0 votes
in NLP using Python by
In a corpus of N documents, one randomly chosen document contains a total

of T terms and the term “hello” appears K times.

What is the correct value for the product of TF (term frequency) and IDF (inverse-documentfrequency), if the term “hello” appears in approximately one-third of the total documents?

a. KT * Log(3)

b. T * Log(3) / K

c. K * Log(3) / T

d. Log(3) / KT

1 Answer

0 votes
by
Answer : (c)

formula for TF is K/T

formula for IDF is log(total docs / no of docs containing “data”)

= log(1 / (⅓))

= log (3)

Hence correct choice is Klog(3)/T

Related questions

0 votes
asked Sep 8, 2022 in NLP using Python by Robin
0 votes
asked Mar 5, 2023 in NLP using Python by Robindeniel
...