The docCount
here is the total number of documents in the index.
A simplified version of the formula for idf(term)
is totalNumberOfDocs/(number of documents containing term)
.
That is why it is called inverted document frequency. Rare terms that only few documents contain will have a higher value of idf
, while popular terms that are contained in many documents, will have a lower value of idf
.