I wanted to find out if the term frequency (that is used to score the IDF) is stored for each different entity or for the whole index?
If it's stored for the whole index. is there any way that I can have a unique terms frequency vector for each type within this index?
My Problem is that I have an application with many different document types. Each type has its own corpus and I don't want that they will affect each other.
For example, if one type contains many occurrences of the term X then I don't want that this will lower the IDF score of X in other types.
I know that this can be achieved using multiple indices but I have many types and some of them contain low number of documents. Hence an index per each type will have bad performance impact.