How to disable TF/IDF completely

(Ahmad Azimi) #1

I'm trying to query small pieces of sentences in about 500k records and TF/IDF makes results so bad. I wanna disable it at index time or query time (without using constant_score).

Has anyone has any idea?!

(David Pilato) #2

What do you mean?

(Ahmad Azimi) #3

I only need check if a term exists or not in the field, but TF/IDF use word order, count and frequency which they make final scores not acceptable in my case.

PS1. boosts are too important for each fields separately.

PS2. I should use multi_match query (cross_fields)

PS3. this is my mapping params for one field:
'search' => [
'type' => 'text',
'norms' => false,
'index_options' => 'docs',
'analyzer' => 'index_time_persian',
'search_analyzer' => 'search_time_persian',
'term_vector' => 'with_positions_offsets',

(David Pilato) #4

So you want to use a term query within a bool -> filter clause, no?

May be I misunderstand in which case provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem.

(Byron Voorbach) #5

You could use constant_score query in order to disregard TF/IDF :slight_smile:

(Ahmad Azimi) #6

Thank you @dadoonet
Ok, I'll prepare it as soon as possible.

(Ahmad Azimi) #7

@byronvoorbach Unfortunately constant_score completely remove my fields boosting and return 0 or 1 as score. I can't use it in a normal way.

(system) closed #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.