How do I build a query such that each token in a document field is matched?

thale_jacobs · July 31, 2013, 1:52pm

Hello All- I am having a similar issue as the one Brian described. Brian

did you end up going with including the token count in your index for
filtering? Did it work well? I am thinking about doing the same, but I
have one more issue to solve too.

I have another doc in the index that just contains "Square" (as well as
"Square Steakhouse"), so if I search only on Square, I only want to get a
match on the "Square" document, not the Square Steakehouse doc...

Query: Square Steakhouse Result: Match to Square Steakehouse doc
Query: Square Steakhouses Result: Match to Square Steakehouse doc
Query: Squared Steakhouse Result: Match to Square Steakehouse doc
Query: Steakhouse Result: No Match
Query: Square Result: Match to Steakehouse doc
Query: Squared Result: Match to Steakehouse doc

Any suggestions?

Thanks.

On Wednesday, January 30, 2013 2:30:52 PM UTC-5, Brian Webster wrote:

I'm going to move forward with your idea:

The only thing I can think of doing is to:

index the number of tokens in that field

count the number of tokens in your query string

use a filter to make sure they are the same
Of course, that means ensuring that you're counting the same number of
tokens that would be generated by the analyzer (eg being aware of
stopwords etc)

I'm going to write a function that uses the analyze API to extract the
number of tokens given a field: String GetTokenCount(string
Field_Or_Search_Text). This function will use the correct analyzer.

Then, upon indexing the document type in question, I will store the token
count of the relevant field.

Upon searching, I will use the same GetTokenCount() function to count the
user's search tokens.

Finally, I will structure the search JSON to utilize the filters as you
have suggested.

Obviously this solution is poor for some applications, but I anticipate
fewer than 10,000 searches per day and fewer than 10 index inserts per day
of the type that is involved. Besides, I'd imagine the analyze API is
rather speedy compared to running actual queries.

Thanks for the advice. This will be a little bit tedious, but not so bad.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Matching every documents tokens Elasticsearch	1	602	July 5, 2017
Need suggestions on type of query to be used for a given analysis for better results? Elasticsearch	2	388	July 6, 2017
Help with analyzer and mapping Elasticsearch	9	569	July 6, 2017
Only match if all tokens of an indexed field are included in the search query in any order Elasticsearch	2	947	June 22, 2022
matchPhraseQuery can not retrieve documents with trailing “’s” even if set word delimiter tokenfilter when created indices Elasticsearch	8	474	July 6, 2017

How do I build a query such that each token in a document field is matched?

Related topics