Match exact substring in not analyzed field

Maya · January 7, 2014, 1:13pm

Hi,

I have a multi_field mapping:
"testMulti": {
"type": "multi_field",
"fields": {
"testMulti": {
"type": "string",
"index": "analyzed",
"analyzer": "english"
},
"exact": {
"type": "string",
"index": "not_analyzed"
}
}
}

I would like to use testMulti.exact field for exact match and also for substring exact match.
If the field contains:
"There is a dog"
it will be returned for:
"query_string": {
"query": ""is a dog"",
"fields": [
"testMulti.exact"
]
}
}
and also for "query": ""There is"" and not for "query": ""Is a dog"", "query": ""are a dog"", etc...

The document is returned only for a full match: "query": ""There is a dog"", and not part match.

How can I achieve part match?

Thanks.

brian_yoder · January 7, 2014, 10:30pm

This is an interesting problem. Typically, my view of stop words is dim. I
would prefer that the client side avoids searching on them if that is
desired, rather than the engine ignores them. Then, phrase matching can
work properly. And queries such as The Wall can look for just Wall(ignoring
The as a stop word), but then the Google-like +The Wall can look for The
Wall. Yeah, I know that ES is not Google; I only look to Google for ideas
that are nice and for hints about their implementation based upon their
external behavior.

Then, your problem could be solved using a phrase query with no slop.

Maybe your testMulti field is analyzed but no stop words are ignored. Or,
maybe testMulti.raw is analyzed but with no stop words ignored. Either way,
you'd have the full set of words indexed for a phrase query to quickly find
the sub-match. At least, much, much more quickly than a grep-style wildcard
search against a non-analyzed form of the field.

I also used phrases within my own table-based synonym matching. Instead of
using ES synonyms, I create a separate type with lists of synonyms. A query
for a synonym is first directed to that type to fetch a list of synonyms;
then an OR query is generated. This has proven to be fast enough. It has
the benefit of allowing the synonyms to be updated with no changes to the
97-millon documents that are already indexed. And, synonyms can be phrases,
for example: HUGE -> "VERY BIG". So now a synonym query for HUGE can find The
Very Big Dog. Likewise, a synonym query for the phrase "VERY BIG" can find The
Huge Dog. Really cool; just a matter of Java coding on the front end. And
ES does the heavy lifting underneath. But I digress a little...

Hope this helps.

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5440531a-2ccc-4df1-9edb-422012f7dd3b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Maya · January 8, 2014, 7:33am

Thanks for the reply.
In the meantime I analyzed the exact part with whitespace analyzer, so it gives pretty good results.

Topic		Replies	Views
Exact Phrase Match on a not_analyzed field with a space in the phrase Elasticsearch	3	1346	July 6, 2017
Match Exact Value of a Field and not be Included as a Subset in That Field, not more not less Elasticsearch	1	335	April 16, 2019
Exact phrase match question. (I know, another one) Elasticsearch	3	518	July 6, 2017
Exact match problem - one of fields of multi_field mapping not be 'not_analyzed' Elasticsearch	5	419	July 6, 2017
How to do exact match Elasticsearch	2	1046	July 6, 2017

Match exact substring in not analyzed field

Related topics