beginsWith / endsWith / contains

dawi · November 2, 2011, 9:43am

Hi,

I have to implement a search backend for our product to replace the
old sql queries.

The index is build using the following default analyzer settings:
index.analysis.analyzer.default.filter = lowercase
index.analysis.analyzer.default.tokenizer = keyword

Now I have to provide the following search methods:

exact
matchesPattern
beginsWith
endsWith
contains

exact is implemented as term query.

"aaa" matches exactly "aaa"
"aa" matches "aa" but not "aaa"

matchesPattern is implemented as wildcard query

"a" matches "a"
"a*" matches "a", "aa", "ab", "a*"
Question here: how can I search for wildcards in fields?
something like "a*" that would match "a", "aa" and "ab" but not
"a" and "aa"

beginsWith is implemented as prefix query.

"aaa" matches "aaa", "aaaa", "aaa*"
"aa" matches "aa", "a*aa" but not "aaa"

endsWith
Should behave as beginsWith.
But what is the best way to implement this feature?
Is it possible to define multiple analyzers per field and to use the
reverse filter for this?
index.analysis.analyzer.reverse.filter=reverse
index.analysis.analyzer.reverse.tokenizer=keyword
Or do I have to store a reversed version of the field in the index and
use the prefix query?
contains

"a" matches "aa", "aaaa"
"a*" matches "a*" or "aa*aa" but not "aaaa"
Could be implemented using wildcard query if character escaping is
possible there.

Hopefully anyone has some tips to point me in the right directions.

Regards,
Daniel

Karussell1 · November 2, 2011, 2:34pm

It is possible to use multiple types (and so multiple analyzers) for
one field:

Also I would suggest to use edge ngram tokenizer/filter to improve
performance of wildcard searches:

there is also an option for front or back.

Peter.

On 2 Nov., 10:43, dawi d.wilmer.1...@googlemail.com wrote:

Hi,

I have to implement a search backend for our product to replace the
old sql queries.

The index is build using the following default analyzer settings:
index.analysis.analyzer.default.filter = lowercase
index.analysis.analyzer.default.tokenizer = keyword

Now I have to provide the following search methods:

exact

matchesPattern

beginsWith

endsWith

contains

exact is implemented as term query.

"aaa" matches exactly "aaa"

"aa" matches "aa" but not "aaa"

matchesPattern is implemented as wildcard query

"a" matches "a"

"a*" matches "a", "aa", "ab", "a*"
Question here: how can I search for wildcards in fields?

something like "a*" that would match "a", "aa" and "ab" but not
"a" and "aa"

beginsWith is implemented as prefix query.

"aaa" matches "aaa", "aaaa", "aaa*"

"aa" matches "aa", "a*aa" but not "aaa"

endsWith
Should behave as beginsWith.
But what is the best way to implement this feature?
Is it possible to define multiple analyzers per field and to use the
reverse filter for this?
index.analysis.analyzer.reverse.filter=reverse
index.analysis.analyzer.reverse.tokenizer=keyword
Or do I have to store a reversed version of the field in the index and
use the prefix query?

contains

"a" matches "aa", "aaaa"

"a*" matches "a*" or "aa*aa" but not "aaaa"
Could be implemented using wildcard query if character escaping is
possible there.

Hopefully anyone has some tips to point me in the right directions.

Regards,
Daniel

dawi · November 3, 2011, 8:59am

Hi Karussell,

thanks for the hint of using multi-field-type.

But we are using elasticsearch schema free, so is it possible to define
different analyzers more global than on a specific field? I am not sure if
this is possible, at least I could not finde one example that does so in
the documentation. There all mappings are done on field level.

Concerning the Edge NGram filter: I will try it and see how this influences
performance and index size.

Concerning wildcard queries: Is there now possibility to escape wildcard
characters (e.g. search for "aa*" finds "aa*")?

Regards,
Daniel

Topic		Replies	Views
Startswith analyzer not working Elasticsearch	7	2581	April 18, 2019
Exact Search on multiple wildcard-fields Elasticsearch	23	7959	September 5, 2018
Search results with begins with and ending with using wildcard Elasticsearch	1	932	February 27, 2018
Startwith query not working if string has hyphen (-) Elasticsearch	5	5706	March 16, 2018
Use an analyzer and a normalizer at the same time on the same field? Elasticsearch	4	1361	November 13, 2020

beginsWith / endsWith / contains

Related topics