Prefix "and" search

Kathy_Cashel · April 27, 2015, 12:28pm

My client wants a search that returns both prefix and exact matches per
token. Most (but not all) of the text being searched is human names. The
idea is that "jane smith" and "smith, j" and "j smit" will all return the
same document, so I'm using tokens.

The trick is that it needs to be an "and" search: all tokens in the search
string must be present in the results. And the prefix query does not seem
to offer this option. Edgengram would be a great answer, but the client
wants results returned alphabetically - using no relevance at all - and the
long tail on ngram searches makes the alpha sort unfeasible.

So for now I'm using a simple keyword tokenizer (plus some custom
analyzers) to index, and then splitting the search string on spaces to
build queries like the below. It seems very awkward / cumbersome / brittle.

All suggestions appreciated.

{
"query" : {
"bool" : {
"must" : [{
"bool" : {
"should" : [{
"match" : {
"text" : "j"
}
}, {
"prefix" : {
"text" : "j"
}
}
]
}
}, {
"bool" : {
"should" : [{
"match" : {
"text" : "smith"
}
}, {
"prefix" : {
"text" : "smith"
}
}
]
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

deepak_daffodil · April 27, 2015, 12:44pm

use slop with phrase_prefix

On Mon, Apr 27, 2015 at 5:58 PM, Kathy Cashel kathleencashel@gmail.com
wrote:

My client wants a search that returns both prefix and exact matches per
token. Most (but not all) of the text being searched is human names. The
idea is that "jane smith" and "smith, j" and "j smit" will all return the
same document, so I'm using tokens.

The trick is that it needs to be an "and" search: all tokens in the search
string must be present in the results. And the prefix query does not seem
to offer this option. Edgengram would be a great answer, but the client
wants results returned alphabetically - using no relevance at all - and the
long tail on ngram searches makes the alpha sort unfeasible.

So for now I'm using a simple keyword tokenizer (plus some custom
analyzers) to index, and then splitting the search string on spaces to
build queries like the below. It seems very awkward / cumbersome / brittle.

All suggestions appreciated.

{
"query" : {
"bool" : {
"must" : [{
"bool" : {
"should" : [{
"match" : {
"text" : "j"
}
}, {
"prefix" : {
"text" : "j"
}
}
]
}
}, {
"bool" : {
"should" : [{
"match" : {
"text" : "smith"
}
}, {
"prefix" : {
"text" : "smith"
}
}
]
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHzCMppKN8LnYxrGUkdHok9YT4LyhmY0D%3DugRGpMbwKux%3Dyn2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Kathy_Cashel · April 28, 2015, 11:44am

This looks very promising. Thanks so much Deepak!

Kathy

On Monday, April 27, 2015 at 8:44:28 AM UTC-4, deepak.chauhan wrote:

use slop with phrase_prefix

On Mon, Apr 27, 2015 at 5:58 PM, Kathy Cashel <kathlee...@gmail.com
<javascript:>> wrote:

My client wants a search that returns both prefix and exact matches per
token. Most (but not all) of the text being searched is human names. The
idea is that "jane smith" and "smith, j" and "j smit" will all return the
same document, so I'm using tokens.

The trick is that it needs to be an "and" search: all tokens in the
search string must be present in the results. And the prefix query does not
seem to offer this option. Edgengram would be a great answer, but the
client wants results returned alphabetically - using no relevance at all -
and the long tail on ngram searches makes the alpha sort unfeasible.

So for now I'm using a simple keyword tokenizer (plus some custom
analyzers) to index, and then splitting the search string on spaces to
build queries like the below. It seems very awkward / cumbersome / brittle.

All suggestions appreciated.

{
"query" : {
"bool" : {
"must" : [{
"bool" : {
"should" : [{
"match" : {
"text" : "j"
}
}, {
"prefix" : {
"text" : "j"
}
}
]
}
}, {
"bool" : {
"should" : [{
"match" : {
"text" : "smith"
}
}, {
"prefix" : {
"text" : "smith"
}
}
]
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d36515c0-803e-4210-864a-d7ba9af06891%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Prefix query with multiple "prefixes"/words Elasticsearch	2	3076	July 6, 2017
Realizing a prefix search across multiple tokens splitted by whitespace Elasticsearch	1	338	July 6, 2017
"phrase_prefix" not working for some prefixes Elasticsearch	2	952	July 6, 2017
Exact match search with permutation of words Elasticsearch	19	5733	March 20, 2019
How can I combine in simple_query_string prefix search and phrase search (similar to match_phrase_prefix but with other features of simple_query_string) Elasticsearch	1	88	June 18, 2024

Prefix "and" search

Related topics