Prefix "and" search

My client wants a search that returns both prefix and exact matches per
token. Most (but not all) of the text being searched is human names. The
idea is that "jane smith" and "smith, j" and "j smit" will all return the
same document, so I'm using tokens.

The trick is that it needs to be an "and" search: all tokens in the search
string must be present in the results. And the prefix query does not seem
to offer this option. Edgengram would be a great answer, but the client
wants results returned alphabetically - using no relevance at all - and the
long tail on ngram searches makes the alpha sort unfeasible.

So for now I'm using a simple keyword tokenizer (plus some custom
analyzers) to index, and then splitting the search string on spaces to
build queries like the below. It seems very awkward / cumbersome / brittle.

All suggestions appreciated.

{
"query" : {
"bool" : {
"must" : [{
"bool" : {
"should" : [{
"match" : {
"text" : "j"
}
}, {
"prefix" : {
"text" : "j"
}
}
]
}
}, {
"bool" : {
"should" : [{
"match" : {
"text" : "smith"
}
}, {
"prefix" : {
"text" : "smith"
}
}
]
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

use slop with phrase_prefix

On Mon, Apr 27, 2015 at 5:58 PM, Kathy Cashel kathleencashel@gmail.com
wrote:

My client wants a search that returns both prefix and exact matches per
token. Most (but not all) of the text being searched is human names. The
idea is that "jane smith" and "smith, j" and "j smit" will all return the
same document, so I'm using tokens.

The trick is that it needs to be an "and" search: all tokens in the search
string must be present in the results. And the prefix query does not seem
to offer this option. Edgengram would be a great answer, but the client
wants results returned alphabetically - using no relevance at all - and the
long tail on ngram searches makes the alpha sort unfeasible.

So for now I'm using a simple keyword tokenizer (plus some custom
analyzers) to index, and then splitting the search string on spaces to
build queries like the below. It seems very awkward / cumbersome / brittle.

All suggestions appreciated.

{
"query" : {
"bool" : {
"must" : [{
"bool" : {
"should" : [{
"match" : {
"text" : "j"
}
}, {
"prefix" : {
"text" : "j"
}
}
]
}
}, {
"bool" : {
"should" : [{
"match" : {
"text" : "smith"
}
}, {
"prefix" : {
"text" : "smith"
}
}
]
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHzCMppKN8LnYxrGUkdHok9YT4LyhmY0D%3DugRGpMbwKux%3Dyn2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

This looks very promising. Thanks so much Deepak!

Kathy

On Monday, April 27, 2015 at 8:44:28 AM UTC-4, deepak.chauhan wrote:

use slop with phrase_prefix

On Mon, Apr 27, 2015 at 5:58 PM, Kathy Cashel <kathlee...@gmail.com
<javascript:>> wrote:

My client wants a search that returns both prefix and exact matches per
token. Most (but not all) of the text being searched is human names. The
idea is that "jane smith" and "smith, j" and "j smit" will all return the
same document, so I'm using tokens.

The trick is that it needs to be an "and" search: all tokens in the
search string must be present in the results. And the prefix query does not
seem to offer this option. Edgengram would be a great answer, but the
client wants results returned alphabetically - using no relevance at all -
and the long tail on ngram searches makes the alpha sort unfeasible.

So for now I'm using a simple keyword tokenizer (plus some custom
analyzers) to index, and then splitting the search string on spaces to
build queries like the below. It seems very awkward / cumbersome / brittle.

All suggestions appreciated.

{
"query" : {
"bool" : {
"must" : [{
"bool" : {
"should" : [{
"match" : {
"text" : "j"
}
}, {
"prefix" : {
"text" : "j"
}
}
]
}
}, {
"bool" : {
"should" : [{
"match" : {
"text" : "smith"
}
}, {
"prefix" : {
"text" : "smith"
}
}
]
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/bf12f1dc-f8d5-4e05-92fa-ce5f36c07d1c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d36515c0-803e-4210-864a-d7ba9af06891%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.