Best implementation of a multi-field, multi-term prefix query

I'm just getting started with ES and its Java API. As a proof-of-concept I
implemented a generic search method that takes user-inputted text and does
prefix matching against various fields:

List searchFoos(@Nullable String phrase) {

final QueryBuilder queryBuilder;
if (phrase != null && !phrase.isEmpty()) {
    
    phrase = phrase.trim().toLowerCase();
    
    final BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();

    for (final String term : phrase.split("\\s+")) {
        boolQueryBuilder
                .should(QueryBuilders.prefixQuery("name", term))
                .should(QueryBuilders.prefixQuery("slogan", term))
                .should(QueryBuilders.prefixQuery("affiliation", term))
                .should(QueryBuilders.prefixQuery("tags", term));
    }
    
    queryBuilder = boolQueryBuilder;
}
else {
    queryBuilder = QueryBuilders.matchAllQuery();
}

//etc.

}

This is obviously a pretty naive implementation. I'd like to improve it
with respect to two overlapping concerns:

My questions are:

  • How can I best improve the above implementation while keeping the same
    behavior?
  • If there is indeed something like multi_match_phrase_prefix, where is
    its documentation, or else helpful examples?
  • As I noted, match_phrase_prefix will behave differently from a prefix
    query for each term of the phrase - is that a more typical implementation
    for the use case of a search bar for a user? This is a really common use
    case so I'm wondering what queries others have used and why.

Paul

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Paul,
the multi_match query supports all the options that are supported by the
match query. That means that you can execute a phrase_prefix on multiple
fields with a single query like the following:

{
"query" : {
"multi_match" : {
"fields" : ["title", "subtitle"],
"query" : "trying out ela",
"type" : "phrase_prefix"
}
}
}

Using a match phrase prefix would use only the last term as a prefix, thus
like you said it's quite different compared to using a prefix query for
each term. On the other hand, this makes sense when it comes to
auto-complete queries as you type and should be more performant too.

Cheers,
Luca

On Tuesday, July 9, 2013 7:24:21 PM UTC+2, Paul Bellora wrote:

I'm just getting started with ES and its Java API. As a proof-of-concept I
implemented a generic search method that takes user-inputted text and does
prefix matching against various fields:

List searchFoos(@Nullable String phrase) {

final QueryBuilder queryBuilder;
if (phrase != null && !phrase.isEmpty()) {
    
    phrase = phrase.trim().toLowerCase();
    
    final BoolQueryBuilder boolQueryBuilder = 

QueryBuilders.boolQuery();

    for (final String term : phrase.split("\\s+")) {
        boolQueryBuilder
                .should(QueryBuilders.prefixQuery("name", term))
                .should(QueryBuilders.prefixQuery("slogan", term))
                .should(QueryBuilders.prefixQuery("affiliation", term

))
.should(QueryBuilders.prefixQuery("tags", term));
}

    queryBuilder = boolQueryBuilder;
}
else {
    queryBuilder = QueryBuilders.matchAllQuery();
}

//etc.

}

This is obviously a pretty naive implementation. I'd like to improve it
with respect to two overlapping concerns:

My questions are:

  • How can I best improve the above implementation while keeping the
    same behavior?
  • If there is indeed something like multi_match_phrase_prefix, where
    is its documentation, or else helpful examples?
  • As I noted, match_phrase_prefix will behave differently from a
    prefix query for each term of the phrase - is that a more typical
    implementation for the use case of a search bar for a user? This is a
    really common use case so I'm wondering what queries others have used and
    why.

Paul

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

1 Like

Hi Lucas,

Thanks for clearing up my confusion with your helpful example - I'd
forgotten to look at using the "type" property. Good point about the
auto-complete queries - I think it does make sense for a "type ahead"
situation, but I'll probably still look at using prefix matching on each
term for full-fledged search results. Like I said, I'm interested to hear
about what's been successful in other people's experience and why.

On Tue, Jul 16, 2013 at 5:59 AM, Luca Cavanna cavannaluca@gmail.com wrote:

Hi Paul,
the multi_match query supports all the options that are supported by the
match query. That means that you can execute a phrase_prefix on multiple
fields with a single query like the following:

{
"query" : {
"multi_match" : {
"fields" : ["title", "subtitle"],
"query" : "trying out ela",
"type" : "phrase_prefix"
}
}
}

Using a match phrase prefix would use only the last term as a prefix, thus
like you said it's quite different compared to using a prefix query for
each term. On the other hand, this makes sense when it comes to
auto-complete queries as you type and should be more performant too.

Cheers,
Luca

On Tuesday, July 9, 2013 7:24:21 PM UTC+2, Paul Bellora wrote:

I'm just getting started with ES and its Java API. As a proof-of-concept
I implemented a generic search method that takes user-inputted text and
does prefix matching against various fields:

List searchFoos(@Nullable String phrase) {

final QueryBuilder queryBuilder;
if (phrase != null && !phrase.isEmpty()) {

    phrase = phrase.trim().**toLowerCase();

    final BoolQueryBuilder boolQueryBuilder =

QueryBuilders.boolQuery();

    for (final String term : phrase.split("\\**s+")) {
        boolQueryBuilder
                .should(QueryBuilders.**prefixQuery("name", term))
                .should(QueryBuilders.**prefixQuery("slogan", term))
                .should(QueryBuilders.**prefixQuery("affiliation",

ter**m))
.should(QueryBuilders.**prefixQuery("tags", term));
}

    queryBuilder = boolQueryBuilder;
}
else {
    queryBuilder = QueryBuilders.matchAllQuery();
}

//etc.

}

This is obviously a pretty naive implementation. I'd like to improve it
with respect to two overlapping concerns:

My questions are:

  • How can I best improve the above implementation while keeping the
    same behavior?
  • If there is indeed something like multi_match_phrase_prefix**,
    where is its documentation, or else helpful examples?
  • As I noted, match_phrase_prefix will behave differently from a
    prefix query for each term of the phrase - is that a more typical
    implementation for the use case of a search bar for a user? This is a
    really common use case so I'm wondering what queries others have used and
    why.

Paul

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/VoJiZf2x_28/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

And I should say Luca! Sorry for the typo.

On Wed, Jul 17, 2013 at 3:37 PM, Paul Bellora bellorap@gmail.com wrote:

Hi Lucas,

Thanks for clearing up my confusion with your helpful example - I'd
forgotten to look at using the "type" property. Good point about the
auto-complete queries - I think it does make sense for a "type ahead"
situation, but I'll probably still look at using prefix matching on each
term for full-fledged search results. Like I said, I'm interested to hear
about what's been successful in other people's experience and why.

On Tue, Jul 16, 2013 at 5:59 AM, Luca Cavanna cavannaluca@gmail.comwrote:

Hi Paul,
the multi_match query supports all the options that are supported by the
match query. That means that you can execute a phrase_prefix on multiple
fields with a single query like the following:

{
"query" : {
"multi_match" : {
"fields" : ["title", "subtitle"],
"query" : "trying out ela",
"type" : "phrase_prefix"
}
}
}

Using a match phrase prefix would use only the last term as a prefix,
thus like you said it's quite different compared to using a prefix query
for each term. On the other hand, this makes sense when it comes to
auto-complete queries as you type and should be more performant too.

Cheers,
Luca

On Tuesday, July 9, 2013 7:24:21 PM UTC+2, Paul Bellora wrote:

I'm just getting started with ES and its Java API. As a proof-of-concept
I implemented a generic search method that takes user-inputted text and
does prefix matching against various fields:

List searchFoos(@Nullable String phrase) {

final QueryBuilder queryBuilder;
if (phrase != null && !phrase.isEmpty()) {

    phrase = phrase.trim().**toLowerCase();

    final BoolQueryBuilder boolQueryBuilder =

QueryBuilders.boolQuery();

    for (final String term : phrase.split("\\**s+")) {
        boolQueryBuilder
                .should(QueryBuilders.**prefixQuery("name", term))
                .should(QueryBuilders.**prefixQuery("slogan", term))
                .should(QueryBuilders.**prefixQuery("affiliation",

ter**m))
.should(QueryBuilders.**prefixQuery("tags", term));
}

    queryBuilder = boolQueryBuilder;
}
else {
    queryBuilder = QueryBuilders.matchAllQuery();
}

//etc.

}

This is obviously a pretty naive implementation. I'd like to improve it
with respect to two overlapping concerns:

My questions are:

  • How can I best improve the above implementation while keeping the
    same behavior?
  • If there is indeed something like multi_match_phrase_prefix**,
    where is its documentation, or else helpful examples?
  • As I noted, match_phrase_prefix will behave differently from a
    prefix query for each term of the phrase - is that a more typical
    implementation for the use case of a search bar for a user? This is a
    really common use case so I'm wondering what queries others have used and
    why.

Paul

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/VoJiZf2x_28/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.