Dealing with search queries such as "Portland, OR"

Hey all,

Does anyone have any advice on how to deal with search queries such as
"Portland, OR"? When doing a query string based search, the parsing fails
(see the error message at https://gist.github.com/1027237).

I could use a regex such as \bOR\W*$ and replace it with a lowercase "or",
but that seems kind of clunky and I'm sure that regex will miss some cases.

Has anyone solved this kind of issue before, or have better solutions?

Thanks as always,

Adam

Use text queries in this case, they are simpler then a query_string, but won't fail in this case.

On Wednesday, June 15, 2011 at 5:34 PM, Adam Creeger wrote:

Hey all,

Does anyone have any advice on how to deal with search queries such as "Portland, OR"? When doing a query string based search, the parsing fails (see the error message at Error when using searching for Portland, OR · GitHub).

I could use a regex such as \bOR\W*$ and replace it with a lowercase "or", but that seems kind of clunky and I'm sure that regex will miss some cases.

Has anyone solved this kind of issue before, or have better solutions?

Thanks as always,

Adam

Shay, thank you so much for such a quick response... This does indeed look
much more suited to the "handling a search box" use case. Does it deal with
quotes? For example: "Bob Jones" Smith

Thanks again,

Adam

On 15 June 2011 07:35, Shay Banon shay.banon@elasticsearch.com wrote:

Use text queries in this case, they are simpler then a query_string, but
won't fail in this case.

On Wednesday, June 15, 2011 at 5:34 PM, Adam Creeger wrote:

Hey all,

Does anyone have any advice on how to deal with search queries such as
"Portland, OR"? When doing a query string based search, the parsing fails
(see the error message at Error when using searching for Portland, OR · GitHub).

I could use a regex such as \bOR\W*$ and replace it with a lowercase "or",
but that seems kind of clunky and I'm sure that regex will miss some cases.

Has anyone solved this kind of issue before, or have better solutions?

Thanks as always,

Adam

Sorry, one more question - how should I search across multiple fields? I use
the "fields" option of the query string currently with a default operator of
AND...

Adam

On 15 June 2011 08:25, Adam Creeger adamcreeger@gmail.com wrote:

Shay, thank you so much for such a quick response... This does indeed look
much more suited to the "handling a search box" use case. Does it deal with
quotes? For example: "Bob Jones" Smith

Thanks again,

Adam

On 15 June 2011 07:35, Shay Banon shay.banon@elasticsearch.com wrote:

Use text queries in this case, they are simpler then a query_string, but
won't fail in this case.

On Wednesday, June 15, 2011 at 5:34 PM, Adam Creeger wrote:

Hey all,

Does anyone have any advice on how to deal with search queries such as
"Portland, OR"? When doing a query string based search, the parsing fails
(see the error message at Error when using searching for Portland, OR · GitHub).

I could use a regex such as \bOR\W*$ and replace it with a lowercase "or",
but that seems kind of clunky and I'm sure that regex will miss some cases.

Has anyone solved this kind of issue before, or have better solutions?

Thanks as always,

Adam

You can combine text query on several fields using bool query for example (or dis max). It will simple analyze the data, so quotes will be treated in a similar manner as analyzing it down to tokens.

On Wednesday, June 15, 2011 at 6:27 PM, Adam Creeger wrote:

Sorry, one more question - how should I search across multiple fields? I use the "fields" option of the query string currently with a default operator of AND...

Adam

On 15 June 2011 08:25, Adam Creeger <adamcreeger@gmail.com (mailto:adamcreeger@gmail.com)> wrote:

Shay, thank you so much for such a quick response... This does indeed look much more suited to the "handling a search box" use case. Does it deal with quotes? For example: "Bob Jones" Smith

Thanks again,

Adam

On 15 June 2011 07:35, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

Use text queries in this case, they are simpler then a query_string, but won't fail in this case.

On Wednesday, June 15, 2011 at 5:34 PM, Adam Creeger wrote:

Hey all,

Does anyone have any advice on how to deal with search queries such as "Portland, OR"? When doing a query string based search, the parsing fails (see the error message at Error when using searching for Portland, OR · GitHub).

I could use a regex such as \bOR\W*$ and replace it with a lowercase "or", but that seems kind of clunky and I'm sure that regex will miss some cases.

Has anyone solved this kind of issue before, or have better solutions?

Thanks as always,

Adam

Hi Shay, thank you so much for getting back to me so quickly (again).

I am struggling to come up with the right query to search across fields
where each term is in a different field (the albino elephant example). From
the research I've done, it looks like I'll have to split the terms up into
separate dis_max text queries and then combine them in a boolean query. See
Search Technology & Search Platform | Lucidworks for my
source for that. That seems like a lot of work for something that is
supposed to make things more simple...

At first I assumed that text was just a variant of query_string that does
less parsing, but it appears it isn't as robust when it comes to multiple
fields? Is that correct? Or is somehow an equivalent of
Solr's DisMaxQParser?

Thanks as always, I intend to enhance the docs with the result of this
thread.

Adam

On 15 June 2011 09:21, Shay Banon shay.banon@elasticsearch.com wrote:

You can combine text query on several fields using bool query for example
(or dis max). It will simple analyze the data, so quotes will be treated in
a similar manner as analyzing it down to tokens.

On Wednesday, June 15, 2011 at 6:27 PM, Adam Creeger wrote:

Sorry, one more question - how should I search across multiple fields? I
use the "fields" option of the query string currently with a default
operator of AND...

Adam

On 15 June 2011 08:25, Adam Creeger adamcreeger@gmail.com wrote:

Shay, thank you so much for such a quick response... This does indeed look
much more suited to the "handling a search box" use case. Does it deal with
quotes? For example: "Bob Jones" Smith

Thanks again,

Adam

On 15 June 2011 07:35, Shay Banon shay.banon@elasticsearch.com wrote:

Use text queries in this case, they are simpler then a query_string, but
won't fail in this case.

On Wednesday, June 15, 2011 at 5:34 PM, Adam Creeger wrote:

Hey all,

Does anyone have any advice on how to deal with search queries such as
"Portland, OR"? When doing a query string based search, the parsing fails
(see the error message at Error when using searching for Portland, OR · GitHub).

I could use a regex such as \bOR\W*$ and replace it with a lowercase "or",
but that seems kind of clunky and I'm sure that regex will miss some cases.

Has anyone solved this kind of issue before, or have better solutions?

Thanks as always,

Adam

Yes, text has different semantics compared to query_string, you can do dis_max, but only between "full" generated text queries. You can use bool query to combine them as well, but again, on the whole query level.

On Wednesday, June 15, 2011 at 9:36 PM, Adam Creeger wrote:

Hi Shay, thank you so much for getting back to me so quickly (again).

I am struggling to come up with the right query to search across fields where each term is in a different field (the albino elephant example). From the research I've done, it looks like I'll have to split the terms up into separate dis_max text queries and then combine them in a boolean query. See Search Technology & Search Platform | Lucidworks for my source for that. That seems like a lot of work for something that is supposed to make things more simple...

At first I assumed that text was just a variant of query_string that does less parsing, but it appears it isn't as robust when it comes to multiple fields? Is that correct? Or is somehow an equivalent of Solr's DisMaxQParser?

Thanks as always, I intend to enhance the docs with the result of this thread.

Adam

On 15 June 2011 09:21, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

You can combine text query on several fields using bool query for example (or dis max). It will simple analyze the data, so quotes will be treated in a similar manner as analyzing it down to tokens.

On Wednesday, June 15, 2011 at 6:27 PM, Adam Creeger wrote:

Sorry, one more question - how should I search across multiple fields? I use the "fields" option of the query string currently with a default operator of AND...

Adam

On 15 June 2011 08:25, Adam Creeger <adamcreeger@gmail.com (mailto:adamcreeger@gmail.com)> wrote:

Shay, thank you so much for such a quick response... This does indeed look much more suited to the "handling a search box" use case. Does it deal with quotes? For example: "Bob Jones" Smith

Thanks again,

Adam

On 15 June 2011 07:35, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

Use text queries in this case, they are simpler then a query_string, but won't fail in this case.

On Wednesday, June 15, 2011 at 5:34 PM, Adam Creeger wrote:

Hey all,

Does anyone have any advice on how to deal with search queries such as "Portland, OR"? When doing a query string based search, the parsing fails (see the error message at Error when using searching for Portland, OR · GitHub).

I could use a regex such as \bOR\W*$ and replace it with a lowercase "or", but that seems kind of clunky and I'm sure that regex will miss some cases.

Has anyone solved this kind of issue before, or have better solutions?

Thanks as always,

Adam

Sorry, I don't really understand your response. :frowning: I only spend 10% of my
time in the search world...

Can you clarify what you mean by "full" generated text queries? Can you
provide examples? I know this is a time suck, and I am happy to help improve
the docs where I can, but I can't do that without more understanding.

Staying solution focussed, I've created a gist (
ElasticSearch: query_string vs text queries · GitHub) that demonstrates the kind of search I want
to do with a "text" query. I expect this is a major use case and was simple
to implement with query_string. I don't expect anyone to solve the problem
for me, but I would like some pointers on where to begin... For example, if
I need to analyze the query to create a matrix of boolean and DisMax
queries, is there any ES code that does that already? Or is that even
necessary?

Thanks so much for ES and your support,

Adam

On 15 June 2011 11:49, Shay Banon shay.banon@elasticsearch.com wrote:

Yes, text has different semantics compared to query_string, you can do
dis_max, but only between "full" generated text queries. You can use bool
query to combine them as well, but again, on the whole query level.

On Wednesday, June 15, 2011 at 9:36 PM, Adam Creeger wrote:

Hi Shay, thank you so much for getting back to me so quickly (again).

I am struggling to come up with the right query to search across fields
where each term is in a different field (the albino elephant example). From
the research I've done, it looks like I'll have to split the terms up into
separate dis_max text queries and then combine them in a boolean query. See
Search Technology & Search Platform | Lucidworks for my
source for that. That seems like a lot of work for something that is
supposed to make things more simple...

At first I assumed that text was just a variant of query_string that does
less parsing, but it appears it isn't as robust when it comes to multiple
fields? Is that correct? Or is somehow an equivalent of
Solr's DisMaxQParser?

Thanks as always, I intend to enhance the docs with the result of this
thread.

Adam

On 15 June 2011 09:21, Shay Banon shay.banon@elasticsearch.com wrote:

You can combine text query on several fields using bool query for example
(or dis max). It will simple analyze the data, so quotes will be treated in
a similar manner as analyzing it down to tokens.

On Wednesday, June 15, 2011 at 6:27 PM, Adam Creeger wrote:

Sorry, one more question - how should I search across multiple fields? I
use the "fields" option of the query string currently with a default
operator of AND...

Adam

On 15 June 2011 08:25, Adam Creeger adamcreeger@gmail.com wrote:

Shay, thank you so much for such a quick response... This does indeed look
much more suited to the "handling a search box" use case. Does it deal with
quotes? For example: "Bob Jones" Smith

Thanks again,

Adam

On 15 June 2011 07:35, Shay Banon shay.banon@elasticsearch.com wrote:

Use text queries in this case, they are simpler then a query_string, but
won't fail in this case.

On Wednesday, June 15, 2011 at 5:34 PM, Adam Creeger wrote:

Hey all,

Does anyone have any advice on how to deal with search queries such as
"Portland, OR"? When doing a query string based search, the parsing fails
(see the error message at Error when using searching for Portland, OR · GitHub).

I could use a regex such as \bOR\W*$ and replace it with a lowercase "or",
but that seems kind of clunky and I'm sure that regex will miss some cases.

Has anyone solved this kind of issue before, or have better solutions?

Thanks as always,

Adam