Nested Span Near Queries Give Results That Make No Sense


(Michael Sander) #1

I am running some complicated span near searches and the results do not
make much sense.

Take the following toy example:

{'span_near': {'clauses': [
{'span_term': {'text': 'foo'}},
{'span_near': {'clauses': [{'span_term':
{'text': 'biz'}},
{'span_term':
{'text': 'buz'}}],
'in_order': False,
'slop': 0}}],
'in_order': False,
'slop': 0}}

I would expect that this search would return documents where foo, biz,
and buz were directly next to each other. I would expect it to match:

foo biz buz

I would not expect it to match:

foo biz and biz buz

However, nested span_near queries seem to match both documents. I have a
feeling this is an issue with Lucene rather than ES, but does anyone know
whether this was done by design? It seems like an entirely
counter-intuitive result.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/88009474-4517-4ec6-a53d-ee742a5372fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Reelsen) #2

Hey,

can you provide a complete example including indexing and mapping? In a
quick test it seems to work as you expect, so there might be differences in
the configuration...

--Alex

On Tue, Apr 22, 2014 at 8:36 AM, Michael Sander michael.sander@gmail.comwrote:

I am running some complicated span near searches and the results do not
make much sense.

Take the following toy example:

{'span_near': {'clauses': [
{'span_term': {'text': 'foo'}},
{'span_near': {'clauses': [{'span_term':
{'text': 'biz'}},
{'span_term':
{'text': 'buz'}}],
'in_order': False,
'slop': 0}}],
'in_order': False,
'slop': 0}}

I would expect that this search would return documents where foo, biz,
and buz were directly next to each other. I would expect it to match:

foo biz buz

I would not expect it to match:

foo biz and biz buz

However, nested span_near queries seem to match both documents. I have a
feeling this is an issue with Lucene rather than ES, but does anyone know
whether this was done by design? It seems like an entirely
counter-intuitive result.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/88009474-4517-4ec6-a53d-ee742a5372fa%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/88009474-4517-4ec6-a53d-ee742a5372fa%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_MC8O6Qj-59019eVN4KcnO8wSbWAYtx0FvS5Ut8-362g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Michael Sander) #3

Thanks for writing back. I'll be happy to get you that information.

I realize in my example below the use of "and" in the text may have changed the results because it's a stop word. So maybe try the following (which I believe shouldn't match but does):foo biz blah biz buz

Sent from my mobile communications device over a cellular datalink.From: Alexander ReelsenSent: Friday, April 25, 2014 8:02 PMTo: elasticsearch@googlegroups.comReply To: elasticsearch@googlegroups.comSubject: Re: Nested Span Near Queries Give Results That Make No Sense

Hey,

can you provide a complete example including indexing and mapping? In a quick test it seems to work as you expect, so there might be differences in the configuration...

--Alex

On Tue, Apr 22, 2014 at 8:36 AM, Michael Sander <michael.sander@gmail.com> wrote:

I am running some complicated span near searches and the results do not make much sense.

Take the following toy example:

{'span_near': {'clauses': [ {'span_term': {'text': 'foo'}}, {'span_near': {'clauses': [{'span_term': {'text': 'biz'}}, {'span_term': {'text': 'buz'}}], 'in_order': False, 'slop': 0}}], 'in_order': False, 'slop': 0}}
I would expect that this search would return documents where foo, biz, and buz were directly next to each other. I would expect it to match:
foo biz buz
I would not expect it to match:
foo biz and biz buz
However, nested span_near queries seem to match both documents. I have a feeling this is an issue with Lucene rather than ES, but does anyone know whether this was done by design? It seems like an entirely counter-intuitive result.

--

You received this message because you are subscribed to the Google Groups "elasticsearch" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/88009474-4517-4ec6-a53d-ee742a5372fa%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.

To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/H2-bKVd1Ju0/unsubscribe.

To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_MC8O6Qj-59019eVN4KcnO8wSbWAYtx0FvS5Ut8-362g%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.


(Alexander Reelsen) #4

Hey,

an example like mentioned in http://www.elasticsearch.org/help would be
great. I still get the desired behaviour. Also mentioning elasticsearch
version and JVM version are always important to not hunt ghosts from the
past.

--Alex

On Fri, Apr 25, 2014 at 8:25 PM, Michael Sander michael.sander@gmail.comwrote:

Thanks for writing back. I'll be happy to get you that information.

I realize in my example below the use of "and" in the text may have
changed the results because it's a stop word. So maybe try the following
(which I believe shouldn't match but does):
foo biz blah biz buz

Sent from my mobile communications device over a cellular datalink.
*From: *Alexander Reelsen
*Sent: *Friday, April 25, 2014 8:02 PM
*To: *elasticsearch@googlegroups.com
*Reply To: *elasticsearch@googlegroups.com
*Subject: *Re: Nested Span Near Queries Give Results That Make No Sense

Hey,

can you provide a complete example including indexing and mapping? In a
quick test it seems to work as you expect, so there might be differences in
the configuration...

--Alex

On Tue, Apr 22, 2014 at 8:36 AM, Michael Sander michael.sander@gmail.comwrote:

I am running some complicated span near searches and the results do not
make much sense.

Take the following toy example:

{'span_near': {'clauses': [
{'span_term': {'text': 'foo'}},
{'span_near': {'clauses': [{'span_term':
{'text': 'biz'}},
{'span_term':
{'text': 'buz'}}],
'in_order': False,
'slop': 0}}],
'in_order': False,
'slop': 0}}

I would expect that this search would return documents where foo, biz,
and buz were directly next to each other. I would expect it to match:

foo biz buz

I would not expect it to match:

foo biz and biz buz

However, nested span_near queries seem to match both documents. I have
a feeling this is an issue with Lucene rather than ES, but does anyone know
whether this was done by design? It seems like an entirely
counter-intuitive result.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/88009474-4517-4ec6-a53d-ee742a5372fa%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/88009474-4517-4ec6-a53d-ee742a5372fa%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/H2-bKVd1Ju0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_MC8O6Qj-59019eVN4KcnO8wSbWAYtx0FvS5Ut8-362g%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAGCwEM_MC8O6Qj-59019eVN4KcnO8wSbWAYtx0FvS5Ut8-362g%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/20140426002523.5800084.57129.7734%40gmail.comhttps://groups.google.com/d/msgid/elasticsearch/20140426002523.5800084.57129.7734%40gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_fYyg41K37b8ALf%3DZL%3DqxheqPRD-JNYOYHak6Lz32GHQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Michael Sander) #5

Will do.

Sent from my mobile communications device over a cellular datalink.From: Alexander ReelsenSent: Friday, April 25, 2014 8:43 PMTo: elasticsearch@googlegroups.comReply To: elasticsearch@googlegroups.comSubject: Re: Nested Span Near Queries Give Results That Make No Sense

Hey,

an example like mentioned in http://www.elasticsearch.org/help would be great. I still get the desired behaviour. Also mentioning elasticsearch version and JVM version are always important to not hunt ghosts from the past.

--Alex

On Fri, Apr 25, 2014 at 8:25 PM, Michael Sander <michael.sander@gmail.com> wrote:

Thanks for writing back. I'll be happy to get you that information.

I realize in my example below the use of "and" in the text may have changed the results because it's a stop word. So maybe try the following (which I believe shouldn't match but does):foo biz blah biz buz

Sent from my mobile communications device over a cellular datalink.From: Alexander ReelsenSent: Friday, April 25, 2014 8:02 PMTo: elasticsearch@googlegroups.comReply To: elasticsearch@googlegroups.comSubject: Re: Nested Span Near Queries Give Results That Make No Sense

Hey,

can you provide a complete example including indexing and mapping? In a quick test it seems to work as you expect, so there might be differences in the configuration...

--Alex

On Tue, Apr 22, 2014 at 8:36 AM, Michael Sander <michael.sander@gmail.com> wrote:

I am running some complicated span near searches and the results do not make much sense.

Take the following toy example:

{'span_near': {'clauses': [ {'span_term': {'text': 'foo'}}, {'span_near': {'clauses': [{'span_term': {'text': 'biz'}}, {'span_term': {'text': 'buz'}}], 'in_order': False, 'slop': 0}}], 'in_order': False, 'slop': 0}}
I would expect that this search would return documents where foo, biz, and buz were directly next to each other. I would expect it to match:
foo biz buz
I would not expect it to match:
foo biz and biz buz
However, nested span_near queries seem to match both documents. I have a feeling this is an issue with Lucene rather than ES, but does anyone know whether this was done by design? It seems like an entirely counter-intuitive result.

--

You received this message because you are subscribed to the Google Groups "elasticsearch" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/88009474-4517-4ec6-a53d-ee742a5372fa%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.

To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/H2-bKVd1Ju0/unsubscribe.

To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_MC8O6Qj-59019eVN4KcnO8wSbWAYtx0FvS5Ut8-362g%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the Google Groups "elasticsearch" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/20140426002523.5800084.57129.7734%40gmail.com.

For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.

To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/H2-bKVd1Ju0/unsubscribe.

To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_fYyg41K37b8ALf%3DZL%3DqxheqPRD-JNYOYHak6Lz32GHQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.


(system) #6