Phrase query matches across array elements?

I have an index with a list of tags, very similar to the example in ES
documentation for
Arrays: http://www.elasticsearch.org/guide/reference/mapping/array-type.html

Using Elastic Head I can see the tags for a document in my index:
tags: [
"wanna",
"money",
"these",
"thanks"
]

When I perform a query_string search with the text "wanna money" the
document appears in the results, is this normal behavior?

I'm using ES 0.20 RC1, the field. I'm explicitly setting a mapping of the
field to "string" with a snowball analyzer (I realize snowball is a little
odd for tags, but there's a reason for it).

Thanks,
Anil

--

Not sure I fully understand the issue. Are you looking for a phrase and
don't want this record to show up because words in the phrase are indexed
in different elements of the array?

On Thursday, December 6, 2012 7:24:28 PM UTC-5, Anil Rhemtulla wrote:

I have an index with a list of tags, very similar to the example in ES
documentation for Arrays:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Using Elastic Head I can see the tags for a document in my index:
tags: [
"wanna",
"money",
"these",
"thanks"
]

When I perform a query_string search with the text "wanna money" the
document appears in the results, is this normal behavior?

I'm using ES 0.20 RC1, the field. I'm explicitly setting a mapping of the
field to "string" with a snowball analyzer (I realize snowball is a little
odd for tags, but there's a reason for it).

Thanks,
Anil

--

Yes, that's correct. My assumption was that the separate array values would
have large gaps in their offsets to avoid phrase matching. If phrase
matching was desired then the field would be indexed as one big string
instead of an array, doesn't that make sense? I must be missing something.

On Friday, December 7, 2012 7:57:45 AM UTC-8, Igor Motov wrote:

Not sure I fully understand the issue. Are you looking for a phrase and
don't want this record to show up because words in the phrase are indexed
in different elements of the array?

On Thursday, December 6, 2012 7:24:28 PM UTC-5, Anil Rhemtulla wrote:

I have an index with a list of tags, very similar to the example in ES
documentation for Arrays:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Using Elastic Head I can see the tags for a document in my index:
tags: [
"wanna",
"money",
"these",
"thanks"
]

When I perform a query_string search with the text "wanna money" the
document appears in the results, is this normal behavior?

I'm using ES 0.20 RC1, the field. I'm explicitly setting a mapping of the
field to "string" with a snowball analyzer (I realize snowball is a little
odd for tags, but there's a reason for it).

Thanks,
Anil

--

No, your assumption is correct, there is a configurable gap between
instances of the same field, but the default value for this gap is 0. You
can change it in the field mapping:

"tags" : {
      "type" : "string",
      "position_offset_gap": 100,
      .......  
}

On Friday, December 7, 2012 11:40:01 AM UTC-5, Anil Rhemtulla wrote:

Yes, that's correct. My assumption was that the separate array values
would have large gaps in their offsets to avoid phrase matching. If phrase
matching was desired then the field would be indexed as one big string
instead of an array, doesn't that make sense? I must be missing something.

On Friday, December 7, 2012 7:57:45 AM UTC-8, Igor Motov wrote:

Not sure I fully understand the issue. Are you looking for a phrase and
don't want this record to show up because words in the phrase are indexed
in different elements of the array?

On Thursday, December 6, 2012 7:24:28 PM UTC-5, Anil Rhemtulla wrote:

I have an index with a list of tags, very similar to the example in ES
documentation for Arrays:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Using Elastic Head I can see the tags for a document in my index:
tags: [
"wanna",
"money",
"these",
"thanks"
]

When I perform a query_string search with the text "wanna money" the
document appears in the results, is this normal behavior?

I'm using ES 0.20 RC1, the field. I'm explicitly setting a mapping of
the field to "string" with a snowball analyzer (I realize snowball is a
little odd for tags, but there's a reason for it).

Thanks,
Anil

--

That worked like a charm, thanks!

I didn't see that option in the docs anywhere (I did find the git commit
though), if you know of a good way to find options like this one please
share your expertise! Do I need to start poking into the ES code?

Thanks again,
Anil

On Friday, December 7, 2012 9:01:33 AM UTC-8, Igor Motov wrote:

No, your assumption is correct, there is a configurable gap between
instances of the same field, but the default value for this gap is 0. You
can change it in the field mapping:

"tags" : {
      "type" : "string",
      "position_offset_gap": 100,
      .......  
}

On Friday, December 7, 2012 11:40:01 AM UTC-5, Anil Rhemtulla wrote:

Yes, that's correct. My assumption was that the separate array values
would have large gaps in their offsets to avoid phrase matching. If phrase
matching was desired then the field would be indexed as one big string
instead of an array, doesn't that make sense? I must be missing something.

On Friday, December 7, 2012 7:57:45 AM UTC-8, Igor Motov wrote:

Not sure I fully understand the issue. Are you looking for a phrase and
don't want this record to show up because words in the phrase are indexed
in different elements of the array?

On Thursday, December 6, 2012 7:24:28 PM UTC-5, Anil Rhemtulla wrote:

I have an index with a list of tags, very similar to the example in ES
documentation for Arrays:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Using Elastic Head I can see the tags for a document in my index:
tags: [
"wanna",
"money",
"these",
"thanks"
]

When I perform a query_string search with the text "wanna money" the
document appears in the results, is this normal behavior?

I'm using ES 0.20 RC1, the field. I'm explicitly setting a mapping of
the field to "string" with a snowball analyzer (I realize snowball is a
little odd for tags, but there's a reason for it).

Thanks,
Anil

--

Yes, you are right, it wasn't documented. I added it.

On Friday, December 7, 2012 1:40:37 PM UTC-5, Anil Rhemtulla wrote:

That worked like a charm, thanks!

I didn't see that option in the docs anywhere (I did find the git commit
though), if you know of a good way to find options like this one please
share your expertise! Do I need to start poking into the ES code?

Thanks again,
Anil

On Friday, December 7, 2012 9:01:33 AM UTC-8, Igor Motov wrote:

No, your assumption is correct, there is a configurable gap between
instances of the same field, but the default value for this gap is 0. You
can change it in the field mapping:

"tags" : {
      "type" : "string",
      "position_offset_gap": 100,
      .......  
}

On Friday, December 7, 2012 11:40:01 AM UTC-5, Anil Rhemtulla wrote:

Yes, that's correct. My assumption was that the separate array values
would have large gaps in their offsets to avoid phrase matching. If phrase
matching was desired then the field would be indexed as one big string
instead of an array, doesn't that make sense? I must be missing something.

On Friday, December 7, 2012 7:57:45 AM UTC-8, Igor Motov wrote:

Not sure I fully understand the issue. Are you looking for a phrase and
don't want this record to show up because words in the phrase are indexed
in different elements of the array?

On Thursday, December 6, 2012 7:24:28 PM UTC-5, Anil Rhemtulla wrote:

I have an index with a list of tags, very similar to the example in ES
documentation for Arrays:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Using Elastic Head I can see the tags for a document in my index:
tags: [
"wanna",
"money",
"these",
"thanks"
]

When I perform a query_string search with the text "wanna money" the
document appears in the results, is this normal behavior?

I'm using ES 0.20 RC1, the field. I'm explicitly setting a mapping of
the field to "string" with a snowball analyzer (I realize snowball is a
little odd for tags, but there's a reason for it).

Thanks,
Anil

--