Specifying analyzer for _all field


(Runar Myklebust-2) #1

Hi, Im having a bit of trouble to set the analyzer for the _all - field to
"keyword", and "not_analyzed" isnt working either.

In this gist, I have store data as showed, executes the query but I get no
results. If I execute query against the field "data_textfield" directly, I
get the result:

mvh

Runar Myklebust


(vineeth mohan) #2

You can set the global analyzer when you set the index. I guess that should
set the behavior of _all also.

curl -X PUT "localhost:9200/indexName" -d '{ "settings" : { "index" : {
"number_of_shards" : 2, "number_of_replicas" : 1 },
"analysis" : {"analyzer":{"my_analyzer" : {
"tokenizer" : "keyword" }}}
}}'

Also can you try with
index_analyzer : keyword
instead of just
"analyzer":"keyword"

Thanks
Vineeth

On Fri, Mar 23, 2012 at 7:58 PM, Runar Myklebust runar@myklebust.me wrote:

Hi, Im having a bit of trouble to set the analyzer for the _all - field to
"keyword", and "not_analyzed" isnt working either.

In this gist, I have store data as showed, executes the query but I get no
results. If I execute query against the field "data_textfield" directly, I
get the result:

https://gist.github.com/2171055

mvh

Runar Myklebust


(Shay Banon) #3

You don't want to set the analyzer for _all to be keyword, _all is an
aggregation of all the other fields int the doc, so you basically treat the
whole aggregation of text as a single token.

On Fri, Mar 23, 2012 at 4:28 PM, Runar Myklebust runar@myklebust.me wrote:

Hi, Im having a bit of trouble to set the analyzer for the _all - field to
"keyword", and "not_analyzed" isnt working either.

In this gist, I have store data as showed, executes the query but I get no
results. If I execute query against the field "data_textfield" directly, I
get the result:

https://gist.github.com/2171055

mvh

Runar Myklebust


(vineeth mohan) #4

Hello Shay ,

A doubt on this area.
When we enable _all , is there a different copy of all the fields stored ?
Or is it just a referance to the other fields ?

Thanks
Vineeth

On Sun, Mar 25, 2012 at 5:27 PM, Shay Banon kimchy@gmail.com wrote:

You don't want to set the analyzer for _all to be keyword, _all is an
aggregation of all the other fields int the doc, so you basically treat the
whole aggregation of text as a single token.

On Fri, Mar 23, 2012 at 4:28 PM, Runar Myklebust runar@myklebust.mewrote:

Hi, Im having a bit of trouble to set the analyzer for the _all - field
to "keyword", and "not_analyzed" isnt working either.

In this gist, I have store data as showed, executes the query but I get
no results. If I execute query against the field "data_textfield" directly,
I get the result:

https://gist.github.com/2171055

mvh

Runar Myklebust


(Shay Banon) #5

Its a copy of all the fields "aggregated" into the _all field.

On Sun, Mar 25, 2012 at 5:06 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Hello Shay ,

A doubt on this area.
When we enable _all , is there a different copy of all the fields stored ?
Or is it just a referance to the other fields ?

Thanks
Vineeth

On Sun, Mar 25, 2012 at 5:27 PM, Shay Banon kimchy@gmail.com wrote:

You don't want to set the analyzer for _all to be keyword, _all is an
aggregation of all the other fields int the doc, so you basically treat the
whole aggregation of text as a single token.

On Fri, Mar 23, 2012 at 4:28 PM, Runar Myklebust runar@myklebust.mewrote:

Hi, Im having a bit of trouble to set the analyzer for the _all - field
to "keyword", and "not_analyzed" isnt working either.

In this gist, I have store data as showed, executes the query but I get
no results. If I execute query against the field "data_textfield" directly,
I get the result:

https://gist.github.com/2171055

mvh

Runar Myklebust


(Runar Myklebust-2) #6

Ok, that make sense. My problem then is that in our existing solution, we
have a search where contains with texts matches the exact sentence part,
e.g

_all = "part of a sentence"

Where all of these will match:

"This is a part of a sentence"
"bigpart of a sentence-that-is-big"

but this will not match:

"A sentence part this is of"

I can use text-query to match that all the phrases are present, but then
the order part of the query disappears. Is there another way I can achieve
this when matching against all fields?

On Sun, Mar 25, 2012 at 1:57 PM, Shay Banon kimchy@gmail.com wrote:

You don't want to set the analyzer for _all to be keyword, _all is an
aggregation of all the other fields int the doc, so you basically treat the
whole aggregation of text as a single token.

On Fri, Mar 23, 2012 at 4:28 PM, Runar Myklebust runar@myklebust.mewrote:

Hi, Im having a bit of trouble to set the analyzer for the _all - field
to "keyword", and "not_analyzed" isnt working either.

In this gist, I have store data as showed, executes the query but I get
no results. If I execute query against the field "data_textfield" directly,
I get the result:

https://gist.github.com/2171055

mvh

Runar Myklebust

--
mvh

Runar Myklebust


(Runar Myklebust-2) #7

An update; I solved it by disabling the default _all-field and creating a
custom all field of type multi-field with both analyzed and not-analyzed
and adding data manually to this field.

mvh

Runar Myklebust


(system) #8