Term facets - getting result after splitting the filed WRT space

vineeth_mohan · November 30, 2011, 1:49pm

Hi ,
If i run term facet on a particular field , i am getting the result of
terms after splitting the filed WRT space.
For eg :

If the fields are - "Mr King","King Kong","Mr CEO","CEO OF" , "Mr King"
After running term facet on this

{
"facets": {
"Categories": {
"terms": {
"field": "Name",
"size": 10
}
}
}
}

Am getting the result as

"Mr" - 3
"King" - 3
Kong - 1
CEO - 2
OF -1

What i expected was

"Mr King" - 2
"King Kong"-1
"Mr CEO" - 1

And so on...

Is this the right behavior.
If it is , what is the other alternative to the desired output.

Thanks
Vineeth

Clinton_Gormley · November 30, 2011, 1:58pm

On Wed, 2011-11-30 at 19:19 +0530, Vineeth Mohan wrote:

Hi ,
If i run term facet on a particular field , i am getting the result of
terms after splitting the filed WRT space.

This is correct - your field is being analyzed so what is stored is the
result of that analysis (ie 'mr', 'king', 'kong', etc)

If you want the original phrase to be preserved, then you should map
that field to have {"index": "not_analyzed"}

However, that has implications for searching too, because you then
wouldn't be able to search for "king".

If you want to be able to do both, then you should use multi-fields,
with one sub-field analyzed (for searching), and one sub-field not
analyzed (for facets)

clint

For eg :

If the fields are - "Mr King","King Kong","Mr CEO","CEO OF" , "Mr
King"
After running term facet on this

{
"facets": {
"Categories": {
"terms": {
"field": "Name",
"size": 10
}
}
}
}

Am getting the result as

"Mr" - 3
"King" - 3
Kong - 1
CEO - 2
OF -1

What i expected was

"Mr King" - 2
"King Kong"-1
"Mr CEO" - 1

And so on...

Is this the right behavior.
If it is , what is the other alternative to the desired output.

Thanks
Vineeth

vineeth_mohan · November 30, 2011, 2:07pm

I have a particular field whose key name is not known before hand.

So the scema looks like

Entities : { name : vm , vm : [ {name : abc } , {name : bcd}]}

Here VM is not know before hand and i want to make field Entities.X.name as
faceting field.
How will i do index: not_analyzed for name alone ?

Thanks
Vineeth

On Wed, Nov 30, 2011 at 7:28 PM, Clinton Gormley clint@traveljury.comwrote:

On Wed, 2011-11-30 at 19:19 +0530, Vineeth Mohan wrote:

Hi ,
If i run term facet on a particular field , i am getting the result of
terms after splitting the filed WRT space.

This is correct - your field is being analyzed so what is stored is the
result of that analysis (ie 'mr', 'king', 'kong', etc)

If you want the original phrase to be preserved, then you should map
that field to have {"index": "not_analyzed"}

However, that has implications for searching too, because you then
wouldn't be able to search for "king".

If you want to be able to do both, then you should use multi-fields,
with one sub-field analyzed (for searching), and one sub-field not
analyzed (for facets)

clint

For eg :

If the fields are - "Mr King","King Kong","Mr CEO","CEO OF" , "Mr
King"
After running term facet on this

{
"facets": {
"Categories": {
"terms": {
"field": "Name",
"size": 10
}
}
}
}

Am getting the result as

"Mr" - 3
"King" - 3
Kong - 1
CEO - 2
OF -1

What i expected was

"Mr King" - 2
"King Kong"-1
"Mr CEO" - 1

And so on...

Is this the right behavior.
If it is , what is the other alternative to the desired output.

Thanks
Vineeth

vineeth_mohan · November 30, 2011, 2:14pm

So the question is more like how will i change the default of index to
not_analyzed for those field which are not specified in the schema.

Thanks
Vineeth

On Wed, Nov 30, 2011 at 7:37 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

I have a particular field whose key name is not known before hand.

So the scema looks like

Entities : { name : vm , vm : [ {name : abc } , {name : bcd}]}

Here VM is not know before hand and i want to make field Entities.X.nameas faceting field.
How will i do index: not_analyzed for name alone ?

Thanks
Vineeth

On Wed, Nov 30, 2011 at 7:28 PM, Clinton Gormley clint@traveljury.comwrote:

On Wed, 2011-11-30 at 19:19 +0530, Vineeth Mohan wrote:

Hi ,
If i run term facet on a particular field , i am getting the result of
terms after splitting the filed WRT space.

This is correct - your field is being analyzed so what is stored is the
result of that analysis (ie 'mr', 'king', 'kong', etc)

If you want the original phrase to be preserved, then you should map
that field to have {"index": "not_analyzed"}

However, that has implications for searching too, because you then
wouldn't be able to search for "king".

If you want to be able to do both, then you should use multi-fields,
with one sub-field analyzed (for searching), and one sub-field not
analyzed (for facets)

clint

For eg :

If the fields are - "Mr King","King Kong","Mr CEO","CEO OF" , "Mr
King"
After running term facet on this

{
"facets": {
"Categories": {
"terms": {
"field": "Name",
"size": 10
}
}
}
}

Am getting the result as

"Mr" - 3
"King" - 3
Kong - 1
CEO - 2
OF -1

What i expected was

"Mr King" - 2
"King Kong"-1
"Mr CEO" - 1

And so on...

Is this the right behavior.
If it is , what is the other alternative to the desired output.

Thanks
Vineeth

Clinton_Gormley · November 30, 2011, 2:49pm

On Wed, 2011-11-30 at 19:44 +0530, Vineeth Mohan wrote:

So the question is more like how will i change the default of index to
not_analyzed for those field which are not specified in the schema.

Have a look at Dynamic Mapping:

clint

Thanks
Vineeth

On Wed, Nov 30, 2011 at 7:37 PM, Vineeth Mohan
vineethmohan@algotree.com wrote:
I have a particular field whose key name is not known before
hand.

    So the scema looks like 
    
    Entities : {  name : vm , vm :  [ {name : abc } , {name :
    bcd}]}
    
    Here VM is not know before hand and i want to make field
    Entities.X.name as faceting field.
    How will i do index: not_analyzed for name alone ?
    
    Thanks
               Vineeth
    
    
    
    On Wed, Nov 30, 2011 at 7:28 PM, Clinton Gormley
    <clint@traveljury.com> wrote:
            On Wed, 2011-11-30 at 19:19 +0530, Vineeth Mohan
            wrote:
            > Hi ,
            > If i run term facet on a particular field , i am
            getting the result of
            > terms after splitting the filed WRT space.
            
            
            This is correct - your field is being analyzed so what
            is stored is the
            result of that analysis (ie 'mr', 'king', 'kong', etc)
            
            If you want the original phrase to be preserved, then
            you should map
            that field to have {"index": "not_analyzed"}
            
            However, that has implications for searching too,
            because you then
            wouldn't be able to search for "king".
            
            If you want to be able to do both, then you should use
            multi-fields,
            with one sub-field analyzed (for searching), and one
            sub-field not
            analyzed (for facets)
            
            clint
            
            
            > For eg :
            >
            > If the fields are - "Mr King","King Kong","Mr
            CEO","CEO OF" , "Mr
            > King"
            > After running term facet on this
            >
            > {
            >   "facets": {
            >     "Categories": {
            >       "terms": {
            >         "field": "Name",
            >         "size": 10
            >       }
            >     }
            >   }
            > }
            >
            >
            > Am getting the result as
            >
            > "Mr" - 3
            > "King" - 3
            > Kong - 1
            > CEO - 2
            > OF -1
            >
            > What i expected was
            >
            > "Mr King" - 2
            > "King Kong"-1
            > "Mr CEO" - 1
            >
            > And so on...
            >
            > Is this the right behavior.
            > If it is , what is the other alternative to the
            desired output.
            >
            > Thanks
            >           Vineeth
            >
            >

vineeth_mohan · November 30, 2011, 3:22pm

Ok , i have created the following file

XYZ@XYZ:~/elasticSearch$ cat config/default-mapping.json
{
"default" : {
"index" : "not_analyzed"
}
}

But its not helping....

Is there something i have missed out

Thanks
Vineeth

On Wed, Nov 30, 2011 at 8:19 PM, Clinton Gormley clint@traveljury.comwrote:

On Wed, 2011-11-30 at 19:44 +0530, Vineeth Mohan wrote:

So the question is more like how will i change the default of index to
not_analyzed for those field which are not specified in the schema.

Have a look at Dynamic Mapping:
Elasticsearch Platform — Find real-time answers at scale | Elastic

clint

Thanks
Vineeth

On Wed, Nov 30, 2011 at 7:37 PM, Vineeth Mohan
vineethmohan@algotree.com wrote:
I have a particular field whose key name is not known before
hand.

    So the scema looks like

    Entities : {  name : vm , vm :  [ {name : abc } , {name :
    bcd}]}

    Here VM is not know before hand and i want to make field
    Entities.X.name as faceting field.
    How will i do index: not_analyzed for name alone ?

    Thanks
               Vineeth



    On Wed, Nov 30, 2011 at 7:28 PM, Clinton Gormley
    <clint@traveljury.com> wrote:
            On Wed, 2011-11-30 at 19:19 +0530, Vineeth Mohan
            wrote:
            > Hi ,
            > If i run term facet on a particular field , i am
            getting the result of
            > terms after splitting the filed WRT space.


            This is correct - your field is being analyzed so what
            is stored is the
            result of that analysis (ie 'mr', 'king', 'kong', etc)

            If you want the original phrase to be preserved, then
            you should map
            that field to have {"index": "not_analyzed"}

            However, that has implications for searching too,
            because you then
            wouldn't be able to search for "king".

            If you want to be able to do both, then you should use
            multi-fields,
            with one sub-field analyzed (for searching), and one
            sub-field not
            analyzed (for facets)

            clint


            > For eg :
            >
            > If the fields are - "Mr King","King Kong","Mr
            CEO","CEO OF" , "Mr
            > King"
            > After running term facet on this
            >
            > {
            >   "facets": {
            >     "Categories": {
            >       "terms": {
            >         "field": "Name",
            >         "size": 10
            >       }
            >     }
            >   }
            > }
            >
            >
            > Am getting the result as
            >
            > "Mr" - 3
            > "King" - 3
            > Kong - 1
            > CEO - 2
            > OF -1
            >
            > What i expected was
            >
            > "Mr King" - 2
            > "King Kong"-1
            > "Mr CEO" - 1
            >
            > And so on...
            >
            > Is this the right behavior.
            > If it is , what is the other alternative to the
            desired output.
            >
            > Thanks
            >           Vineeth
            >
            >

vineeth_mohan · December 1, 2011, 6:28am

Finally i used dynamic template mapping and get the thing working.

Once again clint saved my day

Thanks
Vineeth

On Wed, Nov 30, 2011 at 8:52 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Ok , i have created the following file

XYZ@XYZ:~/elasticSearch$ cat config/default-mapping.json
{
"default" : {
"index" : "not_analyzed"
}
}

But its not helping....

Is there something i have missed out

Thanks
Vineeth

On Wed, Nov 30, 2011 at 8:19 PM, Clinton Gormley clint@traveljury.comwrote:

On Wed, 2011-11-30 at 19:44 +0530, Vineeth Mohan wrote:

So the question is more like how will i change the default of index to
not_analyzed for those field which are not specified in the schema.

Have a look at Dynamic Mapping:
Elasticsearch Platform — Find real-time answers at scale | Elastic

clint

Thanks
Vineeth

On Wed, Nov 30, 2011 at 7:37 PM, Vineeth Mohan
vineethmohan@algotree.com wrote:
I have a particular field whose key name is not known before
hand.

    So the scema looks like

    Entities : {  name : vm , vm :  [ {name : abc } , {name :
    bcd}]}

    Here VM is not know before hand and i want to make field
    Entities.X.name as faceting field.
    How will i do index: not_analyzed for name alone ?

    Thanks
               Vineeth



    On Wed, Nov 30, 2011 at 7:28 PM, Clinton Gormley
    <clint@traveljury.com> wrote:
            On Wed, 2011-11-30 at 19:19 +0530, Vineeth Mohan
            wrote:
            > Hi ,
            > If i run term facet on a particular field , i am
            getting the result of
            > terms after splitting the filed WRT space.


            This is correct - your field is being analyzed so what
            is stored is the
            result of that analysis (ie 'mr', 'king', 'kong', etc)

            If you want the original phrase to be preserved, then
            you should map
            that field to have {"index": "not_analyzed"}

            However, that has implications for searching too,
            because you then
            wouldn't be able to search for "king".

            If you want to be able to do both, then you should use
            multi-fields,
            with one sub-field analyzed (for searching), and one
            sub-field not
            analyzed (for facets)

            clint


            > For eg :
            >
            > If the fields are - "Mr King","King Kong","Mr
            CEO","CEO OF" , "Mr
            > King"
            > After running term facet on this
            >
            > {
            >   "facets": {
            >     "Categories": {
            >       "terms": {
            >         "field": "Name",
            >         "size": 10
            >       }
            >     }
            >   }
            > }
            >
            >
            > Am getting the result as
            >
            > "Mr" - 3
            > "King" - 3
            > Kong - 1
            > CEO - 2
            > OF -1
            >
            > What i expected was
            >
            > "Mr King" - 2
            > "King Kong"-1
            > "Mr CEO" - 1
            >
            > And so on...
            >
            > Is this the right behavior.
            > If it is , what is the other alternative to the
            desired output.
            >
            > Thanks
            >           Vineeth
            >
            >

Clinton_Gormley · December 1, 2011, 9:51am

On Thu, 2011-12-01 at 11:58 +0530, Vineeth Mohan wrote:

Finally i used dynamic template mapping and get the thing working.

Once again clint saved my day

Perhaps I should do the Elasticsearch thing and take a new super-hero
name every morning

glad it helped

clint

Topic		Replies	Views
Elasticsearch splits field by space on facets Elasticsearch	5	873	July 6, 2017
Hierarchy Facet Script Elasticsearch	2	284	July 6, 2017
Faceting Elasticsearch	4	297	July 6, 2017
Non-tokenized terms facet? Elasticsearch	2	274	July 6, 2017
How do I get whole values of a field, as a facet? (not individual terms!) Elasticsearch	14	1997	July 6, 2017

Term facets - getting result after splitting the filed WRT space

Related topics