Terms Aggregation buckets returns only single words and not phrases. Truncates the text after space

HI,

I have implemented the aggregation in one of my elastic search module. It
works perfect but the terms aggregation bucket returns only single words
and truncates the values after space.

Any idea?

Here is my schema, sample query and return:

Schema:
community/Ideas/_mapping?pretty
{

"community" : {
"mappings" : {
"Ideas" : {
"properties" : {
"body" : {
"type" : "string"
},
"categories" : {
"type" : "string"
}

    }

}

}

Sample Query:

POST /community/_search
{
"aggregations": {
"Category": {
"terms": {
"field": "categories",
"min_doc_count": 1,
"size": 0
}
}
},
"query": {
"bool": {
"must": [
{
"match": {
"_all": {
"query": "ping search"
}
}
}
]
}
}
}

Return:
"aggregations": {
"Category": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "administration",
"doc_count": 166
},
{
"key": "pingfederate",
"doc_count": 132
}
]
}
}

Now "Administration" should be "Administration in domain". But, it somehow
truncates after administration. It looks fine in the Source node of hits.

Please help.

  • Vishal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9652d1d2-39eb-44d3-9b88-b2444801279a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

When you put string data into elasticsearch, it will first be tokenized
before being indexed. You need to let elasticsearch know that field values
should not be tokenized by specifying "index": "not_analyzed" in the
mapping of your "categories" field.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

On Mon, Nov 24, 2014 at 7:50 AM, Vishal Sharma vishals@grazitti.com wrote:

HI,

I have implemented the aggregation in one of my elastic search module. It
works perfect but the terms aggregation bucket returns only single words
and truncates the values after space.

Any idea?

Here is my schema, sample query and return:

Schema:
community/Ideas/_mapping?pretty
{

"community" : {
"mappings" : {
"Ideas" : {
"properties" : {
"body" : {
"type" : "string"
},
"categories" : {
"type" : "string"
}

    }

}

}

Sample Query:

POST /community/_search
{
"aggregations": {
"Category": {
"terms": {
"field": "categories",
"min_doc_count": 1,
"size": 0
}
}
},
"query": {
"bool": {
"must": [
{
"match": {
"_all": {
"query": "ping search"
}
}
}
]
}
}
}

Return:
"aggregations": {
"Category": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "administration",
"doc_count": 166
},
{
"key": "pingfederate",
"doc_count": 132
}
]
}
}

Now "Administration" should be "Administration in domain". But, it somehow
truncates after administration. It looks fine in the Source node of hits.

Please help.

  • Vishal

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9652d1d2-39eb-44d3-9b88-b2444801279a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9652d1d2-39eb-44d3-9b88-b2444801279a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7nCphhhnHwExMpfQmaG7eMZvE2BqjH0y4kqj9Yk9drHg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks a lot. That helps!

Vishal Sharma**TL, SFDCT: +1 650 288 6711
E: vishals@grazitti.com alok@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
http://www.linkedin.com/company/grazitti-interactive[image: Description:
Twitter] https://twitter.com/grazitti[image: fbook]
https://www.facebook.com/grazitti.interactiveZakCalendar
Salesforce1TM Calendar
App for Teams
https://appexchange.salesforce.com/listingDetail?listingId=a0N3000000B5UPKEA3

On Mon, Nov 24, 2014 at 2:22 PM, Adrien Grand <
adrien.grand@elasticsearch.com> wrote:

Hi,

When you put string data into elasticsearch, it will first be tokenized
before being indexed. You need to let elasticsearch know that field values
should not be tokenized by specifying "index": "not_analyzed" in the
mapping of your "categories" field.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

On Mon, Nov 24, 2014 at 7:50 AM, Vishal Sharma vishals@grazitti.com
wrote:

HI,

I have implemented the aggregation in one of my elastic search module. It
works perfect but the terms aggregation bucket returns only single words
and truncates the values after space.

Any idea?

Here is my schema, sample query and return:

Schema:
community/Ideas/_mapping?pretty
{

"community" : {
"mappings" : {
"Ideas" : {
"properties" : {
"body" : {
"type" : "string"
},
"categories" : {
"type" : "string"
}

    }

}

}

Sample Query:

POST /community/_search
{
"aggregations": {
"Category": {
"terms": {
"field": "categories",
"min_doc_count": 1,
"size": 0
}
}
},
"query": {
"bool": {
"must": [
{
"match": {
"_all": {
"query": "ping search"
}
}
}
]
}
}
}

Return:
"aggregations": {
"Category": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "administration",
"doc_count": 166
},
{
"key": "pingfederate",
"doc_count": 132
}
]
}
}

Now "Administration" should be "Administration in domain". But, it
somehow truncates after administration. It looks fine in the Source node of
hits.

Please help.

  • Vishal

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9652d1d2-39eb-44d3-9b88-b2444801279a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9652d1d2-39eb-44d3-9b88-b2444801279a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/KtJQY-l716o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7nCphhhnHwExMpfQmaG7eMZvE2BqjH0y4kqj9Yk9drHg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7nCphhhnHwExMpfQmaG7eMZvE2BqjH0y4kqj9Yk9drHg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGMB1mZUx54rhesV%2B6szV9a76kHvu5fNnf%2BMTHk0g3wtsnGMuQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.