Aggregation question


#1

I am using aggregation to give me some reports of items (count) aggregated
by various properties (location or market (or both)).

I am using a term aggregator.

A simplified example of my data looks like this:

{
"intentLocationCode": "SHANG",
"intentLocationDescription": "Shanghai area",
"intentMarketDescription": "China Stainless Steel Exchange",
"intentMarketCode": "CSSX",
}

Lets say I want to aggregate by Intent Location:

My aggregation looks like this
{
"aggregations" : {
"intentLocations" : {
"terms" : { "field" : "intentLocationCode" }
}
}
}

And the result looks something like this:
{
"aggregations": {
"intentLocations": {
"buckets": [
{
"key": "shang",
"doc_count": 12
},
{
"key": "anotherlocation",
"doc_count": 8760
},
{
"key": "loc42",
"doc_count": 4773
},
{
"key": "area51",
"doc_count": 821
}
]
}
}
}

However, in the results I would like something like:
{
"aggregations": {
"intentLocations": {
"buckets": [
{
"key": "Shanghai area",
"doc_count": 12
},
{
"key": "Another Location Where Copper Is Stored",
"doc_count": 8760
},
{
"key": "The 42nd Stainless Steel Storage Company",
"doc_count": 4773
},
{
"key": "Area 51",
"doc_count": 821
}
]
}
}
}

ie I really want to the value of the intentLocationDescription field as the
key rather than the code. But obviously, doing a term aggregation on
description is going to give me very different results (unless I index
description with not_analyzed)
However, I do want to analyse intentLocationDescription for decent search
behaviour.

Is there a trick to achieve this with aggregations?
Or do I have to index intentLocationDescriptionTwice (analysed and not
analysed)?
(I don't really want to be doing any post-processing to match code and
description - because that would involve a reference data lookup that will
kill performance)

Cheers.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/965c337e-3b12-4603-880d-acf9a1860ed7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

Yes, the correct way would be to index intentLocationDescription as a
multi-field. You don't have to introduce it as multiple fields in your
source document. All you need to do is on the ES mapping, you set that
field to a multi-field, once as whatever analyzed you want, and the other
as not_analyzed. You can see an example here:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_multi_fields_3

Wherein you have 2 fields in the index derived from 1 single field in your
JSON source. The "name" field is analyzed. And then the "name.raw" field is
not_analyzed which is what you want to aggregate on.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3b0b8f26-0775-4d6c-9376-faab0e03b106%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


#3

Excellent. Thanks!

On Tuesday, 18 February 2014 15:28:32 UTC, Binh Ly wrote:

Yes, the correct way would be to index intentLocationDescription as a
multi-field. You don't have to introduce it as multiple fields in your
source document. All you need to do is on the ES mapping, you set that
field to a multi-field, once as whatever analyzed you want, and the other
as not_analyzed. You can see an example here:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_multi_fields_3

Wherein you have 2 fields in the index derived from 1 single field in your
JSON source. The "name" field is analyzed. And then the "name.raw" field is
not_analyzed which is what you want to aggregate on.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/70acc449-62d8-4230-8da8-6aabf206d5cd%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4