Hello,
My general problem is simple. I would like to do a kind of OLAP cube using
elasticsearch.
For that I need to aggregate some value from my documents to obtain for
example data to draw histogram or pie chart.
When I do that on all of my documents, it's a bit slow.
I would like to know if an aggregation before indexing could be a good idea
to improve performance (less documents could lead to performance
improvement)
To do that, I dug on array and nested field.
My main problem is to obtain an aggregate value of the nested data
Here is an example of the data:
{
- TravelsByHour: [
- {
- hour: "05:00:00"
- count: 2
}
- {
- hour: "06:00:00"
- count: 7
}
- {
- hour: "07:00:00"
- count: 3
}
- {
- hour: "08:00:00"
- count: 1
}
- {
- hour: "13:00:00"
- count: 1
}
- {
- hour: "14:00:00"
- count: 3
}
- {
- hour: "16:00:00"
- count: 1
}
- {
- hour: "17:00:00"
- count: 1
}
]
- {
- CI: {
- Station: "401"
- Name: "Hello"
- Geo: {
- lat: 61.5354531
- lon: 92.161561
}
}
- TravelDate: "2012-05-29"
- Mode: "Bus"
}
It's an example, I could have {"country" : "US", "count": 13} or something
else.
The idea is to do a facet on my index to obtain the aggregate value of my
array like and I can't manage to find the proper facet.
I thought that histogram facet is what I need.
My query is as follow:
{
"query":
{"match_all":{ }},
"facets":
{"histo":{"histogram":{"key_field":"TravelsByHour.hour","value_field":"TravelsByHour.count","interval":1}}}}
But It doesn't work as I wanted.
I have a cast problem that seems to result of a mapping problem that I try
to resolve.
][DEBUG][action.search.type ] [Frost, Cordelia] [eborder][0],
node[fI3qDc_6Txa48ccT6xAU5A], [P], s[STARTED]: Failed to execute
[org.elasticsearch.action.search.SearchRequest@5e894fec]
org.elasticsearch.transport.RemoteTransportException:
[Centurius][inet[/199.0.12.126:9300]][search/phase/query]
Caused by: org.elasticsearch.search.SearchParseException: [eborder][0]:
query[ConstantScore(:)],from[-1],size[-1]: Parse Failure [Failed to parse
source
[{"query":{"match_all":{}},"facets":{"histo":{"histogram":{"key_field":"TravelsByHour.hour","value_field":"TravelsByHour.count","interval":1}}}}]]
at
org.elasticsearch.search.SearchService.parseSource(SearchService.java:573)
at
org.elasticsearch.search.SearchService.createContext(SearchService.java:484)
at
org.elasticsearch.search.SearchService.createContext(SearchService.java:469)
at
org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:462)
at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:234)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:529)
at
org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:518)
at
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:265)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.ClassCastException:
org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be
cast to org.elasticsearch.index.fielddata.IndexNumericFieldData
at
org.elasticsearch.search.facet.histogram.HistogramFacetParser.parse(HistogramFacetParser.java:121)
at
org.elasticsearch.search.facet.FacetParseElement.parse(FacetParseElement.java:92)
at
org.elasticsearch.search.SearchService.parseSource(SearchService.java:561)
... 10 more
It's not really what's matter, here are my questions:
Is histogram facets the good facet? (again hour is an example, it could be
country, age, gender...)
Is nested data the good idea ?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.