Facet search over a field containing a hyphen/dash/- kills elasticserch with big data volumes, but not with small

Elasticsearch 0.20.1, 2 cluster nodes. 12 shards for the index, 134000000
documents, datasize of 74Gb

We have added a couple of fields called x_y1, x_y2..x_y6 to our mappings
where y1-y6 can either contains a hypnen/dash/- or not.
The data for these fields are either a empty list, or a list of date in the
format of "YYYY-MM-YY" and the type is a date and format is
dateOptionalTime.

We noticed strange behaviour when querying our in house developed api for
facet counts over the fields . Basically it kills performance and the logs
of the cluster spits out
java.lang.OutOfMemoryError: loading field [histogram_stock-out] caused out
of memory failure
and a long stacktrace[1]

We have full test coverege over the features that need the facet counts and
they work in development with a small amount of test-documents(10
documents) but yeah, in production with large data set something goes wrong.

This is also reproducible via elasticsearch head, without any need to go
trough our API, and thus eliminating the API as the culprit

  • Go to es-host:9200/_plugin/head/
  • Click Browser in the top menu
  • Notice the little ? next to some fields in the left hand side, this marks
    date fields that have a feature that if you expand it with the little
    arrow, it shows a facet histogram over that field.
  • Click the arrow on histogram_new, a pretty graph appears, query from head
    to es is attached under [2]
  • Click the arrow on histogram_price-up or other field with a hyphen and
    watch elasticsearch die.

[1]
[16:17:30,764][WARN ][transport.netty ] [War Eagle] exception
caught on transport layer [[id: 0x738c7889, /PRIVATE-IP:50260 :>
/ANOTHER-PRIVATE-IP:9300]], closing connection
java.lang.OutOfMemoryError: loading field [histogram_stock-out] caused out
of memory failure
at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:138)
at
org.elasticsearch.search.facet.datehistogram.ValueScriptDateHistogramFacetCollector.doSetNextReader(ValueScriptDateHistogramFacetCollector.java:98)
at
org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:81)
at
org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:67)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:195)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)
at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:141)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:281)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:213)
at
org.elasticsearch.search.action.SearchServiceTransportAction$2.handleException(SearchServiceTransportAction.java:161)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handleException(MessageChannelHandler.java:182)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:173)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:558)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:786)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:458)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:439)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:558)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:553)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.OutOfMemoryError: Java heap space

[2]
{"fields":["_parent","_source"],"query":{"bool":{"must":[],"must_not":[],"should":[{"match_all":{}}]}},"from":0,"size":0,"sort":[],"facets":{"f-2":{"date_histogram":{"field":"histogram_discontinued","interval":"month","global":true}}},"version":true}:
Form data taken from Network tab in chrome inspector

--

Some more info would be ideal. How much memory per node? How much allocated
to the JVM? With 2 nodes, I am assuming the number of replicas is 1, which
means each node has the full index.

]Are these fields multi-valued? You mentioned that the field are a list of
dates. Can a document have more than 1 x_y1? If so, can this number vary
greatly per document? Eg: document A has 2 x_y1 fields, but document B has
30.

I doubt that hypens and dashes are causing issues. Can you enforce that
the dates are dates and not datetime? Less granularity will help on memory.

Cheers,

Ivan

On Mon, Jan 14, 2013 at 11:59 AM, Simon Johansson simon@editd.com wrote:

at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:138)

--

Hi there Ivan.

How much memory per node?
24Gb per node, Xms/Xmx 12g.

which means each node has the full index.
Correct.

Are these fields multi-valued?
No.

Can a document have more than 1 x_y1? If so, can this number vary
greatly per document? Eg: document A has 2 x_y1 fields, but document B has

Each document may or may not have the x_y{1-6} fields. if they have it they
have all of them, i.e 1-6 but only one occurrence of each. Instead of
having 2 x_y1 fields it would be document['x_y1'] = ["YYYY-MM-DD",
"YYYY-MM-DD"]

Can you enforce that the dates are dates and not datetime?
When populating the document before inserting it to ES we convert a
stringifyed date of format YYYYMMDD to format YYYY-MM-DD so I'm certain
that this is just a date without any time information.
Mappings for the fields is date, and format is dateOptionalTime

I doubt that hypens and dashes are causing issues.
In production when doing a facet count over a query that returns 17
documents, everything works for the fields that doesn't contains the dash,
but as soon as we do it over the "dashed" fields ES start spitting
tracebacks. The undashed fields will contain less number of dates than the
dashed ones, so I tried it over a query that returns 38k documents, and the
facet count for the undashed fields returns quickly whilst dashed made ES
angry again.

Cheers!
// Simon

On Tuesday, January 15, 2013 3:45:03 AM UTC, Ivan Brusic wrote:

Some more info would be ideal. How much memory per node? How much
allocated to the JVM? With 2 nodes, I am assuming the number of replicas is
1, which means each node has the full index.

]Are these fields multi-valued? You mentioned that the field are a list of
dates. Can a document have more than 1 x_y1? If so, can this number vary
greatly per document? Eg: document A has 2 x_y1 fields, but document B
has 30.

I doubt that hypens and dashes are causing issues. Can you enforce that
the dates are dates and not datetime? Less granularity will help on memory.

Cheers,

Ivan

On Mon, Jan 14, 2013 at 11:59 AM, Simon Johansson <si...@editd.com<javascript:>

wrote:

at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:138)

--

There was an issue yesterday when I started the thread that made it be
marked as spam, I had to go into the IRC-chanel to get it sorted, but it
seems like my original message is gone.. So here it is again.

Elasticsearch 0.20.1, 2 cluster nodes. 12 shards for the index, 134000000
documents, datasize of 74Gb

We have added a couple of fields called x_y1, x_y2..x_y6 to our mappings
where y1-y6 can either contains a hypnen/dash/- or not.
The data for these fields are either a empty list, or a list of date in the
format of "YYYY-MM-YY" and the type is a date and format is
dateOptionalTime.

We noticed strange behaviour when querying our in house developed api for
facet counts over the fields . Basically it kills performance and the logs
of the cluster spits out
java.lang.OutOfMemoryError: loading field [histogram_stock-out] caused out
of memory failure
and a long stacktrace[1]

We have full test coverege over the features that need the facet counts and
they work in development with a small amount of test-documents(10
documents) but yeah, in production with large data set something goes wrong.

This is also reproducible via elasticsearch head, without any need to go
trough our API, and thus eliminating the API as the culprit

  • Go to es-host:9200/_plugin/head/
  • Click Browser in the top menu
  • Notice the little ? next to some fields in the left hand side, this marks
    date fields that have a feature that if you expand it with the little
    arrow, it shows a facet histogram over that field.
  • Click the arrow on histogram_new, a pretty graph appears, query from head
    to es is attached under [2]
  • Click the arrow on histogram_price-up or other field with a hyphen and
    watch elasticsearch die.

[1]
[16:17:30,764][WARN ][transport.netty ] [War Eagle] exception
caught on transport layer [[id: 0x738c7889, /PRIVATE-IP:50260 :>
/ANOTHER-PRIVATE-IP:9300]], closing connection
java.lang.OutOfMemoryError: loading field [histogram_stock-out] caused out
of memory failure
at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:138)
at
org.elasticsearch.search.facet.datehistogram.ValueScriptDateHistogramFacetCollector.doSetNextReader(ValueScriptDateHistogramFacetCollector.java:98)
at
org.elasticsearch.search.facet.AbstractFacetCollector.setNextReader(AbstractFacetCollector.java:81)
at
org.elasticsearch.common.lucene.MultiCollector.setNextReader(MultiCollector.java:67)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:576)
at
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:195)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:426)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:342)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:330)
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:178)
at
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:242)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:141)
at
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:281)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:213)
at
org.elasticsearch.search.action.SearchServiceTransportAction$2.handleException(SearchServiceTransportAction.java:161)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handleException(MessageChannelHandler.java:182)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:173)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:558)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:786)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:458)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:439)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:558)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:553)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.OutOfMemoryError: Java heap space

[2]
{"fields":["_parent","_source"],"query":{"bool":{"must":[],"must_not":[],"should":[{"match_all":{}}]}},"from":0,"size":0,"sort":[],"facets":{"f-2":{"date_histogram":{"field":"histogram_discontinued","interval":"month","global":true}}},"version":true}:
Form data taken from Network tab in chrome inspector

--

I have now changed the mappings so the fields doesnt contains
hyphens/dashes/- and I stand corrected, this is not the issue.

I still struggle to understand how a facet count over a query that returns
17 documents causes OutOfMemoryError for some fields when a facet count
works for a query that returns 38k for some fields.
Ie,
x_y1 and x_y2 always works regardless of query, both the query that returns
17 documents and 38k
x_y3, x_y4, x_y5 and x_y6 fails regardless of how many items a query
returns.

On Tuesday, January 15, 2013 3:45:03 AM UTC, Ivan Brusic wrote:

Some more info would be ideal. How much memory per node? How much
allocated to the JVM? With 2 nodes, I am assuming the number of replicas is
1, which means each node has the full index.

]Are these fields multi-valued? You mentioned that the field are a list of
dates. Can a document have more than 1 x_y1? If so, can this number vary
greatly per document? Eg: document A has 2 x_y1 fields, but document B
has 30.

I doubt that hypens and dashes are causing issues. Can you enforce that
the dates are dates and not datetime? Less granularity will help on memory.

Cheers,

Ivan

On Mon, Jan 14, 2013 at 11:59 AM, Simon Johansson <si...@editd.com<javascript:>

wrote:

at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:138)

--

The amount of memory used doesn't depend much on the number of documents
returned. When you run facet on a field all values of this field are loaded
into memory. See Unrealistic high memory consumption for faceting of infrequent array fields with many members · Issue #2468 · elastic/elasticsearch · GitHub
for more details.

On Thursday, January 17, 2013 11:24:40 AM UTC-5, Simon Johansson wrote:

I have now changed the mappings so the fields doesnt contains
hyphens/dashes/- and I stand corrected, this is not the issue.

I still struggle to understand how a facet count over a query that returns
17 documents causes OutOfMemoryError for some fields when a facet count
works for a query that returns 38k for some fields.
Ie,
x_y1 and x_y2 always works regardless of query, both the query that
returns 17 documents and 38k
x_y3, x_y4, x_y5 and x_y6 fails regardless of how many items a query
returns.

On Tuesday, January 15, 2013 3:45:03 AM UTC, Ivan Brusic wrote:

Some more info would be ideal. How much memory per node? How much
allocated to the JVM? With 2 nodes, I am assuming the number of replicas is
1, which means each node has the full index.

]Are these fields multi-valued? You mentioned that the field are a list
of dates. Can a document have more than 1 x_y1? If so, can this number
vary greatly per document? Eg: document A has 2 x_y1 fields, but
document B has 30.

I doubt that hypens and dashes are causing issues. Can you enforce that
the dates are dates and not datetime? Less granularity will help on memory.

Cheers,

Ivan

On Mon, Jan 14, 2013 at 11:59 AM, Simon Johansson si...@editd.comwrote:

at
org.elasticsearch.index.cache.field.data.support.AbstractConcurrentMapFieldDataCache.cache(AbstractConcurrentMapFieldDataCache.java:138)

--