High CPU due to build of global ordinals

Hi,

Recently, we've been getting high CPU usage due to (what seems to be) the build of global ordinals.
We already adjusted refresh_interval to 30s, which helped for a while, but CPU is high again, after some data was added.

Here is the hot_threads output from a specific node, which is problematic (Some parts were removed due to message size limit):

95.6% (478.1ms out of 500ms) cpu usage by thread 'elasticsearch[PROD-228-USW1-CL1-ES37][search][T#14]'
8/10 snapshots sharing following 45 elements
sun.nio.ch.NativeThread.current(Native Method)
sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:736)
sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:726)
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:179)
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
org.apache.lucene.codecs.lucene410.Lucene410DocValuesProducer$CompressedBinaryDocValues$CompressedBinaryTermsEnum.readTerm(Lucene410DocValuesProducer.java:909)
org.apache.lucene.codecs.lucene410.Lucene410DocValuesProducer$CompressedBinaryDocValues$CompressedBinaryTermsEnum.next(Lucene410DocValuesProducer.java:925)
org.apache.lucene.index.MultiTermsEnum.pushTop(MultiTermsEnum.java:293)
org.apache.lucene.index.MultiTermsEnum.next(MultiTermsEnum.java:319)
org.apache.lucene.index.MultiDocValues$OrdinalMap.(MultiDocValues.java:525)
org.apache.lucene.index.MultiDocValues$OrdinalMap.build(MultiDocValues.java:482)
org.apache.lucene.index.MultiDocValues$OrdinalMap.build(MultiDocValues.java:461)
org.elasticsearch.index.fielddata.ordinals.GlobalOrdinalsBuilder.build(GlobalOrdinalsBuilder.java:55)
org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData.localGlobalDirect(SortedSetDVOrdinalsIndexFieldData.java:81)
org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData.localGlobalDirect(SortedSetDVOrdinalsIndexFieldData.java:35)
org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache$2.call(IndicesFieldDataCache.java:211)
org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache$2.call(IndicesFieldDataCache.java:199)
org.elasticsearch.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
org.elasticsearch.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
org.elasticsearch.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
org.elasticsearch.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
org.elasticsearch.common.cache.LocalCache.get(LocalCache.java:3937)
org.elasticsearch.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache.load(IndicesFieldDataCache.java:199)
org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData.loadGlobal(SortedSetDVOrdinalsIndexFieldData.java:69)
org.elasticsearch.search.aggregations.support.ValuesSource$Bytes$WithOrdinals$FieldData.globalMaxOrd(ValuesSource.java:285)
org.elasticsearch.search.aggregations.bucket.terms.TermsAggregatorFactory.create(TermsAggregatorFactory.java:204)
org.elasticsearch.search.aggregations.support.ValuesSourceAggregatorFactory.create(ValuesSourceAggregatorFactory.java:54)
org.elasticsearch.search.aggregations.AggregatorFactories.createAndRegisterContextAware(AggregatorFactories.java:53)
org.elasticsearch.search.aggregations.AggregatorFactories.createTopLevelAggregators(AggregatorFactories.java:157)
org.elasticsearch.search.aggregations.AggregationPhase.preProcess(AggregationPhase.java:79)
org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:100)
org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:301)
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:312)

We're using Elasticsearch 1.7.5.

My questions are:

  1. Is there any way to determine what's the aggregation that's triggering this build?
  2. Any way to optimize the build time? I thought about eager loading of these values, but I'm not sure if it would help, since my problem is not the time that it takes the query, but the fact that CPU is constantly high, causing performance issues - so the build will still occur every refresh causing high CPU.

Thanks.

The stack trace suggests a terms aggregation

If you have small number of documents matching your query and a high cardinality field it may prove better to try use the map execution hint in your terms agg which does not rely on ordinals and works with the raw terms instead. See Terms Aggregation | Elasticsearch Guide [2.4] | Elastic

1 Like

Hi,

I just wanted to thank you - you suggestion was on spot. We had a small # of documents and changing the execution hint to map solved it.

Roy.

1 Like