ok, this is more low level Lucene, but in the context of an ElasticSearch
cluster, is there any way to get an index/shard to optimize away a bunch of
fields that are no longer used (literally have no term values associated
We had an application bug introduced that polluted an index with a very
large number of fields (25,000 fields... cough) , and lets just say
things weren't well after that.
we've deleted all the rogue records, but the shards still contain the raw
Lucene Field information (we've inspected these with Luke) and the cluster
is heavily CPU bound processing "refreshVersionTable" calls that is in a
large loop a function of the number of fields in the segments.
We've attempted a test optimize of the index using Luke on a single shard,
but the residual segments post-optimize still contain a large number of
these fields, all with no values associated with them.
Obviously a reindex would do this, but if there's any other bright ideas
that are quicker than that (45 million item index we're trying to keep up)
would be most welcome!
We're on ES 0.19.10 still (lucene 3.6.1). (you can tell me "upgrade"
another day please..)
Here's a snapshot picture from the Luke on a single shard from this index.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firstname.lastname@example.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHfYWB5nO%3DDQ50SQ4kgde6JvT%3DgjQ_7FmLbVcXVk5Kiurwme%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.