Hey Guys,
I'm developing my own centralized log management and using Elasticsearch with daily indices which matches my claims perfectly, so far. Now i have a problem on my search queries due to the amount of data and the _id field which i am mainly using for sorting:
Caused by: org.elasticsearch.ElasticsearchException: java.util.concurrent.ExecutionException:
CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [7960997201/7.4gb], which is larger than the limit of [7699562496/7.1gb]]
I spend a lot of time reading in the forum and the documentation and also read this part too:
The value of the
_id
field is also accessible in aggregations or for sorting, but doing so is
discouraged as it requires to load a lot of data in memory. In case sorting or aggregating on the
_id
field is required, it is advised to duplicate the content of the_id
field in another field that
hasdoc_values
enabled.
After this i have a few questions:
- Could duplicating the content of the meta field into a custom document field probably solve my issue? If yes, how can i achieve this at indexing time or maybe as mapping in the index template? Or do i have to update every document by myself?
- Is using more nodes maybe a solution for this? (Currently using one node for development)
- Generating my own unique id for documents a better way to go? (I would rather not do that )
Hopefully getting some help.
Thanks in advance.