I am writing the following documents structure into my Elasticsearch:
{
"date":"2021-11-03",
"clientId":"123",
...
<some other fields>
}
What I would like to happen, is that whenever a document is written to Elasticsearch,
a new index is created with the name:
myindex-{clientId}-{date}
where clientId and date are the values from the document.
If such index already exists, the document will be written to it.
This way, I will have a separate index per client per day automatically.
I have looked into index templates and lifecycle management policy, but could not find a way to put both the date and clientId into the index name automatically.
The index name need to be set by the process indexing data to Elasticsearch. Be aware that having lots of small indices is very inefficient and does not scale well, so having daily indices per client might only work for small number of clients.
Thanks Mark
The way I understand it, ILM can add the date to my index names and handle rollover etc... but I didn't find a way to use it to add the clientId into the index name. Is there any way to do that, or do I have to add the clientId in the process indexing the data, like Christian said, and then can use ILM for adding and managing the date part?
Thanks for replying, Christian!
Currently the data is indexed by Kinesis firehose, which can create index per day, but can not add the clientId into the name (not out of the box, anyway). I guess I would need to add a Lambda to that pipeline to achieve my goal.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.