Roll Over index in Elastic search

dilipgupta · January 9, 2019, 12:49pm

Hi Team,

We have requirement to load data from Hadoop to elastic search index through Spark Job.
The condition for index to be rolled over is number of document.
Below are the sample example i am doing for achieveing based-

POST eslogss/log
{
  "name":"b"
}


PUT /esls-000001 
{
  "aliases": {
    "eslogss": {}
  }
}


POST /eslogss/_rollover 
{
  "conditions": {
    "max_docs":  1
  }
}

Issue which we are getting, after adding one document, until i run manually the condition statement , roll over to new index is not happening.
As a result, in one index many documents are getting loaded, Which is not we want.

After running the condition statement, then roll over to new index is happening for the new set of document.

Please suggest, how to achieve roll over in my scenario.
please let me know, if you need any other details.

Thanks & Regards,
Dilip Gupta

dadoonet · January 9, 2019, 12:52pm

rollover is not happening automatically. You need to call the API as frequently as you want to see if the rollover can be executed or not.

You can use curator for that: https://www.elastic.co/guide/en/elasticsearch/client/curator/current/rollover.html

dilipgupta · January 9, 2019, 2:19pm

Hi David,

Thanks for your kind reply.
As per the example mentioned above, If i have set the condition as max_doc=1, Now if the data load is done through spark job which contains 10 document. Then all 10 document is getting inserted to the index.

Could you please help me in understanding, How can the condition be executed in the runtime when data load is in process to ES index.

dadoonet · January 9, 2019, 4:34pm

max_doc=1 is not a normal value for rollover so this example does not make sense in production.
It should be something more like max_doc=1000000.
Which means that if the number of documents in your index is more than 1m when you call the rollover API, then a rollover will happen.

If you index let say 1000 documents per second, then calling every minute the rollover API means that you will have index with a number of documents most likely between 1.000.000 and 1.060.000...

dilipgupta · January 11, 2019, 7:23am

Hi David,

Thanks for making me understand through suggested implementation.
Can you please let me the ways to implement curator as suggested by you--

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/rollover.html

Do we need any admin support for the implemenation or we as a developer can implement curator.

Thanks & Regards,
Dilip

dadoonet · January 11, 2019, 7:50am

It's more an admin tool IMO but I'm leaving the question to @theuntergeek.

theuntergeek · January 11, 2019, 12:13pm

What do you mean by "admin support?"

Curator can connect from anywhere to a cluster, provided there is a network path and no firewalls in between. If you have deployed security in your cluster, like the need for SSL certificates or user authentication, then you would need to have that information to be able to connect.

system · February 8, 2019, 12:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rollover index daily Elasticsearch	5	2879	July 8, 2017
Curator Rollover ignores max_docs condition Elasticsearch	5	674	June 7, 2017
Rollover index API Elasticsearch	1	310	October 1, 2019
Rolling over the index automatically Elasticsearch	8	560	February 21, 2019
Automate rollover index Elasticsearch	9	1038	February 20, 2019

Roll Over index in Elastic search

Related topics