How to specify number_of_shards when putting data into ES using elasticsearch-hadoop SDK

From this page for the elasticsearch hadoop SDK, I can not find how to configure number_of_shards when creating one index automatically like

.option("es.resource.write", "log-{dateHour}")
In this doc, it menthioned it can use index setting API, but number_of_shards can only be set during index creation.

I tried
.option("es.index.number_of_shards", 10)
.option("es.index.refresh_interval", 10)

but it does not looks like work, becuase when I query the index setting, the number_of_shards for the one created is still 1.

Thanks

you have to set that in template

PUT _template/<name of template>
{
   "index_patterns" : ["*job*"],
   "order" : 1,
   "settings" : {
       "number_of_shards" : "3",
       "number_of_replicas" : "1"
   }
}

This is saying that any indexname which has word log will be created with 3 shard and 1 replicas

Thanks so much for the suggestion. This is really helpful!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.