Logstash template not used if Elasticsearch not ready?

I am supplying a template file with my Logstash job using:

template => "/conf/myindex-mapping-template.json"
template_name => "myindex"
template_overwrite => true

This works well, and I see:

12:27:19.383 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Using mapping template from {:path=>"./myindex-mapping-template.json"}
12:27:19.765 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Attempting to install template {:manage_template=>{"template"=>"myindex-*", "order"=>1, "settings"=>{"number_of_shards"=>24, "number_of_replicas"=>0, "refresh_interval"=>"30s", "codec"=>"best_compression"}, "mappings"=>{"_default_"=>...}
12:27:19.784 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Installing elasticsearch template to _template/myindex

I then added the Logstash job to a Rancher stack running Docker containers. The stack consists of an Elasticsearch container, and a Logstash container (and some other containers including the source of data in Kafka). When I start the stack in Rancher, the Logstash container invariably starts before the Elasticsearch container. Eventually the index gets created, but without the template being applied.

Logstash log when Elasticsearch is not ready:

6/22/2017 5:49:18 PM07:49:18.801 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elasticsearch:9200/]}}
6/22/2017 5:49:18 PM07:49:18.805 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://elasticsearch:9200/, :path=>"/"}
6/22/2017 5:49:19 PM07:49:19.283 [[main]-pipeline-manager] WARN  logstash.outputs.elasticsearch - Attempted to resurrect connection to dead ES instance, but got an error. {:url=>#<URI::HTTP:0xe5057f2 URL:http://elasticsearch:9200/>, :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::SocketException] Connection refused (Connection refused)"}
6/22/2017 5:49:19 PM07:49:19.285 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Using mapping template from {:path=>"/conf/myindex-mapping-template.json"}
6/22/2017 5:49:19 PM07:49:19.293 [[main]-pipeline-manager] WARN  logstash.outputs.elasticsearch - Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::SocketException] Connection refused (Connection refused) {:url=>http://elasticsearch:9200/, :error_message=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::SocketException] Connection refused (Connection refused)", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}

Elasticsearch log when it's ready, indicating no template used:

6/22/2017 5:56:30 PM[2017-06-22T07:56:30,106][INFO ][o.e.n.Node               ] [krxQ4NO] started
6/22/2017 5:57:05 PM[2017-06-22T07:57:05,885][INFO ][o.e.c.m.MetaDataCreateIndexService] [krxQ4NO] [myindex-v1] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings []
6/22/2017 5:57:08 PM[2017-06-22T07:57:08,644][INFO ][o.e.c.m.MetaDataMappingService] [krxQ4NO] [myindex-v1/Pvf7pnggTwW44RxIfSd6Fw] create_mapping [mydoc]
6/22/2017 5:57:08 PM[2017-06-22T07:57:08,698][INFO ][o.e.c.m.MetaDataMappingService] [krxQ4NO] [myindex-v1/Pvf7pnggTwW44RxIfSd6Fw] update_mapping [mydoc]
6/22/2017 5:57:08 PM[2017-06-22T07:57:08,850][INFO ][o.e.c.m.MetaDataMappingService] [krxQ4NO] [myindex-v1/Pvf7pnggTwW44RxIfSd6Fw] update_mapping [mydoc]
6/22/2017 5:57:09 PM[2017-06-22T07:57:09,063][INFO ][o.e.c.m.MetaDataMappingService] [krxQ4NO] [myindex-v1/Pvf7pnggTwW44RxIfSd6Fw] update_mapping [mydoc]

If I stop the Logstash container, delete the index and restart the Logstash container, the index will be recreated with the template applied.

Logstash log after restarting:

6/22/2017 6:14:13 PM08:14:13.447 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elasticsearch:9200/]}}
6/22/2017 6:14:13 PM08:14:13.451 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://elasticsearch:9200/, :path=>"/"}
6/22/2017 6:14:50 PM08:14:50.142 [[main]-pipeline-manager] WARN  logstash.outputs.elasticsearch - Restored connection to ES instance {:url=>#<URI::HTTP:0x44d3befd URL:http://elasticsearch:9200/>}
6/22/2017 6:14:50 PM08:14:50.146 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Using mapping template from {:path=>"/conf/myindex-mapping-template.json"}
Invalid Date Invalid Date08:14:50.393 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Attempting to install template {:manage_template=>{"template"=>"myindex-*", "order"=>1, "settings"=>{"number_of_shards"=>24...}
6/22/2017 6:14:52 PM08:14:52.154 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Installing elasticsearch template to _template/myindex

Elasticsearch log:

6/22/2017 6:14:48 PM[2017-06-22T08:14:48,163][WARN ][o.e.m.j.JvmGcMonitorService] [krxQ4NO] [gc][young][425][8] duration [1.9m], collections [1]/[2m], total [1.9m]/[10.5m], memory [607.7mb]->[431.1mb]/[1.9gb], all_pools {[young] [242.3mb]->[6.4mb]/[266.2mb]}{[survivor] [33.2mb]->[33.2mb]/[33.2mb]}{[old] [332mb]->[391.5mb]/[1.6gb]}
6/22/2017 6:14:48 PM[2017-06-22T08:14:48,620][WARN ][o.e.m.j.JvmGcMonitorService] [krxQ4NO] [gc][425] overhead, spent [1.9m] collecting in the last [2m]
6/22/2017 6:14:58 PM[2017-06-22T08:14:58,668][INFO ][o.e.c.m.MetaDataCreateIndexService] [krxQ4NO] [myindex-v1] creating index, cause [auto(bulk api)], templates [myindex], shards [24]/[0], mappings [mydoc]

I'm new to containerisation, so I'm not sure about how to stagger/serialise the starting of the containers, but I don't understand why the application of the template file appears to be a "one-shot", and not retried. Does this also mean that any data that Logstash tried to send while Elasticsearch was unavailable was lost? I thought it was retried automatically.

I wouldn't assume that

6/22/2017 5:57:05 PM[2017-06-22T07:57:05,885][INFO ][o.e.c.m.MetaDataCreateIndexService] [krxQ4NO] [myindex-v1] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings []

means that no index template was applied. What are the actual mappings of the newly created index? Do they indicate that the template was applied?

but I don't understand why the application of the template file appears to be a "one-shot", and not retried.

Yeah, that's a bit weird. I actually prefer managing the templates outside of Logstash so that I can update them whenever I like.

Does this also mean that any data that Logstash tried to send while Elasticsearch was unavailable was lost?

It depends. The documentation of the elasticsearch output plugin documents which kind of ES errors lead to retries, but unless you have enabled the new persistent queue feature Logstash only has a very small in-memory buffer of in-flight events (that's gone when Logstash shuts down). Logstash halts the pipeline when outputs are clogged, and exactly what happens with the inbound events depends on their source. Some event sources back off when Logstash pushes back, some don't.

Thanks very much for your reply, Magnus.

What are the actual mappings of the newly created index? Do they indicate that the template was applied?

The template file contains settings to change the number of shards and replicas as well as mappings to make some fields nested. When the above happens, the default number of shards and replicas is used, and there are no nested fields.

Yeah, that's a bit weird. I actually prefer managing the templates outside of Logstash so that I can update them whenever I like.

I've been maintaining them outside Logstash too, up to now, but tried this to allow the Dockerised Logstash to create the template and run in one operation, so I didn't have to change the Docker entry point script. However, I also need to take care of creating aliases as well, so it looks like some sort of sequencing of operations via a script will be required anyway, unless there's some other way to include them? (eg, either create and populate index via logstash with template file, then create aliases, or post template, then create empty index, then create aliases, then populate index via logstash)

Some event sources back off when Logstash pushes back, some don't.

Thanks for that - will have to read up on this.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.