Indices picking up fields from other indices

I readily admit that this is likely a mistake or misunderstanding of my own, but I am confused by the following:


I have created two sets of indices within Elasticsearch, which are grouped by year. These separate sets should have nothing in common, except perhaps the timestamps. For example:

circulation-2017
pcres-2017
pcres-2016

I have created templates for each, specifying how the fields should be mapped.

circulation template:

pcres template:

I manually ran Logstash to import our backlog of data and the indices become filled with documents, and then ran the service to continue polling our [postgre|my]SQL databases. So far, so good.

Now, after a while (~15min) I start seeing fields from one index appearing in another. For example, see the following document:

% curl 'x.x.x.x:9200/pcres-2017/pcres_session/2925965?pretty'
{
  "_index" : "pcres-2017",
  "_type" : "pcres_session",
  "_id" : "2925965",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "session_interrupted" : "No",
    "is_guest" : "No",
    "times_extended" : null,
    "quoted_wait" : 0,
    "end_time" : "2017-04-26T18:37:00.000Z",
    "reservation_location" : "PC",
    "session_timestamp" : "2017-04-26T18:37:00.000Z",
    "branch" : "CITY",
    "minutes_used" : 1,
    "start_time" : "2017-04-26T18:36:00.000Z",
    "computer" : "CITYPC07",
    "actual_wait" : 0,
    "@timestamp" : "2017-04-26T18:38:00.067Z",
    "circ_operation" : "Unsupported",
    "@version" : "1",
    "id" : 2925965
  }
}

As shown above, this particular document within the pcres-2017 index now includes a field (circ_operation) that neither exists in the original data nor in the template for that set of indices. I have tried deleting all indices and reimporting the data, but the merging of indices happens again within a few dozen minutes of the new import.

I would love an "Easy Button" answer to this problem, but any help would be greatly appreciated.

Thanks,
Chris

If you set dynamic to strict then a document that comes in with an unknown field will fail to index and you can track that back to whatever process is creating the unknown field. You can read more about it here.

I deleted my indices and recreated the templates with "dynamic": false The indices were then filled with the backlog data and the services have been started.

So far, the circulation-2017 index has populated with good data, and no fields from the pcres-2017 have crept in. The pcres-* indices won't have any new information (the backlog data was also added for this index) until we open tomorrow morning, but thus far they also do not have any additional fields.

I will monitor the log files for errors (I assume trying to add a document with unmapped fields on a static index will throw some sort of error?) and report back tomorrow when I start seeing live data.

Thanks so far!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.