Default Mapping + Ignore malformed

Hello! I have a problem with my mapping. I get a lot of JSON log files and I can not tell what data type the fields will have.
So if my first log has a field content: <<string>> and my next log has maybe this content: <<object>> I don't want to loose the whole log document. I found ignore_malformed and use logstash to preprocess my logs.
I want the mapping to be exactly like it is without any modification (dynamic type mapping + dynamic field indexing if there are new fields and also not analyzed .raw fields). The following is my mapping:

{
  "settings": {
    "index.mapping.ignore_malformed": true 
  },
  "mappings": {
    "_default_": {
      "dynamic": "true",
      "_all" : {
        "omit_norms" : true,
        "enabled" : true
      },
      "dynamic_templates" : [{
        "string_fields" : {
          "mapping" : {
            "index" : "analyzed",
            "omit_norms" : true,
            "type" : "string",
            "fields" : {
              "raw" : {
                "index" : "not_analyzed",
                "type" : "string"
              }
            }
          },
          "match_mapping_type" : "string",
          "match" : "*"
          }
        }]
      }		
   }
}

In logstash I've done this:

output {
  stdout {
    codec => rubydebug { metadata => true }
  }
  elasticsearch {
    hosts => "elasticsearch:9200"
    index => "%{[@metadata][es_index_full]}"
    user => ["logstash"]
    password => ["xxxxxxxxxx"]
    template => "/etc/logstash/conf.d/mapping.json"
    manage_template => true
    template_overwrite => true
  }  
}

It seems that logstash uses my Mapping. But it does not work. I still loose my documents where the type is not matching and I'm missing my .raw fields.

What do I have to do?

Many Thanks in advance
Daniel

As the docs mention, index.mapping.ignore_malformed is an index setting, so if the type of the incoming event doesn't match then it'll ignore it.

Now, to figure out what is happening you may want to drop that from the mapping, process a few docs, then check them out.

If this is log data, ie time based, you should use time based indices :slight_smile:

It's a good idea to put that in a different directory.

Thanks for your reply. Die log data is indeed timebased, an my indices are also timebased (day interval).

What is a good location for my mapping.json?

I'll try this when I'm at home!

Update: I've removed the ignore_malformed part and everything is still the same. I do not have .raw fields. Every string field is analysed

Update2: I've added "template": "*" in my mapping.json. Now my .raw fields are created. I will now determine what happens to the documents with mismatching field type

It seems to me, that this option is not working. As stated in my last post the mapping with my .raw fields works now:

{
  "settings": {
    "index.mapping.ignore_malformed": true 
  },
  "template": "*",
  "mappings": {
    "_default_": {
      "dynamic": "true",
      "_all" : {
        "omit_norms" : true,
        "enabled" : true
      },
      "dynamic_templates" : [{
        "string_fields" : {
          "match" : "*",
          "match_mapping_type" : "string",
          "mapping" : {
            "type" : "string", "index" : "analyzed", "omit_norms" : true,
            "fields" : {
              "raw" : { "type": "string", "index" : "not_analyzed", "omit_norms": true }
            }
          }
        }
      }]
    }		
  }
}

Any idea what could be the problem?

Update: It seems to be general problem: https://github.com/elastic/elasticsearch/issues/12366#issuecomment-242771038