Explicit Mapping Types in Elasticsearch are Ignored


(Christopher Hunt) #1

Hey folks,

So here is my situation:

Log Creation

I am setting up the Elasicstack to process, search, and monitor error logs from a Laravel PHP Web application. My current build is using the Monolog library (built into Laravel) to create error logs. By using the "Logstash formater" alongside the "WebProcessor" and" PsrLogMessageProcessor" I am creating nice and pretty JSON logs and adding context data such as the client IP, URL, the http_method, and the server name.

Pipeline: Filebeat -> Logstash -> Elasticsearch

From there I am using Filebeat to send logs to Logstash, without using any templates (template.enabled: false). Logstash is sending the logs to Elasticsearch, also without using any templates or filters. I know logstash is unneeded with this current build but it has a purpose later on in development for more log types and sources. I am using Kibana as a GUI for all of this, as well as for security and monitoring with x-pack.

The pipeline works great. Files are indexed correctly, using the correct mapping type I specify in logstash (document_type => "mlog_json"). I have used systout (and equivalents) at each point in the pipeline and the logs keep the JSON format that I want all the way through. Here is part of an example log:

{"@timestamp":"2017-09-08T15:06:11.149739-04:00","@source":"SOURCENAME","@fields"{"channel":"CHANNELNAME","level":400,"url":"/register","ip":"MYIP","http_method":"POST","server":"SERVERNAME","referrer":"REFERRERURL"},"@message":"exception 'ErrorException' with message 'Undefined index:...

And here is my issue:

The problem that I keep running into is that after creating an explicit mapping type for an index, the fields are created, they are present in the index pattern, the document is indexed, but after all of that the exact same fields are always created for the logs, with the fields that I specified in the mapping type always missing, irregardless of what mapping structure I create. What am I missing here? This is a sample of what I get in the Discover pane of Kibana for the Index pattern for a recent set of test logs. I have highlighted the fields that I always get (the only ones that are created). Nothing else can be searched or aggregated.

@timestamp: September 8th 2017, 15:52:02.449 offset: 840,894 @version: 1 input_type: log beat.hostname: HOSTNAME beat.name: BEATNAME beat.version: 5.5.2 host: HOSTNAME source:SOURCEURL message: {"@timestamp":"2017-09-08T15:52:00.343199-04:00","@source":"SOURCENAME","@fields":{"channel":"CHANNELNAME","level":400,"url":"/register","ip":"MYIP","http_method":"POST","server":"SERVERNAME","referrer":"REFERRERURL"},"@message":"exception 'ErrorException' with message 'Undefined index:...

Notice the message field contains the whole JSON object that is the log message.

A Sample Mapping Type Used

Create the index

PUT test1/

Create first mapping level

PUT test1/_mapping/mlog_json
{
  "properties": {
    "@timestamp":   { "type": "date" },
    "@source":      { "type": "keyword" },
    "@fields":      { "type": "nested"},
    "@message":     { "type": "long" }
  } 
}

Create second mapping level under fields nested field type

PUT test22/_mapping/mlog_json/
{
  "properties": {
    "fields": {
      "properties": {
        "channel":     { "type": "text" },
        "level":       { "type": "integer" },
        "url":         { "type": "keyword" },
        "ip":          { "type": "ip" },
        "http_method": { "type": "keyword" },
        "server":      { "type": "keyword" },
        "referer":     { "type": "keyword" }
      }
    }
  } 
}

Any suggestions, thoughts or comments would be much appreciated. This is my first time with Elastic and I am hitting a wall on this one.


(Rohithnama) #2

So If I'm correct, the fields that you mentioned in the mapping are being demonstrated in the logs and others are being missed. Or is it the otherwise. Because the log-output contradicts what you mentioned.

How did you index the data?

In my case, when I created the mapping for certain fields and then indexed the data, the dynamic mapping mechanism created the mapping for rest of the fields.


(Christopher Hunt) #3

Sorry, that was confusing. The only fields that are generated and searchable, are

  • "@timestamp",
  • "offset",
  • "@version",
  • "input_type",
  • "beat.name",
  • "beat.version",
  • "host",
  • "source", and
  • "message".
    The fields in the "message" field are not detected. They are all lumped under the "message" field as a single item. The fields show up when I create the index mapping, but they are always empty when documents are indexed.

The dynamic mapping seems to work only for those particular fields.


(Rohithnama) #4

So the hierarchy is messages-> fields within the messages. But you are indexing just the messages and that could be the reason it is treating as a single item.

Can you take out the index mapping for messages and give a try once? As you mentioned it of a long type, the dynamic mapping should be able to detect accordingly.

Else, you might want to specify the mapping type for each field within the messages instead of messages as a whole.

Please let me know if this works!


(Christopher Hunt) #5

Ill try both and get back to you. Thanks!

UPDATE: Problem solved.

Alright. I solved the issue. I played around with filters more and after using the following explicit mapping template with the JSON filter I am getting the fields and structure I want.

PUT _template/mlog_json_template
{
  "template": "mlog_json-*",
  "settings": {
    "index.mapper.dynamic": false,
    "number_of_shards": 3
  },
  "mappings": {
    "mlog_json": {
      "properties":{
        "@timestamp":           { "type": "date" },
        "@fields.channel":      { "type": "text" },
        "@fields.http_method":  { "type": "keyword" },
        "@fields.level":        { "type": "integer" },
        "@fields.ip":           { "type": "ip" },
        "@fields.server":       { "type": "text" },
        "@fields.url":          { "type": "text" },
        "@message":             { "type": "text" },
        "@source":              { "type": "text" },
        "@source_path":         { "type": "text" },
        "@source_host":         { "type": "text" },
        "@version":             { "type": "text" },
        "@beat.name":           { "type": "text" },
        "@beat.hostname":       { "type": "text" },
        "@beat.version":        { "type": "text" }
      }
    }
  }
}

(Rohithnama) #6

I'm glad that you are able to resolve the issue.

Also, do remember that you can specify multiple mapping for each parameter using the fields concept. This will become handy if you want to use your parameters in multiple ways while creating different jobs with the same data index.

https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html


(Christopher Hunt) #7

That is helpful. Thanks!


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.