Logstash stores field as multifield

Hello,

i am currently migrating from an old elatic 6.8 cluster to a new 7.13.
The Logstash pipeline is the same:

    input {
        kafka {
            bootstrap_servers => '{{ logstash_kafka_bootstrap }}'
            topics => ['it-log']
            auto_offset_reset => 'earliest'
            codec => json

        }
    }

    filter {
        date {
            match => ["[payload][timestamp]", "ISO8601", "UNIX_MS"]
            target => "@timestamp"
            remove_field => [ "created", "id", "category"]
        }
        mutate {
            add_field => {
                "method" => "%{[payload][method]}"
                "level" => "%{[payload][level]}"
                "line" => "%{[payload][line]}"
                "thread" => "%{[payload][thread]}"
                "message" => "%{[payload][message]}"
                "class" => "%{[payload][class]}"
                "stacktrace" => "%{[payload][stacktrace]}"
            }
            remove_field => [ "payload"]
        }
    }

    output {
        elasticsearch {
            hosts => {{ kibana_elasticsearch_hosts }}
            index => 'it-log-%{+YYYY.MM.dd}'
        }
    }

But the output is quite different between
6.8:

    {
      "_index": "log-2021.05.28",
      "_type": "doc",
      "_id": "uWmvsnkBo9lKhgYE8DFo",
      "_score": 1,
      "_source": {
        "message": "...redacted...",
        "@timestamp": "2021-05-28T11:15:45.432Z",
        "class": "...redacted...",
        "@version": "1",
        "origin": {
          "domain": "...redacted...",
          "subDomain": "...redacted...",
          "host": "...redacted...",
          "version": "...redacted...",
          "key": "...redacted..."
        },
        "thread": "...redacted...",
        "stacktrace": "...redacted...",
        "method": "...redacted...",
        "level": "...redacted...",
        "line": "...redacted..."
      },
      "fields": {
        "@timestamp": [
          "2021-05-28T11:15:45.432Z"
        ]
      }
    }

to 7.13:

    {
      "_index": "it-log-2021.05.26",
      "_type": "doc",
      "_id": "iat1snkBSussBAE3vhSa",
      "_score": 1,
      "fields": {
        "origin.version": [
          "...redacted..."
        ],
        "origin.subDomain": [
          "...redacted..."
        ],
        "line": [
          "...redacted..."
        ],
        "thread.keyword": [
          "...redacted..."
        ],
        "origin.host": [
          "...redacted..."
        ],
        "origin.key.keyword": [
          "...redacted..."
        ],
        "line.keyword": [
          "...redacted..."
        ],
        "@version": [
          "1"
        ],
        "class.keyword": [
          "...redacted..."
        ],
        "origin.version.keyword": [
          "...redacted..."
        ],
        "method.keyword": [
          "...redacted..."
        ],
        "class": [
          "...redacted..."
        ],
        "origin.host.keyword": [
          "...redacted..."
        ],
        "method": [
          "...redacted..."
        ],
        "level": [
          "...redacted..."
        ],
        "origin.domain": [
          "...redacted..."
        ],
        "origin.domain.keyword": [
          "...redacted..."
        ],
        "@version.keyword": [
          "1"
        ],
        "thread": [
          "...redacted..."
        ],
        "message": [
          "...redacted..."
        ],
        "@timestamp": [
          "2021-05-26T23:03:39.444Z"
        ],
        "level.keyword": [
          "...redacted..."
        ],
        "origin.subDomain.keyword": [
          "...redacted..."
        ],
        "stacktrace": [
          "...redacted..."
        ],
        "message.keyword": [
          "...redacted..."
        ],
        "origin.key": [
          "...redacted..."
        ]
      }
    }

As you can see, in 7.13 it moves every string into an array and adds an extra keywords field.

What do i need to do to have 7.13 behave in the same way as 6.8?

The output of what? In both cases "fields" is a hash of arrays. In the 7.13 output it includes a lot more fields. And in 7.13 the _source field is missing. The .keyword fields very likely exist in 6.8, they are just not amongst the fields that are being fetched.

If you are looking at the JSON in Discover , Kibana has changed how that is displayed, that is probably what you are seeing.

That is what the new display looks like.

It's more efficient to pull the fields from the actual Fields not the _source.

You can see this in Kibana advanced settings you can switch it back if you want to check.

discover:searchFieldsFromSource

Read fields from _source

When enabled will load documents directly from _source. This is soon going to be deprecated. When disabled, will retrieve fields via the new Fields API in the high-level search service.

Default: Off

1 Like