Where is the object mapping for [host] defined?

Context

Since the release of Elastic v8.0.0 (alpha), the elasticsearch output of Logstash started throwing the following error, even on a fresh installation:

object mapping for [host] tried to parse field [host] as object, but found a concrete value

Question

I understand that the input I use injects a host field at the top-level of the document.

What I don't understand is where the "object mapping" for host is defined? I have inspected all index templates currently installed, without success. Is this somehow related to ECS, and if yes, how does ECS influence indices which aren't configured to use it?

In your index do you have a field that uses host? Like host.name or host.ip? What the error means is you can't put a value for host if you have fields that use host.

Think of it like a filesystem. host is your folder and host.name is your files. If you have files than the root level is treated like a folder which doesn't contain a value.

@aaron-nimocks my index is created by Logstash (logstash-*), using the default template. The first time Logstash starts, I see the following message in Elasticsearch's logs:

"message":"adding index template [logstash] for index patterns [logstash-*]"

I don't have any index matching logstash-*, but I do see the logstash index template from the log message. Unfortunately, a definition for a host field is nowhere to be seen inside this template:

{"index_templates":[{"name":"logstash","index_template":{"index_patterns":["logstash-*"],"template":{"settings":{"index":{"number_of_shards":"1","refresh_interval":"5s"}},"mappings":{"dynamic_templates":[{"message_field":{"path_match":"message","mapping":{"norms":false,"type":"text"},"match_mapping_type":"string"}},{"string_fields":{"mapping":{"norms":false,"type":"text","fields":{"keyword":{"ignore_above":256,"type":"keyword"}}},"match_mapping_type":"string","match":"*"}}],"properties":{"@timestamp":{"type":"date"},"geoip":{"dynamic":true,"properties":{"ip":{"type":"ip"},"latitude":{"type":"half_float"},"location":{"type":"geo_point"},"longitude":{"type":"half_float"}}},"@version":{"type":"keyword"}}}},"composed_of":[],"priority":200,"version":80001,"_meta":{"description":"index template for logstash-output-elasticsearch"}}}]}

I am fine using a filter to rename the field, but I would really love to understand where the " object mapping for [host]" mentioned in the error message is defined. Is there such thing as a global definition for certain fields in Elasticsearch?


As an additional information, here is the list of all index templates I currently have (fresh v8.0.0 alpha installation):

❯ _cat/templates?v&s=name

name                            index_patterns               order      version composed_of
.deprecation-indexing-template  [.logs-deprecation.*]        1000       1       [.deprecation-indexing-mappings, .deprecation-indexing-settings]
.ml-anomalies-                  [.ml-anomalies-*]            2147483647 8000099 []
.ml-notifications-000002        [.ml-notifications-000002]   2147483647 8000099 []
.ml-state                       [.ml-state*]                 2147483647 8000099 []
.ml-stats                       [.ml-stats-*]                2147483647 8000099 []
.monitoring-alerts-7            [.monitoring-alerts-7]       0          7140099
.monitoring-beats               [.monitoring-beats-7-*]      0          7140099
.monitoring-es                  [.monitoring-es-7-*]         0          7140099
.monitoring-kibana              [.monitoring-kibana-7-*]     0          7140099
.monitoring-logstash            [.monitoring-logstash-7-*]   0          7140099
.slm-history                    [.slm-history-5*]            2147483647 5       []
.transform-notifications-000002 [.transform-notifications-*] 0          8000099
.watch-history-14               [.watcher-history-14*]       2147483647 14      []
ilm-history                     [ilm-history-5*]             2147483647 5       []
logs                            [logs-*-*]                   100        1       [logs-mappings, data-streams-mappings, logs-settings]
logstash                        [logstash-*]                 200        80001   []
metrics                         [metrics-*-*]                100        1       [metrics-mappings, data-streams-mappings, metrics-settings]
synthetics                      [synthetics-*-*]             100        1       [synthetics-mappings, data-streams-mappings, synthetics-settings]

As you can see, only one template matches logstash-*, just to exclude the hypothesis of an override coming from somewhere else.

Where is your data coming from? Are you able to post your Logstash pipeline?

Currently by sending raw log lines to Logstash's TCP input using the nc Linux command. And no, due to the aforementioned error, logs don't make it to Elasticsearch at all.

This worked flawlessly until v8.0.0, hence my feeling this is related to ECS (see Host Fields in the ECS documentation), but I might be wrong.

I don't have any experience with that version so not sure what the big changes are.

Does it work if you change the field from host to somethingelse?

The data you are receiving could have a field named host already. Logstash creates a field called host on ingest. There could be a conflict there.

The data I'm sending is not structured and does not contain any host field. This field is injected by Logstash's TCP input.

The behaviour is documented and well understood. Like I said, what I don't understand is what specific behaviour of Elasticsearch prevents the injection of this field, not where the field is coming from.

As a workaround, I'm simply renaming the field in my pipeline as follows, but I'm still a bit frustrated not to be able to find out the source of the issue.

filter {
    if [host] and ![host][name] {
        mutate {
            rename => { "[host]" => "[host][name]" }
        }
    }
}

Change your output to just stdout {} and your fieldname back to host.

Do you still get the error? If so that means it has nothing to do with ECS or Elasticsearch and most likely you are conflicting with the host field that Logstash creates.

No, the error does not occur with the stdout output. The issue occurs only while attempting to send to Elasticsearch.

For reference, here is what a document looks like after it's been ingested (and augmented) by Logstash:

{
      "@version" => "1",
          "host" => "gateway",  # <-- the culprit, ES expects an object
          "port" => 45732,
       "message" => "Hi, Aaron!",
    "@timestamp" => 2021-08-16T13:06:55.679Z
}

Note: port is also injected by Logstash, but this field does not seem to cause conflicts, which reinforces my feeling that ECS is involved.

What do you get when you do GET logstash/_mapping? Any field named host?

@aaron-nimocks I think you just put me on the right track.

I don't have any logstash-* index created, even after sending data, which seemed surprising. (For reference Logstash typically does this automatically.)

So I ran a search using GET /_search, and the returned result was:

{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 780,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : ".ds-logs-generic-default-2021.08.16-000001",
        "_id" : "W6QtT3sBfV7XNDbtOnEc",
        "_score" : 1.0,
        "_source" : {
          "@version" : "1",
          "@timestamp" : "2021-08-16T13:36:08.869Z",
          "port" : 32878,
          "host" : {
            "name" : "gateway"
          },
          "message" : "Hi, Aaron!",
          "data_stream" : {
            "type" : "logs",
            "dataset" : "generic",
            "namespace" : "default"
          }
        }
      },
     // ...

Notice how the documents ended up in the .ds-logs-generic-default-2021.08.16-000001 index, instead of the expected logstash-....

This index doesn't seem to have any index template that applies to it, but it has the following mapping (GET /.ds-logs-generic-default-2021.08.16-000001/_mapping):

{
  ".ds-logs-generic-default-2021.08.16-000001" : {
    "mappings" : {
      "_data_stream_timestamp" : {
        "enabled" : true
      },
      "dynamic_templates" : [
        {
          "match_ip" : {
            "match" : "ip",
            "match_mapping_type" : "string",
            "mapping" : {
              "type" : "ip"
            }
          }
        },
        {
          "match_message" : {
            "match" : "message",
            "match_mapping_type" : "string",
            "mapping" : {
              "type" : "match_only_text"
            }
          }
        },
        {
          "strings_as_keyword" : {
            "match_mapping_type" : "string",
            "mapping" : {
              "ignore_above" : 1024,
              "type" : "keyword"
            }
          }
        }
      ],
      "date_detection" : false,
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "@version" : {
          "type" : "keyword",
          "ignore_above" : 1024
        },
        "data_stream" : {
          "properties" : {
            "dataset" : {
              "type" : "constant_keyword",
              "value" : "generic"
            },
            "namespace" : {
              "type" : "constant_keyword",
              "value" : "default"
            },
            "type" : {
              "type" : "constant_keyword",
              "value" : "logs"
            }
          }
        },
        "ecs" : {
          "properties" : {
            "version" : {
              "type" : "keyword",
              "ignore_above" : 1024
            }
          }
        },
        "host" : {
          "properties" : {
            "name" : {
              "type" : "keyword",
              "ignore_above" : 1024
            }
          }
        },
        "message" : {
          "type" : "match_only_text"
        },
        "port" : {
          "type" : "long"
        }
      }
    }
  }
}

I'm not sure I understand the result though, it seems like the host.name entry is simply the result of a dynamic mapping. Is the conclusion that, in the absence of index template, the ECS schema applies implicitly?

Hi @antoineco,

I suspect you on the right track looking at ECS.
The host conflict between Logstash default and ECS is known. ECS defines host as an object.

Logstash 8.0 has ECS compatibility settings may be checking for this condition.

The Logstash 8.0 output plugin for Elasticsearch has this ecs_compatibility setting.

You may need to have that disabled.

1 Like

Wouldn't ecs compatibility use ecs-logstash-%{+yyyy.MM.dd} as the index?

From the last message it seems that the data is being ingested as a data stream, the current documentation says that the data_stream option in the elasticsearch output will change from false to auto starting with Logstash 8.

Try to use data_stream => false in your elasticsearch output and see if it writes to logstash-*

1 Like

This setting is currently set to disabled explicitly, but I still observe the same result.

Nice catch! data_stream seems to be the culprit here (TIL about data streams). I changed the value to false and I my logs are now going to logstash-2021.08.16-000001.

FYI I created a GitHub issue already (logstash-plugins/logstash-output-elasticsearch#1031), which I will close if the behaviour is expected.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.