Mapping issue when creating ML job

I know this is a lot of back-and-forth, but this is necessary...

Please now try:

GET .ml-*/_mapping/field/Packet-Type.raw

I know this is a lot of back-and-forth, but this is necessary...

I don't mind, I just want to get this solved :slight_smile:

Please now try:

GET .ml-*/_mapping/field/Packet-Type.raw

Result:

#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in get field mapping requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
{
  ".ml-state" : {
    "mappings" : { }
  },
  ".ml-anomalies-custom-active-directory-logins" : {
    "mappings" : { }
  },
  ".ml-anomalies-custom-sonic-remote-logins" : {
    "mappings" : { }
  },
  ".ml-annotations-6" : {
    "mappings" : { }
  },
  ".ml-notifications" : {
    "mappings" : { }
  },
  ".ml-config" : {
    "mappings" : { }
  }
}

I'm actually now wondering if the field of interest is actually called Packet-Type.raw.raw

You could try:

PUT _xpack/ml/anomaly_detectors/logins
{
  "analysis_config": {
    "bucket_span": "15m",
    "detectors": [
      {
        "detector_description": "high_count over \"Fully-Qualified-Distinguished-Name.raw\" partitionfield=customer excludefrequent=all",
        "function": "high_count",
        "over_field_name": "Fully-Qualified-Distinguished-Name.raw",
        "partition_field_name": "customer",
        "exclude_frequent": "all",
        "detector_index": 0
      }
    ],
    "influencers": [
      "Fully-Qualified-Distinguished-Name.raw",
      "Reason-Code",
      "Packet-Type.raw.raw"
    ]
  },
  "data_description": {
    "time_field": "@timestamp"
  },
  "results_index_name": "custom-logins"
}

Nope.
I think it is clear from the mapping that it should be Packet-Type.raw, also because it has been working before and has not changed (I'm using yearly indices in this case). Also I can see it in the Kibana's index patterns.

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "Failed to parse mapping [doc]: mapper [Packet-Type] of different type, current_type [text], merged_type [ObjectMapper]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [doc]: mapper [Packet-Type] of different type, current_type [text], merged_type [ObjectMapper]",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "mapper [Packet-Type] of different type, current_type [text], merged_type [ObjectMapper]"
    }
  },
  "status": 400
}

Hmm...this is tricky. I can execute the above API call on my system and it works just fine:

output of API call:

{
  "job_id" : "logins",
  "job_type" : "anomaly_detector",
  "job_version" : "7.0.1",
  "create_time" : 1568221771641,
  "analysis_config" : {
    "bucket_span" : "15m",
    "detectors" : [
      {
        "detector_description" : """high_count over "Fully-Qualified-Distinguished-Name.raw" partitionfield=customer excludefrequent=all""",
        "function" : "high_count",
        "over_field_name" : "Fully-Qualified-Distinguished-Name.raw",
        "partition_field_name" : "customer",
        "exclude_frequent" : "all",
        "detector_index" : 0
      }
    ],
    "influencers" : [
      "Fully-Qualified-Distinguished-Name.raw",
      "Reason-Code",
      "Packet-Type.raw"
    ]
  },
  "analysis_limits" : {
    "model_memory_limit" : "1024mb",
    "categorization_examples_limit" : 4
  },
  "data_description" : {
    "time_field" : "@timestamp",
    "time_format" : "epoch_ms"
  },
  "model_snapshot_retention_days" : 1,
  "results_index_name" : "custom-logins"
}

and then the correct index and mappings are there:

GET .ml-anomalies-custom-logins/_mapping

yields

{
  ".ml-anomalies-custom-logins" : {
    "mappings" : {
      "_meta" : {
        "version" : "7.0.1"
      },
      "dynamic_templates" : [
        {
          "strings_as_keywords" : {
            "match" : "*",
            "mapping" : {
              "type" : "keyword"
            }
          }
        }
      ],
      "properties" : {
        "Fully-Qualified-Distinguished-Name" : {
          "properties" : {
            "raw" : {
              "type" : "keyword"
            }
          }
        },
        "Packet-Type" : {
          "properties" : {
            "raw" : {
              "type" : "keyword"
            }
          }
        },
        "Reason-Code" : {
          "type" : "keyword"
        },
...

Although I don't have access right now to a v6.8 setup (this is 7.0.1)

And if I execute:

GET .ml-anomalies-*/_mapping/field/Packet-Type.raw

I get

{
  ".ml-anomalies-custom-logins" : {
    "mappings" : {
      "Packet-Type.raw" : {
        "full_name" : "Packet-Type.raw",
        "mapping" : {
          "raw" : {
            "type" : "keyword"
          }
        }
      }
    }
  },
  ".ml-anomalies-shared" : {
    "mappings" : { }
  }
}

So, I might to need to seek out additional help on this one

Seems correct.

Thank you for your help so far, eager to solve this!

Next thought is that there might be some rogue mapping template that is greedy and is causing issues. You can see what templates exist and what patterns they match by hitting the _template endpoint:

GET _template

but the output is verbose. If you have the jq utility, you could cull the output using this command from the linux command-line:

$ curl -s -u elastic:changeme -XGET 1.2.3.4:9200/_template | jq -c ".|to_entries[]|{template:.key, order:.value.order, patterns:.value.index_patterns}"

(replacing the proper IP) which for me, produces the output:

{"template":".logstash-management","order":0,"patterns":[".logstash"]}
{"template":".monitoring-beats","order":0,"patterns":[".monitoring-beats-7-*"]}
{"template":".triggered_watches","order":2147483647,"patterns":[".triggered_watches*"]}
{"template":"metricbeat-6.1.1","order":1,"patterns":["metricbeat-6.1.1-*"]}
{"template":".monitoring-logstash","order":0,"patterns":[".monitoring-logstash-7-*"]}
{"template":".monitoring-alerts","order":0,"patterns":[".monitoring-alerts-6"]}
{"template":".monitoring-kibana","order":0,"patterns":[".monitoring-kibana-7-*"]}
{"template":".ml-anomalies-","order":0,"patterns":[".ml-anomalies-*"]}
{"template":".ml-config","order":0,"patterns":[".ml-config"]}
{"template":".watch-history-7","order":2147483647,"patterns":[".watcher-history-7*"]}
{"template":".watch-history-9","order":2147483647,"patterns":[".watcher-history-9*"]}
{"template":".monitoring-es","order":0,"patterns":[".monitoring-es-7-*"]}
{"template":"logstash","order":0,"patterns":["logstash-*"]}
{"template":".kibana_task_manager","order":0,"patterns":[".kibana_task_manager"]}
{"template":".monitoring-alerts-7","order":0,"patterns":[".monitoring-alerts-7"]}
{"template":".ml-meta","order":0,"patterns":[".ml-meta"]}
{"template":".ml-notifications","order":0,"patterns":[".ml-notifications"]}
{"template":".ml-state","order":0,"patterns":[".ml-state*"]}
{"template":".watch-history-6","order":2147483647,"patterns":[".watcher-history-6*"]}
{"template":"metricbeat-6.0.0","order":1,"patterns":["metricbeat-6.0.0-*"]}
{"template":".watches","order":2147483647,"patterns":[".watches*"]}
{"template":".management-beats","order":0,"patterns":[".management-beats"]}

Can you do this? Really what we're looking for is a mapping template that is "greedy" and may be overruling what we're trying to write to .ml-anomalies-*. The order value is important because that implies the order in which the template is applied. The .ml-anomalies- template has an order of 0 which means it is applied first, but if another template is also matching the pattern of .ml-anomalies-* indices (or has a real greedy match like *) and also has an order of 0, then there is no guarantee which template is applied first and this could be the source of trouble.

{"template":"elastalert_status_status","order":0,"patterns":["elastalert_status_status"]}
{"template":".monitoring-beats","order":0,"patterns":[".monitoring-beats-6-*"]}
{"template":"progress-audit","order":0,"patterns":["*-progress-audit-*"]}
{"template":"dhcp","order":0,"patterns":["*-dhcp-*"]}
{"template":".monitoring-logstash","order":0,"patterns":[".monitoring-logstash-6-*"]}
{"template":"kibana_index_template:.kibana-customer4","order":0,"patterns":[".kibana-customer4"]}
{"template":"linotp","order":0,"patterns":["*-linotp-*"]}
{"template":"sonicwall-fw","order":0,"patterns":["*-sonicwall-fw-*"]}
{"template":"security-index-template-v6","order":1000,"patterns":[".security-*"]}
{"template":".ml-meta","order":0,"patterns":[".ml-meta"]}
{"template":"fileaudit","order":0,"patterns":["*-fileaudit-*"]}
{"template":"elastalert_status_error","order":0,"patterns":["elastalert_status_error"]}
{"template":"elastalert_status_past","order":0,"patterns":["elastalert_status_past"]}
{"template":"fsecure-msg","order":0,"patterns":["*-fsecure-msg-*"]}
{"template":".kibana_task_manager","order":0,"patterns":[".kibana_task_manager"]}
{"template":"salesforce-logins","order":0,"patterns":["*-salesforce-logins-*"]}
{"template":".monitoring-es","order":0,"patterns":[".monitoring-es-6-*"]}
{"template":"dns","order":0,"patterns":["*-dns-*"]}
{"template":"kibana_index_template:.kibana-customer7","order":0,"patterns":[".kibana-customer7"]}
{"template":"elastalert_status","order":0,"patterns":["elastalert_status"]}
{"template":".ml-config","order":0,"patterns":[".ml-config"]}
{"template":".triggered_watches","order":2147483647,"patterns":[".triggered_watches*"]}
{"template":"logstash","order":0,"patterns":["logstash-*"]}
{"template":".watch-history-9","order":2147483647,"patterns":[".watcher-history-9*"]}
{"template":".watch-history-7","order":2147483647,"patterns":[".watcher-history-7*"]}
{"template":"kibana_index_template:.kibana-customer5","order":0,"patterns":[".kibana-customer5"]}
{"template":".watches","order":2147483647,"patterns":[".watches*"]}
{"template":".ml-state","order":0,"patterns":[".ml-state*"]}
{"template":"kibana_index_template:.kibana-customer9","order":0,"patterns":[".kibana-customer9"]}
{"template":"logstash-index-template","order":0,"patterns":[".logstash"]}
{"template":".watch-history-6","order":2147483647,"patterns":[".watcher-history-6*"]}
{"template":".monitoring-alerts","order":0,"patterns":[".monitoring-alerts-6"]}
{"template":"security-index-template","order":1000,"patterns":[".security-*"]}
{"template":"progress","order":0,"patterns":["*-progress-*"]}
{"template":"sonicwall-sra","order":0,"patterns":["*-sonicwall-sra-*"]}
{"template":"logins","order":0,"patterns":["*-logins-*"]}
{"template":".ml-anomalies-","order":0,"patterns":[".ml-anomalies-*"]}
{"template":"kibana_index_template:.kibana-customer2","order":0,"patterns":[".kibana-customer2"]}
{"template":"kibana_index_template:.kibana-customer3","order":0,"patterns":[".kibana-customer3"]}
{"template":".ml-notifications","order":0,"patterns":[".ml-notifications"]}
{"template":"kibana_index_template:.kibana-customer1","order":0,"patterns":[".kibana-customer1"]}
{"template":"apache2","order":1,"patterns":["*-apache2-*"]}
{"template":".management-beats","order":0,"patterns":[".management-beats"]}
{"template":"jmjanalyzer-alarm-status","order":0,"patterns":["jmjanalyzer-alarm-status"]}
{"template":"system-audit","order":1,"patterns":["*-system-audit-*"]}
{"template":"intime-arkisto","order":0,"patterns":["*-intime-arkisto-*"]}
{"template":".monitoring-kibana","order":0,"patterns":[".monitoring-kibana-6-*"]}
{"template":"system","order":0,"patterns":["*-system-erp-*"]}
{"template":"mycustomer5-cloud","order":0,"patterns":["*-mycustomer5-cloud-*"]}
{"template":"customer5-erp-syslog","order":0,"patterns":["customer5-erp-syslog-*"]}
{"template":"postfix","order":0,"patterns":["*-postfix-*"]}
{"template":"elastalert_status_silence","order":0,"patterns":["elastalert_status_silence"]}
{"template":"adaudit","order":0,"patterns":["*-adaudit-*"]}
{"template":"kibana_index_template:.kibana-customer10","order":0,"patterns":[".kibana-customer10"]}
{"template":"security_audit_log","order":1000,"patterns":[".security_audit_log*"]}

*-logins-* gets close, but should not get applied because of trailing -.
Afterwards I also applied order: 1 into that logins-template, the issue persists.

I you delete and recreate the .ml-anomalies-custom-logins index specifying no settings or mappings, what mappings does the index have before any documents are added to it?

DELETE /.ml-anomalies-custom-logins
PUT /.ml-anomalies-custom-logins
GET /.ml-anomalies-custom-logins/_mapping
1 Like
DELETE /.ml-anomalies-custom-logins
PUT /.ml-anomalies-custom-logins
GET /.ml-anomalies-custom-logins/_mapping

Delete results in error because there is such index.
When I create a new one and check the mappings, it seems correct:
https://privatebin.net/?6371e023646bd8e7#ALwUHyoQ6qyoQRHHSkKc69qYcysdjDEjRbypv72cyKyD

Thanks. This test proves that it's not your index templates interfering with the internal index mappings.

Yes, this is good info. But I'm going to suggest another quick thing to try to again, rule out some weird template conflict. In the API call to create the job, what happens if you choose a completely different name for the index name. Such as:

"results_index_name": "bananas"

Does that change anything?

Yes, I can confirm that this works.
But why? What is wrong with my templates... :thinking:

Well - glad it works and you now have a path to get back up and running.

As for root cause - one thing that you could possibly consider is taking a diagnostic (https://github.com/elastic/support-diagnostics) and unzip it locally for your own inspection. You could grep for logins, Packet-Type and even bananas to see where things are referenced (and not referenced).