Help with my first elasticsearch mapping

What I have is elasticserach ingesting something like:

{
	"policy": {
		"name": "account-cloudtrail-enabled",
		"resource": "account"
	},
	"metrics": [{
		"MetricName": "ResourceCount",
		"Timestamp": "2021-12-06T11:29:48.934903",
		"Value": 0,
		"Unit": "Count"
	}, {
		"MetricName": "ResourceTime",
		"Timestamp": "2021-12-06T11:29:48.934920",
		"Value": 0.8265008926391602,
		"Unit": "Seconds"
	}]
}

And Elasticseach is upset that the data type in "Value" is different in the two objects. In one, it sees a "long" and a "float" in the other.

"error"=>{"type"=>"illegal_argument_exception", "reason"=>"mapper [cc-data.metrics.Value] cannot be changed from type [float] to [long]"

So, I have been advised to create a "mapping" in Elasticsearch to define the data types as floats at that point. OK... based on what I read here: Nested field type | Elasticsearch Guide [7.16] | Elastic
I think I need something like:

curl -X PUT "localhost:9200/_component_template/myindex?pretty" -H 'Content-Type: application/json' -d'
{
  "template": {
    "mappings": {
      "_source": {
        "enabled": false
      },
      "properties": {
        "cc-data.metrics.Value": {
          "type": "float"
        }
      }
    }
  }
}

What would be the most help might be a little guidance on 2 things:
a. How do I manage these templates? Removing them, adding them, listing them?
b. How to I change the data type for a field inside a data structure like this? I'm just confused as to what all the fields mean.

This is managed via Elasticsearch APIs at this point, there's no UI in Kibana. Check out _cat/templates as a start.

For the data type clashes, you will need to reindex the data to make them consistent.
However looking at your example document and your mapping, I would suggest that your approach is not the best way. You should really approach this as time based data, given you have measurements at a specific time it makes sense to split these data points out into individual events, and then store data using ILM/

Thank you for your suggestions.

From what I gather, setting up a template is 100% correct. As you can tell, I am rather new at logstash, so I will look longer at that. I think (but I am unsure) that a template might be created so that everything under "metrics" is nested. Nested field type | Elasticsearch Guide [7.16] | Elastic

I also think I see what you are saying about ILM. As this project gets more mature, that will be necessary. To do this, I need to figure out how to make the above work with a series of indices. So many things to learn.

I did figure out what is really causing the error. The default way that the index is created, the first object under "metrics" is created no problem. Then the second object under "metrics" just alters the data, loosing the origonal data. In doing this, the field "Value" changes data type.

All this has to do with the default way that the data structure is flattened. What is stored at first in ES is something like:

"policy.name": "account-cloudtrail-enabled",
"policy.resource": "account",
"metrics.MetricName": "ResourceCount",
"metrics.Timestamp": "2021-12-06T11:29:48.934903",
"metrics.Value": 0,
"metrics.Unit": "Count"

but what I really need it something more like:

"policy.name": "account-cloudtrail-enabled",
"policy.resource": "account",
"metrics.ResourceCount.Timestamp": "2021-12-06T11:29:48.934903",
"metrics.ResourceCount.Value": 0,
"metrics.ResourceCount.Unit": "Count"
"metrics.ResourceTime.Timestamp": "2021-12-06T11:29:48.934920",
"metrics.ResourceTime.Value": 0.8265008926391602,
"metrics.ResourceTime.Unit": "Seconds"

The and if that were not complicated enough, the name of the metric should be arbitrary. Eg: might be different is some other file.

"policy.name": "account-cloudtrail-enabled",
"policy.resource": "account",
"metrics.ResourceCount.Timestamp": "2021-12-06T11:29:48.934903",
"metrics.ResourceCount.Value": 0,
"metrics.ResourceCount.Unit": "Count"
"metrics.ResourceTime.Timestamp": "2021-12-06T11:29:48.934920",
"metrics.ResourceTime.Value": 0.8265008926391602,
"metrics.ResourceTime.Unit": "Seconds"
"metrics.ResourceFoo.Timestamp": "2021-12-06T11:29:48.934920",
"metrics.ResourceFoo.Value": true,
"metrics.ResourceFoo.Unit": "bar"

You should really split each measurement into it's own event (document).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.