Elastic search mapping

Sridhar_Yadav_Manoha · June 30, 2016, 6:19pm

As mentioned in the blog https://www.elastic.co/blog/elasticsearch-as-a-time-series-data-store
I want to have a mapping template to monitor my application metrics. I have defined the exact mapping as the blog. Now for each metric I would be parsing from a xml in logstash how do I map it to the fields mentioned in the template.
For eg: I have a field namely response_time. I want to find the mean, min and max response time in the entire data.
"properties": {
"@timestamp": { "type": "date", "doc_values": true },
"max": { "type": "integer", "doc_values": true, "index": "no" },
"mean": { "type": "integer", "doc_values": true, "index": "no" },
"min": { "type": "float", "doc_values": true, "index": "no" }
}
How do use the mapping defined to map the field response_time to these properties defined in the mapping to achieve my use case. How do I configure it?

bhatch · June 30, 2016, 10:12pm

When you submit the document, do you just have the response_time, or have you already calculated the max, mean, and min values?

In the blog they have already determined the max, mean, and min values prior to the indexing request. So some other process is taking a bunch of data, doing aggregations on it, and then inserting it into Elasticsearch.

You can also have Elasticsearch do it for you. Just insert the document with response_time as a value. Then use the built in aggregations functions to get your max, mean, and min values. Depending on where you get the data this could be simpler to setup, but it could be less efficient.

For example let's say you get 200 million documents per day. And you want to know the average of the response time across the past year. Now your aggregation needs to work across 73 trillion documents. That could take a while. Especially if this is something that needs to be done regularly.
So the alternative is to keep your first index like normal. But every day you take those 200 million records, do any aggregation work that you want, and then insert it into a separate index. This new index has 1 record per day as opposed to 200 million. So when you want to run the last year, you touch this smaller index as opposed to the larger one. You are basically doing the heavy aggregation work once instead of every single request.

The blog post looks to be the later instead of the former. Logstash cannot do this on the fly though. You either need to index the data, then pull it out and index it into a separate index. Or some other software needs to give you data that already has this done.

Topic		Replies	Views
Issue with min,max,mean Elasticsearch	1	340	July 6, 2017
Issue with min,max,mean Elasticsearch	2	287	July 6, 2017
Template mapping success, filtering with it failed Elasticsearch	2	373	July 6, 2017
Mapping index Elasticsearch	2	546	July 16, 2017
[Kibana] Besoin d'aide sur l'indexation d'un field Discussions en français	5	1032	February 12, 2018

Elastic search mapping

Related topics