Updating Datatype in Elasticsearch


(Barry Williams) #1

Hello All,
I'm a n00b, and I'm having trouble changing a field's datatype in
elasticsearch - so that kibana can use it.

I read in a CSV with logstash. Here is a sample of that CSV:

DateTime,Session,Event,Data/Duration
2014-05-12T21:51:44,1399945863,Pressure,7.00

Here is my logstash config:

input {
file {
path =>
"/elk/Samples/CPAP_07_14_2014/CSV/SleepSheep_07_14_2014_no_header.csv"
start_position => beginning
}
}

filter {
csv {
columns => ["DateTime","Session","Event","Data/Duration"]
}
}

output {
elasticsearch {
host => localhost
}
stdout { codec => rubydebug }
}

Whenever the data reaches elasticsearch, the mapping shows the
"Date/Duration" field as a string, not a float, preventing kibana from
using it for graphing. I tried to use PUT on elasticsearch to overwrite
the mapping, but it wont let me.

Where should I configure this datatype? In the CSV filter, in the output,
in elasticsearch?

Thanks,
Barry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5fac5f75-bcd3-4900-8d0a-94c930e7935c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Brian Yoder) #2

Within my configuration directory's templates/automap.json file is the
following template. Elasticsearch uses this template whenever it generates
a new logstash index each day:

{
"automap" : {
"template" : "logstash-*",
"settings" : {
"index.mapping.ignore_malformed" : true
},
"mappings" : {
"default" : {
"numeric_detection" : true,
"_all" : { "enabled" : false },
"properties" : {
"message" : { "type" : "string" },
"host" : { "type" : "string" },
"UUID" : { "type" : "string", "index" : "not_analyzed" },
"logdate" : { "type" : "string", "index" : "no" }
}
}
}
}
}

Note:

  1. How to ignore malformed data (for example, a numeric field that contains
    "no-data" every once in a while).

  2. How to automatically detect numeric fields. Logstash makes every JSON
    value a string. Elasticsearch automatically detects dates, but must be
    explicitly configured to automatically detect numeric fields.

  3. Listing fields that must be considered to be strings even if they
    contain numeric values, or must not be analyzed, or must not be indexed at
    all.

  4. Disabling of the _all field: As long as your logstash configuration
    leaves the message field pretty much intact, disabling the _all field will
    reduce disk space, increase performance, while still keeping all search
    functionality. But then, don't forget to also update your Elasticsearch
    configuration to specify message as the default field.

Hope this helps!

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e6e95468-3e21-4dc7-82eb-129a58c85852%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3