How do I set a field type using grok?

naisanza · June 30, 2015, 8:11pm

I know that %{NUMBER:field:integer} works, actually I think this works, too: %{NUMBER:field:int}

I use the grok filter heavily and would like to know if I can set the field type in a custom pattern without needing to use mutate to convert the field to an integer.

For instance, will any of these work by design?

(?<field>regex):integer
or
((?<field>regex):integer)
or
(?<field:integer>regex)

eperry · June 30, 2015, 8:19pm

The nice way of doing something like this is

%{NUMBER:fieldname:int}

theuntergeek · June 30, 2015, 8:36pm

I don't think that the :integer type-casting will work, but you should make some simple tests with output { stdout { codec => rubydebug } } to find out.

While Logstash does some typing within (e.g. :int, :float, and :string (the default type))—and these are great for typing and comparison operations—Elasticsearch does it much more broadly, and to greater effect through manually mapping the core numeric types:

The type of the number. Can be float, double, integer, long, short, byte.

As with java typing, a float will take up less "space" than a double, and the same is true of the integer types: long > integer > short > byte.

When you declare an integer with :int or mutate convert in Logstash, it will appear to be an integer in the JSON sent to Elasticsearch. Likewise, :float will appear to be a floating point value in the JSON. The trouble is that Elasticsearch doesn't know the scale you intend, so it guesses double for any floating point value, and long for any integer value (see the first line of the Core Types documentation). This guesswork on Elasticsearch's part doesn't usually hamper anyone, but for performance and storage reasons, I'd recommend using the smallest type that will fit your data—if you can manually map and type your fields, that is.

naisanza · June 30, 2015, 9:24pm

Very, very explanatory! Eventually, the storage sizes and performance needed will be likely with hundreds of terabytes. The current working datasets aren't that large, but the strains can already be felt with a moderately beefy VM (java core dumps, out of memory).

One thing I haven't tested out yet is the ability to re-index. With that, even if data is initially poorly indexed, re-indexing will allow those indexes to become more useful; without needed to reparse the original data.

Topic		Replies	Views
How to change the datatype of field in elastic search Logstash	9	10521	July 6, 2017
Supported field types? Logstash	5	9661	July 6, 2017
Change field type from string to integer? and how to re-index? Elasticsearch	15	58556	July 5, 2017
How can convert field to int, float Logstash	8	10898	July 6, 2017
How to assign a new field a certain type and value? Logstash	3	16620	July 6, 2017

How do I set a field type using grok?

Related topics