How to change the data type in Logstash


(Navneet Mathpal) #1

Hi,

I am indexing the CSV file using Logstash, default it index every field as string in elasticsearch. I want to change some field to Date type ,how can i do it in logstash ?


(Jeremy Page) #2

Are you using grok to parse it? You can do something like %{NUMBER:uid:int} which will make the uid index as an integer. I think this only works for int & float.

From the grok page


(Magnus Bäck) #3

You can use the mutate filter to change the data type of fields, but it doesn't support date conversions. However, I think ES automatically detects strings containing timestamps (at least some formats, like ISO8601) and dynamically maps the field as a date. You may have to use an index template to indicate which fields should be treated as dates, though.


(Mark Walkom) #4

ES does detect some timestamps, see here for more info.


(Navneet Mathpal) #5

@JeremyinNC @magnusbaeck @warkolm I have a colum name called DEPLOY_TIME ( like : 28-01-2015 19:33:44) , when it is indexing it into the ES it is mapping the type as string

"DEPLOY_TIME": {
"norms": {
"enabled": false
},
"type": "string",
"fields": {
"raw": {
"ignore_above": 256,
"index": "not_analyzed",
"type": "string"

I have also tried the date filter like

csv
  {
    
       columns => ["colum1", "colum2", "colum3", "DEPLOY_TIME"]
       separator => ";"}
	   
	   date
	   {
type => "mycsv" 
match => [ "DEPLOY_TIME", "DD-MM-YYYY HH:mm:ss" ] 
}
	   
  }

But it is still mapping DEPLOY_TIME as String ..
Any idea why and how can we resolve it ?


(Magnus Bäck) #6

You'll probably have to use an explicit mapping (preferably set via an index template) that defines the DEPLOY_TIME field as having the date type (like how Logstash sets up the @timestamp field).


(Navneet Mathpal) #7

menas first I need to create a index define mapping then push the data using logstash or default mapping for every index ?


(Magnus Bäck) #8

I don't quite get what you're asking, but I suggest you update the index template used for Logstash indexes by adding an entry for the DEPLOY_TIME field. Then the next Logstash index that gets created has a DEPLOY_TIME field with the correct type.


(system) #9