CSV file indexed to %{[@metadata] instead of target


#1

I am attempting to ingest a CSV file using logstash into an elasticsearch index that I have pre-defined using a PUT request. It appears that logstash is ingesting the CSV file not into the intended elasticsearch index, but into one labelled:

%{[@metadata][beat]}-2017.08.04

I am pasting the relevant logstash conf file, as well as the PUT request I used to pre-build the index. High-fives for any help:

BEGIN LOGSTASH CONF

input {
	file {
		path => "/var/elk/csv/sep/*.csv"
		start_position => "beginning"  
		sincedb_path => "/dev/null"
	}
}  
filter {
	csv {
	separator => ","
	columns => ["Pattern_Date","Operating_System","Client_Version","Policy_Serial",
				"HI_Status","Status","Auto_Protect_On","Worst_Detection",
				"Last_Scan_Time","Antivirus_engine_On","Download_Insight_On",
				"SONAR_On","Tamper_Protection_On","Intrusion_Prevention_On",
				"IE_Browser_Protection_On","Firefox_Browser_Protection_On",
				"Early_Launch_Antimalware_On","Computer_Name","Server_Name",
                             "MAC_Address1"]
	}
}
output {
	elasticsearch {
	hosts => "http://localhost:9200"
	index => "sep-index"
	}
}

END CONF

BEGIN INDEX CREATION

PUT sep-index
{
"mappings": {
"logs": {
"properties": {
"@timestamp": {
"type": "date",
"format": "basic_date"
},
"@version": {
"type": "string"
},
"Pattern_Date": {
"type": "date",
"format": "basic_date"
},
"Operating_System": {
"type": "keyword"
},
"Policy_Serial": {
"type": "keyword"
},
"HI_Status": {
"type": "keyword"
},
"Status": {
"type": "keyword"
},
"Auto_Protect_On": {
"type": "keyword"
},
"Worst_Detection": {
"type": "keyword"
},
"Last_Scan_Time": {
"type": "date",
"format": "basic_date"
},
"Antivirus_engine_On": {
"type": "keyword"
},
"Download_Insight_On": {
"type": "keyword"
},
"SONAR_On": {
"type": "keyword"
},
"Tamper_Protection_On": {
"type": "keyword"
},
"Intrusion_Prevention_On": {
"type": "keyword"
},
"IE_Browser_Protection_On": {
"type": "keyword"
},
"Firefox_Browser_Protection_On": {
"type": "keyword"
},
"Early_Launch_Antimalware_On": {
"type": "keyword"
},
"Computer_Name": {
"type": "keyword"
},
"Server_Name": {
"type": "keyword"
},
"MAC_Address1": {
"type": "keyword"
}
}
}
}
}
END INDEX CREATION


(David Pilato) #2

Please format your code using </> icon as explained in this guide. It will make your post more readable.

Or use markdown style like:

```
CODE
```

I moved your question to #logstash


(Magnus Bäck) #3

It appears that logstash is ingesting the CSV file not into the intended elasticsearch index, but into one labelled:

%{[@metadata][beat]}-2017.08.04

That's because you have

index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"

in one of your configuration files but your CSV events don't have a [@metadata][beat] field. Multiple configuration files are effectively concatenated so you need to use conditionals if you don't want all events to be sent to all outputs.


#4

Thank you Magnus. I found and addressed the issue in one of my other conf files. I still cannot get data from the csv file into the desired index, but I now know that it has something to do with how I pre-built the index. I know this because when I do not pre-build the index, the csv file is ingested as intended

I'm not sure what I did wrong with the PUT, or if perhaps I missed a step after it. Any help is appreciated. Re-pasting the PUT I posted using Kibana development console:

PUT sep-index
{
“mappings”: {
“logs”: {
“properties”: {
"@timestamp": {
“type”: “date”,
“format”: “basic_date”
},
"@version": {
“type”: “string”
},
“Pattern_Date”: {
“type”: “date”,
“format”: “basic_date”
},
“Operating_System”: {
“type”: “keyword”
},
“Policy_Serial”: {
“type”: “keyword”
},
“HI_Status”: {
“type”: “keyword”
},
“Status”: {
“type”: “keyword”
},
“Auto_Protect_On”: {
“type”: “keyword”
},
“Worst_Detection”: {
“type”: “keyword”
},
“Last_Scan_Time”: {
“type”: “date”,
“format”: “basic_date”
},
“Antivirus_engine_On”: {
“type”: “keyword”
},
“Download_Insight_On”: {
“type”: “keyword”
},
“SONAR_On”: {
“type”: “keyword”
},
“Tamper_Protection_On”: {
“type”: “keyword”
},
“Intrusion_Prevention_On”: {
“type”: “keyword”
},
“IE_Browser_Protection_On”: {
“type”: “keyword”
},
“Firefox_Browser_Protection_On”: {
“type”: “keyword”
},
“Early_Launch_Antimalware_On”: {
“type”: “keyword”
},
“Computer_Name”: {
“type”: “keyword”
},
“Server_Name”: {
“type”: “keyword”
},
“MAC_Address1”: {
“type”: “keyword”
}
}
}
}
}

(Magnus Bäck) #5

Is Logstash even reading the CSV file? Have you looked in the Logstash log for clues?


#6

Nothing in the logstash logs. However, when I look at my elasticsearch logs I see a huge number of entries that look like this:

[DEBUG][o.e.a.b.TransportShardBulkAction] [3NZmxIH] [sep-index][4] failed to execute bulk item (index) BulkShardRequest [[sep-index][4]] containing [4] requests


(Magnus Bäck) #7

That doesn't look normal. Any other interesting log entries? Is the cluster's health green (or at least yellow)?


#8

When I grep'd the logs a different way I found that elasticsearch was failing to parse my @timestamp field. When I removed "format": "basic_date" from the field properties everything started working.

Thank you for your help!


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.