Not analyzing string data


#1

Hi,

I'm having an issue with elasticsearch splitting words from my "request" field. I have found some clues about analyzed and not_analyze field that would do what I want : searching my whole sentence.

So when I get my mapping :
curl -XGET 'http://localhost:9200/pglog/_mapping/logs'
{"pglog":{"mappings":{"logs":{"properties":{"@timestamp":{"type":"date","format":"dateOptionalTime"},"@version":{"type":"string"},"contenu":{"type":"string"},"database":{"type":"string"},"date":{"type":"string"},"datestamp":{"type":"string"},"duration":{"type":"long"},"host":{"type":"string"},"message":{"type":"string"},"path":{"type":"string"},"request":{"type":"string"},"tags":{"type":"string"}}}}}}

The "request" field does not have an "index" value, so I can't assign it to "not_analyzed"

curl -XPUT http://localhost:9200/pglog/_mapping/logs -d'{

"logs": {
"properties": {
"request": {
"type": "string",
"index": "not_analyzed"
}
}
}
}'
{"error":"MergeMappingException[Merge failed with failures {[mapper [request] has different index values, mapper [request] has different tokenize values, mapper [request] has different index_analyzer]}]","status":400}

And I'm getting this field from a grok match filter :
grok {
match => ["contenu", " db=%{DATA:database},user=%{DATA} duration: %{NUMBER:duration} ms %{GREEDYDATA:request}"]
}

Is there a way to say to Logstash that I want this field to be "not_analyzed" ? Or something else ?

Thibaut


(Mark Walkom) #2

You need to do that in Elasticsearch by specifying a mapping for the field before it leaves LS.
LS is just a "dumb" pipeline in that regards.


#3

Ok I got it thanks. My curl request to modify the field index to not_analyzed was good, but was supposed to be executed before any data is saved on Elasticsearch.


(system) #4