Hello!
I have configured an analyzer in my YML to exclude all words that do
not start with either # or @ to process hash tags and at tags. This
analyzer works fine if using the analyzer API but when I index data it is
not being applied.
I thought that when indexing, analyzers would replace the original
contents with the analysis result. Is that not so?
Thank you for your help!
André
Here is my YML configuration:
index.analysis.analyzer.tags:
type: custom
tokenizer: whitespace
filter: fntags, fnsize
index.analysis.filter.fntags:
type : pattern_replace
pattern: "^[^#@]+.*$"
replacement: ""
index.analysis.filter.fnsize:
type : length
min : 2
max : 200
Here is my type mapping for the field:
Hash_Tags: {
analyzer: tags
type: string
}
Here is the result when using the analyzer API:
curl -XGET 'localhost:9200/catalog/_analyze?field=Hash_Tags&pretty=true' -d
'NO a la violencia! Comparta esto en la medida de lo posible si espera
verdaderamente un mejor mundo para Navidad... y despus! #lifeworthbetter'
{
"tokens" : [ {
"token" : "#lifeworthbetter",
"start_offset" : 126,
"end_offset" : 142,
"type" : "word",
"position" : 23
} ]
}
And here the result for a match_all query:
Hash_Tags: " NO a la violencia! Comparta esto en la medida de lo posible si
espera verdaderamente un mejor mundo para Navidad... y después!
#lifeworthbetter"
I was expecting:
Hash_Tags: "#lifeworthbetter"
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b719f38f-c8a0-439f-9cdc-0bf3f709429d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.