Hello there,
I try to create an index with custom settings and mappings.
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"char_filter": [
"my_char_filter"
],
"tokenizer": "whitespace",
"filter": ["lowercase"]
}
},
"char_filter": {
"my_char_filter": {
"type": "mapping",
"mappings": [
"a => ٨",
]
}
}
}
},
"mappings": {
"file": {
"properties": {
"attachment.content": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
}
I thought that the "attachment.content" would be stored after ElasticSearch replaces "a" into "٨", but it's not the case.
When I index a document and search for "٨", I don't find anything:
PUT /my_index/file/{id}?pipeline=attachment
{
"data": "myencodedbase64data"
}
GET /dv-dm4ep-test/file/_search
{
"_source": [ "attachment.content" ],
"query":{
"match_all" : {}
}
}
As you can see, the "a" characters have not been replaced by "٨":
{
"took": 17,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1718,
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_type": "file",
"_id": "32215",
"_score": 1,
"_source": {
"attachment": {
"content": """01 Legislative reports"""
}
}
},
{...}
]
}
}
Any help is welcome!
Thanks