Hello,
I cant seem to make the hunspell filter to work.
My config:
curl -XPUT "http://localhost:9200/test_index" -d'
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer":"ngram",
"filter": [
"lowercase",
"standard",
"hspell"
]
}
},
"filter" : {
"hspell" : {
"type" : "hunspell",
"locale" : "rs_RS"
}
}
}
},
"mappings": {
"test_mapping": {
"properties": {
"name": {
"index_analyzer": "basic",
"type": "string",
"store":true
}
}
}
}
}'
Requests:
1. curl -XGET "http://localhost:9200/_analyze?index_analyzer=my_analyzer&text=raća"
2. curl -XGET "http://localhost:9200/_analyze?index_analyzer=my_analyzer&text=raža"
3. curl -XGET "http://localhost:9200/_analyze?index_analyzer=my_analyzer&text=đaja"
4. curl -XGET "http://localhost:9200/_analyze?index_analyzer=my_analyzer&text=čača"
5. curl -XGET "http://localhost:9200/_analyze?index_analyzer=my_analyzer&text=liše"
Responses:
1.{
"tokens": [
{
"token": "ra",
"start_offset": 0,
"end_offset": 2,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "263",
"start_offset": 4,
"end_offset": 7,
"type": "<NUM>",
"position": 2
},
{
"token": "a",
"start_offset": 8,
"end_offset": 9,
"type": "<ALPHANUM>",
"position": 3
}
]
}
2.{
"tokens": [
{
"token": "ra",
"start_offset": 0,
"end_offset": 2,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "a",
"start_offset": 3,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 2
}
]
}
3.{
"tokens": [
{
"token": "273",
"start_offset": 2,
"end_offset": 5,
"type": "<NUM>",
"position": 1
},
{
"token": "aja",
"start_offset": 6,
"end_offset": 9,
"type": "<ALPHANUM>",
"position": 2
}
]
}
4.{
"tokens": [
{
"token": "269",
"start_offset": 2,
"end_offset": 5,
"type": "<NUM>",
"position": 1
},
{
"token": "a",
"start_offset": 6,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "269",
"start_offset": 9,
"end_offset": 12,
"type": "<NUM>",
"position": 3
},
{
"token": "a",
"start_offset": 13,
"end_offset": 14,
"type": "<ALPHANUM>",
"position": 4
}
]
}
5.{
"tokens": [
{
"token": "li",
"start_offset": 0,
"end_offset": 2,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "e",
"start_offset": 3,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 2
}
]
}
So in 1,3,4 it replaces the special character with a number. In 2 and 5 it doesnt do anything. Shouldnt it replace these chars also with a number?
Im using the latest aff and dic files.
Thanks.