How to tokenize in this case?


(Anjni G) #1

Hi all,
I am searching in an index created through below command
curl -XPUT 'http://localhost:9200/myTest/' -d '{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 1,
"analysis": {
"analyzer": {
"myAnalyzer": {
"tokenizer":"keyword",
"filter": ["lowercase", "reverse"]
}
},
"filter": {
"lowercase": {
"type": "lowercase",
"preserve_original": "true"
},
"reverse":{
"type": "reverse"}
}
}
}
}
}'

Now in one of the fields in indextypes I have data as below
"the quick brown fox"
"the brown quick fox"
I need to search over these with keywords like "quick brown" which should return only the first one and if i search it like "brown quick" it should only return the 2nd one and not the 1st one.
How to I achieve this?


(Christoph) #2

Hi,

which analyzer do the fields you want to search on use? I suspect its not the one mentioned in your example because the keyword tokenizer won't do any token spliting and reverse the token like this:

curl -XPOST 'localhost:9200/my_test/_analyze?analyzer=myAnalyzer' -d 'The quick brown fox'
{"tokens":[{"token":"xof nworb kciuq eht","start_offset":0,"end_offset":19,"type":"word","position":0}]}

In any case, what you are probably looking for is Phrase Matching.

Cheers


(system) #3