Hi.
I have index with two field in documents like direction_name, direction_code
and data like: Ukraine mobile, 38067 Ukraine mobile, 38063 Ukraine fixed, 380
How I can search operator nume by full phone number like 380673421233 ?
It must return code Ukraine mobile => 38067
And if number 3801231231212
it must return Ukraine fixed => 380.
for MySQL I used query:
SELECT dc_direction
FROM direct_codes
WHERE 3801231231212 LIKE dc_code || '%'
ORDER BY LENGTH(dc_code) DESC LIMIT 1
You want the longest matching prefix? You could do it with a term query against the direction_code field if you set the index_analyzer to use the keyword tokenizer and the query_analyzer to use the edgeNGram tokenizer. You'd have to index the length of the direction_code as integer field and add a sort on it. Its not a particularly strait forward thing to do but its not super hard either.
Under the covers this is implemented super differently than the way MySQL would do it. Expect some overhead on the first query but subsequent queries to be much faster.
Since this kind of list doesn't change much its probably a good idea to _optimize it to a single segment after loading it to get optimal performance. But only because it doesn't change much.
How many of these prefix codes do you have? How deep do you go in the number list and how many regions is it? NANPA's can be broken down into something like ten million of these if you are willing to go to the "exchange" level and that's just two countries.
Anyway, if you only have tens of thousands of these things I'd suggest going with a single shard. It'll be faster to query.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.