Search for a numeric range inside string in elastic search


(Pankaj Rawat) #1

I am not sure it is a perfect platform to discuss a issue. If not then just please ignore this mail.
I am trying to search a numeric expression inside a field indexed as string in elastic search.
Example

indent code 4.8663 spaces
indent code 121.232 spaces
indent code 12.3232 spaces

Example query

get all string with "indent code between 1 and 100"

It should get 1st and 3rd but not 2nd.
For this purpose I tried using span_near query with range in span_multi.

          {
                "span_near": {
                    "in_order": 1,
                    "clauses": [
                        {
                            "span_term": {
                                "request": "indent"
                            }
                        },
                        {
                            "span_term": {
                                "request": "code"
                            }
                        }
                        ,
                        {
                            "span_multi": {
                                    "match":{
                                        "range": {
                                            "request": {
                                                "to": 100,
                                                "from": 1
                                            }
                                        }
                                }
                            }
                        }
                    ],
                    "slop": 0,
                    "collect_payloads": 0
                }
            }

It is getting result but wrong result, as it is comparing using TermRangeQuery rather than NumericRangeQuery
Any help would be appreciated. Also please let me know any other approach if possible.


(Jason Wee) #2

If you index that as string, I don't think range query will work as the range is lexicographically. maybe during index, you can add one more field with type float and query that field that contain numeric value only?


(None) #3

Maybe create a multifield with a custom analyzer that tokenizes "numbers" from a string. Probably to far fetched lol

I am no expert but yeah a you would need to write a custom tokenizer that would parse your field just for that value and then use multifield.

Or easier way maybe to just parse your string pre-indexing and set the number value in a separate field.


(Pankaj Rawat) #4

Thanks, i figured that i got that answer coming :smile:

The problem with saving integer as separate field is that, there could be many integer, float in the string may be hundreds and so i cannot store them all and if i do i have no way of knowing which one to match. The string that i have mentioned in question is just a sample one.


(Pankaj Rawat) #5

Hi did anyone find a solution yet


(system) #6