Annotate Text with Numbers and Search by Range ... use annotated_text?

Is it possible to combine a text search with a number range search, where you want to find the numbers near the text?

I have a corpus of unstructured data that includes both numbers and text, the following would be a typical (but abbreviated) sample document:

John Smith's specialty is biology, and noted his hourly rates is $150/hour. ... Sarah Marsh's has 6 months of experience in electronics, and has a rate of $50/hour. ... Jane Johnson has worked for many years in the field of electronics, and charges $120/hour.

I want to run searches such as: find documents with the word "electronics" where that word appears near a dollar amount rate between $100-200.

My current solution has a mapping with two fields:

  • A text field to search the text
  • A double field for hourly rates (I extract these dollar amounts from the text at index time)

Unfortunately, while this mapping allows me to run searches on the text and a range of dollar amounts, it does not store the proximity information, so it cannot search for text near a dollar amount range.

I see the annotated_text plugin allows you to annotate text and keep proximity information, but that does not appear to work with numbers.

What would be a good data model to enable these types of range + text + proximity searches?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.