I’m new to elasticsearch and evaluating it for use in my project where I need to compare a large set of keywords (currently around 3500) against documents (on average around 500 words long) and highlight any matches that occur.
The keywords I have are mostly specialised technical terms and the documents are unstructured natural language text.
Basically, I want to end up with the document text, modified with highlighted keyword matches, plus a separate array of the matched keywords that I can display as a simple list. I think the best way to describe what I’m trying to achieve is a document classification system.
Things I’d be interested to know are:
- Is Elasticsearch a suitable tool for this kind of use case?
- If it is, are there any specific features or configurations I’ll need to look into?
- What kind of performance I can expect - eg, can I do this sort of thing on demand or will it require a background task?
I realise these questions are quite broad but I’d really appreciate any pointers.