Hi all!
I am in the planning phase for for a search application and still have
to decide which search engine to use. First I have been going towards
Solr but than got pointed to ElasticSearch. And I must admit, it caught
my attention.
From what I have seen so far, this is a great and easy to use search
engine. But I have a use case I'm not quite sure if this goes along well
with ES.
I am doing analysis of text documents using the UIMA Framework. That is,
I have a text and quite a lot of annotations. For example, my analysis
would mark Names in a text as 'person', cities, lakes etc as 'location'
etc. It does also some quite more complicated things (detecting
relations and whatsoever). I already use a UIMA component named Lucas
(Lucene CAS indexer) to create a Lucene index from my annotated texts.
Now, of course, I don't want to bother with using Lucene directly but
rather through an elaborate search engine.
In Solr I managed to write some additional classes which take a CAS and
use Lucas to build a Lucene document. This document is then given to the
normal Solr indexing-process and everything's fine.
Finally, my question: Can I do something similar with ElasticSearch? In
the simpliest way I already have a Lucene document object I'd just like
to hand to the search engine which should do the rest for me. Especially
with ElasticSearch I don't know where to start. Will this work with the
schemaless approach? Will routing still work?
I'd really appreciate if you could point me to a mechanism so I could
add such capabilities. I think I saw a plugin mechanism for ES, could
this be the way to go?
Another possibility would be to convert the format UIMA gives me into
something ElasticSearch understands out of the box. But I don't think
you can express something like Lucene position_increment in JSON, right?
Thanks for your help!
Erik