Customize IndexWriter/Reader/etc within ElasticSearch

cecoke · May 23, 2018, 1:25am

Hello,

Is there a way for me to customize Lucene Index when using ElasticSearch? For example, I might replace text-based search with numerical/image-based search.

rjernst · May 23, 2018, 4:47pm

Lucene is tightly coupled with Elasticsearch, and there is no support for plugging in an alternate index writer or reader. However, you can plug in additional query types. See the SearchPlugin interface. With a query, you can build any kind of matching against the underlying data that you desire.

cecoke · May 24, 2018, 12:28am

Thanks for your quick answer, Ryan!

I am new to ElasticSearch. Can you point me to how a SearchPlugin can be used to allow custom searching algorithm? Does it mean with this plugin, Lucene is no necessary for indexing and searching?

rjernst · May 24, 2018, 4:34am

A SearchPlugin allows adding query implementations. For example, when you use do the following search, a term query is run:

/_search
{
  "query": {
    "term": {
      "myfield": "someterm"
    }
  }
}

The term name is attached to a QueryParser (and also a Writable.Reader, but that is just implementation details for how the query object is passed across nodes). The QueryParser takes the json content, in this case { "myfield" : "someterm" }, and parses it into a QueryBuilder. The QueryBuilder is then to construct the actual Query object. Everything up until this is boiler plate for how to plug in a custom Lucene Query with a name elasticsearch will know how to parse in a search.

Implementing a Query is beyond the scope of a simple discuss response. There are numerous examples online, and you can ask questions on the lucene users mailing list if you need help. At a high level, a Query produces a tree used to return matching documents and score them. The implementation can do whatever it likes.

A word of caution, though, before embarking on this very advanced exercise: numeric queries are well supported in Elasticsearch and Lucene already and image based search is also possible. I have seen users break the image into features, and then index those features as text (unanalyzed tokens). You would do this translation into features both at index time, and then also at search time, looking to match as many features as possible to find the best match (this is where it normally gets complicated, in order to calculate a score which measures how well the features matched a given document's features).

cecoke · May 25, 2018, 12:57am

Thank you, I understand that this is no small undertaking for new users.

system · June 22, 2018, 12:57am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to use Customized Query of Lucene in Elasticsearch Elasticsearch	1	302	July 6, 2017
How to execute Lucene query inside Elasticsearch Elasticsearch	2	406	November 1, 2019
Possible to use Lucene filters? Elasticsearch	4	421	July 6, 2017
Customizing Directory and IndexWriter behavior via custom ES plug-in Elasticsearch	5	740	July 6, 2017
Use lucene query within ES-API Elasticsearch	6	710	May 17, 2018

Customize IndexWriter/Reader/etc within ElasticSearch

Related topics