Custom similarity in 6.3?

Hi there,

We were using a custom similarity that was using 3 float values in the payload. Unfortunately, this commit breaks it, since the refactor effectively prevents plugins to provide similarities.

I guess it could be possible to use a text payload with the three values and parse that, but that sounds hugely inefficient. Any thoughts on how to keep our custom similarity without forking from the main repo?

Thanks!

Have you performed any benchmark to understand how "huge" this parsing performance drawback would be to verify if it makes any sense to fork Elasticsearch?

I guess I was a bit off: scripted similarity (where you specify java code in the JSON request) is what I had in mind. But it looks like scripting engines are the better way to integrate custom similarities in 6.3, right?

Yes, this is the preferred to use custom similarity algorithms in latest versions.

While a ScriptEngine is certainly one way to use a custom similarity, it is not the only way. You can still have a plugin override onIndexModule and then call addSimilarity to that. The commit you reference was a cleanup that removed a level of indirection and consolidated the factories which create the Lucene provided similarities. What about it prevents plugins from providing similarities?

1 Like

Thanks for the pointer, that's what I was looking for. Since the elastic code documentation often falls a bit short: is there something like a book or wiki that's updated with the current versions? In particular how the internal APIs work together? I read Mike McCandless book, but it's covering Lucene 3 (!)...

Do you mean for Lucene? For Elasticsearch, we have slowly been improving the java docs, but a lot of what is accessible is not meant to be used by plugins. At this time, plugin development is still the Wild West. Reading the code for classes you are interacting with is the best approach.

Yup, I meant Elasticsearch -- and yes, reading the code helps, but it's quite hard at times...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.