Hi everyone,
I have a question related to ES internals (providing an extension to search
functionality).
We have a customer that would like to integrate clustering of search
results directly with ES so that it happens as part of the search. This
functionality is essentially identical to just plain searches, with some
additional parameters to determine the clustering algorithm to use, etc. I
know medcl already implemented a Carrot2 plugin for ES and I looked at his
code but for us we will need something more generic to also allow
proprietary clustering algorithms to be used with the plugin in a seamless
way. But back to the point.
I've been looking at the architecture and ways this could be accomplished
(by the way -- kudos to everyone involved, the code looks and works very
cool... bonsai cool) and have a few questions that popped up.
-
It seems that the "nicest" way to accomplish the task in question would
be to somehow plug into the search action, ideally as a SearchPhase (or
rather a FetchSubPhase). These sub-phases currently seem to be fixed and
not extensible.... and I can already see the problems with serialization if
the search result is somehow augmented at this level. Do you think it's at
all possible (and a good idea) to try to plug it in there? -
Since (1) seemed very intrusive I temporarily implemented a custom
plugin (and an action/ request/ response pair). My code essentially does
nothing but delegates most of its internal workings to Search*: currently
all the "logic" that actually does the clustering resides in a subclass
of TransportAction; in doExecute it delegates to TransportSearchAction,
then inside onResponse it clusters the result and returns the augmented
response back to the user.
This works fine but clustering is pretty heavy on computational resources
and I wondered if TransportAction is a good place to place this logic and
what threading (threadpool) magic should be used to make it fit with the
rest of ES.
Another problem is that the rest handler could be implemented in pretty
much the same way but the search-request parsing logic in
RestSearchAction#parseSearchRequest is currently private and there is no
way to reuse that (and I'd say it begs for reuse since it's far from
trivial and copy-paste will most likely go out of sync in future versions).
Thanks for all the tips and hints,
Dawid
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.