Is there any facility in elasticsearch to help with sending terms to an
external processes after lucene processing (tokenization, filters, etc)?
The idea here is having some external analysis / nlp code run against the
documents while keeping all the pre-processing choices consistent and in
one place (i.e. the analysis setup in elasticsearch index configuration).
I am not very familiar with Lucene, but I believe possibly their update
request processor is intended for scenarios like this needing a simple
pipeline.
Is there any facility in elasticsearch to help with sending terms to an
external processes after lucene processing (tokenization, filters, etc)?
The idea here is having some external analysis / nlp code run against the
documents while keeping all the pre-processing choices consistent and in
one place (i.e. the analysis setup in elasticsearch index configuration).
I am not very familiar with Lucene, but I believe possibly their update
request processor is intended for scenarios like this needing a simple
pipeline.
Thanks. I actually have used the term list plugin (thanks) for some quick
prototype / experiments.
I actually meant I am not familiar with SOLR. Lucene I do have some
familiarity with. In this case I was wanting to really be able to send the
analysed text on to some post processing either in parallel or prior to
indexing. I can have the other process load up the same sets of analyser
config being used by ES with lucene, but then I have to manage 2 sets of
analysis configuration (external process + es) plus I am making 2 passes on
the data. Or I can come back and hit the index after it is built with
maybe the term vector api, but again 2 passes on the data.
From the lack of response I am guessing there isn't a facility for this. I
am surprised because I figured a lot of people would be running various
things over their text data to better analyse it, but I might also be
approaching it wrong.
Thanks again!
Kevin
On Tuesday, August 26, 2014 4:56:10 PM UTC-5, Jörg Prante wrote:
If you want to retrieve the term list of an index after Lucene processing
via REST HTTP API, you can try
On Tue, Aug 26, 2014 at 10:41 PM, Kevin B <blais...@gmail.com
<javascript:>> wrote:
Is there any facility in elasticsearch to help with sending terms to an
external processes after lucene processing (tokenization, filters, etc)?
The idea here is having some external analysis / nlp code run against the
documents while keeping all the pre-processing choices consistent and in
one place (i.e. the analysis setup in elasticsearch index configuration).
I am not very familiar with Lucene, but I believe possibly their update
request processor is intended for scenarios like this needing a simple
pipeline.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.