Hi,
I'm working on a file system crawler and indexer, using combination of FS Crawler and pipeline (painless scripts) to do transformation/computations on text extracted from documents,
My question is : is it possible to dynamically cancel or abort processing/indexing of a document based on logic in the pipeline script ?
If you just throw an exception inside the pipeline then it won't index. Also, the error will propagate back to the client, so you need to make sure that the client will ignore the error and won't retry.
Hi, thank you for your answer,
One more precision please : If this failure is in a bulk request, will other document be indexed or all the bulk is in error/discard ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.