I am processing incoming documents via logstash. And I would like to send the documents to different multiple indices. I have found this approach to solve the problem.
I have the following doubt with this approach:
If a document gets an error(lets say mapping parser exception) while indexing in the first index, will it not index in the second index? And what happens to the documents that come after that.
What your describing is done by design.
If one output fails then the entire execution of that pipeline will fail. This is done to prevent any sort of queueing or backpressure on logstash.
You may want to look into cross cluster index replication.
Why do you have so many clusters in the first place?
Yeah let me explain why I am doing that.
Actually I was performing reindexing on my data. And I have completed the reindexing from my "old_index" to "new_index" but before shifting entirely to the "new_index", I would like to see if the mapping I defined in the "new_index" working fine for a few days and so I want the data to still be indexed in the "old_index" also. I hopw this makes sense .
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.