We're doing some nested indexing on an events property. Some of the events we don't need in elasticsearch and are causing for a large amount of unneeded internal documents.
Is there a way to skip indexing a field based on it's value?
ie: if events.type = foo, skip indexing this
I've been trying to find a solution all morning but no luck.
We've already requested they add support for pipelines so we can route documents to an index based off the created_at field, and hope to have it within the next 1-2 weeks.
From what I understand, we're pretty blocked until we can specify a pipeline to use the ingest layer?
remove processor removes a field. It does not drop the document by itself.
If I understood you want the later.
So you need to do that in another way.
I don't think there is something doable like this. May be an ingest script though which test the field value and generates then an exception which will "fail" the document.
Correct. We’re basically looking to drop that specific nested doc.event where doc.event.type = X from indexing, but keep the entire doc and the rest of the events.
For more insight, we have 200k individual docs, but due to the nested events on doc.events it translates to 9m docs in the index. If we could ignore a specific event.type it would shed millions off.
It seems Ultimately we’ll likely just have to change our data model.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.