Discard documents when indexing with rules


(Xavier Facq) #1

Hi,

Is it possible to add something like a "rule" into the index mapping/description or whatever configuration that could to resolve the following trick:

We want to inject lot of data into a dedicated indice, but we would like to do not index some of this documents if they do not match a rule. There is a feature that to this trick but only on the field level:

https://www.elastic.co/guide/en/elasticsearch/reference/6.5/analysis-keep-words-tokenfilter.html

We would like to do the same but directly reject the indexation of the document is one of the field is
not in a dictionary.

Any idea ?

Thanks !


(Christian Dahlqvist) #2

You typically identify this before you send the data to Elasticsearch and simply drop the document. As this is a common ingest requirement Logstash has a drop filter you can use for this. I do not see why you would do this in Elasticsearch.


(Xavier Facq) #3

Yes, it's what we use to do for other data, but in this new use-case it would have be interesting to filter documents during the indexation, based on a special dictionary.


(Christian Dahlqvist) #4

You could use an ingest pipeline with a drop processor.


(Xavier Facq) #5

Ok thank you ! This is very interesting for our usecase ! :+1:

Before we would have to switch from 2.4 to 6.x !!!

Thanks !
Xavier