HOWTO : Test new inserted docs against a list of 14000 words

(tybreizh29) #1

I have to insert xml docs (time coded word, converted into a mapping for each word) into Elastic Search.
I have to check if some words of a 14.000 words list are present into the newly inserted doc.
The 14.000 words are upper case, as my words are upper or/and lower.
As it's in French we have words with accents, like "é" the uppercase is E not É.
How can/should I do that ?
Here is how part of a record looks like:


I can change my mapping to add a plain text field.

My first guess was to add a plain text with only upper case characters, and do 14.000 requests, that's ugly but it's me only idea.
thanks for your help

(Isabel Drost-Fromm) #2

What exactly is your purpose with doing that? The approach you want to implement might depend on your use-case:

If you want to trigger some action if any of these words appears in a newly indexed document the Percolator might be what you want to use:

Others may know better solutions.


(system) #3