Conditionally add document only if percolator matches a query


(tiftif) #1

Our current setup looks like this:

  1. We have a bunch of queries defined in ES (just a few)
  2. We have an incoming data stream.
  3. We percolate (and thus add) each document into ES.
  4. We check the resulting queries from percolation. If no queries matched,
    we delete the added document.

The problem is that we are constantly adding and removing documents into
ES. Just running ES for 2 days runs out of memory for some reason.

Is there a way to conditionally add a document only if it matches one or
more queries after percolation?


(tiftif) #2

Btw, to be more clear, I understand we can do this individually for each
document (i.e., only percolate each document, then add each document
selectively). But is there a bulk API to do this?

On Tuesday, July 24, 2012 1:18:46 AM UTC-4, tiftif wrote:

Our current setup looks like this:

  1. We have a bunch of queries defined in ES (just a few)
  2. We have an incoming data stream.
  3. We percolate (and thus add) each document into ES.
  4. We check the resulting queries from percolation. If no queries matched,
    we delete the added document.

The problem is that we are constantly adding and removing documents into
ES. Just running ES for 2 days runs out of memory for some reason.

Is there a way to conditionally add a document only if it matches one or
more queries after percolation?


(David Pilato) #3

Just wondering what is your setup ?

Do you have multiple nodes? Only one embedded instance in your own project?

Where do you see OOM? On the Elasticsearch nodes?

How do you percolate? With a Node Client or with a Transport Client ?

David

De : elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] De la part de tiftif
Envoyé : mardi 24 juillet 2012 07:26
À : elasticsearch@googlegroups.com
Objet : Re: Conditionally add document only if percolator matches a query

Btw, to be more clear, I understand we can do this individually for each document (i.e., only percolate each document, then add each document selectively). But is there a bulk API to do this?

On Tuesday, July 24, 2012 1:18:46 AM UTC-4, tiftif wrote:

Our current setup looks like this:

  1. We have a bunch of queries defined in ES (just a few)

  2. We have an incoming data stream.

  3. We percolate (and thus add) each document into ES.

  4. We check the resulting queries from percolation. If no queries matched, we delete the added document.

The problem is that we are constantly adding and removing documents into ES. Just running ES for 2 days runs out of memory for some reason.

Is there a way to conditionally add a document only if it matches one or more queries after percolation?


(system) #4