Filter on thousands of IDS - How is it efficient?

Hello,

I have to restrict my result based on date, and multi conditions to
analyse. I have about a billion documents since many years and these
restrictions apply to only documents from last few days. In order to be
efficient, to use the caching feature with ElasticSearch, i was thinking
about using a third party tool to get all my documents restricted, and ask
ElasticSearch to do not return these results. I would use such query :

"filtered" : {

    "query" : {

        "term" : { "body" : "Obama" }

    },

    "filter" : {

        "not" : {

            "ids" : {

                    "values" : ["1", "4", "100"]

            }

        }

Is it really worth it if i send an array of 1000 values ? 10000 values ?

Thanks a lot.

Loïc

--

Hi,

On Saturday, November 10, 2012 3:49:48 AM UTC+13, Loïc Bertron wrote:

Hello,

I have to restrict my result based on date, and multi conditions to
analyse. I have about a billion documents since many years and these
restrictions apply to only documents from last few days. In order to be
efficient, to use the caching feature with Elasticsearch, i was thinking
about using a third party tool to get all my documents restricted, and ask
Elasticsearch to do not return these results. I would use such query :

"filtered" : {

    "query" : {

        "term" : { "body" : "Obama" }

    },

    "filter" : {

        "not" : {

            "ids" : {

                    "values" : ["1", "4", "100"]

            }

        }

Is it really worth it if i send an array of 1000 values ? 10000 values ?

What do you mean by 'is it really worth it'? Do you mean will it perform
well? will it cache well?

Thanks a lot.

Loïc

--