Search Query Optimization

I'm using EC with 2 nodes with 64GB ram each, number of request ~1000 per minute and CPU/Memory is in green zone.

My index is about 40 millions of items and search request is taking 5-7s to finish.
I use function score query to rank my search result as example below:

"from": 0,
 "size": 50,
"query" :{ 
  "function_score": {
        "script_score": {
            "script": {
               "lang": "painless",
               "source: "_source + doc['point'] value + (new Date.getTime() /1000 - doc['created_at'].value) +...some other factor using doc_value"
             }
         },
        "query": {
            {
                  "bool": {
                      "must": {  "title" : {"value": "wordA", "boost": 100 } }
                   }
             }
        }
 }

As profile API pointed out, FunctionScoreQuery and TermQuery are taking majority of time , specially when total hits of result is big.
I wonder if there is any way to reduce their execution time ?
My idea is pre-calculated all the static score for all items daily then when user makes a search, it just needs to plus boots score from bool query but I run through all documents but no way to archive this.

Furthermore, when I try to create a script_field with same script source as above, it just need ~100ms to finish for 40M items. Why the difference is too big ?

I appreciate a lot if you can help me with these questions or provide me any solutions like caching.
Thank you very much..

Is that a correct query?

        "query": {
            {
                  "bool": {
                      "must": {  "title" : {"value": "wordA", "boost": 100 } }
                   }
             }
        }

I'm surprised this works.

"query":  {
    "bool": {
        "must": [
            {  
                "title" : {
                    "value": "wordA", 
                    "boost": 100 
                 }
             }
         ]
     }
}

I corrected the query. Would you mind checking again ?

Are you sure this is a correct query?
I mean did you try to run it?

So the main difference between a script field and a script that runs in function score is that the former has only to run on the 100 hits you are fetching whereas the later has to be ran against all the hits that are returned by the query.

If this is too slow, you should may be use the rescore API to first select less results then run the function score on it.

Have a look at this example: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-rescore.html

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.