Terms query takes lots of cpu usage


(Tamizharasan) #1

I am using terms query to find the followers of a user.
For that, I fetch the follower ids of that user and use terms query to get the user result from elasticsearch. But when I run the query it took lots of CPU. I guess it because of more terms per query.
But I couldn't find any other solution for this. Any suggestions will be helpful.


(Mark Harwood) #2

If you denormalize the data such that each user has a "follows" array you can just do this:

 GET /users/_search
 {
     "query": {
          "match" : { 
              "follows" : "userX"
         }
    }
 }

(Tamizharasan) #3

Really thanks for the response. I thought the same way too. But I have 70 million users. So it will take too much time to add follows array. Is there any other option to overcome this issue ?.


(Mark Harwood) #4

It comes down to physics. Random disk seeks for lots of unique IDs are slow. SSDs will help but there's still a cost with big numbers.
Using a graph database will remove the index lookups at query time by chasing around connections using pointers but:

  1. This means using a single-server solution with lots of RAM
  2. All your index-lookups are shifted to write-time when the database has to convert user IDs to pointers.

No easy answers here.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.