Cross-cluster search not returning all hits from IDs query, sometimes

loren · July 25, 2019, 9:08pm

Why do my cross-cluster IDs queries sometimes return all but 1 or 2 of the results?

I have a 7.2.0 cluster executing an IDs query across 3 remote clusters usually with 500-1000 IDs in each query. I set the size parameter to be the length of my ID array. I usually get back all of the IDs but sometimes I get back all but 1 or 2 of them. Repeating the query again gets me 100% of the results.

If I just hit the clusters directly instead of going through CCS, I get back 100% of the results all the time with size=len(my_ids).

Thinking it might be a fencepost error, I tried len(my_ids)+1 but no luck. I also tried setting ccs_minimize_roundtrips=False just as a wild guess.

Strange as that all is, I seem to have found a hacky workaround: if I double the size param (i.e., size=2*len(my_ids)), I get back 100% of the results all the time.

Can anyone suggest what might be happening there?

Possibly related to IDs query returning incomplete results.

javanna · August 2, 2019, 9:20am

Hi Loren,
could you post more details on how to reproduce this please? Some sample data, your query etc.?

Thanks
Luca

loren · August 2, 2019, 6:02pm

Thank you for responding Luca.

The query was very simple:

GET cluster1:my_index,cluster2:my_index,cluster3:my_index/_search
{
  "size": 5678, 
  "_source": "pa*", 
  "query": {
    "ids": {
      "values": [
        "id1",
        "id2",
        "id3",
        "id4",
       ....
       "id5678"
      ]
    }
  }
}

I wish I could replicate it for you. The 3 indices on the remote clusters are all multi-TB, and the error only happens 5-10% of the time for a group of IDs, and then never again for that group.

As I ended up needing to bring back more than 10000 results, I changed the search query to a scan/scroll. I have not observed the problem there.

I'm sorry I can't be more helpful in tracking this down. Perhaps someone else will run across this issue, too, and add their finding to this post.

system · August 30, 2019, 6:02pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
IDs query returning incomplete results Elasticsearch	6	878	May 16, 2018
Scan with fields and size parameter not returning expected result Elasticsearch	2	2023	July 5, 2017
Size query performance problems Elasticsearch	2	396	April 3, 2018
Why is IDs query slow? Elasticsearch	18	8845	July 5, 2017
Incomplete results for scan / scroll searches Elasticsearch	3	730	July 6, 2017

Cross-cluster search not returning all hits from IDs query, sometimes

Related topics