Missing results from search

Hello all,

As I've scaled my cluster and the size of my data, I'm very much wondering how ES handles the search !

I'm still stuck on ES 1.7.2 (difficult to handle migration to a more recent version), on 6 servers with SSD. I perform a search on 2 indices (around 300GB each, splitted on 6 shards / 1 replica). On some searches a big part of documents are matching (like 30-40%).

Unfortunately, results given are not stable, first results are not always the same, most of the time, expected results from the top results are missing. Though when they do happen to pop up, scoring is as expected, ...

Are my shards too big ? Am I missing something about the cluster management ?

Thank you for more insight!
BR,
Aurelien

1.7.2

Yeah. You need to upgrade.

2 indices
300GB each, splitted on 6 shards / 1 replica
6 servers

Are my shards too big ?

300 / 6 = 18 gb per shard.
24 shards on 6 nodes = 4 shards per node

That looks correct to me.

results given are not stable

I can't really tell but may be for obscure reasons your primaries and replicas diverged at some point and replicas are inconsistent?

You can use preference=primary IIRC and check that you have consistent results.
If consistent, then may be set the number of replicas to 0 then set it back to 1.
replicas will be copied again from the primaries.

My 2 cents

Great ideas David, Merci :slight_smile: Well, that does not help much unfortunately.

I tried to reinit replicas anyway. No luck. Issue is that most of the searches are totally ok, but some searches are not and the number of returned results varies from one call to the other.

My only idea around is that, as I return more like 6M of results from 3(M, it gets truncated somewhere in the chain, probably at the shard level. I don't best results get good scores out of a shard and be lost when merging shards results.

Is that possible that a shard won't return all results for some reasons ? (memory issues, incomplete read of the shard because enough results were found, ...)

I'm desesperately trying to upgrade ES but reindexing is a pain, from 1.7.2 to 5.5.0, difficult to handle and migrate all the API calls.

First tries on migrated documents seems to get more consistent. I've also increased the number of shards.

as I return more like 6M of results from 3M, it gets truncated somewhere in the chain, probably at the shard level. I don't best results get good scores out of a shard and be lost when merging shards results.

WDYM? Did you set "size": 6000000? I don't think so but prefer to double-check.

Is that possible that a shard won't return all results for some reasons ? (memory issues, incomplete read of the shard because enough results were found, ...)

Nothing that I can think of. Corrupted shards should be detected so I don't thing that can happen.
But, when you run the query with preference=primary do you have different results?
When this is happening, can you see in the response some failed shards?

Would be great if you could share some response samples.

Hi,
No, of course, I get 20 results per page. Number of found results on the query is 6M, as reported by total hits.

When I run with primary only, I do have same incomplete results and inconsistency of the results from one call to the other.

I can share you some more insights in private though in respect to data privacy.
BR,
Aurelien

I do have same incomplete results and inconsistency of the results from one call to the other.

This is weird as you are always reaching the same shards all the time.

Can you share some typical responses? Just remove the _source content. We don't need it here.
You can set size: 0 by the way.

Well, I get such response for example :

/user/avi/input-logs/social/y=2017/m=11/d=10

curl -XPOST 'http://172.27.5.81:9200/global/_search?pretty=true' -d @test.json         
 {
   "took" : 890,
  "timed_out" : true,
  "_shards" : {
    "total" : 30,
     "successful" : 30,
     "failed" : 0
  },
   "hits" : {
     "total" : 2992694,


curl -XPOST 'http://172.27.5.81:9200/global/_search?pretty=true' -d @test.json
{
  "took" : 1033,
  "timed_out" : true,
  "_shards" : {
   "total" : 30,
  "successful" : 30,
  "failed" : 0
   },
 "hits" : {
   "total" : 2866351,
    "max_score" : 2.9997215,

As you can see both subsequent results are having different count. Nothing in the index changed between them.

 curl -XPOST 'http://172.27.5.81:9200/global/_search?pretty=true&preference=_primary' -d @test.json 
{
  " took" : 1209,
  "timed_out" : true,
 "_shards" : {
"total" : 30,
"successful" : 30,
"failed" : 0
 },
 "hits" : {
  "total" : 2942189,
"max_score" : 2.9997215,




 curl -XPOST 'http://172.27.5.81:9200/global/_search?pretty=true&preference=_primary' -d @test.json
{
  "took" : 961,
  "timed_out" : true,
  "_shards" : {
   "total" : 30,
   "successful" : 30,
   "failed" : 0
 },
 "hits" : {
  "total" : 3055002,

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.