Bug? Search (specific) query doesn't return documents that does exist!

Nir_Reuveny · August 31, 2015, 12:29pm

Hi,

We have ES 1.7 cluster with daily logstash indexes.
We've noticed a major issue which we can't explain... when searching (or running term aggr) on a specific field we need it doesn't return any documents, although we do have many documents with that value or any other value in that index!
This doesn't happen on all 'daily' indexes... just on some...

See below examples that shows the problem:

http://kibana:9200/logstash-2015.08.29/_search
{
  "size": 200,
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "_type:\"record\" AND d_id_pre_1:\"c\"",
          "analyze_wildcard": true
        }
      }
    }
  },
  "fields": [
    "d_id_pre_1"
  ]
}

Results:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

Doing the search on the same index without filtering the value of the d_id_pre_1, you can see that there are documents with "c" as the value of the d_id_pre_1 field!

{
  "size": 5,
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "_type:\"record\"",
          "analyze_wildcard": true
        }
      }
    }
  },
  "fields": [
    "d_id_pre_1"
  ]
}

Result:

{
  "took": 52,
  "timed_out": false,
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "hits": {
    "total": 4619845,
    "max_score": 1,
    "hits": [
      {
        "_index": "logstash-2015.08.29",
        "_type": "record",
        "_id": "b1ad227e8a8b3e3044c6822c55e1399b",
        "_score": 1,
        "fields": {
          "d_id_pre_1": [
            "2"
          ]
        }
      },
      {
        "_index": "logstash-2015.08.29",
        "_type": "record",
        "_id": "e0dcf0c4ea01b2d0b98a247955e1399b",
        "_score": 1,
        "fields": {
          "d_id_pre_1": [
            "2"
          ]
        }
      },
      {
        "_index": "logstash-2015.08.29",
        "_type": "record",
        "_id": "e371b878878935a4ec7d7c3355e1399c",
        "_score": 1,
        "fields": {
          "d_id_pre_1": [
            "c"
          ]
        }
      },
      {
        "_index": "logstash-2015.08.29",
        "_type": "record",
        "_id": "c8ac1a8e23c73175e0660fc655e1399c",
        "_score": 1,
        "fields": {
          "d_id_pre_1": [
            "1"
          ]
        }
      },
      {
        "_index": "logstash-2015.08.29",
        "_type": "record",
        "_id": "6479d993fe922d39dba61c9f55e1399c",
        "_score": 1,
        "fields": {
          "d_id_pre_1": [
            "7"
          ]
        }
      }
    ]
  }
}

This problem is happening on some of the daily indexes, not all of them... which makes this issue even more odd...
You can see this aggr query/results that shows this:

http://kibana:9200/logstash-2015.08.31,logstash-2015.08.30,logstash-2015.08.29,logstash-2015.08.28/_search
{
  "size": 0,
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "_type:\"record\" AND d_id_pre_1:\"c\"",
          "analyze_wildcard": true
        }
      }
    }
  },
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "1d",
        "min_doc_count": 0
      }
    }
  }
}

Result:

{
  "took": 25,
  "timed_out": false,
  "_shards": {
    "total": 8,
    "successful": 8,
    "failed": 0
  },
  "hits": {
    "total": 269432,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "2": {
      "buckets": [
        {
          "key_as_string": "2015-08-28T00:00:00.000Z",
          "key": 1440720000000,
          "doc_count": 140945
        },
        {
          "key_as_string": "2015-08-29T00:00:00.000Z",
          "key": 1440806400000,
          "doc_count": 0
        },
        {
          "key_as_string": "2015-08-30T00:00:00.000Z",
          "key": 1440892800000,
          "doc_count": 0
        },
        {
          "key_as_string": "2015-08-31T00:00:00.000Z",
          "key": 1440979200000,
          "doc_count": 128487
        }
      ]
    }
  }
}

Ideas?? seems like a huge bug at this point... as I can't find any good reason for this behavior...

Thanks!

Nir.

Nir_Reuveny · September 3, 2015, 7:56am

Anyone can help or have any ideas on this problem?

mikemccand · September 3, 2015, 8:19am

How is the d_id_pre_1 field indexed in the problematic daily index? Is it analyzed (which analyzer)?

Can you try removing the double-quotes around the query? This tells the query parser to make a phrase query, but (at least in this example) you have only one token (c) that you are trying to match ...

Nir_Reuveny · September 3, 2015, 11:06am

Hi Mike,

The field is set to 'not_analyzed' in all daily indexes... (coming from the same template...)
So AFAIK you need to search for the full text which I did...

"d_id_pre_1" : {
"index" : "not_analyzed",
"type" : "string"
}

mikemccand · September 3, 2015, 12:56pm

OK good, yes you must search for the full text.

Did you try the query without double quotes around c?

Nir_Reuveny · September 3, 2015, 1:27pm

Just did. it's the same result... again, the mappings are exactly the same on all indexes. but the search just doesn't 'work' on some of the indexes...
Seems like a bug, no?

{
"size": 200,
"query": {
"filtered": {
"query": {
"query_string": {
"query": "_type:"record" AND d_id_pre_1:c",
"analyze_wildcard": true
}
}
}
}
}

{
"took": 2,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}

mikemccand · September 3, 2015, 5:05pm

Yeah maybe a bug ... can you simplify it down to a small case?

E.g. remove the type:"record" part, remove the analyze_wildcard, use a straight query (not filtered)?

Nir_Reuveny · September 3, 2015, 6:54pm

Mike,

I've tried that. if I remove the 'type' filter, those 'problematic' indexes does return results but only for other document types and not to the main type we use in 90% of our logs ('record')
I've also tried to run term aggr... same behavior! it shows really small numbers on the problematic indexes as it doesn't find (or ignore in some way) most of the documents (the ones with 'record' type)...
But on the 'good' indexes it shows the very high numbers for each bucket in the aggr...

{
"size": 0,
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*"
}
}
}
},
"aggs": {
"3": {
"terms": {
"field": "d_id_pre_1",
"size": 20
}
}
}
}

    {
      "key": "0",
      "doc_count": 30
    },
    {
      "key": "1",
      "doc_count": 30
    },
    {
      "key": "6",
      "doc_count": 28
    },
    {
      "key": "c",
      "doc_count": 26
    },
    {
      "key": "f",
      "doc_count": 24
    }

Nir_Reuveny · September 7, 2015, 6:50am

Bumping this problem... anyone?

Topic		Replies	Views
Very inconsistent "No results found" Kibana 4.3.0 Kibana	4	3563	July 6, 2017
Unable to search on a specific field Elasticsearch	6	373	August 21, 2019
Issue on db query Kibana	6	187	October 18, 2023
Elasticsearch Input Seems To Not Use Provided Query - It Just Dumps All docs in Index Logstash	2	547	September 14, 2017
Indexed documents not showing up in search results Elasticsearch	2	3826	July 6, 2017

Bug? Search (specific) query doesn't return documents that does exist!

Related topics