Number of documents in Dev tools is different from number of document in Discover

I am using ELK 7.9.3 and I have an index which contain more then 2M documents, I used reindex API to have a sample using the following command:

POST /_reindex?wait_for_completion=false
{
  "max_docs": 1000, 
  "source": {
    "index": "firstIndex-2020.10.27-000313",
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "develop1000"
  }
}

I am expected to have 1000 documents in the index named develop1000, that's the result of GET develop1000/_count but the problem is when I check in Discover, I only have 992 hits.

What could be the problem please? and how to investigate the issue? I have the same problem in another index!

Thanks folks.

Can you show the number of docs? :arrow_up:

Sure, below the total number of docs:

{
  "count" : 2652481,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

Hmmm... I right understand when you execute query:
GET develop1000/_count
result: 1000?

Perhaps some of the documents timestamps are not correct and inside / outside your time window in Discover OR somehow on the reindex the timestamp failed and those docs don't have a timestamp.

Also with 1000 documents just run the reindex in the foreground wait_for_completion=true and see if you get any errors

You can check what the min and max timestamp

GET develop1000/_search
{
  "size": 0,
  "aggs": {
    "min_date": {
      "min": {
        "field": "@timestamp",
        "format": "yyyy-MM-dd HH.mm.ss"
      }
    },
    "max_date": {
      "max": {
        "field": "@timestamp",
        "format": "yyyy-MM-dd HH.mm.ss"
      }
    }
  }
}

Yes exactly!

Here is the result of reindexing with wait_for_completion=true:

{
  "took" : 8916,
  "timed_out" : false,
  "total" : 1000,
  "updated" : 0,
  "created" : 1000,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

seems that there is no error. Then I execute it the search query to know the max and min timestamp:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "max_date" : {
      "value" : 1.603826335E12,
      "value_as_string" : "2020-10-27 19.18.55"
    },
    "min_date" : {
      "value" : 1.603826285E12,
      "value_as_string" : "2020-10-27 19.18.05"
    }
  }
}

What you said is interesting, how can I check if a given document have empty timestamp?

What is your index pattern time field?

@timestamp or the field value you show above. Discover is based on the time field defined in the index pattern

For the time field it's during October 2020 for all events, in discover I set the time to look for 2 years ago to be sure to get all events. But as you suggested the problem is that the field @timestamp is missing for 8 documents and checked that using the following query, as you pointed out in your answer:

GET develop1000/_search
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "@timestamp"
        }
      }
    }
  }
}

the returned result is exactly 8 documents which they don't have a @timestamp field:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
         // details for the 8 documents ...
      ]
}

Thank you @stephenb ! :grinning: