Bulk Upload but search cant find all docs

GaryD · July 2, 2021, 9:36am

I have done a bulk upload via the API in python as follows:

def to_bulk_doc(_doc):
        return {
            # default value: 'index'
            '_op_type': 'index',
            '_index': es_index,
            '_id': uuid.uuid4(),
            '_source': _doc,
            '_type': 'document'
        }
  for doc in json_docs_list:
            json_doc = json_docs_list[0]
            doc_resources: list = json.loads(doc)['resources']
            # split doc_resources into a list-of-lists, where each list has max=max_batch_size elements
            chunks = chunked(doc_resources, max_batch_size)
            for batch in chunks:
                # convert batch of json docs to a format compatible with bulk API using the 'to_bulk_doc' 
                function defined above
                actions = map(to_bulk_doc, batch)
                res = helpers.bulk(client=es, actions=actions)
                print(res)

This successfully uploads the documents to the index:
In Dev Tools: GET /_cat/indices/es_index?v=true

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open es_index Xngti... 5 1 304444 0 445.3mb 222.5mb

But searching in Kibana Dev Tools with:
GET es_index/_search
only returns 346 lines and nothing else, yet there is supposed to be 304K documents?

Any ideas what might be wrong?
Thanks!

Marco_Liberati · July 2, 2021, 12:31pm

Hi @GaryD ,

welcome to the Kibana community.

How many hits are shown in the hits.total.value? Still 346 or 304k?

GaryD · July 2, 2021, 1:20pm

Thanks Marco, have you got the Kibana Dev Tools Get for that?
GET hits.total.value returns 404

Marco_Liberati · July 2, 2021, 1:22pm

Hi @GaryD

I think that there's a misunderstanding.

I meant this field in the response:

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
...
  },
  "hits" : {
    "total" : {
      "value" : 4675, <== this number here
      "relation" : "eq"
    },
...

GaryD · July 2, 2021, 1:25pm

Thanks Marco, I dont seem to have a value attribute:

{
  "took": 19,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 304444,
    "max_score": 1,
    "hits": [
      {
        "_index": "es_index",
        "_type": "document",
        "_id": "0cca6ec0-4772-4526-93bb-a437ced2698d",
        "_score": 1,
        "_source": {
          "metadata": {
            "guid": "8532129b-65c9-4701-add4-23601af9c3e5",
            "url": "/v2/events/8532129b-65c9-4701-add4-23601af9c3e5",
            "created_at": "2016-11-20T00:02:10Z",
            "updated_at": null
          },

Marco_Liberati · July 2, 2021, 1:28pm

You have hits.total which is showing that all the 304k documents have been hit.
Note that ES is returning by design a limited set of hits in the hits list: you could retrieve more results using the size parameter in the requests, but I'd advice to not fetch too many documents at once, rather paginate the requests.

You can read more about it here: Paginate search results | Elasticsearch Guide [7.13] | Elastic

GaryD · July 2, 2021, 1:36pm

Thanks Marco, much appreciated.
So if I have to paginate can I still build visual dashboard on the full index and search for hits like a normal index that has been populated by filebeat etc?

Marco_Liberati · July 2, 2021, 1:39pm

When building visualization you most probably will pass through some aggregation, so all your documents will be hit, no pagination will occur on that side.

GaryD · July 2, 2021, 1:44pm

Thank you, let me read some more on pagination and see if we can get the data out we need. Appreciate your fast responses here. Have a good weekend.

system · July 30, 2021, 1:45pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unable to retrieve all documents from elastic Elasticsearch	2	403	February 7, 2019
Kibana crashes during getting all documents from Elasticsearch Kibana	2	463	March 8, 2018
Why would Kibana not show all docs? Kibana	6	227	June 16, 2022
Get all documents from index - like in Kibana (even if accualy showing some of them) Elasticsearch	2	357	July 4, 2019
Strange problem about docs search Elasticsearch	4	346	September 26, 2019

Bulk Upload but search cant find all docs

Related topics