ElasticSearch - nested query in batches?

magorbe · June 18, 2018, 7:50am

Hello! I have a problem when I do a query. The mapping is very simple :

{
    "index": {
        "aliases": {},
        "mappings": {
            "level1": {
                "properties": {
                    "id": {
                        "type": "string"
                    },
                    "level2": {
                        "type": "nested",
                        "properties": {
                            "level3": {
                                "type": "nested",
                                "properties": {
                                    "value1": {
                                        "type": "string"
                                    },
                                    "value2": {
                                        "type": "long"
                                    },
                                    "id": {
                                        "type": "string"
                                    },
                                    "value3": {
                                        "type": "long"
                                    }
                                }
                            },
                            "id": {
                                "type": "string"
                            }
                        }
                    }
                }
            }
        },
        "settings": {
            "index": {
                "creation_date": "1505476515647",
                "number_of_shards": "5",
                "number_of_replicas": "1",
                "uuid": "_0IiQCPrQ1i-kDP1481y8w",
                "version": {
                    "created": "2030099"
                }
            }
        },
        "warmers": {}
    }
}

And the query is :

{"query": {"terms": {"_id": [ "value51" ] }}}

When I do the query in Python I receive data with this structure:

_source (dict)
  level1 (list)
     level2 (list)
        data1 (dict)
              id
              value1
              value2
              value3
        data2 (dict)
        data3 (dict)
        ...
        data65000 (dict)

The problem is that 65,000 data are too many, and I run out of memory, I would like to know if _search or ElasticSearch in general has some way of bringing that information (data1,data2,data3...) in batches, or if there is some way to make that query so that I do not run out of memory on the computer. Any idea?

Thank you.

Johnnycc1 · June 18, 2018, 11:13am

So you have one document in elastic search with 65000 nested sub documents - and when this one document is returned to python you run out of memory.

If elastic search could be configured to return part of the document in chunks - python would still run out of memory loading this one document.

Solution one - model your data differently - flatten out this nested document.

Solution two - get more memory .

magorbe · June 18, 2018, 11:16am

Thanks! Solution two for now is impossible. Would solution one be easy to do?

Johnnycc1 · June 18, 2018, 11:20am

Are you loading the data using logstash? If so you could just use the split filter and you will get one document for each document in the level 2 list - with all the upper level fields repeated.

magorbe · June 18, 2018, 11:37am

No,I don't use logstash, I am doing :

get_query = self.get(id=id, index=self.default_index, doc_type=self.default_doc_type)

With the get of ES library. And i receive all the data.

magorbe · June 18, 2018, 12:41pm

I don't understand the way to do that, can you give me any example ? Thanks!!!!

system · July 16, 2018, 12:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Query results non understood Elasticsearch	6	1288	May 15, 2018
Help with Nested query Elasticsearch	2	272	July 6, 2017
Elasticsearch multi nested query Elasticsearch	1	286	March 28, 2020
Multi-level nested query Elasticsearch	6	1726	June 15, 2018
Can't find anything with nested queries with 0.17.1 Elasticsearch	8	301	July 6, 2017

ElasticSearch - nested query in batches?

Related topics