A Question on whether to use a Nested Datatype

matthew.whill · November 18, 2019, 10:07pm

Hello All.

I wanted to ask for peoples thoughts around whether to use nested datatypes in a particular scenario.

The scenario:

I am wanting to store into an index the contents of files and each file contains references to email addresses (along with other textual content). I am (using regex) extracting all the email addresses as I want to perform aggregation queries on these addresses. For each email address I am storing the actual email address in a field, the domain in a field and the pre @ name (e.g. "bob" from "bob@email.com") in a field (so the email object has 3 fields).

Now there may be 1000s of email addresses in a file I am indexing. I can store the email address details into a nested datatype to the index which the document is being indexed however, I have noted that elastic does place limitations on the number of nested objects that can be stored into a single document (index.mapping.nested_objects.limit). Alternatively, I can create a separate index to store the email address objects and include a field which stores the ID of the document which contains the file I indexed. However, it is my understanding that the nested datatype is in essence already doing something very similar behind the scenes.

The question:

So my question (after all that), do I go with the nested datatype approach and simply increase the limit (index.mapping.nested_objects.limit) to something "crazy" or do I go with the manual approach of managing a separate index?

Thanks in advance for the advice.

Matt

Mikhail_Khludnev · November 19, 2019, 5:24am

Hello, Matthew.
It sounds like you can not bother about nested. Those three field should be sufficient for the associated emails.

matthew.whill · November 21, 2019, 2:43am

Sorry Mikhail - just confirming. You are saying that I am better to NOT use NESTED DATATYPES and to instead go with a separate index that I manage?

Mark_Harwood · November 21, 2019, 7:49am

I’ve shared this flowchart before to help with the decision process for nested:

Mikhail_Khludnev · November 21, 2019, 8:36pm

better to NOT use NESTED DATATYPES and get alone with three fields.

system · December 19, 2019, 8:36pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Nested Field vs Multiple Indices Elasticsearch	3	778	June 10, 2020
Nested, flattened or object Elasticsearch	1	276	January 24, 2022
Source code advise about how to store and query Nested datatype Elasticsearch	1	422	February 7, 2019
Regarding the usage of nested fields Elasticsearch	2	218	October 23, 2023
Nested type Elasticsearch	3	516	July 6, 2017

A Question on whether to use a Nested Datatype

Related topics