Suggest index design for my data

Hey, everyone,

We are currently building some search engine for our internal data and we are using Elasticsearch for that. Our data looks like this:

  • Object A: {attribute_a_1, attribute_a_2} - 100M of object A;
  • Object B: {attribute_b_1, attribute_b_2} - 200M of object B;
  • Object C: {attribute_c_1, attribute_c_2} - 50M of object C.

Now the relationships are:

  • Each of object A can have none or one or many objects B;
  • Each of object A can have none or one or many objects C.

My goal is to have index which can be really fast searched for:

  • Objects A - by using its own and other objects attributes;
  • Objects B - by using its own and other objects attributes.

Example:

Find all objects B where attribute_a_1="something", attribute_b_2="something", attribute_c_1="something".

Currently we are using nested structures:

PUT object_a
{
	"mapping": {
		"properties": {
			"attribute_a_1": {...},
			"attribute_a_2": {...},
			"object_b": {
				"type": "nested",
				"properties": {
					"attribute_b_1": {...},
					"attribute_b_2": {...}
				}
			},
			"object_c": {
				"type": "nested",
				"properties": {
					"attribute_c_1": {...},
					"attribute_c_2": {...}
				}
			}
		}
	}
}

Performance isn't bad, but having some objects A, which contains hundreds of thousands of objects B and object C, really slow down the querying. We know that this design costs a lot of resources to update data, but we don't do often updates, so this is not a problem. Any suggestions how can we improve our index design? Maybe we are thinking all wrong. I would really appreciate any suggestions. Thank you in advance :slight_smile:

If you need to have all the attributes of object B inside A, some documents are going to be huge.

I would probably separate each object type into different indices, and reference the common attributes between them. I would use a key to search for attributes in all indices.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.