Nested field in app search

What is best practice when needing to store data in a nested field for app search

I have the following example, where the events key can hold an array of
events

How would i perform a search to find all documents where the events.slug = 'demo-sale' for example?

EDIT: Response updated by JasonStoltz August 25th, 2021, to be a bit more understandable.

Modeling relational data in any object data store can be challenging.

Because App Search does not support nested objects, you'll need to flatten your data.

That's generally the concept, but that could be accomplished a few different way.

Here are some thoughts.

Assuming you have two types of objects, Cars, and Events, where Car can have many Events associated with it.

Car:
- id
- make
- model
- events: Event[]

Event:
- id
- name
- slug

Solution 1: Nest Event fields into Car as separate arrays

Instead of:

[{
  id: 1,
  make: 'kia',
  model: 'sorento',
  events: [
    {id: 100, date: '10-15-2021', slug: 'demo-sale'},
    {id: 200, date: '09-6-2021', slug: 'blowout-sale'}
  ]
}]

You could have:

[{
  id: 1,
  make: 'kia',
  model: 'sorento',
  events.ids: [100, 200],
  events.dates: ['10-15-2021', '09-6-2021'],
  events.slugs: ['demo-sale', 'blowout-sale']
}]

That will get your use case working, filtering by event.slug.

However, there's a number of difficulties with this:

  1. You'll find it difficult if you want to do any filtering by more than 1 sub-field. I.e., you can't do a filter on both events.slug and events.date.

For instance, if you had the following:

  events.id: [100, 200, 300],
  events.date: ['10-15-2021', '09-6-2021', '08-12-2021'],
  events.slugs: ['demo-sale', 'blowout-sale', 'another-sale']

And you wanted to find cars found at an '08-12-2021' occurrence of 'demo-sale' you'd query like:

events.date='08-12-2021' and events.slugs='demo-sale'

You would get a hit on the Kia Sorrento document based on a date and slug from different events, which is not what we want:

  events.id: [100, 200, 300],
  events.date: ['10-15-2021', '09-6-2021', **'08-12-2021'**],
  events.slugs: [**'demo-sale'**, 'blowout-sale', 'another-sale']

Solution 2: Duplicate the Car document for each Event

In document stores like Elasticsearch or App Search, it's not uncommon to duplicate your data. So you could try creating 1 instance of Car for each event.

[{
  id: 1,
  make: 'kia',
  model: 'sorento',
  event.id: 100,
  event.date: '10-15-2021',
  event.slug: 'demo-sale'
},{
  id: 2,
  make: 'kia',
  model: 'sorento',
  event.id: 100,
  event.date: '10-15-2021',
  event.slug: 'demo-sale'
}]

That means you could filter by 'event.slug' and 'event.date', which we couldn't do with the previous example.

Data maintenance gets a bit trickier as you now have to update a bunch of documents every time you update Car details.

It also makes querying by Car details harder, as searching for model='sorento' returns duplicate entries. You'd have to use the grouping option in your search: Search API group | Elastic App Search Documentation [7.14] | Elastic.

Additionally, what happens if you have other relationships as well? Like if Cars in addition to Events have Retailers. You couldn't conceivably maintain a record for every Event and Retailer combination.

Solution 3: Event as the main object with nested Car fields

[{
  id: 100,
  date: '10-15-2021',
  slug: 'demo-sale'
  car.id: 1,
  car.make: 'kia',
  car.model: 'sorento',
},{
  id: 100,
  date: '10-15-2021',
  slug: 'demo-sale'
  car.id: 2,
  car.make: 'toyota',
  car.model: 'corolla',
},{
  car.id: 1,
  car.make: 'kia',
  car.model: 'sorento',
  id: 100,
  date: '10-15-2021',
  slug: 'demo-sale'
}]

This is comparable to the first previous solution, and suffers from the same challenges.

Option 4: Separate Engines for Cars and Events

// Events
[
  {
    id: 100,
    date: '10-15-2021',
    slug: 'demo-sale',
    cars: [1, 2]
  },
  {
    id: 200,
    date: '09-6-2021',
    slug: 'blowout-sale',
    cars: [1, 2]
   }
]

// Cars
[
  {
    id: 1,
    make: 'kia',
    model: 'sorento'
    event_slugs: ['demo-sale', 'blowout-sale'],
    event_ids: [100, 200]
},{
    id: 2,
    make: 'toyota',
    model: 'corolla',
    event_slugs: ['demo-sale', 'blowout-sale'],
    event_ids: [100, 200]
  }
]

You could query for Cars and filter by event_slugs.

If you needed to filter by multiple fields, you could accomplish this in 2 queries:
Query 1: Query Events by date and slug to get a list of event ids
Query 2: Query Cars and filter by event_id

This seems pretty clean to me. But it make data management a bit more complicated.

There could be a number of variation on this, and it would largely depend on how you're planning to query and / or display data.

Huge disclaimer: I personally have not done any of these, so this is all theoretical. I would love to hear from others that may have experience with this. I would also love to hear back from you if any of these solutions work.

3 Likes

Thanks so much for the detailed response Jason!

I think the first solution is probably the best, I'd prefer to keep the simplest solution, without having to add additional documents or engines.

It allows us to get documents part of events and it's sufficient to facet filter on simply one of the event fields.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.