Query Return: Matching Children AND Matching Parents without Matching Children

I'm an ES beginner, but longtime programmer. I'm currently planning a feature and needed some community advice.

Our current NON-ES data structure looks like this:

people: [
    {
        _id: "1"
        name: "James Bond"
        hair: "BLOND",
        actual_id: "1"
    },
    {
        _id: "2"
        name: "007",
        hair: "BLOND",
        actual_id: "1" // this is the parent marker in our data
    },
    {
        _id: "3"
        name: "Jane Doe",
        hair: "BLOND",
        actual_id: "3"
    },
    {
        _id: "4"
        name: "Babe Ruth",
        hair: "BROWN",
        actual_id: "4"
    },
]

We need to create a search which from this filter:

hair must equal BLOND

will only return matching children, or parents who match, and do not have matching children

[
    // this is a matching child, so its parent (_id:1) will not be returned
    {
        returned
        _id: "2"
        name: "007",
        hair: "BLOND", 
        actual_id: "1" // this is a the parent marker in our data
    },
    // this parent has no matching children, AND hair is blond, so it will be returned
    {
        _id: "3"
        name: "Jane Doe",
        hair: "BLOND",
        actual_id: "3"
    }
    // babe ruth (_id:4)  has BROWN hair so he is NOT returned
]

Can this be accomplished with the parent/child relationship in ES or should we plan additional data juggling on our side?

Can this be accomplished with the parent/child relationship in ES

May be. But I'd not try to overcomplicate my model.

or should we plan additional data juggling on our side?

Yes. If you can simplify things on your side, you will get back the best from elasticsearch.

Denormalizing data is better.

Thanks Dadoonet.

Do you think my scenario would be easier with 2 indexs? One of children and one of parents. Then:
first do a query on children index
then do a query on parent index, removing any of the actual_id's from the first call?

This setup would get dicey if for instance the user is looking for results 51-to-100. This would get very hard, and I would pretty much need to request children 1 to 100, and calculate the first 50 items, even though they are not needed.

The trick I need is, if a child matches a query then its matching parent is not returned. But other matching parents without matching children DO need to be returned.

actually @dadoonet, you were maybe envisioning this, spread the children onto the parents? Does this make the query I need easier?

people: [
    {
        _id: "1"
        name: "James Bond"
        hair: "BLOND",
        children: [
        {
            _id: "2"
            name: "007",
            hair: "BLOND"
        }
      ]
    },
    {
        _id: "2"
        name: "007",
        hair: "BLOND",
        children: []
    },
    {
        _id: "3"
        name: "Jane Doe",
        hair: "BLOND",
        children: []
    },
    {
        _id: "4"
        name: "Babe Ruth",
        hair: "BROWN",
        children: []
    },
]

TBH I don't understand the use case. I'm not speaking about the implementation but just the use case.

In short, you need to answer yourself 2 questions:

  • What kind of object I'm looking for? Is it a tweet, a person, a list of persons?
  • What are the properties of the object (all or only the ones I need for search)? Like a name, a city, a birthdate...

If you can answer those 2 questions, that will may be help to move forward.

Thanks for bearing with me. I tried to make simpler data than our real use-case.

We have an accounting software. There are official names for products such as "Verizon E-Line 1000". Customers however, may refer to that service by different names internally "VZN E1000" "Verizon Business ELine"... etc and our system is good at matching the customer preferred name to the product's official name.

When the customer does a search on their products which can be done by a variety of filters or query methods they need to see their preferred name "VZN E1000" instead of the official name.

So to convert my first example:

  • James Bond = official name: Verizon E-Line 1000
  • 007 = client's preferred name: VZN E1000
  • BLOND = client's name "ABC INC"

As for additional fields to search on, we need a text search on name, and about 3-4 facet categories for this type of search.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.