How to implement join-fields with Spring Data Elasticsearch 4.0.0

Hi,

I am using Spring Data Elasticsearch 4.0.0 and I would like to establish a parent-child-relationship between my entities. I found out that @JoinTypeRelation comes with version 4.1.X, but unfortunately, I am stuck with 4.0.0. The official documentation does not have any information about how to implement join-fields, but I hope there is still a way to do it.

In order to give some more detailed information: Since types are no longer supported in ES 7.6.2, I merged my two entities, parent and child, into a single class which holds either parent or child information, but never both.

@Document(indexName = "my_index")
public class ParentOrChild {

    @Id
    private String _id;

    @Field(type = FieldType.Keyword)
    private String someParentProperty;

    @Field(type = FieldType.Keyword)
    private String someChildProperty;

    // getters and setters
}

Now I would like to create a join-field so that entities that represent a child can reference another entitiy that represents a parent. My goal is to later find parent entities by searching for properties of their children like this:

GET my_index/_search
{
    "query": {
        "has_child" : {
            "type" : "_doc",
            "query" : {
                "fuzzy" : {
                    "someChildProperty" : "value"
                }
            }
        }
    }
}

I appreciate any hints you can give me.

Welcome!

I'm not going to comment on the spring side as I don't know how this is implemented in the next version and if there's a workaround.
The only workaround I can imagine is by providing manually the mapping and writing manually the queries.

That being said, before going further, are you sure that you must use a relationship model in elasticsearch? Is there anything that prevents you of denormalizing your data and avoid doing joins?

IMO joins should be used only when nothing else is possible. At least if you want the application to be the fastest as possible.

2 Likes

Thank you for your answer,

I guess it would not be a problem to just duplicate the parent data in the child, since it already has the fields anyway. The only problem I see is when a parent gets updated. Then I would have to find all of the children and update them aswell.

That seems like a costly operation. Do you think it is still better than having joins?

EDIT: I talked to my colleagues and in our scenario updates to parents are very rare whereas the search for childen by attributes of the parent are very common. So denormalization seems like a very good idea. Thank you very much, David.

1 Like