HI all
When I was first learning Elasticsearch and designing my first index, a colleague of mine suggested not using ES parent/child joins because they were slow.
I countered by claiming that it was unlikely that Elasticsearch's implementation of join could be slower than a "client-side join." By "client-side join" I mean first doing one query and then a second query using the results from the first. By using ES join, I avoid a round trip, serdes, http request overhead, etc. I was also worried that I might lose some roaring bitset speed by going this route.
However, my data and index as time has evolved has turned out much different than I thought it would be. For one thing, we have far far more of one document kind than another, and the number of fields in the latter group is much much higher.
As a result, I'm starting to consider that the downsides of keeping both kinds of documents in the same index might outweigh the downsides of a client-side join.
Any thoughts welcome!