ES 6 : nested document vs the new join type


#1

Starting a search project with ES 6. Structure of data is something like stack overflow. Parent document is a question. And question has 2 types of children, answers and comments. Question has a title, author, time and tags. Answers and comments have a body, author, time.

From a reading of the official docs, the join data type (https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html) seemed appropriate.

Search requirement - search for answers or comments with a specific text in their body belonging to a question with certain tags. This I presume requires using the has_child query. But then again, the docs say that has_child is slow due to the join.

What then is the right way to represent this data? Or is there a way to use the join type but still avoid using has_child queries?

The below is the document modeling advice in the ES docs -
"Documents should be modeled so that search-time operations are as cheap as possible.

In particular, joins should be avoided. nested can make queries several times slower and parent-child relations can make queries hundreds of times slower. So if the same questions can be answered without joins by denormalizing documents, significant speedups can be expected."

Since multi type indexes are out of question with ES 6, does the above mean that I should consider putting "questions" and "answers/comments" in 2 separate indices? If so, would a query such as look for a question with tag "java" and answer contains text "spring" be efficient? Since its a multi index search


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.