ElasticSearch + Rails: how to refer to parent fields in nested / has_child query

Need help from guys with extensive experience in ElasticSearch or any other
storage that supports full-text searching (matching phrases,tokenizers,
analyzers), parent/child relationships, powerful and flexible queries and
aggregations (sum/avg/min/max, group by specific field).

Description of the problem

There's a repository with 3 types of entities:

  • Search parameters (search)
  • Users (user)
  • Documents (document)

1 user has many documents

Having an instance of the search parameters object (search), which is
created by a user (user), N documents are selected from storage based on
these search parameters (target documents). For each of the N documents we
must select 1 document of the same type (call this document co-document),
on the basis of

  • Some parameters of the user who searches (so-called searcher)
  • An owner of target document of N found as a result of the initial
    search (so-called owner)
  • Some of the fields of the document N (target document), at least
    target_document.owner_id (mandatory condition)

The result is an array of pairs (target document; co-document) (in the
repository can be documents that won't have the co-document under the
current search conditions, and they are not in the array)

And finally, we need to build some aggregations against this array
(aggregations in terms ElasticSearch 1.0 and above), which are based on the
fields of target documents of these pairs

Problems

  • Using just data denormalization to store pairs (target document <->
    co-document) by itself does not help, because the relationship between the
    target document and co-document is not defined statically, it dependent on
    search params
  • Denormalizing data by storing a limited number of candidates to be
    co-document of a particular target document (for example, with the same
    owner_id, as a mandatory condition) won't help, because ElasticSearch does
    not support referring parent object fields when querying for the child
    objects (this is true for all types of queries related to parent-child
    relations: nested query, has_child query)
  • You can solve the problem the dumb way by first selecting N target
    documents, then for each of them you must choose co-document (N
    subqueries), and then build aggregations against all these documents in the
    application code; it's unacceptable in terms of UX, such a request takes
    10-15 seconds on 15,000 documents, not taking into account the time to
    build aggregations, and it's assumed to have hundreds of thousands or even
    millions of documents
  • I've considered an option to return to the SQL-type storage like
    PostgreSQL, but they do not support the necessary functions (eg search for
    synonyms, removing stop-words), which are necessary in this case, or I
    don't have info on such support

Possible solutions

  • Search for necessary, but perhaps unknown capabilities of
    ElasticSearch which can help in solving this problem
  • Pick another storage having all of the ES capabilities +
    joins/subqueries/you name it

Any help is appreciated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a9955a7a-c04d-48f8-850a-7bb3c9dad52f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.