Query documents based on references to other documents

I'm struggling with putting together a Lucene query for Elsticsearch that can select the documents that I need. Let me explain by a simple example:

Let's say that we have two types of documents that we want to to operate on.

  1. Customer requirements, which can reference zero or more internal requirements.
  2. Internal requirements, which can have status as either "implemented" or "not implemented.

I want a Lucene query that selects all customer requirements that have at least one referenced internal requirement and all referenced internal requirements have status "implemented".

Consider this set of documents, all in the same index:

[
    { "req_id": "cust1", "type": "customer", "refs": ["int1", "int2"] },
    { "req_id": "cust2", "type": "customer", "refs": ["int3"] },
    { "req_id": "cust3", "type": "customer", "refs": ["int3", "int4"] },
    { "req_id": "cust4", "type": "customer", "refs": [] },
    { "req_id": "int1", "type": "internal", "state": "implemented" },
    { "req_id": "int2", "type": "internal", "state": "implemented" },
    { "req_id": "int3", "type": "internal", "state": "implemented" },
    { "req_id": "int4", "type": "internal", "state": "not implemented" }
]

What I'm looking for is a way to query Elasticsearch to select cust1 and cust2 (as all referenced internal requirements have status implemented), but not cust3 (as it references int4 which is not implemented) and not cust4 as it does not reference any internal requirements.

Is this possible in Elasticsearch, or do I need to add the information from the referenced internal requirements to the customer requirements before I upload them to Elasticsearch?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.