Subqueries :(

I am using the ingest mapper plugin to index PDFs:

POST /index-name

{
  "attachment_ID": 1,
  "data": "binary-data......",
  "type": "attachment"
}

Obviously, the data field is ingested. I am storing only "content".

I then have another document like this:

POST /index-name

{
  "type": "blog-post",
  "attachments": [
     {
        "attachment_ID": 1
     }
  ]
}

I need to query the blog post type but have content of the associated attachments affect relevancy. So really I need to subquery the associated attachments to each blog-post. I guess this is a JOIN. I need to query attachments by themselves AS WELL AS blog posts with associated attachments affecting relevancy. My best idea right now is to ingest twice: once under the blog post and one in the attachment document.

Any good solutions to this problem?

I solve this while indexing, with my reference plugin.

Scenario: when indexing a blog-post doc, which is referencing an attachment doc by ID, the plugin can lookup the attachment doc for extracting fields from that doc to get them indexed also into the blog-post doc.

See

@jprante any plans to make your plugin working with ES 5.x? It seems to be nice :slight_smile:

Definitely yes, a 5.1.1 preview is already in the bundle plugin https://github.com/jprante/elasticsearch-plugin-bundle but I have to factor the code out, update to latest 5.2.1 and update the docs :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.