Document-level metadata


(Val Crettaz) #1

The _meta field makes it possible to attach custom metadata at the mapping type level. I've come through a few use cases recently (see genesis here) where it'd be handy to have a _meta field at the document level, too, in order to attach technical-/business-specific metadata that doesn't necessarily belong to the core of the document itself.

When indexing a new document, we'd be able to specify the _meta hash inside the document and ES would extract it (as it used to do with other meta fields back in the days)

PUT index/doc/1
{
   "_meta": {
     "foo": "bar"
   },
   "field1": "some_data",
   "field2": "some_data"
}

When searching for documents, the _meta field would come back outside of the _source along with other meta fields:

POST index/_search
=>
  "hits": {
    "total": 12,
    "max_score": null,
    "hits": [
      {
        "_index": "index",
        "_type": "doc",
        "_id": "1",
        "_version": 1,
        "_score": null,
        "_meta": {
          "foo": "bar"
        },
        "_source": {
          "field1": "some_data",
          "field2": "some_data"
        }
      }

I could neither find any existing or past issues nor discussion threads on this subject. I'm curious to know what would be the cons and pros of adding such a feature?


(Thiago Souza) #2

Hey there. I am curious to understand why does it matters if the _meta field comes nested or not in _source field. If the field is already named _meta, wouldn't this be enough so the application handles it differently?


(Val Crettaz) #3

Hey Thiago, thanks for dropping by. If you look at the second link I provided, you'll understand how it could be useful in some cases.

Long story short, since the Update API doesn't support ingestion pipelines (yet?), one alternative would be to have such a meta field where document-level metadata could be added without the risk of being overwritten when reindexing the document as a whole.


(Thiago Souza) #4

Ok, I understand what you mean. I guess the proper way to support this would be being able use update api combined with ingest pipeline, somehow.

There is already an open issue for tracking this requirement. If more users shows more interest in it, then it will be implemented.


(Thiago Souza) #5

You could, though, implement a plugin that does what you want. The only caveat is that I don't think a plugin can add custom metadata to a document but this can be easily solved by handling the _meta field differently (maybe it's possible to add custom metadata, but needs investigation)


(Val Crettaz) #6

Thanks for you input, Thiago, much appreciated.

Yes indeed, the best way to handle this would be via the ingestion pipelines being supported by the Update API. I've been following that issue for a while now and it seems to drive substantial interest, indeed. Let's see what happens.

Having a document-level _meta field could be nice, even though if it's not the primary focus here, merely a way to circumvent the lack of support for ingestion pipelines in the Update API.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.