Enrich policy with integrated sort/search query

winlamp · March 27, 2020, 7:53am

Hi,

I'm trying to integrate a search query into my enrich policy. This step is required since I only require the most recent data from my index. Therefore i would like to do a sort based on the index @timestamp.

I already managed to do a search that returns the field as required. It returns the latest entries for "Device-1" based on @timestamp.

GET my-enrich-index-*/_search
{
  "query": {
    "match": {
      "device.internal_name": "Device-1"
    }
  },
  "sort": [
    {
      "@timestamp": {
        "order": "desc"
      }
    }
  ],
  "size": 1
}

Now I need to integrate it into my enrich policy:

PUT /_enrich/policy/MyDevice-policy
{
  "match":
  {
    "indices": "my-enrich-index-*",
    "match_field": "device.internal_name",
    "enrich_fields": "serial_number"
  }
}

Right now the enrich policy returns the first ever value ingested. No matter what I try to do it will not return the latest value from my-enrich-index-*.

Please advise of how to integrate my search into my enrich policy.
Thx a lot!

xeraa · March 28, 2020, 4:16am

I don't think this is possible / how it is supposed to work.

Taking the example from the docs for exact match this is using a term query.

Since you need to set up an enrich index explicitly anyway I would create that without duplicates. If you use the unique matching field as the _id of of the document, you'll only have the current ones in there and don't have to worry about sorting any more. Also for performance reasons I'd keep this index as minimal as possible and keep historic values in another index (if needed).

winlamp · March 28, 2020, 8:08am

Ok. that's bad. My question right now is the following: If I use my unique ID as _id I can't use it anymore since I have to use the same field name (enrich index and to be enriched index) to reference it, right?.
In my to be enriched index the _id field is something completely different since it is used for another use case.
So how can I reference from my to be enriched index to the enrich index when the fields are named differently?

thx again!

Christian_Dahlqvist · March 28, 2020, 8:56am

I do not understand. You keep the structure of the document as it is, but set the document ID to the the unique identifier for the device, e.g. device.internal_name. Every time a new document related to a specific device with that id comes in it will overwrite any existing version. You therefore keep only the most recent version for each device, which means that your query will always return just one document and you do not need the sort and size clauses.

If you want to keep track of all the state changes, you can write all changes to a different index where you let Elasticsearch set the document id.

winlamp · March 28, 2020, 9:27am

Ok Christian! Since I'm a bit of a newbie please give me a hint of how do I assign the "_id" when using an ingestion pipeline? I searched the ES reference and I haven't found the assistance to do that on my own.

Thx!

Christian_Dahlqvist · March 28, 2020, 9:49am

You should be able to change this example to set the field based on one of the fields in the document.

winlamp · March 28, 2020, 9:50am

Thx!

winlamp · March 28, 2020, 2:04pm

I managed to use the _id field as storage for my unique intensifier. Data is pared correctly via my ingest pipeline. The only issue I encounter now: if I update the file and filebeat ingests it again, I see no change in my index. Even the timestamp doesn't change from the initial ingestion.
On the other hand, if i change the field from _id to something else it works as advertised. multiple version separated by the ingestion timestamp.

Any idea what I do wrong? Thx again for your time!

system · April 25, 2020, 2:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Enrich policy access "_id" field Elasticsearch	2	856	May 10, 2021
Ingest pipeline with enrich policy that matches by _id - not supported? Elasticsearch	2	824	December 18, 2020
Enrich policy execution in elastic search Elasticsearch	1	397	August 21, 2020
Query from a date (> than date) Elasticsearch	16	844	July 6, 2017
Enrich processor Elasticsearch	4	407	November 23, 2020

Enrich policy with integrated sort/search query

Related topics