Setting an ingest pipeline in Machine learning

benji87 · October 14, 2020, 10:56am

Hi all,

Hoping this is an easy question!

I have created an enrichment policy that adds an additional field to documents based on whether the geo-point within the document is within a geo-shape from the source index. This all works fine when testing a few documents in the Console. The example I followed can be found here - https://www.elastic.co/guide/en/elasticsearch/reference/master/geo-match-enrich-policy-type.html

I now want to ingest a larger number of documents using machine learning and importing a CSV. I have imported the CSV file, set the index name and set the correct mappings. However, when it comes to the Ingest Pipeline field, I am at a loss as to what I need to put in there.

It's evident that it needs some sort of properly formatted query language, but what exactly I do not know. Apologies, I am just starting my ELK journey, and I'm not very familiar with the query language used!

Any help greatly appreciated Thanks!

BenTrent · October 14, 2020, 11:29am

Hey @benji87,

First of all, great name!

I am guessing you are referring to the CSV upload feature in the data visualization section of ML.

In the pipeline section, you can simply add your enrichment processor as the last processor.

"processors": [ // Might already be present in the pipeline section
... //processors defined automatically by ml (if present)    
{// Your enrich processor
      "enrich": {
        "policy_name": "postal_policy",
        "field": "geo_location",
        "target_field": "geo_data",
        "shape_relation": "INTERSECTS"
      }
    }
  ]

I am curious, how were you testing via the kibana console? I am guessing you created your pipeline and used _simulate. If you did, then you should be able to copy paste your processor definition into the pipeline in the CSV Uploader.

Thanks!

Ben

benji87 · October 14, 2020, 12:33pm

Hi @BenTrent,

Thanks! Yours aint bad either

I am guessing you are referring to the CSV upload feature in the data visualization section of ML.

Yes that's right, here's a screenshot of the menu and what I have configured (mapping blanked intentionally) -

I've added what you suggested, however I'm getting the following error on import -

Not quite sure which part it's not happy with! Do you have any ideas? Thanks again

benji87 · October 14, 2020, 12:36pm

Oh, also in answer to this question -

I am curious, how were you testing via the kibana console? I am guessing you created your pipeline and used _simulate . If you did, then you should be able to copy paste your processor definition into the pipeline in the CSV Uploader.

I just followed the instructions in the example I posted earlier, which involved using PUT commands to create the geo-shap index, enrichment policy, and then index a document and specifiy the enrichment policy as the ingest pipeline.

BenTrent · October 14, 2020, 12:49pm

I THINK you need to wrap the pipeline definition in {}

Example:

{
"processors": [
{
"enrich": {
"policy_name": "bens_policy",
"field": "home_gps",
"target_field": "geo_data",
"shape_relation": "INTERSECTS"
}
}
]
}

benji87 · October 14, 2020, 3:46pm

Unfortunately not, I get the same error! Only this time it's "position 108" in the JSON error.

I feel like we're pretty close to nailing it, there must be something not quite right though

BenTrent · October 14, 2020, 4:23pm

@benji87 interesting.

What version of the Elasticsearch and Kibana are you utilizing? I cannot reproduce the issue in 7.9.2.

My example pipeline definition that worked

{
  "description": "My simple ingestion pipeline",
  "processors": [
    {
      "remove": {
        "field": "message"
      }
    }
  ]
}

benji87 · October 15, 2020, 10:44am

Hi @BenTrent

So it turns out after all that, it was a rogue quotation mark from copying and pasting into the ingest window -

Once this was fixed, everything worked a treat!

When using the console, this sort of error is flagged. I wonder whether it's worthwhile extending this sort of error detection to other places, like the ingest pipeline window as an example.

Thanks again for all your help!

system · November 12, 2020, 10:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Geo-polygon query inside ingest pipeline Elasticsearch	10	580	July 8, 2021
How to get ingestion pipeline to work for elasticsearch python script Elasticsearch	1	979	June 2, 2020
Enrichment Policy does not match the values of an array/object Elasticsearch ingest-pipeline	5	503	November 2, 2022
Enrich processor debug statements Elasticsearch ingest-pipeline	2	528	June 10, 2021
Ingestion pipeline key value pair Kibana ingest-pipeline	3	382	June 11, 2021

Setting an ingest pipeline in Machine learning

Related topics