Ingest plugin context data

nmalocic · May 24, 2021, 7:08am

Hi everyone,

I am building ingest plugin, and I was wondering if there is any way to know index, or pipeline name from plugin?

My idea is to enable users to have multiple configurations so this seems like easy solution.

Thanks,
Nemanja

spinscale · May 25, 2021, 10:28am

Can you elaborate a little bit more, where you would like to know index/pipeline within your plugin?

Thanks!

nmalocic · May 26, 2021, 10:12am

Hi @spinscale ,

I am not sure, my main goal is to be able to read different configuration files, based on some parameter that will be read from plugin. As far as I see same plugin code is executed for every registered pipeline, so my idea was that when document is processed there is context that tells me pipeline name or index name.
That can be either in settings or some other params as far as I see exceute of parameter only takes IngestDocument as param:

    public NaturalLanguagePlugin(final Settings settings, final Path configPath) {
        this.config = ConfigurationParser.parseConfiguration(new Environment(settings, configPath));
    }

IngestDocument execute(IngestDocument ingestDocument) throws Exception;

spinscale · May 26, 2021, 11:09am

Your IngestPlugin implements Map<String, Processor.Factory> getProcessors(Processor.Parameters parameters). The trick would be to create a Processor.Factory implementation that takes the config as an argument and thus can pass it to the processor implementation.

nmalocic · May 26, 2021, 12:11pm

Hi @spinscale ,

But my questions where in plugin could I read information about "index" or "pipeline".
I see that Plugin takes Settings and Path, but not sure if these information's are available to us from plugin.

For example does Parameters class holds information about "index" that will hold the document, or pipeline name that is executed.

Nemanja

nmalocic · June 1, 2021, 6:55am

Hi @spinscale ,

Is there any way of doing this, or should I stop dreaming.

Kind regards,
Nemanja

spinscale · June 1, 2021, 9:13am

You probably need to invert your thinking a little, as the lifecycle of those classes is a bit different. A plugin gets created on node startup, and the only configuration it can read are settings from the configuration file. You could read those, when a a pipeline is created, by referring in the processor configuration when storing a pipeline to different settings.

Again, this is about passing down settings to a single processor instance and collect all information there, instead of passing them up to the plugin.

Hope that helps a little from an architecture point of view. Otherwise, please add more information.

Thanks!

nmalocic · June 1, 2021, 1:29pm

Hi @spinscale,

I think I understand what you are saying, and workaround that I see is to add additional parameter when registering pipeline like this:

{
    "description": "ExpertAi processing API cloud",
    "processors": [
        {
            "cloud": {
                "field": "text",
                "target_field": "language",
                "config_name": "config1.yml"
            }
        }
    ]
}

I don't rly like this but this will let user to pick different configuration for different pipelines.

system · June 29, 2021, 1:29pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ingest Plugin - Index new documents from within pipeline processor Elasticsearch	1	309	May 13, 2020
Ingest question - attachment processor plugin and dynamic fields Elasticsearch	1	1273	August 6, 2017
Different index output using ingest pipeline Elasticsearch ingest-pipeline	8	1300	November 8, 2021
Ingest-attachment ingest local docs Elasticsearch	4	453	November 18, 2018
Problem with Ingest Attachment Processor Plugin Elasticsearch	8	1204	November 24, 2017

Ingest plugin context data

Related topics