Ingest plugin context data

Hi everyone,

I am building ingest plugin, and I was wondering if there is any way to know index, or pipeline name from plugin?

My idea is to enable users to have multiple configurations so this seems like easy solution.

Thanks,
Nemanja

Can you elaborate a little bit more, where you would like to know index/pipeline within your plugin?

Thanks!

Hi @spinscale ,

I am not sure, my main goal is to be able to read different configuration files, based on some parameter that will be read from plugin. As far as I see same plugin code is executed for every registered pipeline, so my idea was that when document is processed there is context that tells me pipeline name or index name.
That can be either in settings or some other params as far as I see exceute of parameter only takes IngestDocument as param:

    public NaturalLanguagePlugin(final Settings settings, final Path configPath) {
        this.config = ConfigurationParser.parseConfiguration(new Environment(settings, configPath));
    }
IngestDocument execute(IngestDocument ingestDocument) throws Exception;

Your IngestPlugin implements Map<String, Processor.Factory> getProcessors(Processor.Parameters parameters). The trick would be to create a Processor.Factory implementation that takes the config as an argument and thus can pass it to the processor implementation.

Hi @spinscale ,

But my questions where in plugin could I read information about "index" or "pipeline".
I see that Plugin takes Settings and Path, but not sure if these information's are available to us from plugin.

For example does Parameters class holds information about "index" that will hold the document, or pipeline name that is executed.

Nemanja

Hi @spinscale ,

Is there any way of doing this, or should I stop dreaming.

Kind regards,
Nemanja

You probably need to invert your thinking a little, as the lifecycle of those classes is a bit different. A plugin gets created on node startup, and the only configuration it can read are settings from the configuration file. You could read those, when a a pipeline is created, by referring in the processor configuration when storing a pipeline to different settings.

Again, this is about passing down settings to a single processor instance and collect all information there, instead of passing them up to the plugin.

Hope that helps a little from an architecture point of view. Otherwise, please add more information.

Thanks!

Hi @spinscale,

I think I understand what you are saying, and workaround that I see is to add additional parameter when registering pipeline like this:

{
    "description": "ExpertAi processing API cloud",
    "processors": [
        {
            "cloud": {
                "field": "text",
                "target_field": "language",
                "config_name": "config1.yml"
            }
        }
    ]
}

I don't rly like this but this will let user to pick different configuration for different pipelines.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.