Call a python script from ElasticSearch daily

I am using logstash with scheduling to load data from database to ElasticSearch daily, now I have a script that will do some machine learning stuff on my index, how can I call a python script whenever an index is updated or schedule a python script to run daily? I was reading: ES Python docs though can save it in config/script but how do I then run it daily? I have 14 days trial version for learning, it is deployed on AWS so how do I upload my script then on AWS?

I was reading: ES Python docs though can save it in config/script but how do I then run it daily?

The Python support you're linking to can't be used to run arbitrary Python scripts in the ES cluster. Use your operating system's program scheduler to run your Python script.

I have 14 days trial version for learning, it is deployed on AWS so how do I upload my script then on AWS?

Run an EC2 instance? Or could you perhaps perform your task via Lambda?

I took trial of ElasticSearch Cloud from here
and is using Logstash ElasticSearch cloud module, got cloud.id, password and user, as doc says I shall upload file to config/script directory, I don't have access to it, as ES was deployed by community itself. Shall I deploy it myself?

Upload what file? What piece of the documentation? Are you still talking about the Python script in your original question?

Yes sir! a python script file on the instance provided to me. Can I upload a file on AWS instance provided to me? Because I don't see any such option anywhere.
In Short I want to run a python script whenever data in pushed on ElasticSearch,
here my Logstash is scheduled and it looks like:

input {
  jdbc {
    jdbc_driver_library => "..\logstash-6.4.0\logstash-core\lib\jars\mysql-connector-java-5.1.46-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://Server_Here&useSSL=False"
    jdbc_user => "User_Here"
    jdbc_password => "Password_Here"

    statement => "Statement_Here"

    schedule => "30 2 * * *"  # will execute at 2:45am
  }
}
output {
    stdout { codec => json_lines }}
    elasticsearch {
    hosts => ["Host_here"]
    user => "elastic"
    password => "Password_here"
    # protocol => https
    # port => "Port"
    index => "all_transactions"
    document_id => "%{id}"
    # script_lang => "python"
	# script_type => "file"
	# script => "/scripts/search.py"
    }
}

and logstash.yml looks like:

cloud.id: "Colud_Id_Here"
cloud.user: "UserName:Password"

and pipeline is like:

- pipeline.id: 1
    path.config: "./config/my_config.conf"
    pipeline.workers: 1

now what I do in open command prompt and do bin\logstash
This pushed my data from database to server with given index name,
but whenever data is pushed or updated, I want to call a python script to do some task,
The task it will do is read this pushed index apply some machine learning and update another index on ElasticSearch
or if I am doing it wrong way, please suggest how I can do, shall I call script from ElasticSearch rather than Logstash whenever index is updated

I suggest you use an exec output to start the script from Logstash.

I tried that but that will execute for each row of table, i.e. for each document.
What I tried is
exec{command=> "echo hi"}
this prints hi for every row sir.
It is also mentioned in stack overflow question.

Also there is no way to exec only once as mentioned by ES Community Here

Well, short of an Elasticsearch plugin there's no way to hook into each batch of documents (if that's even the correct trigger in your case).

You could e.g. have a cron job that runs periodically but terminates immediately if nothing has happened since the last execution.

Thankyou for the help sir!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.