Elastisearch + update+range time filter

Hi i have an update_by_query using specific field. i am looking for the filter that will update the last 15 minutes only of a syslog message timestamp. Have you got the syntax please in release 5.6.4 of ELK, please ? ( i can not find something correct).

POST /%3Clogstash-%7Bnow%2Fd%7D%3E/syslog/_update_by_query
{
"script": {
"source": "if (ctx._source.syslog_program=='logger'){def path=ctx._source.syslog_message;ctx._source.Temp=1;} else ctx._source.Temp= 0;",
"lang": "painless"
}
}

thanks

hi guys, add this message on top of the stack :joy:
still not solve :factory::factory::factory::factory:

I am not sure I understand what you are looking to achieve. Can you perhaps give an example of an event before and after the update?

Hi,my logstach is fed by real time syslog. each 15 minutes i run (curl) some painless script (as up but more complicated) that will update my field (Temp in the example). For CPU purpose running on a PI3, i prefer to run the scrip on the last 15min timestamp data so that each 15min i compute only small amount of data. In other words the question is how to apply the script (see up) to only the 15m last minutes of records (now-15m to now). Can't find the syntax on 5.6.4.
regards

Elasticsearch does not record when the event was indexed, so any query would need to be based on the event timestamp, which off course could arrive delayed. If you set or add a field to the event as part of the update, could you not just search for all events that does not have this field (or maybe a default value that indicate they have not yet been updated) set every 15 minutes? That would spread out the updates and not be sensitive to delays in the ingest pipeline.

Temp filed is on every record as it is a field create for all syslog of logstash. it's just not initialize per logstash. Temp value are filled every 15m by some painless script. How run the script

POST /%3Clogstash-%7Bnow%2Fd%7D%3E/syslog/_update_by_query
{
"script": {
"source": "if (ctx._source.syslog_program=='logger'){def path=ctx._source.syslog_message;ctx._source.Temp=1;} else ctx._source.Temp= 0;",
"lang": "painless"
}
}

so that only record between now and now-15 are processed ? currently it process all records of /%3Clogstash-%7Bnow%2Fd%7D%3E.

If you can write a query that only selects records that have not been updated you can use that with the API to limit records processed.

True, but i need a timestamp selection and i accept overlapping (that's ok to reprocess some small % of record). but nothing appear to be easy for newbies.

Do you have a timestamp field, e.g. @timestamp, that you can base the query on? If so, you can use this through a range query.

i already tried such grammar into the POST but it does not filter range time. i may do not know how to use it with "script' and "source" that's why i am asking for help.
I tried something like that .... but it processed all records note per range interval.

POST /%3Clogstash-%7Bnow%2Fd%7D%3E/syslog/_update_by_query
{"script":{"filter":{"range":{"date":{"gte":"now-15m","lte":"now"}}},"source":"if (ctx._source.syslog_program=='logger'){def path=ctx._source.syslog_message;ctx._source.Temp=1;} else ctx._source.Temp= 0;","lang":"painless"}}

If you format that correctly you will see that it does not match the example in the update by query documentation. The filter need to be in a query block, not inside the script block.

thanks

Sorry about that. Wrote delete by query instead by mistake instead of update by query.

hi, ok to put the filter in a query block but don't know hot to out the script bloc. easy :sunglasses: and mostly lost in the confusing doc. does someone has rougtly the structure of such filter to get out of this issue ? thanks

Solved

POST /%3Clogstash-%7Bnow%2Fd%7D%3E/syslog/_update_by_query
{
"query": {
"range" : {
"@timestamp" : {
"gte" : "now-15m",
"lt" : "now"
}
}
},
"script": {
"source": "ctx._source.Temp=100",
"lang": "painless"
}
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.