Aggregate problem

Hello !

My logstash config can deal with serveral log type. Two of them are "Appel" and "SVI". These two types have a similar field called "ID appelSVI". I want to add à field called "Passage par SVI" to my "Appel" event if there is a "SVI" event with the same "ID appelSVI". There is my code :

if [Type]=="Appel" or [Type]=="SVI"
{
aggregate
{
task_id => "%{ID appelSVI}"
code => "
if(event.get('Type')=='SVI')
map['Passage par SVI']='true'
event.cancel
else
if(map['Passage par SVI']=='true')
event.set('Passage par SVI'='true')
end
end
"
push_map_as_event_on_timeout => false
timeout_task_id_field => "ID appelSVI"
timeout => 210
timeout_code => ""
timeout_tags => ['_aggregatetimeout_aggregationSVI']
}
}

It works if the first event that come in the aggregate part is a SVI type. If it is, map['Passage par SVI'] is set to 'true'. Then when the event of type Appel come it will check if map['Passage par SVI'] is set to 'true' and will add the field 'Passage par SVI'.

The problem is when the first event that come in the aggregate part is a Appel Type. I can't know if a SVI type with the same ID will come so I can't add the field 'Passage par SVI' yet and if a event of Type SVI come it will be to late for adding the field to the Appel type event. How can I do that ?

Thanks

As you say, logstash cannot predict what events it will see in the future. Can you sort your input before feeding it to logstash?

Otherwise, it would depend on what your output is. For example, if you are writing to elasticsearch then in principle you could update the document in elasticsearch when you see an SVI.

I can't sort my input before feeding it to Logstash. How can I update the document in Elasticsearch when I see an SVI ?

You would use logstash to generate a file that could be POSTed to elasticsearch using the bulk and update APIs.

Ok if I understand, Logstash will not transmit data to Elasticsearch but will create a file. And then thanks to the bulk API Elasticsearch will read the file and index all document in this file. If I'm right what will do the update API ?

The update API determines the format of that file. You can merge in new fields using update, as shown in the documentation. Let me see if I can find an example...

OK, the use case where I did this was parsing SiteMinder trace logs, where every line has a correlation id and one piece of information about the request. I needed to gather all the information about one request into a single document. I did this by doing a bulk update using doc_as_upsert. One update for each input line.

So, provided that you can use the 'ID appelSVI' as the document id, what you could do is something like

output {
    if [Type] == "SVI" {
        file { path => "/some/path/out.txt" codec => plain { format => '{ "update" : {"_id" : "%{ID appelSVI}", "_type" : "doc", "_index" : "someindex"} }
{ "doc": "Passage par SVI": true, "doc_as_upsert" : true }
' } }
    }
}

Then

curl -X POST 'localhost:9200/someindex/doc/_bulk' -H "Content-Type:application/json" --data-binary @/some/path/out.txt

Ok I understand. The first code block works well but I don't realy know where I should write the second.

There is the output in Logstash for ligne where the "Type" is not "SVI" :

elasticsearch{
hosts => "localhost"
index=> "cdr_sbc"
document_type=>"CDR_SBC"
}

My line should look like

curl -X POST 'localhost:9200/cdr_sbc/CDR_SBC/_bulk' -H "Content-Type:application/json" --data-binary @C:\Users\GAUTSCPI\Documents\Elasticsearch\sortieSVI.txt

But where should I write it ?

After logstash has executed the file output will have been created. Then you run the curl command at a shell prompt.

Ok, the probleme is that I work on Windows and the curl command is unknow.

OK, you could do it in PowerShell using Invoke-WebRequest.

Thank you from your answer. I searched on the internet but I can't find how to make this instruction on the Invoke-WebRequest format. There is what I tried :

Invoke-WebRequest -Method POST -URI 'localhost:9200/cdr_sbc/CDR_SBC/_bulk' -body "Content-Type:application/json" --data-binary @C:\Users\GAUTSCPI\Documents\Elasticsearch\sortieSVI.txt

I obtain this error :

Try

get-content C:\Users\GAUTSCPI\Documents\Elasticsearch\sortieSVI.txt | Invoke-WebRequest -Method POST -URI 'localhost:9200/cdr_sbc/CDR_SBC/_bulk' -Content-Type "application/json"

I think it's better but powershell don't know the

-Content-Type"application/json" parameter

Sorry, there is no hyphen in -ContentType

Yes it's better but now it is the same problem with the URI param, I tried to write it "Uri" like in the Invoke-Webrequest doc but it's the same. I tried to put back slashes instead of slashes in the path but the result is the same.

I founded the solution I added "http://" at the beginning of the URI. Now I get a new error.

There is my json file :

I tried to write \n instead of return to line and the result is the same.

This is really turning into a PowerShell question rather than a logstash question and I am not able to test it :frowning: With curl, the reason you use -data-binary rather than -d is to tell it not to strip the newlines from the file. I do not know what the equivalent is in PowerShell.

The JSON looks OK, except you should not have the blank lines.

Ok I will continue to try to find a solution and will post it if I find, thank you !

Hello !

I founded a solution. The get-content "function" got a parameter that is called -Delimiter. By default this value is the return tu line char (\n). That means that get-content will return each line separately. I read in the get-content doc that if the delimiter that the user set does not exist in the file, get-content will return the entire file as a single undelimited object and that is what we want. I set the delimiter to "?!@" to be certain that nothing will match with this string and it works.