I have tried build pie chart on field such as "STAT" and "Sender_", but some logs contain only the "STAT" field and some logs contain "Sender_id". I tried building viz on two parameters but it's not working through the "ID" field is common between three "docs".
Is there any work around in building pie chart based on below logs for fields such as "STAT" and "Sender_id"
logs-
Time ID STAT Sender_id
Nov 17, 2021 @ 22:46:58.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 REJECTD -
Nov 17, 2021 @ 22:47:02.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 DELIVRD -
Nov 17, 2021 @ 22:46:55.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 - ABC
I am using logstash for parsing logs,I checked if I can fill the "Sender_id" field using logstash filters or using scripted fields in kibana, but I am still checking more on it.
If you can throw some light on it, would be helpful.
If you cannot update the loading of the data, perhaps you could create an additional runtime field (the newer version of scripted fields) for each of those that either emits the actual value or some other value like "MISSING" when the value does not exist.
Also I would try the part chart in lens you might find it a little more flexible / easier
I checked with scripted fields but did not get desired results.
Script-
if (doc.containsKey('MSGID.keyword') && !doc['Sender_id.keyword'].empty)
return params['_source']['Sender_id'];
else if (doc.containsKey('MSGID.keyword') && !doc['STAT.keyword'].empty)
return params['_source']['STAT'];
Results are-
Time ID STAT Sender_id Scripted_Field
Nov 17, 2021 @ 22:46:58.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 REJECTD - REJECTD
Nov 17, 2021 @ 22:47:02.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 DELIVRD - DELIVRD
Nov 17, 2021 @ 22:46:55.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 - ABC ABC
In this index, there are two types of logs, one log which contains "Sender_id" and the other log contains "STAT" field. I want to create a new field that will contain the "Sender_id" field in all logs.
"MSGID" field is the same for both types of logs.
Expected Result-
Time ID STAT Sender_id Scripted_Field
Nov 17, 2021 @ 22:46:58.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 REJECTD - ABC
Nov 17, 2021 @ 22:47:02.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 DELIVRD - ABC
Nov 17, 2021 @ 22:46:55.000 5a77cf94-e635-4aeb-9ae7-0e9ad37976a8 - ABC ABC
Actually, I need to compare "MSGID" fields of different logs and If it matches then "Sender_id" should get printed for all common "MSGID" logs that do not have the "Sender_id" field.
Now that I see what you want to accomplish that is not possible at this time, you can not read fields from another record with runtime or scripted fields at this time.
The normal way to do this would be on ingest, perhaps if you had a source of the IDs with the Senders you could use an enrich processor.
I checked by enriching data during ingest in logstash pipeline but was unable to figure it out.
So I made two indexes at the output of logstash pipeline "type1" aka the main index which will contain "Sender_id", "Mobile_Number" field, "type2" index will contain "STAT" field. Both indexes will contain the "MSGID" field as common.
Logstash Output-
output{
stdout {codec => rubydebug}
if [Sender_id] {
elasticsearch {
hosts => ["<>"]
index => "type1" #type1 log
user => "<>"
password => "<>"
}
}
else if [Mobile_Number] {
elasticsearch {
hosts => ["<>"]
index => type1" #type1 log
user => "<>"
password => "<>"
}
}
else{
elasticsearch {
hosts => ["<>"]
index => "type2" #type2 log
user => "<>"
password => "<>"
}
}
}
The python script is enriching the "type1" index by checking on the "MSGID" field and taking the "STAT" field from the type2 index.
"STAT" field is getting enriched in type1 index successfully. The script can be scheduled so that enrichment can be automated.
Pie chart Viz is created and working fine.
Cons-
As a single doc of the "MSGID" field is getting checked in both index and enriching it, speed is very slow as 400 docs/min enrichment speed. We have 1m logs getting ingested every hour so building viz on enriched data in real-time is not possible and generating alerts is not possible as we are building alerts on type1 index based on "STAT" and "Sender_id" field.
I checked with bulk update (batch update) using scroll API and Bulk Api helperI but it's not working on the above script currently.
Any suggestions on the above will be highly helpful.
Note- I worked with enrich processor in past, reindexing needs to be done to get enriched data,which is cannot be use-case here,
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.