Building Pie chart in kibana

Hi,

I have tried build pie chart on field such as "STAT" and "Sender_", but some logs contain only the "STAT" field and some logs contain "Sender_id". I tried building viz on two parameters but it's not working through the "ID" field is common between three "docs".

Is there any work around in building pie chart based on below logs for fields such as "STAT" and "Sender_id"

logs-

	Time	                              ID                             STAT         Sender_id
	Nov 17, 2021 @ 22:46:58.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8	REJECTD	          - 
	Nov 17, 2021 @ 22:47:02.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8	DELIVRD	          -   
	Nov 17, 2021 @ 22:46:55.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8	 - 	               ABC

Snap-

Pie chart config snap-

Thank you!!!!

Hi there, I am afraid it's not possible at the moment.

Hi @Marta_Bondyra,
Thanks for confirmation

I am using logstash for parsing logs,I checked if I can fill the "Sender_id" field using logstash filters or using scripted fields in kibana, but I am still checking more on it.

If you can throw some light on it, would be helpful.

If you cannot update the loading of the data, perhaps you could create an additional runtime field (the newer version of scripted fields) for each of those that either emits the actual value or some other value like "MISSING" when the value does not exist.

Also I would try the part chart in lens you might find it a little more flexible / easier

Hi @stephenb,
Thanks for the reply

I checked with scripted fields but did not get desired results.

Script-

if (doc.containsKey('MSGID.keyword') && !doc['Sender_id.keyword'].empty) 
    return params['_source']['Sender_id'];

else if (doc.containsKey('MSGID.keyword') && !doc['STAT.keyword'].empty) 
    return params['_source']['STAT'];

Results are-

Time	                              ID                             STAT         Sender_id     Scripted_Field
	Nov 17, 2021 @ 22:46:58.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8	REJECTD	          -   REJECTD
	Nov 17, 2021 @ 22:47:02.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8    DELIVRD	          -   DELIVRD   
	Nov 17, 2021 @ 22:46:55.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8	 - 	           ABC       ABC

In this index, there are two types of logs, one log which contains "Sender_id" and the other log contains "STAT" field. I want to create a new field that will contain the "Sender_id" field in all logs.

"MSGID" field is the same for both types of logs.

Expected Result-

Time	                              ID                             STAT         Sender_id    Scripted_Field
	Nov 17, 2021 @ 22:46:58.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8	REJECTD	          -   ABC
	Nov 17, 2021 @ 22:47:02.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8    DELIVRD	          -   ABC   
	Nov 17, 2021 @ 22:46:55.000	5a77cf94-e635-4aeb-9ae7-0e9ad37976a8	 - 	           ABC       ABC

Actually, I need to compare "MSGID" fields of different logs and If it matches then "Sender_id" should get printed for all common "MSGID" logs that do not have the "Sender_id" field.

Thanks for the explanation.

Now that I see what you want to accomplish that is not possible at this time, you can not read fields from another record with runtime or scripted fields at this time.

The normal way to do this would be on ingest, perhaps if you had a source of the IDs with the Senders you could use an enrich processor.

Hi @stephenb,
Thanks for confirmation!!!

The normal way to do this would be on ingest.

I checked by enriching data during ingest in logstash pipeline but was unable to figure it out.

So I made two indexes at the output of logstash pipeline "type1" aka the main index which will contain "Sender_id", "Mobile_Number" field, "type2" index will contain "STAT" field. Both indexes will contain the "MSGID" field as common.

Logstash Output-

output{
     stdout {codec => rubydebug}
     if [Sender_id] {
     elasticsearch {
     hosts => ["<>"]
     index => "type1"  #type1 log
     user => "<>"
     password => "<>"
             }
      }
     else if [Mobile_Number] {
     elasticsearch {
     hosts => ["<>"]
     index => type1"   #type1 log
     user => "<>"
     password => "<>"
             }
      }
     else{
             elasticsearch {
             hosts => ["<>"]
             index => "type2"  #type2 log
             user => "<>"
             password => "<>"
                 }
        }
}

The python script is enriching the "type1" index by checking on the "MSGID" field and taking the "STAT" field from the type2 index.

py script-

from elasticsearch import Elasticsearch, RequestsHttpConnection, Urllib3HttpConnection, ElasticsearchException
import json
from elasticsearch.connection.http_urllib3 import VERIFY_CERTS_DEFAULT
from elasticsearch.exceptions import ConnectionTimeout
from flask import request, Flask, Response, json
from flask_cors import CORS
import pandas as pd
import string

app = Flask(__name__)
CORS(app)


es_host = "<>"
es_port = "<>"
es_uname = "<>"
es_pwd = "<>"
es_type1 = "type1"
es_type2 = "type2"
def makeConnection():
    elastic = Elasticsearch([{'host': es_host, 'port': es_port}], http_auth=(
        es_uname, es_pwd), scheme="https", verify_certs=False)
    return elastic

elastic = makeConnection()

def enrichment_data(message_id):
    data = elastic.search(index="type2", body={
        "size": 10000,
        "sort": {"event_timestamp": "desc"},
        "query":
            {
                "match": {
                    "MSGID.keyword": str(message_id)
                }
        }})
    if data['hits']['total']['value']!=0:
        stat = data['hits']['hits'][0]['_source']['STAT']
        metadata = {"STAT": stat
                }
    else:
        metadata={}
        pass
    return metadata
def enrich_data():
    get_data = elastic.search(index=type1, body={
        "size":10000,
        "query": {
            "bool": {
                "must": [
                    {
                        "exists": {
                            "field": "MSGID"
                        }
                    }
                ]
            }
        }
    }
    )
    if get_data['hits']['total']['value'] != 0:
        for data in get_data['hits']['hits']:
            message_id = data['_source']['MSGID']
            metadata = enrichment_data(message_id)
            try:
                data['_source']['STAT'] = metadata['STAT']
            except:
                pass
            body_json = data['_source']
            elastic.index(index=data['_index'], doc_type=data['_type'],
                          id=data['_id'], body=body_json)
            elastic.indices.refresh(index=data['_index'])
    return
if __name__ == '__main__':
    enrich_data()

Pros-

  • "STAT" field is getting enriched in type1 index successfully. The script can be scheduled so that enrichment can be automated.

  • Pie chart Viz is created and working fine.

Cons-

  • As a single doc of the "MSGID" field is getting checked in both index and enriching it, speed is very slow as 400 docs/min enrichment speed. We have 1m logs getting ingested every hour so building viz on enriched data in real-time is not possible and generating alerts is not possible as we are building alerts on type1 index based on "STAT" and "Sender_id" field.

I checked with bulk update (batch update) using scroll API and Bulk Api helperI but it's not working on the above script currently.

Any suggestions on the above will be highly helpful.

Note- I worked with enrich processor in past, reindexing needs to be done to get enriched data,which is cannot be use-case here,

Thank you !!!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.