Exclude last value in a line graph of kibana

Hello,

Here is my situation-
I need to plot a line graph in kibana, which will have 2 lines. First Line(say line A) will represent 10 data points, Second line(say line B) should only represent 9 data points.

To be exact with my question, I'm plotting some numerical values against a release id of my product. Line A shows the actual values related to the product and line B will show the predicted value my product is making. So basically I'm trying to predict value for the next release of my project so as of now i'l not be having actual value for it. So line B which represent the actual values which will have 9 points and line A which represent the predicted values will have one point more than line A i.e., 10 here.

I'm importing data from a excel sheet, if i leave the cell empty for the 10th value, That release id will not be shown at all in kibana so i'm forced to give a value so I've given 0(zero). So in the graph the line drops down to 0, which i don't want to happen.

I don't want the purple line in graph in graph to hit zero,instead i want it to stop at its value in point 1022 on x-axis. I'm new to kibana and looking forward for some help.

I've also attached the screenshot of the graph i'm getting now.

Thanks in advance.

Hi @vidhyadhara,

this depends on the data structure and the aggregations you are using to build that chart.

I managed to get to this kind of chart using the following:

These are the documents:

[
      {
          "rel" : 1,
          "actual" : 5,
          "prediction" : 6
      },
      {
          "rel" : 2,
          "actual" : 5,
          "prediction" : 6
      },
      {
          "rel" : 3,
          "actual" : 5,
          "prediction" : 6
      },
      {
          "rel" : 4,
          "actual" : 5,
          "prediction" : 6
      },
      {
          "rel" : 5,
          "actual" : 5,
          "prediction" : 6
      },
      {
          "rel" : 6,
          "prediction" : 6
      }
    ]

Notice for the last document, I'm omitting the actual key completely.

Then I'm using the following chart config:

For the last bucket of the x axis (rel: 6), actual won't have a value and won't produce a data point:

As you can see, the green line stops one data point early.

Let me know whether that helps your use case.

Thank you so much for your quick response @flash1293... I wonder why this isn't working for me. I'm also using MAX aggregation. May be i'm doing something wrong in pushing data from python script.
I'm pushing data which is present in a excel sheet(Refer the attachment for data) using the Python script(also attached in attachment). If you can see if I've left last actual value blank in excel, in that case when i try to plot graph of release ID n X-axis release ID 6(Because there is no actual value for release id 6) will not be shown at all on the graph at all. Can you help on this?

Thank you.

Maybe panda fills in a zero automatically for the missing value.

Log your data frame and the resulting JSON to see what exactly is in there and make sure its not 0.

Maybe this can help: https://stackoverflow.com/questions/48956575/turn-zero-values-to-empty-cells-in-pandas

No... That value went to zero because i gave the value as zero.
I think we have a misunderstanding here, I filled that cell value with zero because if i don't fill up the cell value then that release ID will not be present at all. I think i can explain better with the screenshots of what i'm telling. You can also see the aggregations in the screenshot.
if you can observe I've 6 Rel ID put only 5 show up in my kibana graph x axis. To confirm again I'm using MAX aggregation itself. I've also attached output of running the python script also.

For me it looks like the problem is on the python side - I’m not familiar with pandas so I’m not sure where to look. Could you check the index in discover to see what actually gets ingested in Elasticsearch?

This is what is being pushed to the elasticsearch. Thanks for reminding me to check the discover tab.
6th release id data is not being pushed at all. Thank you for that. I really appreciate you for helping me out.
Okay forget about pandas and python, I want to push/ingest data to elasticsearch from my local database (or excel for now) without using logstash, what would you recommend?

If it doesn't have to happen programmatically, you can use the "File Data Visualizer" which allows upload of CSV files directly in Kibana as a way of ingesting data: https://www.elastic.co/blog/importing-csv-and-log-data-into-elasticsearch-with-file-data-visualizer

Otherwise, you can piece those things together with just about any programming language - I'm sure that your python/pandas approach isn't necessarily wrong or anything, and I'm sure someone can help you out with it in the Stackoverflow python forum.

To debug where exactly the problem happens, check whether the df.to_dict actually iterates over all of the entries , whether the stuff passed into e.bulk is what you expect and so on.

Completely personal: As I'm most familiar with JS, I would probably go with a node.js script using https://www.npmjs.com/package/excel to load the data and https://www.npmjs.com/package/elasticsearch to ingest it in Elasticsearch

Another thought I just had - if you are using a specified mapping for the dar21 index, could you share it here (together with the input of e.bulk)? Maybe it's passing something in here which is rejected by Elasticsearch.

@flash1293 Thank you so much for your suggestions I'l surely look into them. I don't know anything about specified mapping and stuff so i think i'm not doing anything like that :laughing: . I've shared all the files i'm using now and I'l be happy to provide what ever further info. you need.

Hello @flash1293,
I just looked into "File Data Visualizer", but that is limited to files. I actually want to connect my database in future, So file data visualizer wont help me much. Since the data is not getting ingested at all(I mean the release ID 6), I think there is problem with my python script only. I'l check it. Mean while i'l so check into your other suggestions.

Hi @flash1293,
I looked into the script and found that json cant handle "NaN" value...Its asking me to enable non numeric values. Have attached a screenshot of it also. Let me know if you have any idea on it.

That NaN has to be removed before you are sending it to Elasticsearch.

You could try setting it to an empty string in your python script like this:
(slight variation of https://stackoverflow.com/questions/48956575/turn-zero-values-to-empty-cells-in-pandas linked above)

import numpy as np
df[df.eq(np.nan)] = ''

Looks like elasticsearch index doesn't store and handle null values and treats empty values with NaN. I'm actually blank now. df[df.eq(np.nan)] = '' also didnt work actually.

Could you share the output of rec_to_actions(df) with setting the nan values to empty strings?

Here it is @flash1293.
Hope this is what you asked.

That's the response from Elasticsearch, I was talking about the request. Anyway, it's clear in the error message that it's still sending that NaN value - you have to replace it with an empty string.

I'm definitely the wrong person to help with python problems, I suggest posting this question on Stackoverflow or get help from a colleague.

1 Like

I totally appreciate your concern and happy that you lend a hand in helping me solve this error. Thank you @flash1293
Will keep posted if i get a solution for this.

1 Like

I was using python scripts because i was not to figure out how to connect logstash to AWS ES, fortunately i found out how to connect logstash to AWS ES. Logstash handles even the NULL values and doesn't throw up any error. So my problem is solved.

Glad it worked out for you!

1 Like