Push json key value with ELK

I have a lot of json like this that every day have different data:

[{ "date": "2020-02-24 21:00:00",
"country": "US",
"men": 101,
"women": 26,
"childreni": 127,
}]

I tried to push this into elasticsearch but next I have problem to visualize the trend with Kibana because Kibana don't permit metric of field value.

Hi there,

I'm afraid you need to contextualize everything a little bit better to allow us to understand the situation. The one you posted is an array of json with a single json object in it.

How did you try to ingest it in elasticsearch? Via logstash? What is the pipeline you used? Where does it get the data from and how does it parse it? Which is a sample of a document parsed by the pipeline (i.e. the pipeline with only the input section, no filter, and output to stdout{})?

What do you see in kibana? How would you like to visualize your date? How would you like to aggregate them?

I mean, there's a lot we don't know and it is very difficult to help you without a context.

Thanks Fabio-sama,

I have an array of json objects with the number of ticket solded every 3 hours.
I don't show you all beacuse are more of 10000.

This is an extract of my file "people.json":

[
{ "date": "2020-01-20 20:00:00",
"country": "US",
"men": 101,
"women": 26,
"childrens": 127,
},
{ "date": "2020-01-20 23:00:00",
"country": "US",
"men": 144,
"women": 20,
"childrens": 99,
},
{ "date": "2020-01-21 02:00:00",
"country": "US",
"men": 91,
"women": 123,
"childrens": 87,
}
]

I used logstash to push data into elasticsearch and visualize this data with kibana.
I used the most simple conf file because I have all I need into json so:

input {
  file {
    path => "/tmp/people.json"
    start_position => "beginning"
	sincedb_path => "/dev/null"
	#codec => "json"
     codec => multiline {
      pattern => "^{"
      negate => true
      what => previous
    }
 }
}

filter {

          #Remove "[" and "]"
          if [message] == "]" or [message] == "[" { drop {} }
          #Remove the "," at the end of object
          mutate { gsub => [ "message", "},", "}"  ]     }   
  json { source => "message"     }

  date {  match => [ "date", "YYYY-MM-dd HH:mm:ss" ]     }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "logstash-people-%{+YYYY.MM.dd}"
  }
 
  stdout { codec => rubydebug   }
  
}

For each object I have a doc in elasticsearch like this:

   {
  "_index": "logstash-people-2020.01.22",
  "_type": "_doc",
  "_id": "I7xKznAB6obTCirlmSMC",
  "_version": 1,
  "_score": null,
  "_source": {
    "men": 111,
    "@timestamp": "2020-01-22T12:00:00.000Z",
    "path": "/tmp/people.json",
    "@version": "1",
    "tags": [
      "multiline"
    ],
    "country": "US",
    "childrens": 127,
    "host": "ubuntu",
    "message": "{ \"date\": \"2020-01-22 13:00:00\",\n\"country\": \"US\",\n\"men\": 111,\n\"women\": 26,\n\"childrens\": 127\n}",
    "women": 26,
    "date": "2020-01-22 13:00:00"
  },
  "fields": {
    "@timestamp": [
      "2020-01-22T12:00:00.000Z"
    ]
  },
  "sort": [
    1579694400000
  ]
}

Whit this structure it's impossible to create a visualize to show day by day a the trend line of field data value, for example if the number of men day by day increase or decrease.

I updated my conf file because it was wrong.

Apart from the fact I don't really get what you do here:

if [message] == "]" or [message] == "[" { drop {} }

Is there any message containing only [ or ]? The comment on top of that line

#Remove "[" and "]"

Should rather be something like Removing those events where message is equal to either "[" or "]", since with drop{} you're dropping the whole event, not only the "[" or "]".

Anyway, let me get this straight, if you have in input something like:

[
  { 
    "date": "2020-01-20 20:00:00",
    "country": "US",
    "men": 101,
    "women": 26,
    "childrens": 127
  },
  { 
    "date": "2020-01-20 23:00:00",
    "country": "US",
    "men": 144,
    "women": 20,
    "childrens": 99
  },
  { 
    "date": "2020-01-21 02:00:00",
    "country": "US",
    "men": 91,
    "women": 123,
    "childrens": 87
  }
]

do you manage to have in Elasticsearch 3 separate documents structured as the one you posted?

I need similar output to graphhite from logstash....anyone please help

Hi Narayana,

unfortunately this is not how it works.

Open your thread, explain your use case, provide some useful info like what you have in input (set only the input section of logstash, no filter, send to stdout{} in the output and post the result) and what is your desired output and wait for an answer there.

Otherwise the stockpile of answers will only create a mess here.

Thanks

Maybe I explained my problem poorly.

Yes. The file people.json is a portion of my file but all the objects are like this.

No. I already have in Elasticsearch 3 separate documents but for my use it's not mandatory how store data into Elasticsearch but store data to have better visualization.
With this documents if you create an histogram in Kibana with X-axis timestamp and Y-axis count of men you can't because count metric counts the number of documents and not the fields values ( 101, 144 and 91 for men field).

Why did you answer no to my question and then said you did manage to get those 3 separate documents? :sweat_smile:

Anyway, if I got it right, considering those 3 docs, you do have a situation like the following:

is the following your goal?

NOTE: children is already the plural of child, no need for a final s :wink:

YES!! :smiley:

Correct :wink:

Ok so, you're simply making a wrong aggregation in your Y-axis. What you want to do is make 3 aggregations of type Sum, each one for a specific category (Men-Women-Children), like the following:

Though, if you want to make something more professional and good-looking, since you're interested in a time series (time on x-axis), I recommend you the TSVB visualization:

You have much more control on the colors you can use, where to put the legend, you do not make any query (so no kibana loading) if you want to filter out something on the fly, you can add Annotations from other indices etc...

You basically decide where to get your data from in the Panel options tab


and then configure each series you want to visualize. Here again I used a Sum aggregation on the specific fields

and then, in the Options panel (of each series) I specified Chart type -> Bar and Stacked

Thanks a lot Fabio. It's exactly what I need to do. :smiley:

No problem :wink:

Iam Sorry Fabio-sama...
I am new to this technology and to the discussion thread..

Don't worry. Just create your own thread and tag me there if you want.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.