How to remove quotes from value of fieldname

Hi all.

I've been having problems with indexing where in the JSON is returning values in string format but I would like to remove the quotes on all values

"dn": "topology/pod-1/node-104/sys/phys-[eth1/48]/CDeqptEgrTotal5min",
"pktsRateMax": "302.633041",
"pktsThr": "",
"bytesPer": "5011221",
"pktsBase": "0",
"lastCollOffset": "119",
"pktsRate": "173.151443",
"pktsRateTtl": "2072.598565",
"host": "172.17.0.1",
"utilTr": "0",
"bytesBase": "0",
"pktsMax": "2724",
"bytesRate": "41760.523004",
"pktsTrBase": "55599",
"pktsRateThr": "",
"pktsRateLast": "82.000000",
"bytesRateMax": "68796.454545",
"utilMax": "0",
"bytesRateTrBase": "43812.610741",
"childAction": "",
"repIntvStart": "2018-03-13T16:59:58.913+00:00",
"utilTtl": "0",
"bytesThr": "",
"pktsRateSpct": "0",
"utilSpct": "0",
"port": 44352,
"bytesRateMin": "18162.000000",
"bytesRateAvg": "41378.273727",
"utilThr": "",
"repIntvEnd": "2018-03-13T17:01:58.912+00:00",
"status": "",
"pktsSpct": "0",
"bytesMax": "756761",
"utilLast": "0",
"pktsCum": "967969755",
"bytesRateTr": "0.000000",
"interface": "104/1/48",
"bytesRateThr": "",
"pktsMin": "624",
"pktsAvg": "1731",
"bytesSpct": "0",
"utilAvg": "0",
"utilMin": "0",
"@version": "1",
"pktsLast": "738",
"bytesRateLast": "20235.222222",
"bytesMin": "163458",
"bytesAvg": "417601",
"class_name": "eqptEgrTotal5min",
"utilTrBase": "0",
"pktsRateTrBase": "187.235202",
"bytesRateTtl": "496539.284727",
"pktsTr": "0",
"bytesRateSpct": "0",
"bytesLast": "182117",
"bytesTr": "0",
"cnt": "12",
"bytesCum": "235163782693",
"pktsPer": "20778",
"pktsRateTr": "0.000000",
"@timestamp": "2018-03-13T17:02:03.444Z",
"pktsRateAvg": "172.716547",
"pktsRateMin": "69.333333",
"bytesTrBase": "13021484"

what is the best way please that is scalable.

I have seen the gsub option but I don't seem to find the right combination.

I have tried this from a previous post.

mutate { gsub => [ "", """, "" ]}

But the logstash fails.

Any advice how i can remove for example and make it as

"pktsRateAvg": 172.716547,
"pktsRateMin": 69.333333,
"bytesTrBase": 13021484

Thanks in advance for your help

I have tried to use as well

logstash -e 'input { tcp { port => 8929 codec => json } } filter { mutate { gsub => ['message', '"' ," "] } } output { stdout { codec => rubydebug } elasticsearch { hosts => ["elasticsearch:9200"] } }'

but i get

ERROR logstash.agent - Cannot create pipeline {:reason=>"Expected one of #, {, ,, ] at line 1, column 87 (byte 87) after filter { mutate { gsub => [message, " ," "}

The quotes are there because the values are strings. What you can do is convert the string fields that contain numerical data into numerical fields. Look into the mutate filter's convert option.

Thanks for that @magnusbaeck

the problem I will have in this case is we have 11 more requests and if further fields are added in the future I may run into the issue of having to re-index everything again.

If there was a way to remove the quotes, it would be better since any new fields added will be able to use the same mutate function.

Your desire to "remove the quotes" doesn't make sense. What you want to do is convert the data type of the fields that are numerical.

Where does the data come from, i.e. what inputs and what filters do you have?

so effectively I am running an API query to a network device.

The out is of the format

{
"dn" => "topology/pod-1/node-101/sys/phys-[eth1/10]/CDeqptEgrTotal5min",
"pktsRateMax" => "790.754528",
"pktsThr" => "",
"bytesPer" => "3387838",
"pktsBase" => "0",
"lastCollOffset" => "269",
"pktsRate" => "68.579926",
"pktsRateTtl" => "2001.253558",
"host" => "172.17.0.1",
"utilTr" => "0",
"bytesBase" => "0",
"pktsMax" => "7116",
"bytesRate" => "12594.193309",
"pktsTrBase" => "26017",
"pktsRateThr" => "",
"pktsRateLast" => "13.109654",
"bytesRateMax" => "72284.476053",
"utilMax" => "0",
"bytesRateTrBase" => "14564.381170",
"childAction" => "",
"repIntvStart" => "2018-03-14T22:14:58.753+00:00",
"utilTtl" => "0",
"bytesThr" => "",
"pktsRateSpct" => "0",
"utilSpct" => "0",
"port" => 45624,
"bytesRateMin" => "4990.110012",
"bytesRateAvg" => "13129.400953",
"utilThr" => "",
"repIntvEnd" => "2018-03-14T22:19:27.753+00:00",
"status" => "",
"pktsSpct" => "0",
"bytesMax" => "650488",
"utilLast" => "0",
"pktsCum" => "372871048",
"bytesRateTr" => "0.000000",
"bytesRateThr" => "",
"pktsMin" => "82",
"pktsAvg" => "683",
"bytesSpct" => "0",
"utilAvg" => "0",
"utilMin" => "0",
"@version" => "1",
"pktsLast" => "118",
"bytesRateLast" => "5406.954783",
"bytesMin" => "44906",
"bytesAvg" => "125475",
"class_name" => "eqptEgrTotal5min",
"utilTrBase" => "0",
"pktsRateTrBase" => "93.805101",
"bytesRateTtl" => "354493.825724",
"pktsTr" => "0",
"bytesRateSpct" => "0",
"bytesLast" => "48668",
"bytesTr" => "0",
"cnt" => "27",
"bytesCum" => "101949942420",
"pktsPer" => "18448",
"pktsRateTr" => "0.000000",
"@timestamp" => 2018-03-14T22:19:35.556Z,
"pktsRateAvg" => "74.120502",
"pktsRateMin" => "9.112124",
"bytesTrBase" => "4194870"
}

This is for 1 interface.

I have several more queries to the network device but they have different fields depending on the request.

i have seen the convert filter but that would mean everytime a new request is needed to obtain a different metric I would have to add to the mutate list.

is this the information you are asking please.

logstash -e 'input { tcp { port => 8929 codec => json } } output { stdout { codec => rubydebug } elasticsearch { hosts => ["elasticsearch:9200"] } }'

Hi @magnusbaeck

So I was able to change my python code and now I am able to get the correct format to logstash as below.

"dn" => "topology/pod-1/node-104/sys/phys-[eth1/50]/CDeqptEgrTotal5min",
"pktsRateMax" => 1205.018638,
"pktsThr" => "",
"bytesPer" => 10756099,
"pktsBase" => 0,
"lastCollOffset" => 89,
"pktsRate" => 378.269663,
"pktsRateTtl" => 3273.350803,
"host" => "172.17.0.1",
"utilTr" => 0,
"bytesBase" => 0,
"pktsMax" => 13254,
"bytesRate" => 120855.044944,
"pktsTrBase" => 109753,
"pktsRateThr" => "",
"pktsRateLast" => 293.921547,
"bytesRateMax" => 266937.539776,
"utilMax" => 0,
"bytesRateTrBase" => 117536.084041,
"childAction" => "",
"repIntvStart" => "2018-03-15T00:14:59.149+00:00",
"utilTtl" => 0,
"bytesThr" => "",
"pktsRateSpct" => 0,
"utilSpct" => 0,
"port" => 46276,
"bytesRateMin" => 73629.333333,
"bytesRateAvg" => 117592.229267,
"utilThr" => "",
"repIntvEnd" => "2018-03-15T00:16:28.149+00:00",
"status" => "",
"pktsSpct" => 0,
"bytesMax" => 2936046,
"utilLast" => 0,
"pktsCum" => 2710526109,
"bytesRateTr" => 0.0,
"bytesRateThr" => "",
"pktsMin" => 1780,
"pktsAvg" => 3740,
"bytesSpct" => 0,
"utilAvg" => 0,
"utilMin" => 0,
"@version" => "1",
"pktsLast" => 2645,
"bytesRateLast" => 110107.567508,
"bytesMin" => 662664,
"bytesAvg" => 1195122,
"class_name" => "eqptEgrTotal5min",
"utilTrBase" => 0,
"pktsRateTrBase" => 356.18968,
"bytesRateTtl" => 1058330.063403,
"pktsTr" => 0,
"bytesRateSpct" => 0,
"bytesLast" => 990858,
"bytesTr" => 0,
"cnt" => 9,
"bytesCum" => 1243041497122,
"pktsPer" => 33666,
"pktsRateTr" => 0.0,
"@timestamp" => 2018-03-15T00:16:36.335Z,
"pktsRateAvg" => 363.705645,
"pktsRateMin" => 197.777778,
"bytesTrBase" => 35817490
}

However when i re-index in kibana it does not seem to reflect this and still shows as string.

Look like i had to kill both kibana and elasticsearch docker instances and re run them again. So look like it is working now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.