Logstash failing to parse json

Hi - I have been trying to parse a json string and i get jasonpraseer error.. below is the json string.. .

"{reportSuite=geometrixxapibeta, timeGMT=1457955198, receivedTimeGMT=1457955198, hitIdHigh=3130934947308273664, hitIdLow=4649055123883334978, mcVisIdHigh=0, mcVisIdLow=0, visIdHigh=893325985, visIdLow=1806759822, visIdType=0, customVisId=37891, props={prop2=Men, prop1=Men, prop16={values=[Rating], delim=44}, prop17={values=[5 Stars], delim=44}, prop14=Anonymous, prop24=Get Outdoors! Amazon to Alps, prop12=22 - 25, prop13=Male, prop51=9, prop11=37891, prop5=Department, site_section=Men, server=Geometrixx Mobile}, evars={evars={eVar31=Anonymous, eVar4=Men, eVar2=int_145, eVar49=Male, eVar29=37891, eVar59=Checkout - Card Number, eVar18=Rating, eVar53=video, eVar43=22 - 25, eVar19=5 Stars, eVar51=Ski Gear Demystified, eVar61=Tw: Earned Link, eVar52=1:Intro}}, events={event83=[{count=1, exponent=0, unique=}], event82=[{count=1, exponent=0, unique=}], event7=[{count=1, exponent=0, unique=}]}, hierarchies={1={values=[Men], delim=:}}, geoCountry=fra, geoRegion=j, geoCity=tigery, geoZip=91250, geoDMA=250091, geoLatitude=48.6401, geoLongitude=2.50899, connectionType=Not Specified, topLevelDomain=, languageAbbrev=, language=, searchEngine=, bot=, operatingSystem=Windows 7, browserType=Google, browser=Google Chrome 7.0, javascriptVersion=1.4, monitorWidth=768, monitorHeight=768, monitorColorDepth=16 million (32-bit), mobileDeviceType=, mobileDeviceName=, mobileManufacturer=, mobileScreenWidth=0, mobileScreenHeight=0, mobileCookieSupport=false, mobileColorDepth=, mobileAudioSupport=, mobileVideoSupport=, pageURL=http://www.geometrixxoutdoors.com/content/geometrixx-outdoors/en/search.html, pageName=Men, usesPersistentCookie=1, homePage=2, browserHeight=429, browserWidth=660, javaEnabled=false, ip=193.178.155.99, isErrorPage=false, purchaseId=, referrer=http://www.geometrixxoutdoors.com/content/geometrixx-outdoors/en.html?test=a?s_kwcid=, state=LA, userAgent=Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.7 (KHTML, like Gecko) Chrome/7.0.517.41 Safari/534.7, plugins=, currency=USD, hitSource=1, transactionId=, truncated=false, zip=63933-3249}"

I am reading this from KAFKA que using input plugin.

Logstash config - i tried many permutations and combinations ... please suggest which works..i tried without filter as well , tried just source tried just target tried both

input {
kafka {
zk_connect => 'zk1:2181'
topic_id => 'TutorialTopic'
type => "kafkaTopic_5"
reset_beginning => true
auto_offset_reset => smallest
consumer_threads => 1
consumer_restart_on_error => true
consumer_restart_sleep_ms => 100
decorate_events => true
codec => json
}
}

filter {
json {
source => "message"
target => "tweet"
}

}

output {
stdout {
codec => "rubydebug"
}
elasticsearch {
host => "localhost"
}
}

While that string is similar to JSON, it isn't JSON.

Hi Magnus ,

Thank you so much for your response. Sincerely Much Appreciated.

Is there a way to change that to JSON .. using some filters grok or any such kind..

Regards
Naren

Everything's possible, but in this case there's no simple fix.

Thank you Magnus. While writing to KAFKA if I use JsonEncode and write message will that help ?
what bout json_encode filter will this be of any use...

I'm not familiar with Kafka but JsonEncode sounds promising. I suggest you try it out.

Thanks Magnus. will try

Hi Magnus - I got the gem file for json_encode .. how do I install that on my local logstash running on windows ..

A json_encode filter on the Logstash side won't help you. It's the producing side (Kafka itself or whatever is sending the message in the first place) that needs to change.

Hi Magnus - I was able to convert from string to json and made some progress. However my json has some nested kind. Top level is being index and rest value just as string. What should be the filter so that it does nested json..

Sample Json :- {
"reportSuite" => "geometrixxapibeta",
"timeGMT" => 1458196160,
"receivedTimeGMT" => 1458196160,
"hitIdHigh" => 3131452409260343296,
"hitIdLow" => 5550652558416160808,
"mcVisIdHigh" => 0,
"mcVisIdLow" => 0,
"visIdHigh" => 2858913181,
"visIdLow" => 1076723857,
"visIdType" => 0,
"customVisId" => "6003",
"props" => {
"prop48" => "DS: Social"
},
"evars" => {
"evars" => {
"eVar58" => "Positive",
"eVar57" => "gold45",
"eVar56" => "Twitter",
"eVar55" => "allsports"
}
},
"events" => {
"event71" => [
[0] {
"count" => 0,
"exponent" => -2,
"unique" => ""
}
],
"event72" => [
[0] {
"count" => 0,
"exponent" => -2,
"unique" => ""
}
],
"event48" => [
[0] {
"count" => 1,
"exponent" => 0,
"unique" => ""
}
],
"cartViews" => [
[0] {
"count" => 1,
"exponent" => 0,
"unique" => ""
}
]
},
"products" => [
[0] {
"name" => "",
"category" => "",
"units" => 0,
"revenue" => 0.0,
"events" => {
"event71" => [
[0] {
"count" => 25600,
"exponent" => -2,
"unique" => ""
}
],
"event72" => [
[0] {
"count" => 500,
"exponent" => -2,
"unique" => ""
}
]
}
}
],
"mvvars" => {
"1" => {
"m" => {
"values" => [
[0] "allsports"
],
"delim" => 44
}
}
},
"geoCountry" => "usa",
"geoRegion" => "mo",
"geoCity" => "st louis",
"geoZip" => "63127",
"geoDMA" => 609,
"geoLatitude" => 38.5413,
"geoLongitude" => -90.408,
"connectionType" => "Not Specified",
"topLevelDomain" => "",
"languageAbbrev" => "",
"language" => "",
"searchEngine" => "",
"bot" => "",
"operatingSystem" => "OS X 10.6.8",
"browserType" => "Apple",
"browser" => "Safari 5.1.7",
"javascriptVersion" => "No JavaScript",
"monitorWidth" => 0,
"monitorHeight" => 0,
"monitorColorDepth" => "",
"mobileDeviceType" => "",
"mobileDeviceName" => "",
"mobileManufacturer" => "",
"mobileScreenWidth" => 0,
"mobileScreenHeight" => 0,
"mobileCookieSupport" => false,
"mobileColorDepth" => "",
"mobileAudioSupport" => "",
"mobileVideoSupport" => "",
"pageURL" => "",
"pageName" => "",
"usesPersistentCookie" => 1,
"homePage" => 0,
"browserHeight" => 0,
"browserWidth" => 0,
"javaEnabled" => false,
"ip" => "64.241.37.140",
"isErrorPage" => false,
"purchaseId" => "",
"referrer" => "",
"state" => "",
"userAgent" => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.57.2 (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2",
"plugins" => "",
"currency" => "USD",
"hitSource" => 1,
"transactionId" => "",
"truncated" => false,
"zip" => "",
"@version" => "1",
"@timestamp" => "2016-03-17T08:26:12.346Z",
"type" => "kafkaTopic_12",
"kafka" => {
"msg_size" => 2790,
"topic" => "StreamingTopic_1",
"consumer_group" => "logstash",
"partition" => 0,
"key" => nil
}
}

any way so that all my child stuff is indexed something parent.child1.child2 as key and value

That's still not JSON. Are you using Logstash's rubydebug output, or from where have you copied the lines above?

Hi Magnus - Yes I am using rubydebug. I am coping this from logstash output...

Okay. So what's the problem? Things look good to me.

Hi Magnus - Now I made some more progress using the below filter...

input {
kafka {
zk_connect => 'XXXXXXXXX:2181'
topic_id => 'StreamingTopic_1'
type => "kafkaTopic_13"
reset_beginning => true
auto_offset_reset => smallest
consumer_threads => 1
consumer_restart_on_error => true
consumer_restart_sleep_ms => 100
decorate_events => true
codec => json
}
}

filter
{
mutate
{
replace => [ "message", "%{message}}" ]
gsub => [ 'message','\n','']
}
if [message] =~ /^{.*}$/
{
json { source => message }
}

}

output {
stdout {
codec => "rubydebug"
}
elasticsearch {
protocol => "http"
codec => json
host => "localhost"
index => "json"
embedded => true
}
}

I think am at the last step of an issue .. it is something to do with filter i am using ....

Logstash is parsing good below set of data ..

"evars" : {
"evars" : {
"eVar31" : "Anonymous",
"eVar4" : "Seasonal",
"eVar2" : "int_134",
"eVar49" : "Female",
"eVar29" : "677087",
"eVar59" : "Checkout - Card Number",
"eVar18" : "Department",
"eVar53" : "video",
"eVar43" : "41 - 50",
"eVar19" : "Equipment,Men",
"eVar51" : "Ski Like a Girl!",
"eVar52" : "2:PreRoll",
"eVar61" : "FB: Application",
"eVar30" : "20160317677087"
}
},

However the problem is with below set of data ..

"events" : {
"event82" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ],
"event30" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ],
"event84" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ]
},
"hierarchies" : {
"1" : {
"values" : [ "Men" ],
"delim" : ":"
}
},

Instead of "events"."event82"."count" = 1 it is says events.event82 =
{"count":1,"exponent":0,"unique":""} .. this is how am seeing in elasticsearch .. any help

this is how am seeing in elasticsearch

Do you mean in Kibana? Kibana doesn't support arrays of objects.

H Magnus - Not in Kibana in Logstash .. json array is not processed as expected as given in my example.

What does the source document look like, i.e. what's Logstash's input?

here it goes i have problem only for the nodes where array are involved like products events

{
"reportSuite" : "geometrixxapibeta",
"timeGMT" : 1458226553,
"receivedTimeGMT" : 1458226553,
"hitIdHigh" : 3131517677732429824,
"hitIdLow" : 6301949885541317350,
"mcVisIdHigh" : 0,
"mcVisIdLow" : 0,
"visIdHigh" : 817097323,
"visIdLow" : 1447905620,
"visIdType" : 0,
"customVisId" : "779998",
"props" : {
"prop2" : "Women",
"prop1" : "Product Details|Mombassa Runners",
"prop16" : {
"values" : [ "Activity" ],
"delim" : 44
},
"prop17" : {
"values" : [ "Hiking", "Running", "$101 - $150", "$151 - $200", "Over $201" ],
"delim" : 44
},
"prop14" : "Anonymous",
"prop24" : "Summer Training Tips",
"prop12" : "31 - 40",
"prop13" : "Female",
"prop50" : "2,202",
"prop51" : "4",
"prop11" : "779998",
"prop5" : "Product Details",
"site_section" : "Product Details",
"prop3" : "Summer",
"server" : "Geometrixx Mobile",
"prop4" : "Running"
},
"evars" : {
"evars" : {
"eVar31" : "Anonymous",
"eVar4" : "Women",
"eVar2" : "int_114",
"eVar49" : "Female",
"eVar29" : "779998",
"eVar59" : "Checkout - Phone",
"eVar45" : "Running",
"eVar53" : "video",
"eVar44" : "Summer",
"eVar18" : "Activity,Price Range",
"eVar43" : "31 - 40",
"eVar19" : "Hiking,Running,$101 - $150,$151 - $200,Over $201",
"eVar51" : "Hiking Shoes For Every Terrain",
"eVar52" : "2:PreRoll",
"eVar61" : "FB: Welcome Tab"
}
},
"events" : {
"productViews" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ],
"event3" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ],
"event42" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ],
"event19" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ],
"event82" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ],
"event7" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ],
"event84" : [ {
"count" : 1,
"exponent" : 0,
"unique" : ""
} ]
},
"products" : [ {
"name" : "Mombassa Runners",
"category" : "",
"units" : 3,
"revenue" : 120.0
}, {
"name" : "Mombassa Runners",
"category" : "",
"units" : 1,
"revenue" : 32.0
} ],
"hierarchies" : {
"1" : {
"values" : [ "Product Details|Mombassa Runners" ],
"delim" : ":"
}
},
"geoCountry" : "pry",
"geoRegion" : "asu",
"geoCity" : "asuncion",
"geoZip" : "1119",
"geoDMA" : 0,
"geoLatitude" : -25.3005,
"geoLongitude" : -57.636199999999995,
"connectionType" : "Not Specified",
"topLevelDomain" : "",
"languageAbbrev" : "",
"language" : "",
"searchEngine" : "",
"bot" : "",
"operatingSystem" : "Windows 7",
"browserType" : "Mozilla",
"browser" : "Mozilla Firefox 18.0",
"javascriptVersion" : "1.1",
"monitorWidth" : 1366,
"monitorHeight" : 1366,
"monitorColorDepth" : "16 million (24-bit)",
"mobileDeviceType" : "",
"mobileDeviceName" : "",
"mobileManufacturer" : "",
"mobileScreenWidth" : 0,
"mobileScreenHeight" : 0,
"mobileCookieSupport" : false,
"mobileColorDepth" : "",
"mobileAudioSupport" : "",
"mobileVideoSupport" : "",
"pageURL" : "http://www.geometrixxoutdoors.com/content/geometrixx-outdoors/en/women/running/mombassa-runners.html",
"pageName" : "Product Details|Mombassa Runners",
"usesPersistentCookie" : 1,
"homePage" : 1,
"browserHeight" : 544,
"browserWidth" : 666,
"javaEnabled" : true,
"ip" : "201.217.50.194",
"isErrorPage" : false,
"purchaseId" : "20160317779998",
"referrer" : "http://www.geometrixxoutdoors.com/content/geometrixx-outdoors/en/search.html?i=1;q=*;q1=Women;q2=running;sp_cs=UTF-8;view=xml;x1=department;x2=activity",
"state" : "MA",
"userAgent" : "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0",
"plugins" : "",
"currency" : "USD",
"hitSource" : 1,
"transactionId" : "",
"truncated" : false,
"zip" : "23417-2280"
}

Okay, so the input data looks like this:

"events" : {
  "productViews" : [ {
    "count" : 1,
    "exponent" : 0,
    "unique" : ""
   } ],
  "event3" : [ {
    "count" : 1,
    "exponent" : 0,
    "unique" : ""
  } ],
  ...

And this is what Logstash produces:

"events" : {
  "event82" : [ {
    "count" : 1,
    "exponent" : 0,
    "unique" : ""
  } ],
  "event30" : [ {
    "count" : 1,
    "exponent" : 0,
    "unique" : ""
  } ],
  ...

These documents have the same structure so it's not a problem with the JSON parsing. Where do you observe the undesirable behavior?

instead of indexed like this events.event30.count = 1 it's getting indexed as events.event82 = { "count" : 1, "exponent" : 0, "unique" : "" }