JSON Result with 'columns' and 'rows' values - help parsing


(Michael) #1

I've got a JSON document like so ::

{
"tables": [{
"name": "PrimaryResult",
"columns": [{
"name": "test",
"type": "string"
}, {
"name": "test2,
"type": "string"
}],
"rows": [
["testdatarow1", "test2datarow1"],
["testdatarow2", "test2datarow2"]
]
}]
}

As you can see, it labels the columns for the rows in a separate JSON section. How would I go about parsing / filtering this in Logstash before sending to Elasticsearch?

Thank you!!


#2

If you fix the missing quote in '"name": "test2,' then you can parse it using

filter { json { source => "message" } }

That will get you to

    "tables" => [
    [0] {
        "columns" => [
            [0] {
                "type" => "string",
                "name" => "test"
            },
            [1] {
                "type" => "string",
                "name" => "test2"
            }
        ],
           "rows" => [
            [0] [
                [0] "testdatarow1",
                [1] "test2datarow1"
            ],
            [1] [
                [0] "testdatarow2",
                [1] "test2datarow2"
            ]
        ],
           "name" => "PrimaryResult"
    }
]

From there you can do whatever you want. But you haven't said what you want to do...


(Michael) #3

Thanks for responding!

I want to have each 'row' in Elastic as a separate log entry, using the proper fields from 'columns'

Quote issue - sorry, that was my own typo when removing sensitive data before posting.


#4

OK, so you will probably want to a split filter on [tables][0][rows]. I think you are then saying that you want the column names to be pulled out of [tables][0][columns] to rename fields from rows. Can you show me, based on that sample data, what a single event should look like?


(Michael) #5

Yes, each row would be a new document in the ES index and within that document, would be the data from the row split up amongst the various fields from the columns

So the document in the index would be ::

field: test (string)
data: testdatarow1

field: test2 (string)
data: test2datarow1

The next document in the index ::

field: test (string)
data: testdatarow2

field: test2 (string)
data: test2datarow2

Make sense?

Thank you!


#6

I would try something like

split { field => "[tables][0][rows]" }
mutate {
    add_field => {
        "%{[tables][0][columns][0][name]}" => "%{[tables][0][rows][0]}"
        "%{[tables][0][columns][1][name]}" => "%{[tables][0][rows][1]}"
    }
    remove_field => [ "tables" ]
}

(Michael) #7

So here is my complete pipeline -

input {
appinsights {
type => "string"
query_interval_hours => 1
source => "requests"
key => "key here"
app_id => "app id here"
}
}

filter {
  split { field => "[tables][0][rows]" }
  mutate {
    add_field => {
      "%{[tables][0][columns][0][name]}" => "%{[tables][0][rows][0]}"
      "%{[tables][0][columns][1][name]}" => "%{[tables][0][rows][1]}"
    }
    remove_field => [ "tables" ]
  }
}

output {
  elasticsearch {
    hosts => ["elasticsearch:9200" ]
    user => "elastic"
    password => "pass here"
    index => "azure-appinsights-%{+YYYY.MM.dd}"
    template => "/usr/share/logstash/config/templates/appinsights.json"
    template_name => "appinsights"
    template_overwrite => true
    ssl => true
    cacert => "/usr/share/logstash/config/elastic-stack-ca.pem"
  }
}

The pipeline loads fine, but the index is created, contains no documents and goes into a red state.

The logstash logs say the following -

[2019-01-11T20:13:37,193][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,195][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,299][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,305][WARN ][org.logstash.FieldReference] Detected ambiguous Field Reference %{[tables][0][columns][0][name]}, which we expanded to the path [%{, tables, 0, columns, 0, name, }]; in a future release of Logstash, ambiguous Field References will not be expanded.
[2019-01-11T20:13:37,313][WARN ][org.logstash.FieldReference] Detected ambiguous Field Reference %{[tables][0][columns][1][name]}, which we expanded to the path [%{, tables, 0, columns, 1, name, }]; in a future release of Logstash, ambiguous Field References will not be expanded.
[2019-01-11T20:13:37,593][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,594][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,594][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,594][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,789][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,791][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,791][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,793][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,793][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,793][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:14:38,015][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[azure-appinsights-2019.01.10][3] primary shard is not active Timeout: [1m],

I see a lot of logs with actual data but they are accompanied with a message that the primary shard is not active. I continue to suspect this is filter related. What do you think?


#8

Is the input JSON? If so, you need that

filter { json { source => "message" } }

line. The error message is telling you that the event does not have a field called [tables][0][rows]

field:[tables][0][rows] is of type = NilClass

(Michael) #9

Here is my current filter ::

filter {
  json {
    source => "message"
  }
  mutate {
    convert => { "tables" => "string" }
    convert => { "[tables][rows]" => "string" }
  }
  split { field => "tables" }
  split { field => "[tables][rows]" }
}

I've tried this with and without the mutate. Same result.

[2019-01-12T21:41:57,146][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:tables is of type = NilClass
[2019-01-12T21:41:57,158][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][rows] is of type = NilClass
[2019-01-12T21:41:59,037][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:tables is of type = NilClass
[2019-01-12T21:41:59,040][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][rows] is of type = NilClass


#10

There are multiple things here. I cannot think of anything logstash can do that would turn an index red. There should be an indication in the elasticsearch logs what is causing that.

Meanwhile, in logstash, can you remove the filter section completely and replace the output with

output { stdout { codec => rubydebug } }

and show us the output for an event.


(Michael) #11

Here is an output (this is exactly what displays in the Command Prompt ::

                   "sdkVersion"e[0;37m => e[0me[0;33m"azurefunctions: 2.0.12246.0"e[0m,
                    "itemCount"e[0;37m => e[0me[1;34m1e[0m,
                 "outerMessage"e[0;37m => e[0me[0;33m""e[0m,
                    "errorType"e[0;37m => e[0me[0;33m""e[0m,
             "innermostMessage"e[0;37m => e[0me[0;33m""e[0m,
                     "@version"e[0;37m => e[0me[0;33m"1"e[0m,
           "operation_ParentId"e[0;37m => e[0me[0;33m"nrLh8bzD7zQ="e[0m,
                     "duration"e[0;37m => e[0me[1;34m543.692e[0m,
          "application_Version"e[0;37m => e[0me[0;33m""e[0m,
                    "client_IP"e[0;37m => e[0me[0;33m"0.0.0.0"e[0m,
                    "client_OS"e[0;37m => e[0me[0;33m""e[0m,
                       "source"e[0;37m => e[0me[0;33m""e[0m,
                          "env"e[0;37m => e[0me[0;33m""e[0m,
                     "itemType"e[0;37m => e[0me[0;33m"request"e[0m,
                  "client_Type"e[0;37m => e[0me[0;33m"PC"e[0m,
                   "resultCode"e[0;37m => e[0me[0;33m"0"e[0m,
                           "id"e[0;37m => e[0me[0;33m"|nrLh8bzD7zQ=.affbe790_"e[0m,
                   "@timestamp"e[0;37m => e[0m2019-01-12T05:30:00.024Z,
             "customDimensions"e[0;37m => e[0me[0;33m"{\"TriggerReason\":\"Timer fired at 2019-01-12T05:30:00.0239476+00:00\",\"LogLevel\":\"Information\",\"Category\":\"Host.Results\",\"FullName\":\"HealthCheck\",\"FunctionExecutionTimeMs\":\"542.9888\",\"InvocationId\":\"296afb19-deb5-4f90-8ab7-7a20a7b3851e\",\"OperationName\":\"HealthCheck\"}"e[0m,
            "performanceBucket"e[0;37m => e[0me[0;33m"500ms-1sec"e[0m,
                   "session_Id"e[0;37m => e[0me[0;33m""e[0m,
               "cloud_RoleName"e[0;37m => e[0me[0;33m"cloudRoleNameHere"e[0m,
               "client_Browser"e[0;37m => e[0me[0;33m""e[0m,
         "user_AuthenticatedId"e[0;37m => e[0me[0;33m""e[0m,
                        "appId"e[0;37m => e[0me[0;33m"1fc8cda2-468b4-d3b2516610e2"e[0m,
           "customMeasurements"e[0;37m => e[0me[1;31mnile[0m,
                    "problemId"e[0;37m => e[0me[0;33m""e[0m,
                      "appName"e[0;37m => e[0me[0;33m"appNameHere"e[0m,
                         "iKey"e[0;37m => e[0me[0;33m"d3c9d106d3-ac0b-1b75569b0103"e[0m,
           "cloud_RoleInstance"e[0;37m => e[0me[0;33m"be9f97accd822cd9b37ab025062440f953d630a759"e[0m,
                         "type"e[0;37m => e[0me[0;33m"string"e[0m,
               "user_AccountId"e[0;37m => e[0me[0;33m""e[0m,
                      "success"e[0;37m => e[0me[0;33m"True"e[0m,
                          "url"e[0;37m => e[0me[0;33m""e[0m,
                 "client_Model"e[0;37m => e[0me[0;33m""e[0m,
                 "operation_Id"e[0;37m => e[0me[0;33m"nrLh8bzD7zQ="e[0m,
               "operation_Name"e[0;37m => e[0me[0;33m"HealthCheck"e[0m,
                      "user_Id"e[0;37m => e[0me[0;33m""e[0m,
    "operation_SyntheticSource"e[0;37m => e[0me[0;33m""e[0m,
            "error_operationId"e[0;37m => e[0me[0;33m""e[0m,
                       "itemId"e[0;37m => e[0me[0;33m"2cb53-b469-592bd94f1f6b"e[0m,
                         "name"e[0;37m => e[0me[0;33m"HealthCheck"e[0m,
                    "timestamp"e[0;37m => e[0me[0;33m"2019-01-12T05:30:00.0242524Z"e[0m
}

#12

Your post is unreadable. Can you either precede and follow that with a line containing three backticks so that it is quoted like this

Foo

or, select it and click on </> in the toolbar above the edit pane, which will indent it by 4 spaces, which also works.

Bar
Baz

Either way. You were asking about a field called table. And there is no field in that called table.


(Michael) #13

I edited my post. I do not know what is happening. Would you happen to know why the data is showing up so obscure looking? It looks cleaner when directly querying the API through curl.