JSON Result with 'columns' and 'rows' values - help parsing

mseto · January 10, 2019, 10:42pm

I've got a JSON document like so ::

{
"tables": [{
"name": "PrimaryResult",
"columns": [{
"name": "test",
"type": "string"
}, {
"name": "test2,
"type": "string"
}],
"rows": [
["testdatarow1", "test2datarow1"],
["testdatarow2", "test2datarow2"]
]
}]
}

As you can see, it labels the columns for the rows in a separate JSON section. How would I go about parsing / filtering this in Logstash before sending to Elasticsearch?

Thank you!!

Badger · January 10, 2019, 10:59pm

If you fix the missing quote in '"name": "test2,' then you can parse it using

filter { json { source => "message" } }

That will get you to

    "tables" => [
    [0] {
        "columns" => [
            [0] {
                "type" => "string",
                "name" => "test"
            },
            [1] {
                "type" => "string",
                "name" => "test2"
            }
        ],
           "rows" => [
            [0] [
                [0] "testdatarow1",
                [1] "test2datarow1"
            ],
            [1] [
                [0] "testdatarow2",
                [1] "test2datarow2"
            ]
        ],
           "name" => "PrimaryResult"
    }
]

From there you can do whatever you want. But you haven't said what you want to do...

mseto · January 10, 2019, 11:01pm

Thanks for responding!

I want to have each 'row' in Elastic as a separate log entry, using the proper fields from 'columns'

Quote issue - sorry, that was my own typo when removing sensitive data before posting.

Badger · January 11, 2019, 12:21am

OK, so you will probably want to a split filter on [tables][0][rows]. I think you are then saying that you want the column names to be pulled out of [tables][0][columns] to rename fields from rows. Can you show me, based on that sample data, what a single event should look like?

mseto · January 11, 2019, 1:15am

Yes, each row would be a new document in the ES index and within that document, would be the data from the row split up amongst the various fields from the columns

So the document in the index would be ::

field: test (string)
data: testdatarow1

field: test2 (string)
data: test2datarow1

The next document in the index ::

field: test (string)
data: testdatarow2

field: test2 (string)
data: test2datarow2

Make sense?

Thank you!

Badger · January 11, 2019, 8:11am

I would try something like

split { field => "[tables][0][rows]" }
mutate {
    add_field => {
        "%{[tables][0][columns][0][name]}" => "%{[tables][0][rows][0]}"
        "%{[tables][0][columns][1][name]}" => "%{[tables][0][rows][1]}"
    }
    remove_field => [ "tables" ]
}

mseto · January 11, 2019, 9:00pm

So here is my complete pipeline -

input {
appinsights {
type => "string"
query_interval_hours => 1
source => "requests"
key => "key here"
app_id => "app id here"
}
}

filter {
  split { field => "[tables][0][rows]" }
  mutate {
    add_field => {
      "%{[tables][0][columns][0][name]}" => "%{[tables][0][rows][0]}"
      "%{[tables][0][columns][1][name]}" => "%{[tables][0][rows][1]}"
    }
    remove_field => [ "tables" ]
  }
}

output {
  elasticsearch {
    hosts => ["elasticsearch:9200" ]
    user => "elastic"
    password => "pass here"
    index => "azure-appinsights-%{+YYYY.MM.dd}"
    template => "/usr/share/logstash/config/templates/appinsights.json"
    template_name => "appinsights"
    template_overwrite => true
    ssl => true
    cacert => "/usr/share/logstash/config/elastic-stack-ca.pem"
  }
}

The pipeline loads fine, but the index is created, contains no documents and goes into a red state.

The logstash logs say the following -

[2019-01-11T20:13:37,193][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,195][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,299][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,305][WARN ][org.logstash.FieldReference] Detected ambiguous Field Reference %{[tables][0][columns][0][name]}, which we expanded to the path [%{, tables, 0, columns, 0, name, }]; in a future release of Logstash, ambiguous Field References will not be expanded.
[2019-01-11T20:13:37,313][WARN ][org.logstash.FieldReference] Detected ambiguous Field Reference %{[tables][0][columns][1][name]}, which we expanded to the path [%{, tables, 0, columns, 1, name, }]; in a future release of Logstash, ambiguous Field References will not be expanded.
[2019-01-11T20:13:37,593][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,594][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,594][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,594][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,789][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,791][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,791][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,793][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,793][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:13:37,793][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][0][rows] is of type = NilClass
[2019-01-11T20:14:38,015][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[azure-appinsights-2019.01.10][3] primary shard is not active Timeout: [1m],

I see a lot of logs with actual data but they are accompanied with a message that the primary shard is not active. I continue to suspect this is filter related. What do you think?

Badger · January 11, 2019, 10:18pm

Is the input JSON? If so, you need that

filter { json { source => "message" } }

line. The error message is telling you that the event does not have a field called [tables][0][rows]

field:[tables][0][rows] is of type = NilClass

mseto · January 12, 2019, 10:49pm

Here is my current filter ::

filter {
  json {
    source => "message"
  }
  mutate {
    convert => { "tables" => "string" }
    convert => { "[tables][rows]" => "string" }
  }
  split { field => "tables" }
  split { field => "[tables][rows]" }
}

I've tried this with and without the mutate. Same result.

[2019-01-12T21:41:57,146][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:tables is of type = NilClass
[2019-01-12T21:41:57,158][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][rows] is of type = NilClass
[2019-01-12T21:41:59,037][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:tables is of type = NilClass
[2019-01-12T21:41:59,040][WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[tables][rows] is of type = NilClass

Badger · January 12, 2019, 11:09pm

There are multiple things here. I cannot think of anything logstash can do that would turn an index red. There should be an indication in the elasticsearch logs what is causing that.

Meanwhile, in logstash, can you remove the filter section completely and replace the output with

output { stdout { codec => rubydebug } }

and show us the output for an event.

mseto · January 13, 2019, 12:31am

Here is an output (this is exactly what displays in the Command Prompt ::

                   "sdkVersion"e[0;37m => e[0me[0;33m"azurefunctions: 2.0.12246.0"e[0m,
                    "itemCount"e[0;37m => e[0me[1;34m1e[0m,
                 "outerMessage"e[0;37m => e[0me[0;33m""e[0m,
                    "errorType"e[0;37m => e[0me[0;33m""e[0m,
             "innermostMessage"e[0;37m => e[0me[0;33m""e[0m,
                     "@version"e[0;37m => e[0me[0;33m"1"e[0m,
           "operation_ParentId"e[0;37m => e[0me[0;33m"nrLh8bzD7zQ="e[0m,
                     "duration"e[0;37m => e[0me[1;34m543.692e[0m,
          "application_Version"e[0;37m => e[0me[0;33m""e[0m,
                    "client_IP"e[0;37m => e[0me[0;33m"0.0.0.0"e[0m,
                    "client_OS"e[0;37m => e[0me[0;33m""e[0m,
                       "source"e[0;37m => e[0me[0;33m""e[0m,
                          "env"e[0;37m => e[0me[0;33m""e[0m,
                     "itemType"e[0;37m => e[0me[0;33m"request"e[0m,
                  "client_Type"e[0;37m => e[0me[0;33m"PC"e[0m,
                   "resultCode"e[0;37m => e[0me[0;33m"0"e[0m,
                           "id"e[0;37m => e[0me[0;33m"|nrLh8bzD7zQ=.affbe790_"e[0m,
                   "@timestamp"e[0;37m => e[0m2019-01-12T05:30:00.024Z,
             "customDimensions"e[0;37m => e[0me[0;33m"{\"TriggerReason\":\"Timer fired at 2019-01-12T05:30:00.0239476+00:00\",\"LogLevel\":\"Information\",\"Category\":\"Host.Results\",\"FullName\":\"HealthCheck\",\"FunctionExecutionTimeMs\":\"542.9888\",\"InvocationId\":\"296afb19-deb5-4f90-8ab7-7a20a7b3851e\",\"OperationName\":\"HealthCheck\"}"e[0m,
            "performanceBucket"e[0;37m => e[0me[0;33m"500ms-1sec"e[0m,
                   "session_Id"e[0;37m => e[0me[0;33m""e[0m,
               "cloud_RoleName"e[0;37m => e[0me[0;33m"cloudRoleNameHere"e[0m,
               "client_Browser"e[0;37m => e[0me[0;33m""e[0m,
         "user_AuthenticatedId"e[0;37m => e[0me[0;33m""e[0m,
                        "appId"e[0;37m => e[0me[0;33m"1fc8cda2-468b4-d3b2516610e2"e[0m,
           "customMeasurements"e[0;37m => e[0me[1;31mnile[0m,
                    "problemId"e[0;37m => e[0me[0;33m""e[0m,
                      "appName"e[0;37m => e[0me[0;33m"appNameHere"e[0m,
                         "iKey"e[0;37m => e[0me[0;33m"d3c9d106d3-ac0b-1b75569b0103"e[0m,
           "cloud_RoleInstance"e[0;37m => e[0me[0;33m"be9f97accd822cd9b37ab025062440f953d630a759"e[0m,
                         "type"e[0;37m => e[0me[0;33m"string"e[0m,
               "user_AccountId"e[0;37m => e[0me[0;33m""e[0m,
                      "success"e[0;37m => e[0me[0;33m"True"e[0m,
                          "url"e[0;37m => e[0me[0;33m""e[0m,
                 "client_Model"e[0;37m => e[0me[0;33m""e[0m,
                 "operation_Id"e[0;37m => e[0me[0;33m"nrLh8bzD7zQ="e[0m,
               "operation_Name"e[0;37m => e[0me[0;33m"HealthCheck"e[0m,
                      "user_Id"e[0;37m => e[0me[0;33m""e[0m,
    "operation_SyntheticSource"e[0;37m => e[0me[0;33m""e[0m,
            "error_operationId"e[0;37m => e[0me[0;33m""e[0m,
                       "itemId"e[0;37m => e[0me[0;33m"2cb53-b469-592bd94f1f6b"e[0m,
                         "name"e[0;37m => e[0me[0;33m"HealthCheck"e[0m,
                    "timestamp"e[0;37m => e[0me[0;33m"2019-01-12T05:30:00.0242524Z"e[0m
}

Badger · January 13, 2019, 1:01am

mseto:

"sdkVersion"e[0;37m => e[0me[0;33m"azurefunctions: 2.0.12246.0"e[0m,
"itemCount"e[0;37m => e[0me[1;34m1e[0m,
"outerMessage"e[0;37m => e[0me[0;33m""e[0m,
"errorType"e[0;37m => e[0me[0;33m""e[0m,
"innermostMessage"e[0;37m => e[0me[0;33m""e[0m,
"@version"e[0;37m => e[0me[0;33m"1"e[0m,
"operation_ParentId"e[0;37m => e[0me[0;33m"nrLh8bzD7zQ="e[0m,
"duration"e[0;37m => e[0me[1;34m543.692e[0m,
"application_Version"e[0;37m => e[0me[0;33m""e[0m,
"client_IP"e[0;37m => e[0me[0;33m"0.0.0.0"e[0m,
"client_OS"e[0;37m => e[0me[0;33m""e[0m,
"source"e[0;37m => e[0me[0;33m""e[0m,
"env"e[0;37m => e[0me[0;33m""e[0m,
"itemType"e[0;37m => e[0me[0;33m"request"e[0m,
"client_Type"e[0;37m => e[0me[0;33m"PC"e[0m,
"resultCode"e[0;37m => e[0me[0;33m"0"e[0m,
"id"e[0;37m => e[0me[0;33m"|nrLh8bzD7zQ=.affbe790_"e[0m,
"@timestamp"e[0;37m => e[0m2019-01-12T05:30:00.024Z,
"customDimensions"e[0;37m => e[0me[0;33m"{"TriggerReason":"Timer fired at 2019-01-12T05:30:00.0239476+00:00","LogLevel":"Information","Category":"Host.Results","FullName":"HealthCheck","FunctionExecutionTimeMs":"542.9888","InvocationId":"296afb19-deb5-4f90-8ab7-7a20a7b3851e","OperationName":"HealthCheck"}"e[0m,
"performanceBucket"e[0;37m => e[0me[0;33m"500ms-1sec"e[0m,
"session_Id"e[0;37m => e[0me[0;33m""e[0m,
"cloud_RoleName"e[0;37m => e[0me[0;33m"cloudRoleNameHere"e[0m,
"client_Browser"e[0;37m => e[0me[0;33m""e[0m,
"user_AuthenticatedId"e[0;37m => e[0me[0;33m""e[0m,
"appId"e[0;37m => e[0me[0;33m"1fc8cda2-468b4-d3b2516610e2"e[0m,
"customMeasurements"e[0;37m => e[0me[1;31mnile[0m,
"problemId"e[0;37m => e[0me[0;33m""e[0m,
"appName"e[0;37m => e[0me[0;33m"appNameHere"e[0m,
"iKey"e[0;37m => e[0me[0;33m"d3c9d106d3-ac0b-1b75569b0103"e[0m,
"cloud_RoleInstance"e[0;37m => e[0me[0;33m"be9f97accd822cd9b37ab025062440f953d630a759"e[0m,
"type"e[0;37m => e[0me[0;33m"string"e[0m,
"user_AccountId"e[0;37m => e[0me[0;33m""e[0m,
"success"e[0;37m => e[0me[0;33m"True"e[0m,
"url"e[0;37m => e[0me[0;33m""e[0m,
"client_Model"e[0;37m => e[0me[0;33m""e[0m,
"operation_Id"e[0;37m => e[0me[0;33m"nrLh8bzD7zQ="e[0m,
"operation_Name"e[0;37m => e[0me[0;33m"HealthCheck"e[0m,
"user_Id"e[0;37m => e[0me[0;33m""e[0m,
"operation_SyntheticSource"e[0;37m => e[0me[0;33m""e[0m,
"error_operationId"e[0;37m => e[0me[0;33m""e[0m,
"itemId"e[0;37m => e[0me[0;33m"2cb53-b469-592bd94f1f6b"e[0m,
"name"e[0;37m => e[0me[0;33m"HealthCheck"e[0m,
"timestamp"e[0;37m => e[0me[0;33m"2019-01-12T05:30:00.0242524Z"e[0m

Your post is unreadable. Can you either precede and follow that with a line containing three backticks so that it is quoted like this

Foo

or, select it and click on </> in the toolbar above the edit pane, which will indent it by 4 spaces, which also works.

Bar
Baz

Either way. You were asking about a field called table. And there is no field in that called table.

mseto · January 13, 2019, 1:30am

I edited my post. I do not know what is happening. Would you happen to know why the data is showing up so obscure looking? It looks cleaner when directly querying the API through curl.

system · February 10, 2019, 1:30am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Split row data to fields Logstash	7	2138	July 6, 2017
Storing table-like data in Elastic Search Elasticsearch	15	4341	July 6, 2017
How to parse mix json logs Logstash	28	4562	March 25, 2019
Json Data not getting parsed when sent to Elasticsearch Elasticsearch	5	900	July 6, 2017
Logstash-elastic do not support array based columns? Logstash	15	1853	October 23, 2017

JSON Result with 'columns' and 'rows' values - help parsing

Related topics