Parsing issues

nickel43 · November 19, 2021, 12:06pm

Hello,

I'm new to Elastic and trying to parse JSON files in order to have multiple fields, so that I can make statistics out of it in Kibana.

Here is a sample:

{
    "info": {
        "generated_on": "2017-12-03 08:41:42.057563", 
        "slice": "0-999", 
        "version": "v1"
    }, 
    "playlists": [
        {
            "name": "Rock", 
            "collaborative": "false", 
            "pid": 0, 
            "modified_at": 1493424000, 
            "num_tracks": 22, 
            "num_albums": 27, 
            "num_followers": 1, 
            "tracks": [
                {
                    "pos": 0, 
                    "artist_name": "Michael Jackson", 
                    "track_uri": "spotify:track:0UaMYEvWZi0ZqiDOoHU3YI", 
                    "artist_uri": "spotify:d5F5d7go1WT98tk", 
                    "track_name": "Song", 
                    "album_uri": "spotify:album:6vV5Udzzf4Qo2I9K", 
                    "duration_ms": 226863, 
                    "album_name": "The Cookbook"
                }], 
                "num_edits": 34, 
                "duration_ms": 9065801, 
                "num_artists": 37
            },
        {
            "name": "Jazz", 
            "collaborative": "false", 
            "pid": 0, 
            "modified_at": 1493424000, 
            "num_tracks": 22, 
            "num_albums": 27, 
            "num_followers": 1, 
            "tracks": [
                {
                    "pos": 0, 
                    "artist_name": "Whatever", 
                    "track_uri": "spotify:track:0UaMYEvWZi0ZqiDOoHU3YI", 
                    "artist_uri": "spotify:d5F5d7go1WT98tk", 
                    "track_name": "Song", 
                    "album_uri": "spotify:album:6vV5Udzzf4Qo2I9K", 
                    "duration_ms": 226863, 
                    "album_name": "The Cookbook"
                }], 
                "num_edits": 34, 
                "duration_ms": 9065801, 
                "num_artists": 37
            }
        ]
    }

I have managed to parse the above by using this logstash configuration:

input{
    file{
        path => "test.json"
        sincedb_path => "/dev/null"
        start_position => "beginning"
        codec => multiline { pattern => "^Spalanzani" negate => true what => previous auto_flush_interval => 1 }
    }
}

filter { 
    json { 
        source => "[message]"
    }
}


output{
    elasticsearch{
        hosts => "localhost:9200"
        index => "test"
    }
    stdout { codec => rubydebug }
}

The problem is:

This works for small files but as soon as I use the whole JSON files (~35MB, 60k lines)
I receive parsing errors and messages from hits in Kibana are just tracks/playlists that are randomly cut.
I'm 100% sure the JSON are written correctly and follow the above grammar.
Could it be the files are too big?

I use the latest versions of Kibana, Logstash and Elasticsearch.

Thank you for your help

Badger · November 19, 2021, 5:32pm

Yes. See the max_bytes and max_lines options on the multiline codec.

system · December 17, 2021, 5:33pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parsing issues Logstash	7	476	January 5, 2022
JSON parsing question - Elasticsearch+Kibana+Logstash 6.5 Logstash	12	2869	February 8, 2019
Logstash-elastic do not support array based columns? Logstash	15	1800	October 23, 2017
Trying to parse the content of a json into different fields Logstash	6	1269	October 18, 2018
Parsing json logs using logstash Logstash	11	4895	July 1, 2021

Parsing issues

Related topics