Parsing JSON Log and adding to root level

lester74 · January 23, 2025, 6:51am

I'm trying to ingest logs to elasticsearch directly using filebeat 8.17.0 I want to separate the json keys from logs and add it to the root level so I can perform analytics based on the fields.

sample log:

[23/Jan/2025 11:47:13] INFO [elastic_service:1242] {"sort": [{"newarrivals": "desc"}, {"_score": "desc"}, {"speed": "desc"}], "from": 0, "_source": ["product_id", "unit_price", "views", "count_sold", "ordering", "is_soldout", "colorcode", "newarrivals"], "aggs": {"distinct_sizes": {"terms": {"field": "sizes", "order": {"_key": "asc"}, "size": 1000}}, "distinct_color.rgbcode": {"terms": {"field": "color.rgbcode", "order": {"_key": "asc"}, "size": 1000}}, "categories": {"aggs": {"distinct_categories.cat_slug": {"terms": {"field": "categories.cat_slug", "size": 1000}, "aggs": {"categories.groups": {"aggs": {"categories.groups.group_name": {"terms": {"field": "categories.groups.group_name", "size": 1000}}}, "nested": {"path": "categories.groups"}}}}}, "nested": {"path": "categories"}}}, "query": {"function_score": {"query": {"bool": {"minimum_should_match": 1, "filter": [{"bool": {"must": [{"term": {"active": {"value": true}}}]}}], "should": [{"prefix": {"ean": {"boost": 5, "value": "abc@mail.in"}}}, {"prefix": {"sku": {"boost": 5, "value": "abc@mail.in"}}}, {"multi_match": {"minimum_should_match": "75%", "fields": ["parent_category^8", "name^13", "color.name^9", "sizes^4", "sizes_text^4", "category_style^12", "category_solutions^11", "category_fabric^10", "category_occasion^7", "category_offers^8", "category_child^5", "description^4"], "type": "most_fields", "fuzziness": 1, "query": "abc@mail.in"}}, {"nested": {"path": "categories", "query": {"bool": {"must": [{"match": {"categories.cat_name": {"query": "abc@mail.in", "fuzziness": "AUTO"}}}]}}}}], "must": [{"bool": {"should": [{"match": {"name": {"query": "abc@mail.in", "fuzziness": "AUTO"}}}, {"match": {"description": {"query": "abc@mail.in", "fuzziness": "AUTO"}}}, {"prefix": {"sku": {"boost": 5, "value": "abc@mail.in"}}}, {"nested": {"path": "categories", "query": {"bool": {"must": [{"match": {"categories.cat_name": {"query": "", "fuzziness": "AUTO"}}}]}}}}]}}]}}, "functions": [{"script_score": {"script": "_score * (doc[is_soldout].value==true?0:1)"}}], "score_mode": "multiply", "boost_mode": "multiply"}}, "size": 4}

I am using dissect in the filebeat.yml file processors tab to separate the JSON fields,

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - dissect:
          tokenizer: '[%{timestamp}] %{level} [%{service}] %{json_field}'
          field: "message"
          target: ""
  - decode_json_fields:
      fields: ["json_field"]
      process_array: false
      max_depth: 2
      target: ""
      overwrite_keys: true
      add_error_key: true

I now have fields such as dissect.timestamp, dissect.level, dissect.service and dissect.json_field in the root level, but the json_field cannot be dissected further. also facing an error in the dashboard.
Adding dashboard image below

I want to filter the JSON field further and separate the keys from the JSON log to root fields.
I appreciate any help you can provide.

leandrojmp · January 25, 2025, 3:24am

Hello and welcome,

Your configuration is wrong, the dissect processor does not have a target setting, it is target_prefix. [documentation]

Change target: "" in the dissect processor to target_prefix: "" and see if it works.

But also, keep in mind that you have fields that may conflict with built-in fields like _source, I'm not sure if this will work or result in any issue.

lester74 · January 25, 2025, 5:51am

@leandrojmp I have changed target to target_prefix, but still the disect.json_field is not working on decode_json_fields. the keys are still not visible at the root level.

if I manually add the JSON log alone without timestamp, level etc, then it's working, the keys are getting decoded and added to root level.

{"sort": [{"newarrivals": "desc"}, {"_score": "desc"}, {"speed": "desc"}], "from": 0, "_source": ["product_id", "unit_price", "views", "count_sold", "ordering", "is_soldout", "colorcode", "newarrivals"], "aggs": {"distinct_sizes": {"terms": {"field": "sizes", "order": {"_key": "asc"}, "size": 1000}}, "distinct_color.rgbcode": {"terms": {"field": "color.rgbcode", "order": {"_key": "asc"}, "size": 1000}}, "categories": {"aggs": {"distinct_categories.cat_slug": {"terms": {"field": "categories.cat_slug", "size": 1000}, "aggs": {"categories.groups": {"aggs": {"categories.groups.group_name": {"terms": {"field": "categories.groups.group_name", "size": 1000}}}, "nested": {"path": "categories.groups"}}}}}, "nested": {"path": "categories"}}}, "query": {"function_score": {"query": {"bool": {"minimum_should_match": 1, "filter": [{"bool": {"must": [{"term": {"active": {"value": true}}}]}}], "should": [{"prefix": {"ean": {"boost": 5, "value": "abc@mail.in"}}}, {"prefix": {"sku": {"boost": 5, "value": "abc@mail.in"}}}, {"multi_match": {"minimum_should_match": "75%", "fields": ["parent_category^8", "name^13", "color.name^9", "sizes^4", "sizes_text^4", "category_style^12", "category_solutions^11", "category_fabric^10", "category_occasion^7", "category_offers^8", "category_child^5", "description^4"], "type": "most_fields", "fuzziness": 1, "query": "abc@mail.in"}}, {"nested": {"path": "categories", "query": {"bool": {"must": [{"match": {"categories.cat_name": {"query": "abc@mail.in", "fuzziness": "AUTO"}}}]}}}}], "must": [{"bool": {"should": [{"match": {"name": {"query": "abc@mail.in", "fuzziness": "AUTO"}}}, {"match": {"description": {"query": "abc@mail.in", "fuzziness": "AUTO"}}}, {"prefix": {"sku": {"boost": 5, "value": "abc@mail.in"}}}, {"nested": {"path": "categories", "query": {"bool": {"must": [{"match": {"categories.cat_name": {"query": "", "fuzziness": "AUTO"}}}]}}}}]}}]}}, "functions": [{"script_score": {"script": "_score * (doc[is_soldout].value==true?0:1)"}}], "score_mode": "multiply", "boost_mode": "multiply"}}, "size": 4}

but the same is not working on the dissected json_field. decode_json_fields is not able to parse the dissected json log.

leandrojmp · January 25, 2025, 1:01pm

What is the result you got in kibana when you set target_prefix: "" in your filebeat.yml ?

If you still got dissect.* fields, then try to change target_prefix to a random string to confirm that filebeat is indeed using the updated file, something like target_prefix: "test_target".

lester74 · January 25, 2025, 1:50pm

There is no Issue with the target_prefix. since the log entry starting with [23/Jan/2025 11:47:13] INFO [elastic_service:1242], there is an error in json parsing.

error.message parsing input as json: invalid character '/' after array element

leandrojmp · January 25, 2025, 2:17pm

It is not clear what exactly is your issue now.

You said that even using target_prefix: "" you still had the dissect.* fields, but now you say that there is no issue with it and the issue is with the json parse.

Those things are done by 2 different processors, which one is working and which one is not working?

Please share what you are seeing in Kibana, without seeing it is not possible to troubleshoot.

lester74 · January 25, 2025, 2:40pm

I have changed the target to target_prefix under dissect. now only the four dissected fields are available at the root level. dissect.timestamp, dissect.level, dissect.service and dissect.json_field.

Even though I have included decode_json_fields in the next step to parse the JSON from json_field, I don't have any new fields at the root level.

I tried changing the target_prefix value to parsed to confirm the filebeat config changes and it's fine. parsed.timestamp and 3 other fields from dissect are present in the root level.

decode_json_fields was not able to process the dissected json log from dissect.json_fields.

leandrojmp · January 25, 2025, 2:56pm

Ok, so the dissect processor is working and your issue is with the decode_json_fields processor.

You need to share what you are getting in Kibana and what your source message looks like.

lester74 · February 7, 2025, 7:45am

I have modified my logs to complete JSON and removed the dissect processor.

Now decode_json_fields is able to parse my JSON without any issues since it's a complete json.

Topic		Replies	Views
Issue with JSON logs and field 'log.level' Beats filebeat	3	421	July 30, 2019
JSON log, only root level fields being indexed Beats filebeat	10	1620	July 27, 2018
Moving JSON inner field to root level Beats filebeat	2	972	September 20, 2018
Want to parse a nested json form filebeat to elasticsearch Beats filebeat	1	511	July 25, 2019
Filebeat - Parse json output Beats filebeat	3	611	May 27, 2019

Parsing JSON Log and adding to root level

Related topics