Is Logstash useful for structured data?

Hello everyone,

Recently I started to migrate some data into Elasticsearch using the Python client.

The data is already structured in json files and needs some processing like:

  • lowercase all the values
  • handle value inconsistencies for country field( ex: "United States of America" and "USA")
  • add some extra fields

In this case, is still worth it to use Logstash( from what I have seen in documentation and tutorials is mainly used for formatting log files using grok) or is better to make all the processing in python and then add the data into Elasticsearch?

Thank you!

Hi,

for me it perfectly makes sense to use logstash to add fields and modify event data going through pipelines.

Maybe confirm with an elastic team member as i'm not an expert :wink:

1 Like

may be we can help each other :). I am good in logstash but zero in python.

but yes you can do all three bullet point you listed in logstash.
in sort version
mutate - lowercase => [fieldname]
if ([country] /United Stats of America to USA/ ) { change to USA }
mutate - add_field

1 Like

Sure, if you have any Python questions you can reach me out.

Regarding the country field. I will have to write in the config file a line for each country (which seems a lot of hard-coding).

Is it possible to have something like a pre-defined key-value variable and use it to map each coutry name to its shortened version?

A translate filter could do that.

1 Like

here is example

  translate {
      field => "[username]"
      destination => "[manager]"
      dictionary_path => "/tmp/user_manager.csv"
   }

and this is how your /tmp/user_manager.csv looks like

"sam","zulu"
"zulu","younameit"

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.