HELP ! analyze unstructured data to be able to visualize them

(Irving Hernandez) #1

I want to analyze the file java.log are unstructured data which I would like to be able to obtain specific fields such as: error, java, idCatCatalog and ect. Intenete realizing a template loading it to Logstash but you do not appear to me the fields that add. There is a way of realizing using curl, api java, api python, etc.

(Ry Biesemeyer) #2

If you plan to read these files with the File Input Plugin, you'll likely need to use the multiline codec (default behaviour is to interpret log files as one event per line, but it's clear each of your events span multiple lines). Since your events begin with a square-bracketed funky-format date, you'll need to specify a pattern that could reliably match any date that is output.

input {
  file {
    path => "/var/log/*.log"
    codec => multiline {
      pattern => "\[\d{1,2}/\d{1,2}/\d{2} \d{1,2}:\d{2}:\d{2}:\d{3} [A-Z]{3,4}\]"
      negate => true

From there, you may want to paste a few lines into the Grok Constructor, which is a helpful tool for building patterns for extracting information from semistructured data.

The bit between the ------ delimiters looks pretty well-structured key-value to me, so using the KV Filter Plugin could prove useful, perhaps something like:

filter {
  kv {
    field_split_pattern => "\n"
    value_split_pattern => ": "

While you're developing your pipeline, the STDOUT output plugin combined with the Rubydebug Codec can be very helpful in enabling you to see the structure that you're successfully extracting from the flat events:

output {
  stdout {
    codec => rubydebug

Once you're fairly confident that you're extracting the useful bits, you may want to start using the Elasticsearch Output Plugin to get the events into Elasticsearch, where you can use a visualisation tool like Kibana.

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.