Logstash - Aggregate xpath data

Hello,

I am a Logstash beginner and have probably a relatively simple question.
Im using xpath to get data from a xml document (have shortened the code on purpose):

filter
{
	xml {
	source => "message"
	store_xml => false
    xpath =>
	[
	"/Event/@Date", "Date",
	"/Event/@Version", "Version",
	"/E/@S", "Severity",
	]
	}

After that i would like to use the aggregate filter (Correct me if im wrong, but i guess its the simplest method to get, what i would like to have) to have all informations from /Event (Date, Version) in my /E (Severity) Kibana-table to use my filter KQL Syntax to match it and as already said have like all informations from /Event in the Severity table.

I have already looked at the instructions for the filter on the Elastic website, I just don't understand how to apply it with my Xpath data and if it is even possible.

filter {
   grok {
     match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
   }

   if [logger] == "TASK_START" {
     aggregate {
       task_id => "%{taskid}"
       code => "map['sql_duration'] = 0"
       map_action => "create"
     }
   }

Could anybody help me with my problem or give me a brief introduction in the aggregate filter with xpath data? Unfortunately i am also a bit confused how to implement the syntax of the query correctly.

Thanks

What does your [message] field look like? If possible, get it from

output { stdout { codec => rubydebug } }

First of all, thanks for your quick response Badger,
hopefully i could explain my problem.

My output in general looks like this:

{
      "event" => {},
        "log" => {
        "file" => {}
    },
       "Date" => [
        [0] "2021-08-21"
    ],
    "message" => "<Events UTCOfs=\"+0200\"Number=\"123\" Number2=\"1234\" Version=\"V1\" EventStartDate=\"2021-08-21\">\r",
    "Version" => [
        [0] "V1"
    ]
}
{
       "event" => {},
         "log" => {
        "file" => {}
    },
    "Severity" => "Info",
     "message" => "  <E L=\"1\" T=\"00:00:01.275\" S=\"Information\" E=\"0\" Usr=\"\" ></E>\r"

And i would like to get the attributes for example "Date" = 2021-08-21 or "Version" V1 from the events element to my E Element with attribute Severity. (I thought aggregate could be the easiest solution)

For my usecase there will be more Attributes to add to the E Element fields, but i just wanted a brief introduction to this topic and try to solve the issue after that by myself.

    xpath => {
        "/Events/@EventStartDate" => "Date"
        "/Events/@Version" => "Version"
        "/E/@S" => "Severity"
        }

will pull the attributes off the elements, but to use aggregate you need a task id that both messages have, and you do not appear to have one.

Ah oke thanks,

any suggestion what i could use to solve my problem? Is there any chance to combine the Elements to use it for kibana evaluation?

I cannot see a way to do it.

1 Like

Sounds like you are trying to combine fields from different events into the same document in Elasticsearch?

If that is the case, my suggestion would be to generate a unique document id, e.g. using fingerprint filter, and then index both events into the same document in Elasticsearch.

1 Like

Hei hendry,

i will have a look at the fingerprint filter, thank you! My Usecase would be to read in all Elements and after that doing queries to evaluate the data.

For example: StartDate <= 2021-08-21 and Version : V1 and Severity : Info

So like you said, i might have to generate such a ID and index the events together.

I tried to use the fingerprint filter today, but i still got some issues and also have Syntax/Understanding problems.

fingerprint {
      method => "SHA1"
      source => ["[Version]", "[Severity]"]
    }

So my fingerprint configuration looks like this, but as in the wiki-docs mentioned:

"This example produces a single fingerprint that is computed from "birthday (should be Severity here)" the last source field."

Issue: -> i´m only receiving the last line of my logs as a document, but with all informations from /Events and /E. (i have got around 200.000 lines in my .xml)

So it worked to combine the Element Events with the Element E.
However my Usecase would be to have like all Loglines from my file with the Informations from both Elements (Events & E)

 <Events UTCOfs="+0200" Number="123" Number2="1234" Version="V1" EventStartDate="2021-08-21">
 <E L="1" T="00:00:01.275" S="Information" E="0" Usr="" ></E>
 <E L="2" T="00:07:21.986" S="Warning" ></E>
 

My question here is how to configure the fingerprint filter, that im getting like a loop, which prints the /E Elements combined with /Events from the first to last log line. I also have got some difficulties with the output and if

document_id => "" 
		doc_as_upsert => true
		action => "update"

are useful for my kind of configuration.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.