Sending mltiple fields per file... possible?


(Alejandro Olivan) #1

Hi forum.... seems this is the first post here!!!

OK...
By reading blogs and howtos, I'm confident we can make a cluster of servers to send their logs, and setup sending a "type" field, with a value, in order to distinguish every log entry from its original file... Good! but.... Could we send more than one "type" field?

I explain, here (I think this works...) we can send logs from a file, paired with an identificative field type.

"files": [
{
"paths": [
"/var/log/service1"
],
"fields": { "type": "service1" }
}
]
"files": [
{
"paths": [
"/var/log/service2"
],
"fields": { "type": "service2" }
}
]

But I want more! Could we add more "type" or "tag" fields? I find this very useful?!!!... so I could filter logs from both service and server... the idea is:

"files": [
{
"paths": [
"/var/log/service1"
],
"fields": { "type": "service1", "type":"server1" }
}
]
"files": [
{
"paths": [
"/var/log/service2"
],
"fields": { "type": "service2", "type":"server1" }
}
]


(Allan Mitchell) #2

Hi

If I understand correctly you want to add "something" to your documents to describe the document. You want to assign these identifiers in Logstash. There could be n identifiers per document. How about adding an array of identifiers?
Would the following work for you

  mutate {
    add_field => { 
      "types" => "type 1"
    }
  }

 mutate {
    add_field => { 
      "types" => "type 2"
    }
}

or you could add tags

  mutate {
    add_tag => [ "tag 1", "tag 2"]
  }

Allan


(Mark Walkom) #3

You cannot have an array of _types, if you are referring to the metadata field that Elasticsearch uses.

The best option would be to use what Allan suggested and create an extra type or tags field.


(Alejandro Olivan) #4

Hi! thank you very much for your help!

Probably my questions are result of fundamentals ignorance... Obviously, my missing point is that I have not succeeded on log type - log source pair discrimination at the receiving data (logstash) side.
That's why I'm trying to add more and more tags/fields at the logstash-forwarder sending side.
Right now, I'm using different UDP ports to send different log types from the same host in order to discrimianate at the ending point...

I do not understand how and where you use those mutate instructions... Its not clear to me:
If I'm not wrong and in my short experience, mutate instructions are executed once certain filtering criteria has been met, and is something logstash / receive-side (and not logstash-forwarder / send side!) stuff.... so there comes the point I'm missing:
If I cannot discriminate / mark at the sending side, I could not filter by my marks at the receiving side!

Overall, in my head the picture is something like this:
If server 1 has services A, B and C, and server 2 has services A and B, there should be a way to "mark" logs as they are all sent away from different points to a single common end point...., so, in the receiving logstash side I can filter like: Hey! this log has tags/fields "server1" and "serviceA" filter that way! or Hey! this log has tags/fields "server2" and "serverA" filter that way! and so on...
Also, that way, on the query/kibana side I could claim for every entry from of service A in all servers, or querying for service A just in server 2 and, all logs form "server1", and so on...

Right now, I'm discriminating only by port, and/or by looking for fields in the log that include the IP address of the sending server (for instance firewall logs): this is poor and ugly! if log does not give any clue of the origin I'm unable to discriminate by log content and I have to rely on port (it is something I learn by googling around)... I'm sure there must be a way to handle log traffic in an ordered and elegant way!

Best regards and good work!


(Magnus Bäck) #5

Yes, you can and should tag your messages for easy filtering. It's just that the type field can't be an array so you have to add additional fields. It seems you want to discriminate based on message type and message source, so why not define a source field in addition to type? Rewriting your original example:

"files": [
  {
    "paths": [
      "/var/log/service1"
    ],
    "fields": {
      "source": "service1",
      "type":"server1"
    }
  },
  {
    "paths": [
      "/var/log/service2"
    ],
    "fields": {
      "source": "service2",
      "type": "server1"
    }
  }
]

(Alejandro Olivan) #6

GREAT!!!

I like that clear explanation!

I start making the picture in my mind... So type can't be an array... BUT we are free AT SOURCE SIDE to inject additional fields in our log lines BEFORE they leave... that way, I imagine I have to look for this fields on the receiving side, on the early stages of filtering, add tags, and let those logs be filtered by their filter files...

looks promising, Thank you very much!

I will test it as soon as I got all pieces of the puzzle clear!


(system) #7