Field munging

So here's what I got. I am monitoring two files via logstash, each has a field "service". Now, I have the below combinations of results in Kibana, the dashes show no result:

ntp,-
-,ntp
ntp,ntp_google
-,-

I'd like to do....something, but I'm not sure what. In plain English I'd really like "if you have any results, just show that one result" which would address the first two. For the second, "if you have a result with an underscore, just show that one". Lastly, "if you have two dashes, rename that to 'Unknown'". I don't even know if it's possible, and truth be told if it's a boatton of work in logstash then meh I'll just deal with it as it is. Thank you.

the below combinations of results in Kibana

It's very hard to understand what you mean. What do the documents contain? What query are you using in Kibana? Are you seeing this in the Discover tab? A screenshot might help.

Yea I guess I could have made this clearer. Ok so here's the logstash bits:

input {

	file {
		type => "connlog"
		path => "/opt/bro/spool/bro/conn.log"
		sincedb_path => "/var/lib/logstash/.sincedbconn"
	}

This has a grok match (snips added for clarity):

match => [ "message", "(<snip>\t(?<service>(.*?)) <snip> \t(?<service>(.*))" ]

I've named them both service since a) they ARE services (ssl for the first and for example ssl_facebook on the second). This is how Kibana displays them:

So again...I can live with this..the service has at least been identified with one or the other field. If it hasn't been, then I know it's something unknown and I can dig deeper into it. My question is can I clean this up so where I only have either a) a single entry, the service with a "_" if it exists, b) remove the "-," or ",-" if they exist, and lastly c) change the "-,-" to "Unknown". Thanks...sorry I wasn't very clear here.

  • So the service field should really never be an array then? Since you want to remove all hyphen entries and in the last example want to keep "ssl_google" it seems service should always be a string.
  • Please show some example log entries. The simplest solution might (depending on the answer to the previous bullet) be to adjust the grok expression.

Thanks for looking at this Magnus. Here's a sample:

1476643999.554721	CCds0e2aAntQPGzZn2	192.168.1.2	13062	192.168.1.253	53	udp	dns	0.045913	37	213	SF	T	T	0	Dd	1	65	1	241	(empty)	dns_google

I'm basically cheating here....the last "service" entry is actually called "protosigs", but I called it service so they'd end up in the same place. The goal and intent is to be able to histogram by date using the service Term, since currently we can only use one Term in Kibana (besides TimeLion which continues to mystify me). Case in point:

Thanks again Magnus...I tried using join => { "service" => "," }, but I think that only applies to arrays.

Maybe capture the last field into a separate temporary field that you overwrite the service value with if set?

That's a neat idea...not sure how'd I would do that in logstash...I'll do some research..thank you Magnus!