Kibana .raw fields and "message" field

alexolivan · May 15, 2015, 3:37pm

I again fourum...

I have optimized my filters... overall very good, but now I have Kibana problem with space separated fields.
Reading around, I have stated I have "lost" those .raw fields...
Now I know they were there for some reason.

I suspect the problem is related to the fact that I have followed the practice of deleting "message" fields at the end of filtering.
Basically, this is the most evident change I did to actual filtering (some claenup, grokparsefailure handle, output index dressing...)

How can I get those .raw fields back? Or how to make my resulting String fields being handled good by kibana?
Or maybe can we "force" our message extracted field to become .raw treated?

Thank you very much.
Best regards!

magnusbaeck · May 16, 2015, 10:40am

With the exception of the message field (starting with Logstash 1.5+) all string fields come with a .raw subfield. If you don't have them it's because something you've done. Here's the part of the default mapping template that creates the .raw subfields:

github.com

logstash-plugins/logstash-output-elasticsearch/blob/v0.2.4/lib/logstash/outputs/elasticsearch/elasticsearch-template.json#L18-L27


"string_fields" : {
  "match" : "*",
  "match_mapping_type" : "string",
  "mapping" : {
    "type" : "string", "index" : "analyzed", "omit_norms" : true,
      "fields" : {
        "raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
      }
  }
}

alexolivan · May 17, 2015, 10:26pm

... aha...
To be sincere nothing in my filters suggests (at least for me) that nothing is done related to the .raw fields.
Nothing about them is present...
I will try to comment the lines deleting the "message" field at the end of the filters...
Is the only relevant action. Meaybe by keeping the message field the .raw fields are kept with it....

Will report my experiments...

alexolivan · May 18, 2015, 9:10am

OK... I little more googling (after stating that "message" field deletion is not related with my .raw missing problems) has give me some clues...

The problem may come by the fact that I have reorganized my indices into a series of new, elasticsearch output entries in configuration.
As I read around, if you go beyond the default logstash-yyy.mm.dd indices tree... then .raw fields are lost since no "template" applies to your custom ones... (that's my 11:00 AM noob theory! )...

While still researching on it... could anyone give some advance on whether am I going wrong?
Should I create new templates for new indices? can i just rename/reuse existing ones?

Thank you all... best regards!

alexolivan · May 18, 2015, 10:10am

... I have copy/renameded elasticserach-template.json to my custom service1.json, service2.json.. templates, in order to apply them to output to indices named that way... but I missing something... it does not work...

Would it be possible to take profit of the default logstash-* template to create indices like logstash-service1-%{+YYYY-MM-dd} logstash-service2-%{+YYYY-MM-dd} logstash-service3-%{+YYYY-MM-dd} and still get the raw fields?

Otherwise, managing custom templates for some reason is refusing to work...
My setup is something like that...

....
} else if "PFSense-2.2" in [tags] {
elasticsearch {
host => "127.0.0.1"
cluster => "example"
index => "pfsense-%{+YYYY-MM-dd}"
template => "/opt/logstash/lib/logstash/outputs/elasticsearch/elasticsearch-pfsense.json"
template_name => "pfsense"
}
} else if "SuricataIDPS" in [tags] {
elasticsearch {
host => "127.0.0.1"
cluster => "example"
index => "suricata-%{+YYYY-MM-dd}"
template => "/opt/logstash/lib/logstash/outputs/elasticsearch/elasticsearch-suricata.json"
template_name => "suricata"
}
} else if "stream" in [tags] {
elasticsearch {
host => "127.0.0.1"
cluster => "example"
index => "stream-%{+YYYY-MM-dd}"
template => "/opt/logstash/lib/logstash/outputs/elasticsearch/elasticsearch-stream.json"
template_name => "stream"
}
....
... where those .json files are modief versions of elasticsearch-template.json...

EDIT:
FINALLY SOLVED.

Although not as elegant as i wish I finally got .raw files back by creating my indexes with a common naming convention which matches logstash default indice teplate.... since it use *, any logstash-SERVICEn-%{+YYYY.MM.dd} matches the elasticsearch default template... and .raw fields are back...

kibana dashboards fill terms nicely now!

mega_robo · August 27, 2015, 8:58am

i have same and i not understand completely your replays to solve problem i have get logs from pfsense and show in kibana but its not show in rows and chart and i think its format in have problem
i see logs in kibana like this

{"message":"<134>Aug 22 16:10:43 filterlog: 253,1XXXXXX,XXXXXXXX,emX,match,block,in,4,0x0,,128,4132,0,none,17,udp,x,192.168X>X,192.168.x.x5,137,137,58","@version":"1","@timestamp":"2015-08-22T11:39:51.766Z","type":"syslog","host":"1xxxxx"}

how i fix it

alexolivan · August 27, 2015, 12:02pm

Hi mega_robo

To be sincere... I think your problem is unrelated to the one I originally dealt in this thread:
My problem was that my .raw fields were lost... I explain:
Provide we do have a field named 'bytes' from the logs... OK, then a 'bytes.raw' is automagically generated by logstash for several purposes.
The 'magic' behind this fact is that a there exists a default set of log processing (...including .raw fields creaton...), and it is performed on any data landing on a Elasticsearch database/storage IF and ONLY IF the naming convention sitcks to the default: that is: elasticsearch-%{+YYYY.MM.dd} (which is the way it always apears in howtos, and seem to use everybody in hte universe)
My problem was that I decided to create my own naming convention for custom Elasticsearch organization, and once my naming didn't match the correct structure, default processing didn occured... and my .raw fields ended appearing.

By reading your post I understand you have some kind of filtering/groking, which is completely different... but maybe I need more details to understand what is happening to your setup.

Best regards.

Jeremy_Colton · July 18, 2016, 8:27am

Hi @alexolivan, I have the same problem as you. But, if you call your index something that matches the "logstash-*" pattern eg "logstash-Service-%{+YYYY.MM.dd}" then won't results from this index be included in the default "logstash-*" index as well? Your "logstash-*" index will be contaminated with data from your secondary index!..

alexolivan · July 19, 2016, 10:19am

HI....

I remember that I was unable to do what I initially wnated
I was unable to change the default behaviour and I end up using logstash-service-%{} output in order for it to work, like this excerpt of output config

} else if "php5mail" in [tags] {
  elasticsearch {
host => "172.16.0.25"
    cluster => "MyCluster"
    index => "logstash-php5mail-%{+YYYY.MM.dd}"
  }
} else if "centovacastmatch" in [tags] {
  elasticsearch {
host => "172.16.0.26"
    cluster => "Mycluster"
    index => "logstash-centovacast-%{+YYYY.MM.dd}"
  }
} else if "centovaerrormatch" in [tags] {
  elasticsearch {
host => "172.16.0.27"
    cluster => "Mycluster"
    index => "logstash-centovaerror-%{+YYYY.MM.dd}"
  }

Note the index => "logstash-SERVICE-%{+YYYY.MM.dd}" syntax, indicating that I'm sticking with default filtering / index setup and using it as expected.

I'm using but Elasticsearch 1.5 in production, so it may be old.

Remember that my way of work is based on early tagging!

To avoid contamination you have to make an adequate TAGGING PLANNING.
I tag log lines as soon as they are ingested from log lines at input moment filtering (this is what I mean as the 'service tag' concept / approach).... THERE is where I decide the final index "family" they will end up....SO:
lines read from /var/log/apache2 get an "apache" tag added at input filter time, lines read from /var/mail get a "postfix" tag added at input filter time, and so on....

Afterwards, I add other tags to allow future filtering... tags such as "server1", "cluster3", "customerX", "hosting", etc (remember you can add as much tags as you want ) that may differ from server to server allow future data filteryng / analysing / grouping... but the fact is every server reading an apache log line will mark it with an "apache" tag! this is the log line, distinguishing, 'service tag'!

Of course, the log line is analyzed using adequate filters (with adequate grok patterns) that are selected based (again) on the 'service tag'! , exploding the log line data into current data fields...

Finally, at output filter time, and again using the 'service tag', I send logs containing "apache" to an logstash-apacheXXXXXX index.

There is NO CONATMINATION: only apache lines have "apache" tag... postfix lines do not have "apache" tag, but "postfix" tag instead, so there is uniqueness factor that prevents it to happen.
It is simple and works perfect to me...

Topic		Replies	Views
No .raw field Logstash	4	2468	July 6, 2017
Missing raw fields with default index Elasticsearch	11	3717	July 5, 2017
Logstash 2.1.0 stop creating raw fields Logstash	8	2476	July 6, 2017
Raw field disappear Elasticsearch	6	1050	July 5, 2017
Confused about how to use .raw fields and not analyze string fields Kibana	37	51475	July 6, 2017

Kibana .raw fields and "message" field

Related topics