Faild put to /_ingest/pipeline with custom json

hjfeng1988 · June 25, 2018, 8:08am

I write a cutstom josn file refer to filebeat6.3.0 modues config file /usr/share/filebeat/module/system/syslog/ingest/pipeline.json

{
   "description": "Pipeline for parsing /var/log/cmd.log.",
   "processors": [{
     "grok": {
       "field": "message",
           "ignore_missing": true,
       "patterns":[
         "%{SYSLOGTIMESTAMP:cmd.timestamp} %{DATA:cmd.hostname} %{USER:cmd.login_user}\\[%{POSINT:cmd.pid}\\]: %{DATA:cmd.tty} \\(%{IP:cmd.login_ip}\\) %{USER:cmd.run_user} %{DATA:cmd.pwd} # %{GREEDYDATA:cmd.command}"
       ]
     }
   }, {
     "remove": {
       "field": "message"
     }
   }, {
     "date": {
       "field": "cmd.timestamp",
       "target_field": "@timestamp",
       "formats": [
         "MMM  d HH:mm:ss",
         "MMM dd HH:mm:ss"
       ],
       {< if .convert_timezone >}"timezone": "{{ beat.timezone }}",{< end >}
       "ignore_failure": true
     }
   }]
 }

When I run curl -H 'Content-type:application/json' -XPUT 'http://172.16.2.97:9200/_ingest/pipeline/cmd_log?pretty' -d@cmd_log.json,I get error log:

{
   "error" : {
     "root_cause" : [
       {
         "type" : "parse_exception",
         "reason" : "Failed to parse content to map"
       }
     ],
     "type" : "parse_exception",
     "reason" : "Failed to parse content to map",
     "caused_by" : {
       "type" : "json_parse_exception",
       "reason" : "Unexpected character ('{' (code 123)): was expecting double-quote to start field name\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@56ffa838; line: 1, column: 614]"
     }
   },
   "status" : 400
 }

Then i delete the line {< if .convert_timezone >}"timezone": "{{ beat.timezone }}",{< end >} in my config cmd_log.json, It curl -XPUT correct,How can I fix it well with keyword {< if .convert_timezone >}"timezone": "{{ beat.timezone }}",{< end >}.

jsoriano · June 25, 2018, 10:31am

Hi @hjfeng1988,

Module pipelines distributed with filebeat can contain templates that are interpreted by filebeat on setup. If you use these files as base to create your own pipelines you have to replace these templates by the expected result.
In this case, this template adds the timezone if it needs to be converted. If you have enabled convert_timezone in your module configuration replace this line with "timezone": "{{ beat.timezone }}",. If you didn't enable it, just remove the line.

hjfeng1988 · June 26, 2018, 2:10am

hi @jsoriano,
How can i enalbe convert_timezone in my module configuration detail,My cmd_log.json path is /usr/share/filebeat/cmd_log.json,Is there any suggestion to the path of json file.

jsoriano · June 26, 2018, 8:14am

There are some modules, like the system one, that allow to set this convert_timezone setting. You would need it if your server uses a local timezone and you want to collect it in UTC.

hjfeng1988 · June 27, 2018, 8:28am

Hi @jsoriano,
Thanks a lot, My timezone isn't UTC,I think it best delete the line.

hjfeng1988 · June 27, 2018, 8:39am

Another problem is kibana discover page show the variable cmd.timestamp,cmd.hostname and so on all in one variable cmd.
One record like this

offset:323,196 prospector.type:log source:/var/log/cmd.log input.type:log @timestamp:June 23rd 2018, 22:22:43.908 host.name:HD1_sh_tomcat3 beat.hostname:HD1_sh_tomcat3 beat.name:HD1_sh_tomcat3 beat.version:6.3.0 cmd:{ "hostname": "HD1_sh_tomcat3", "login_user": "lzl", "tty": "pts/0", "pid": "21619", "run_user": "root", "pwd": "/root", "login_ip": "117.30.93.67", "command": "/data/script/ctrl_tomcat.sh shanghu_job inc_update ROOT_20180623_001.zip", "timestamp": "Jun 23 22:22:41" } _id:GggGLWQBySP3f9eNmlet _type:doc _index:filebeat-6.3.0-2018.06.23 _score: -

I expect that value separate in itself key,anyone know this problem.

jsoriano · June 27, 2018, 3:30pm

When using grok, field names are in the patterns themselves, if you define your own pattern you can decide what field names to use.

hjfeng1988 · June 28, 2018, 1:47am

Sorry,I'm new novice.Could you tell me how to do this well detail.

jsoriano · June 28, 2018, 8:53am

In the pipeline config you posted, there was a grok pattern:

You can see that cmd.timestamp, cmd.hostname... appear there, these are the names used for the fields in the events that you later see in kibana. If you change them there, they will also change in the events you see.

If you don't want to modify this grok expression, another option is to add a rename processor, with this you can rename any field.

In general, you need to be careful when choosing the names or renaming fields. For some known fields we add some mappings that help to handle the data type, so it uses to be better to leave the field names as they are. Also, there is the risk of creating mapping conflicts: two values of different types cannot be stored with the same name (e.g. a string cannot be stored as an integer). We are defining a common schema that can serve as guideline for the names to use.

hjfeng1988 · June 28, 2018, 9:20am

I'm sorry,Did you misunderstand my meaning.I need kibana show me in this format.

cmd.hostname:HD1_sh_tomcat3 cmd.login_user:lzl cmd.tty:pts/0 cmd.pid:21619 cmd.run_user:root 
 cmd.pwd:/root cmd.login_ip:117.30.93.67 cmd.command:/data/script/ctrl_tomcat.sh shanghu_job inc_update ROOT_20180623_001.zip

Not cmd{"hostname": "HD1_sh_tomcat3", "login_user": "lzl"...}

jsoriano · June 28, 2018, 10:26am

Yes, I think I don't understand completely, why would you want this format?

hjfeng1988 · June 28, 2018, 10:35am

I want to go through the button on the left to add like cmd.login_user and sort by it.

jsoriano · June 28, 2018, 11:21am

Ok, I see now. This is weird, it looks like if your object was converted to plain text. How is your architecture? are you sending directly from filebeat to Elasticsearch?

hjfeng1988 · June 29, 2018, 1:31am

Yes,I sending directly from filebeat to Elasticsearch.This is my original log

Jun 29 09:31:14 HD1_sh_tomcat3 hjfeng[16226]: pts/0 (117.25.173.75) hjfeng /home/hjfeng # exit

jsoriano · June 29, 2018, 11:38am

Could you share one of these events containing cmd as returned by Elasticsearch?

To query Elasticsearch you can use the console in the Dev Tools tab in Kibana, and the query to obtain an object with the cmd field would be something like this:

GET /filebeat-6.3.0-*/_search
{
  "query": { "exists": { "field": "cmd" } },
  "size": 1
}

hjfeng1988 · July 2, 2018, 1:35am

GET /filebeat-6.3.0-2018.06.28/doc/KvTqRWQBGLmoLez8cHKn

{
  "_index": "filebeat-6.3.0-2018.06.28",
  "_type": "doc",
  "_id": "KvTqRWQBGLmoLez8cHKn",
  "_version": 1,
  "found": true,
  "_source": {
    "offset": 5545,
    "prospector": {
      "type": "log"
    },
    "source": "/var/log/cmd.log",
    "input": {
      "type": "log"
    },
    "@timestamp": "2018-06-28T10:22:28.541Z",
    "host": {
      "name": "HD1_sh_tomcat3"
    },
    "beat": {
      "hostname": "HD1_sh_tomcat3",
      "name": "HD1_sh_tomcat3",
      "version": "6.3.0"
    },
    "cmd": {
      "hostname": "HD1_sh_tomcat3",
      "login_user": "hjfeng",
      "tty": "pts/2",
      "pid": "9473",
      "run_user": "root",
      "pwd": "/usr/share/filebeat",
      "login_ip": "117.25.173.75",
      "command": "# id",
      "timestamp": "Jun 28 18:22:27"
    }
  }
}

jsoriano · July 2, 2018, 4:55pm

Ok, the stored event is in the format you expect. Then the issue is in the fields list of your index pattern.

Open the Index Patterns view in the Kibana Management tab. Then select your filebeat* index pattern, and look for the cmd field. You will probably find it, with the type string, this is what is making Kibana to handle this field as a string instead of as an object.

This means that in some of your indexes matching with the filebeat* pattern there are (or there were) events with a string instead of an object in the cmd field. Could this be in your case?

You can try to click on the Refresh fields list button, if there are no events with this field anymore the field will disappear from the list. If it doesn't disappear, it means that you still have events with this field, to find the index with those fields, go to the Console in the Dev Tools tab, and look for the different mappings for this field with:

GET /filebeat*/_mapping/field/cmd

The indexes where you have the structured fields (like cmd.hostname and so on) shouldn't appear as result of this query.

Once found, you can decide what to do with this index, if reindexing it to rename this field, removing it completely...

hjfeng1988 · July 3, 2018, 3:49am

Thanks very much.

system · July 31, 2018, 4:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Detecting JSON in Ingest pipeline Elasticsearch	2	2114	November 6, 2018
Filebeat custom module pipeline failed Beats filebeat	5	511	October 20, 2020
Pipeline date_time parser failure beats me (sorry for the pun :) Beats filebeat	5	173	October 17, 2023
Ingest pipeline Elasticsearch ingest-pipeline	3	289	December 26, 2022
Filebeat fails when I configure an ingest-pipeline on filebeat.yml Beats filebeat , ingest-pipeline	16	1679	December 25, 2021

Faild put to /_ingest/pipeline with custom json

Related Topics