ElasticSearch custom index and mapping | local json data visualization issue

Hi!

I have setup elasticsearch, kibana, filebeat. With the other beats, i am able to collect logs and visualize them.
But i want to be able to upload logs that are in json format and visualize them.
The logs that i want to use are the darpa transparent computing data (trace dataset)

i have tried to change the config file of filebeat to ship a small part of the logs. but it doesnt work. on kibana, it show "no data to display"

I am quite sure the mapping of the data doesnt match. For that, i was able to find the scheme of the darpa data. but it was in avro format which Elasticsearch doesnt support / varies with.

So i have made an Elasticsearch compatible mapping
when i use the console to create a new index and provide this mapping, it throws "Root mapping definition has unsupported parameters" error.

i then tried to add it to the filebeat config file by creating a new index and a mapping. when i ran the filebeat setup to upload template and index, everything went fine, the index was created, but the mapping was the default filebeat mapping and not the one i wanted.

kindly please help me out with this. i am at my wit's end.

I either need a way to use my own mapping or need a way to upload and visualise the darpa data in some way.

Hi @Srini-99 Welcome to the community!

What version are you on?

Did you try the File Uploader?

I am not sure if you can but what would help is to share your mapping etc. your filebeat configuration and a sample JSON document etc. otherwise it is pretty hard for us to help.

I will add that filebeat wants to read ndjson (new line delimited json) not Pretty json... if you file is pretty json that adds a lot of complexity.

ndjson - Good :slight_smile:
{ "foo":"bar"}

pretty json - Not So Good :frowning:

{
  "foo": "bar"
}

There are lots of threads here on custom templates / index with filebeat etc...

This is where a lot of users get tripped up...

You actually don't need filebeat to upload your template (you can) you can just create it Via Kibana or the Dev-Tools

The often the filebeat config is not correct to send the data to the correct index etc..

Share your template, filebeat configuration etc... perhaps we can help.. without it we are guessing

Hi @stephenb !
Thanks for the reply.

The version i am using for all the components is 8.10.2

the data isnt in pretty json format, its all ndjson (fortunately).

Its baffling...i have never come across something called file uploader for es before in all my searches.

As for sharing the template, data sample and schema...I tried to add the files as google drive links... but the forum didnt allow it. so i guess i ll just paste all the 800+ lines of code here.

Sorry its a lot. thank you so much for your time in advance

Following is the sample data

{"datum":{"com.bbn.tc.schema.avro.cdm18.UnitDependency":{"unit":"318B61F6-259B-DDD4-0815-B38419085377","dependentUnit":"D2BEAEF8-42D1-DC02-379E-83E4AF247F87"}},"CDMVersion":"18","source":"SOURCE_LINUX_BEEP_TRACE"}
{"datum":{"com.bbn.tc.schema.avro.cdm18.UnitDependency":{"unit":"46313DCC-45A7-5D3A-CF56-8B5A466138A3","dependentUnit":"D2BEAEF8-42D1-DC02-379E-83E4AF247F87"}},"CDMVersion":"18","source":"SOURCE_LINUX_BEEP_TRACE"}
j

This is the mapping that i did for the avro scheme given with the data

{
    "mappings": {
        "TCCDMDatum": {
            "properties": {
                "datum": {
                    "type": "nested",
                    "properties": {
                        "Host": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "hostName": {
                                    "type": "text",
                                    "analyzer": "standard"
                                },
                                "hostIdentfiers": {
                                    "type": "nested",
                                    "properties": {
                                        "idType": {
                                            "type": "keyword"
                                        },
                                        "idValue": {
                                            "type": "keyword"
                                        }    
                                    }
                                },
                                "osDetails": {
                                    "type": "text",
                                    "analyzer": "standard"
                                },
                                "hostType": {
                                    "type": "keyword"
                                },
                                "interfaces": {
                                    "type": "nested",
                                    "properties": {
                                        "name": {
                                            "type": "keyword"
                                        },
                                        "macAddress": {
                                            "type": "keyword"
                                        },
                                        "ipAddress": {
                                            "type": "keyword"
                                        }
                                    }
                                }
                            }
                        },
                        "Principal": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "type": {
                                    "type": "keyword"
                                },
                                "hostId": {
                                    "type": "keyword"
                                },
                                "userId": {
                                    "type": "keyword"
                                },
                                "username": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "groupIds": {
                                    "type": "keyword"
                                },
                                "properties": {
                                    "type": "object",
                                    "dynamic": true,
                                    "default": null
                                }
                            }
                        },
                        "ProvenanceTagNode": {
                            "type": "object",
                            "properties": {
                                "tagId": {
                                    "type": "keyword"
                                },
                                "flowObject": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "hostId": {
                                    "type": "keyword"
                                },
                                "subject": {
                                    "type": "keyword"
                                },
                                "systemCall": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "programPoint": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "prevTagId": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "opcode": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "tagIds": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "itag": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "ctag": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "properties": {
                                    "type": "object",
                                    "dynamic": true,
                                    "default": null
                                }
                            }
                        },
                        "Subject": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "type": {
                                    "type" : "keyword"
                                },
                                "cid": {
                                    "type" : "integer"
                                },
                                "parentSubject": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "hostId": {
                                    "type": "keyword"
                                },
                                "localPrincipal": {
                                    "type": "keyword"
                                },
                                "startTimestampNanos": {
                                    "type": "long"
                                },
                                "unitId": {
                                    "type": "integer",
                                    "default": null
                                },
                                "iteration": {
                                    "type": "integer",
                                    "default": null
                                },
                                "count": {
                                    "type": "integer",
                                    "default": null
                                },
                                "cmdLine": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "privilegeLevel": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "importedLibraries": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "exportedLibraries": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "properties": {
                                    "type": "object",
                                    "dynamic": true,
                                    "default": null
                                }
                            }
                        },
                        "FileObject": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "baseObject": {
                                    "type": "nested",
                                    "properties": {
                                        "hostId": {
                                            "type": "keyword"
                                        },
                                        "permission": {
                                            "type": "keyword",
                                            "default": null
                                        },
                                        "epoch": {
                                            "type": "integer",
                                            "default": null
                                        },
                                        "properties": {
                                            "type": "object",
                                            "dynamic": true,
                                            "default": null
                                        }
                                    }
                                },
                                "type": {
                                    "type" : "keyword"
                                },
                                "fileDescriptor": {
                                    "type": "integer",
                                    "default": null
                                },
                                "localPrincipal": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "size": {
                                    "type": "long",
                                    "default": null
                                },
                                "peInfo": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "hashes": {
                                    "type": "nested",
                                    "properties": {
                                        "type": {
                                            "type": "keyword"
                                        },
                                        "hash": {
                                            "type": "keyword"
                                        }
                                    },
                                    "default": null
                                }
                            }
                        },
                        "UnnamedPipeObject": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "baseObject": {
                                    "type": "nested",
                                    "properties": {
                                        "hostId": {
                                            "type": "keyword"
                                        },
                                        "permission": {
                                            "type": "keyword"
                                        },
                                        "epoch": {
                                            "type": "integer"
                                        },
                                        "properties": {
                                            "type": "object",
                                            "dynamic": true,
                                            "default": null
                                        }
                                    }
                                },
                                "sourceFileDescriptor": {
                                    "type": "integer",
                                    "default": null
                                },
                                "sinkFileDescriptor": {
                                    "type": "integer",
                                    "default": null
                                },
                                "sourceUUID": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "sinkUUID": {
                                    "type": "keyword",
                                    "default": null
                                }
                            }
                        },
                        "RegistryKeyObject": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "key": {
                                    "type": "keyword"
                                },
                                "value": {
                                    "type": "nested",
                                    "properties": {
                                        "size": {
                                            "type": "integer",
                                            "default": -1
                                        },
                                        "type": {
                                            "type": "keyword"
                                        },
                                        "valueDataType": {
                                            "type": "keyword"
                                        },
                                        "isNull": {
                                            "type": "boolean",
                                            "default": false
                                        },
                                        "name": {
                                            "type": "keyword",
                                            "default": null
                                        },
                                        "runtimeDataType": {
                                            "type": "keyword",
                                            "default": null
                                        },
                                        "valueBytes": {
                                            "type": "binary",
                                            "default": null
                                        },
                                        "provenance": {
                                            "type": "nested",
                                            "properties": {
                                                "asserter": {
                                                    "type": "keyword"
                                                },
                                                "source": {
                                                    "type": "keyword",
                                                    "default": null
                                                },
                                                "provenance": {
                                                    "type": "object",
                                                    "dynamic": true,
                                                    "default": null
                                                }
                                            },
                                            "default": null
                                        },
                                        "tag": {
                                            "type": "nested",
                                            "properties": {
                                                "numValueElemenents": {
                                                    "type": "integer",
                                                    "default": 0
                                                },
                                                "tagId": {
                                                    "type": "keyword"
                                                }
                                            },
                                            "default": null
                                        },
                                        "components": {
                                            "type": "object",
                                            "dynamic": true,
                                            "default": null 
                                        }
                                    },
                                    "default": null
                                },
                                "size" : {
                                    "type" : "long",
                                    "default": null
                                }
                            }
                        },
                        "PacketSocketObject": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "proto": {
                                    "type": "keyword"
                                },
                                "ifIndex": {
                                    "type": "integer"
                                },
                                "haType": {
                                    "type": "keyword"
                                },
                                "pktType": {
                                    "type": "keyword"
                                },
                                "addr": {
                                    "type": "binary"
                                }
                            }
                        },
                        "NetFlowObject": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "localAddress": {
                                    "type": "keyword"
                                },
                                "localPort": {
                                    "type": "integer"
                                },
                                "remoteAddress": {
                                    "type": "keyword"
                                },
                                "remotePort": {
                                    "type" : "integer"
                                },
                                "ipProtocol": {
                                    "type": "integer",
                                    "default": null
                                },
                                "fileDescriptor": {
                                    "type" : "integer",
                                    "default": null
                                }
                            }
                        },
                        "MemoryObject": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "memoryAddress": {
                                    "type": "long"
                                },
                                "pageNumber": {
                                    "type": "long",
                                    "default": null
                                },
                                "pageOffset": {
                                    "type": "long",
                                    "default": null
                                },
                                "size": {
                                    "type": "long",
                                    "default": null
                                }
                            }
                        },
                        "SrcSinkObject": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "type": {
                                    "type": "keyword"
                                },
                                "fileDescriptor": {
                                    "type": "integer",
                                    "default": null
                                }
                            }
                        },
                        "Event": {
                            "type": "object",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "sequence": {
                                    "type": "long",
                                    "default": null
                                },
                                "type": {
                                    "type": "keyword" 
                                },
                                "threadID": {
                                    "type": "integer",
                                    "default": null
                                },
                                "hostId": {
                                    "type": "keyword"
                                },
                                "subject": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "predicateObject": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "predicateObjectPath": {
                                    "type": "text",
                                    "default": null
                                },
                                "predicateObject2": {
                                    "type": "keyword",
                                    "default": null
                                },
                                "predicateObjectPath2": {
                                    "type": "text",
                                    "default": null
                                },
                                "startTimestampNanos": {
                                    "type": "long"
                                },
                                "name": {
                                    "type": "text",
                                    "default": null
                                },
                                "location": {
                                    "type": "long",
                                    "default": null
                                },
                                "size": {
                                    "type": "long",
                                    "default": null
                                },
                                "programPoint": {
                                    "type": "text",
                                    "default": null
                                },
                                "properties": {
                                    "type": "object",
                                    "dynamic": true,
                                    "default": null
                                }
                            }
                        },
                        "UnitDependency": {
                            "type": "nested",
                            "properties": {
                                "uuid": {
                                    "type": "keyword"
                                },
                                "dependentUnit": {
                                    "type": "keyword"
                                }
                            }
                        },
                        "TimeMarker": {
                            "type": "nested",
                            "properties": {
                                "tsNanos": {
                                    "type": "long"
                                }
                            }
                        },
                        "StartMarker": {
                            "type": "nested",
                            "properties": {
                                "sessionNumber": {
                                    "type": "integer"
                                }
                            }
                        },
                        "EndMarker": {
                            "type": "nested",
                            "properties": {
                                "sessionNumber": {
                                    "type": "integer"
                                },
                                "recordCounts": {
                                    "type": "object",
                                    "dynamic": true
                                }                                
                            }
                        }
                    }
                },
                "CDMVersion": {
                    "type": "text",
                    "analyzer": "standard",
                    "default": 18
                },
                "source": {
                    "type": "keyword"
                }
            }
        }
    }
}

This is my filebeat config

# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input-specific configurations.

# filestream is an input for collecting log messages from files.
- type: filestream

  # Unique ID among all inputs, an ID is required.
  #id: testdata

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    #- /var/log/*.log
    - D:\Sem3_Materials\Project\data\Darpa\testdata\*.ndjson

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  # Line filtering happens after the parsers pipeline. If you would like to filter lines
  # before parsers, use include_message parser.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  # Line filtering happens after the parsers pipeline. If you would like to filter lines
  # before parsers, use include_message parser.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #prospector.scanner.exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Parsers configuration

  #### JSON configuration

  parsers:
    - ndjson:
      # Decode JSON options. Enable this if your logs are structured in JSON.
      # JSON key on which to apply the line filtering and multiline settings. This key
      # must be top level and its value must be a string, otherwise it is ignored. If
      # no text key is defined, the line filtering and multiline features cannot be used.
      #message_key:

      # By default, the decoded JSON is placed under a "json" key in the output document.
      # If you enable this setting, the keys are copied to the top level of the output document.
      keys_under_root: true

      # If keys_under_root and this setting are enabled, then the values from the decoded
      # JSON object overwrite the fields that Filebeat normally adds (type, source, offset, etc.)
      # in case of conflicts.
      overwrite_keys: true

      # If this setting is enabled, then keys in the decoded JSON object will be recursively
      # de-dotted, and expanded into a hierarchical object structure.
      # For example, `{"a.b.c": 123}` would be expanded into `{"a":{"b":{"c":123}}}`.
      expand_keys: true

      # If this setting is enabled, Filebeat adds an "error.message" and "error.key: json" key in case of JSON
      # unmarshaling errors or when a text key is defined in the configuration but cannot
      # be used.
      add_error_key: true

# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false

# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
setup.dashboards.enabled: true

# The URL from where to download the dashboard archive. By default, this URL
# has a value that is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

# =================================== Kibana ===================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

# ================================== Outputs ===================================

# Configure what output to use when sending the data collected by the beat.

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["localhost:9200"]

  # Protocol - either `http` (default) or `https`.
  #protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  username: "elastic"
  password: "a1AIP3W-UMtTdtuSn0lw"

  #--------added this new-------
  index: "darpa-%{[agent.version]}"
  mappings: C:\Program Files\Filebeat\darpa_mapping.json

# ================================== Template ==================================

# A template is used to set the mapping in Elasticsearch
# By default template loading is enabled and the template is loaded.
# These settings can be adjusted to load your own template or overwrite existing ones.

# Set to false to disable template loading.
setup.template.enabled: true

# Template name. By default the template name is "filebeat-%{[agent.version]}"
# The template name and pattern has to be set in case the Elasticsearch index pattern is modified.
setup.template.name: "darpa-%{[agent.version]}"

# Template pattern. By default the template pattern is "filebeat-%{[agent.version]}" to apply to the default index settings.
# The template name and pattern has to be set in case the Elasticsearch index pattern is modified.
setup.template.pattern: "darpa-%{[agent.version]}"

# Path to fields.yml file to generate the template
#setup.template.fields: "${path.config}/fields.yml"

# A list of fields to be added to the template and Kibana index pattern. Also
# specify setup.template.overwrite: true to overwrite the existing template.
#setup.template.append_fields:
#- name: field_name
#  type: field_type

# Enable JSON template loading. If this is enabled, the fields.yml is ignored.
#setup.template.json.enabled: false

# Path to the JSON template file
#setup.template.json.path: "${path.config}/template.json"

# Name under which the template is stored in Elasticsearch
#setup.template.json.name: ""

# Set this option if the JSON template is a data stream.
#setup.template.json.data_stream: false

# Overwrite existing template
# Do not enable this option for more than one instance of filebeat as it might
# overload your Elasticsearch with too many update requests.
#setup.template.overwrite: false

# Elasticsearch template settings
setup.template.settings:

  # A dictionary of settings to place into the settings.index dictionary
  # of the Elasticsearch template. For more details, please check
  # https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
  #index:
    #number_of_shards: 1
    #codec: best_compression

  # A dictionary of settings for the _source field. For more details, please check
  # https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html
  #_source:
    #enabled: false

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

This is the avro schema

{
  "type" : "record",
  "name" : "TCCDMDatum",
  "namespace" : "com.bbn.tc.schema.avro.cdm18",
  "fields" : [ {
    "name" : "datum",
    "type" : [ {
      "type" : "record",
      "name" : "Host",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : {
          "type" : "fixed",
          "name" : "UUID",
          "size" : 16
        },
      }, {
        "name" : "hostName",
        "type" : "string",
      }, {
        "name" : "hostIdentifiers",
        "type" : {
          "type" : "array",
          "items" : {
            "type" : "record",
            "name" : "HostIdentifier",
            
            "fields" : [ {
              "name" : "idType",
              "type" : "string"
            }, {
              "name" : "idValue",
              "type" : "string"
            } ]
          }
        },
      }, {
        "name" : "osDetails",
        "type" : "string",
        
      }, {
        "name" : "hostType",
        "type" : {
          "type" : "enum",
          "name" : "HostType",
          
        },
      }, {
        "name" : "interfaces",
        "type" : {
          "type" : "array",
          "items" : {
            "type" : "record",
            "name" : "Interface",
            
            "fields" : [ {
              "name" : "name",
              "type" : "string"
            }, {
              "name" : "macAddress",
              "type" : "string"
            }, {
              "name" : "ipAddresses",
              "type" : {
                "type" : "array",
                "items" : "string"
              }
            } ]
          }
        },
      } ]
    }, {
      "type" : "record",
      "name" : "Principal",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
      }, {
        "name" : "type",
        "type" : {
          "type" : "enum",
          "name" : "PrincipalType",
          
        },
        "default" : "PRINCIPAL_LOCAL"
      }, {
        "name" : "hostId",
        "type" : "UUID",
      }, {
        "name" : "userId",
        "type" : "string",
      }, {
        "name" : "username",
        "type" : [ "null", "string" ],
        "default" : null
      }, {
        "name" : "groupIds",
        "type" : {
          "type" : "array",
          "items" : "string"
        },
      }, {
        "name" : "properties",
        "type" : [ "null", {
          "type" : "map",
          "values" : "string"
        } ],
        "default" : null,
        "order" : "ignore"
      } ]
    }, {
      "type" : "record",
      "name" : "ProvenanceTagNode",
      
      "fields" : [ {
        "name" : "tagId",
        "type" : "UUID",
      }, {
        "name" : "flowObject",
        "type" : [ "null", "UUID" ],
        "default" : null
      }, {
        "name" : "hostId",
        "type" : "UUID",
      }, {
        "name" : "subject",
        "type" : "UUID",
      }, {
        "name" : "systemCall",
        "type" : [ "null", "string" ],
        "default" : null
      }, {
        "name" : "programPoint",
        "type" : [ "null", "string" ],
        "default" : null
      }, {
        "name" : "prevTagId",
        "type" : [ "null", "UUID" ],
        "default" : null
      }, {
        "name" : "opcode",
        "type" : [ "null", {
          "type" : "enum",
          "name" : "TagOpCode",
          
        } ],
        "default" : null
      }, {
        "name" : "tagIds",
        "type" : [ "null", {
          "type" : "array",
          "items" : "UUID"
        } ],
        "default" : null
      }, {
        "name" : "itag",
        "type" : [ "null", {
          "type" : "enum",
          "name" : "IntegrityTag",
          
        } ],
        "default" : null
      }, {
        "name" : "ctag",
        "type" : [ "null", {
          "type" : "enum",
          "name" : "ConfidentialityTag",
          
        } ],
        "default" : null
      }, {
        "name" : "properties",
        "type" : [ "null", {
          "type" : "map",
          "values" : "string"
        } ],
        "default" : null,
        "order" : "ignore"
      } ]
    }, {
      "type" : "record",
      "name" : "Subject",
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
      }, {
        "name" : "type",
        "type" : {
          "type" : "enum",
          "name" : "SubjectType",
          
        },
      }, {
        "name" : "cid",
        "type" : "int",
      }, {
        "name" : "parentSubject",
        "type" : [ "null", "UUID" ],
        
        "default" : null
      }, {
        "name" : "hostId",
        "type" : "UUID",
        
      }, {
        "name" : "localPrincipal",
        "type" : "UUID",
        
      }, {
        "name" : "startTimestampNanos",
        "type" : "long",
        
      }, {
        "name" : "unitId",
        "type" : [ "null", "int" ],
        
        "default" : null
      }, {
        "name" : "iteration",
        "type" : [ "null", "int" ],
        
        "default" : null
      }, {
        "name" : "count",
        "type" : [ "null", "int" ],
        "default" : null
      }, {
        "name" : "cmdLine",
        "type" : [ "null", "string" ],
        
        "default" : null
      }, {
        "name" : "privilegeLevel",
        "type" : [ "null", {
          "type" : "enum",
          "name" : "PrivilegeLevel",
        } ],
        
        "default" : null
      }, {
        "name" : "importedLibraries",
        "type" : [ "null", {
          "type" : "array",
          "items" : "string"
        } ],
        
        "default" : null
      }, {
        "name" : "exportedLibraries",
        "type" : [ "null", {
          "type" : "array",
          "items" : "string"
        } ],
        
        "default" : null
      }, {
        "name" : "properties",
        "type" : [ "null", {
          "type" : "map",
          "values" : "string"
        } ],
        
        "default" : null,
        "order" : "ignore"
      } ]
    }, {
      "type" : "record",
      "name" : "FileObject",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
        
      }, {
        "name" : "baseObject",
        "type" : {
          "type" : "record",
          "name" : "AbstractObject",
          
          "fields" : [ {
            "name" : "hostId",
            "type" : "UUID",
            
          }, {
            "name" : "permission",
            "type" : [ "null", {
              "type" : "fixed",
              "name" : "SHORT",
              "size" : 2
            } ],
            
            "default" : null
          }, {
            "name" : "epoch",
            "type" : [ "null", "int" ],
            
            "default" : null
          }, {
            "name" : "properties",
            "type" : [ "null", {
              "type" : "map",
              "values" : "string"
            } ],
            
            "default" : null,
            "order" : "ignore"
          } ]
        },
        
      }, {
        "name" : "type",
        "type" : {
          "type" : "enum",
          "name" : "FileObjectType",
          
          
        },
        
      }, {
        "name" : "fileDescriptor",
        "type" : [ "null", "int" ],
        
        "default" : null
      }, {
        "name" : "localPrincipal",
        "type" : [ "null", "UUID" ],
        
        "default" : null
      }, {
        "name" : "size",
        "type" : [ "null", "long" ],
        
        "default" : null
      }, {
        "name" : "peInfo",
        "type" : [ "null", "string" ],
        
        "default" : null
      }, {
        "name" : "hashes",
        "type" : [ "null", {
          "type" : "array",
          "items" : {
            "type" : "record",
            "name" : "CryptographicHash",
            
            "fields" : [ {
              "name" : "type",
              "type" : {
                "type" : "enum",
                "name" : "CryptoHashType",
                
                
              },
              
            }, {
              "name" : "hash",
              "type" : "string",
              
            } ]
          }
        } ],
        
        "default" : null
      } ]
    }, {
      "type" : "record",
      "name" : "UnnamedPipeObject",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
        
      }, {
        "name" : "baseObject",
        "type" : "AbstractObject",
        
      }, {
        "name" : "sourceFileDescriptor",
        "type" : [ "null", "int" ],
        
        "default" : null
      }, {
        "name" : "sinkFileDescriptor",
        "type" : [ "null", "int" ],
        "default" : null
      }, {
        "name" : "sourceUUID",
        "type" : [ "null", "UUID" ],
        "default" : null
      }, {
        "name" : "sinkUUID",
        "type" : [ "null", "UUID" ],
        "default" : null
      } ]
    }, {
      "type" : "record",
      "name" : "RegistryKeyObject",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
        
      }, {
        "name" : "baseObject",
        "type" : "AbstractObject",
        
      }, {
        "name" : "key",
        "type" : "string",
        
      }, {
        "name" : "value",
        "type" : [ "null", {
          "type" : "record",
          "name" : "Value",
          
          "fields" : [ {
            "name" : "size",
            "type" : "int",
            
            "default" : -1
          }, {
            "name" : "type",
            "type" : {
              "type" : "enum",
              "name" : "ValueType",
              
              
            },
            
          }, {
            "name" : "valueDataType",
            "type" : {
              "type" : "enum",
              "name" : "ValueDataType",
              
              
            },
            
          }, {
            "name" : "isNull",
            "type" : "boolean",
            
            "default" : false
          }, {
            "name" : "name",
            "type" : [ "null", "string" ],
            
            "default" : null
          }, {
            "name" : "runtimeDataType",
            "type" : [ "null", "string" ],
            
            "default" : null
          }, {
            "name" : "valueBytes",
            "type" : [ "null", "bytes" ],
            
            "default" : null
          }, {
            "name" : "provenance",
            "type" : [ "null", {
              "type" : "array",
              "items" : {
                "type" : "record",
                "name" : "ProvenanceAssertion",
                
                "fields" : [ {
                  "name" : "asserter",
                  "type" : "UUID",
                  
                }, {
                  "name" : "sources",
                  "type" : [ "null", {
                    "type" : "array",
                    "items" : "UUID"
                  } ],
                  
                  "default" : null
                }, {
                  "name" : "provenance",
                  "type" : [ "null", {
                    "type" : "array",
                    "items" : "ProvenanceAssertion"
                  } ],
                  
                  "default" : null
                } ]
              }
            } ],
            
            "default" : null
          }, {
            "name" : "tag",
            "type" : [ "null", {
              "type" : "array",
              "items" : {
                "type" : "record",
                "name" : "TagRunLengthTuple",
                
                "fields" : [ {
                  "name" : "numValueElements",
                  "type" : "int",
                  "default" : 0
                }, {
                  "name" : "tagId",
                  "type" : "UUID"
                } ]
              }
            } ],
            
            "default" : null
          }, {
            "name" : "components",
            "type" : [ "null", {
              "type" : "array",
              "items" : "Value"
            } ],
            
            "default" : null
          } ]
        } ],
        
        "default" : null
      }, {
        "name" : "size",
        "type" : [ "null", "long" ],
        
        "default" : null
      } ]
    }, {
      "type" : "record",
      "name" : "PacketSocketObject",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
        
      }, {
        "name" : "baseObject",
        "type" : "AbstractObject",
        
      }, {
        "name" : "proto",
        "type" : "SHORT",
        
      }, {
        "name" : "ifIndex",
        "type" : "int",
        
      }, {
        "name" : "haType",
        "type" : "SHORT",
        
      }, {
        "name" : "pktType",
        "type" : {
          "type" : "fixed",
          "name" : "BYTE",
          "size" : 1
        },
        
      }, {
        "name" : "addr",
        "type" : "bytes",
        
      } ]
    }, {
      "type" : "record",
      "name" : "NetFlowObject",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
        
      }, {
        "name" : "baseObject",
        "type" : "AbstractObject",
        
      }, {
        "name" : "localAddress",
        "type" : "string",
        
      }, {
        "name" : "localPort",
        "type" : "int",
        
      }, {
        "name" : "remoteAddress",
        "type" : "string",
        
      }, {
        "name" : "remotePort",
        "type" : "int",
        
      }, {
        "name" : "ipProtocol",
        "type" : [ "null", "int" ],
        
        "default" : null
      }, {
        "name" : "fileDescriptor",
        "type" : [ "null", "int" ],
        
        "default" : null
      } ]
    }, {
      "type" : "record",
      "name" : "MemoryObject",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
        
      }, {
        "name" : "baseObject",
        "type" : "AbstractObject",
        
      }, {
        "name" : "memoryAddress",
        "type" : "long",
        
      }, {
        "name" : "pageNumber",
        "type" : [ "null", "long" ],
        
        "default" : null
      }, {
        "name" : "pageOffset",
        "type" : [ "null", "long" ],
        "default" : null
      }, {
        "name" : "size",
        "type" : [ "null", "long" ],
        
        "default" : null
      } ]
    }, {
      "type" : "record",
      "name" : "SrcSinkObject",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
        
      }, {
        "name" : "baseObject",
        "type" : "AbstractObject",
        
      }, {
        "name" : "type",
        "type" : {
          "type" : "enum",
          "name" : "SrcSinkType",
          
          
        },
        
      }, {
        "name" : "fileDescriptor",
        "type" : [ "null", "int" ],
        
        "default" : null
      } ]
    }, {
      "type" : "record",
      "name" : "Event",
      
      "fields" : [ {
        "name" : "uuid",
        "type" : "UUID",
        
      }, {
        "name" : "sequence",
        "type" : [ "null", "long" ],
        
        "default" : null
      }, {
        "name" : "type",
        "type" : {
          "type" : "enum",
          "name" : "EventType",
          
          
        },
        
      }, {
        "name" : "threadId",
        "type" : [ "null", "int" ],
        
        "default" : null
      }, {
        "name" : "hostId",
        "type" : "UUID",
        
      }, {
        "name" : "subject",
        "type" : [ "null", "UUID" ],
        
        "default" : null
      }, {
        "name" : "predicateObject",
        "type" : [ "null", "UUID" ],
        
        "default" : null
      }, {
        "name" : "predicateObjectPath",
        "type" : [ "null", "string" ],
        
        "default" : null
      }, {
        "name" : "predicateObject2",
        "type" : [ "null", "UUID" ],
        
        "default" : null
      }, {
        "name" : "predicateObject2Path",
        "type" : [ "null", "string" ],
        
        "default" : null
      }, {
        "name" : "timestampNanos",
        "type" : "long",
        
      }, {
        "name" : "name",
        "type" : [ "null", "string" ],
        
        "default" : null
      }, {
        "name" : "parameters",
        "type" : [ "null", {
          "type" : "array",
          "items" : "Value"
        } ],
        
        "default" : null
      }, {
        "name" : "location",
        "type" : [ "null", "long" ],
        
        "default" : null
      }, {
        "name" : "size",
        "type" : [ "null", "long" ],
        
        "default" : null
      }, {
        "name" : "programPoint",
        "type" : [ "null", "string" ],
        
        "default" : null
      }, {
        "name" : "properties",
        "type" : [ "null", {
          "type" : "map",
          "values" : "string"
        } ],
        
        "default" : null,
        "order" : "ignore"
      } ]
    }, {
      "type" : "record",
      "name" : "UnitDependency",
      
      "fields" : [ {
        "name" : "unit",
        "type" : "UUID"
      }, {
        "name" : "dependentUnit",
        "type" : "UUID"
      } ]
    }, {
      "type" : "record",
      "name" : "TimeMarker",
      
      "fields" : [ {
        "name" : "tsNanos",
        "type" : "long",
        
      } ]
    }, {
      "type" : "record",
      "name" : "StartMarker",
      
      "fields" : [ {
        "name" : "sessionNumber",
        "type" : "int",
        
      } ]
    }, {
      "type" : "record",
      "name" : "EndMarker",
      "fields" : [ {
        "name" : "sessionNumber",
        "type" : "int",
      }, {
        "name" : "recordCounts",
        "type" : {
          "type" : "map",
          "values" : "string"
        },
        "order" : "ignore"
      } ]
    } ]
  }, {
    "name" : "CDMVersion",
    "type" : "string",
    "default" : "18"
  }, {
    "name" : "source",
    "type" : {
      "type" : "enum",
      "name" : "InstrumentationSource",
    },
  } ]
}

ndjson is good!

What version please... that may not seem important but it is!

We Only need a couple of lines of the logs that you shared.

Please don't do that. You can use https://pastebin.com/ or something but I am not sure what would take 800 lines of code?

No you can not link to Google Drive.

Can you share your filebeat.yml

Can you share your template?

Here is how you can upload the data if you just want to get it in and test?

You can adjust the mapping if you like / use your own mapping if you like

Just put in your own mapping under advanced

Here is just the quick mapping I did...

{
  "properties": {
    "CDMVersion": {
      "type": "long"
    },
    "datum": {
      "properties": {
        "com": {
          "properties": {
            "bbn": {
              "properties": {
                "tc": {
                  "properties": {
                    "schema": {
                      "properties": {
                        "avro": {
                          "properties": {
                            "cdm18": {
                              "properties": {
                                "UnitDependency": {
                                  "properties": {
                                    "dependentUnit": {
                                      "type": "keyword"
                                    },
                                    "unit": {
                                      "type": "keyword"
                                    }
                                  }
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    },
    "source": {
      "type": "keyword"
    }
  }
}

If you share your filebeat can do that as well

1 Like

I assume you actually want this to be ongoing so yes you will need to use filebeat...

Pretty sure you are making a common mistake ...

Assuming you created your own template manually in Kibana / Elasticsearch then take out all that setup template stuff out and try what I have below ... mostly likely what you have it is just overwriting your template

# ================================== Template ==================================

# A template is used to set the mapping in Elasticsearch
# By default template loading is enabled and the template is loaded.
# These settings can be adjusted to load your own template or overwrite existing ones.

# Set to false to disable template loading.
setup.template.enabled: true

# Template name. By default the template name is "filebeat-%{[agent.version]}"
# The template name and pattern has to be set in case the Elasticsearch index pattern is modified.
setup.template.name: "darpa-%{[agent.version]}"

# Template pattern. By default the template pattern is "filebeat-%{[agent.version]}" to apply to the default index settings.
# The template name and pattern has to be set in case the Elasticsearch index pattern is modified.
setup.template.pattern: "darpa-%{[agent.version]}"

# Set to false to disable template loading.
setup.template.enabled: true
......

Just put this in this .. it says use what I manually loaded.

setup.template.enabled: false
setup.ilm.enabled: false

You can look at some of my other examples etc

1 Like

Oh hi!

i mentioned the version in the first reply to you. i guess it got buried under all the other replies.
All of the components of elastic are version 8.10.2

And yes i have only shared 2 records of the log data in my first response.

The 800+ lines are the mapping i wrote.

I also have shared the filebeat config file in another reply.

Alright. I did previously try the data visualizer. it didnt work out well... but sure i shall try it along with my mapping and shall also try the filebeat changes you suggested.

@stephenb OMG IT WORKED WONDERS!!!! :sob: :sob: :sob:
THANK YOU SO MUCH!!!! :pray: :pray:

I tried 4 iterations, each changing the mapping and all 4 worked.
i cleaned up the logs by removing the prefixes to make it nicer on the eyes.

What i dont understand is.... i wrote the mapping manually and spent so much time in getting the parent-child nesting properly... but one of the iterations i tested was to take out all the children of "datum" and put it at the top level.... Logically the mapping should have gone haywire, but it still worked the same as compared to the properly nested one. which is kinda crazy to think abt.

And since this data view option only allows 100mb file and the dataset i have is 20gb... as u mentioned i need to keep using filebeat. And when i edited the advance settings after adding the map, there was an option to get the config for this custom index "Create Filebeat configuration" which gave what you already had sugggested

filebeat.inputs:
- type: log
  paths:
  - '<add path to your files here>'

output.elasticsearch:
  hosts: ["<es_url>"]
  username: "elastic"
  password: "<password>"
  index: "darpa-test-run5"
  pipeline: ""

setup:
  template.enabled: false
  ilm.enabled: false
1 Like

@Srini-99 Excellent!

You don't need that, in general don't add any settings you do not need.

1 Like

Hi again!

I created a new index thru the data view, gave my own mapping, cleared off the data inside so that it wont duplicate with the data i upload (because sample data is taken from the file).

But now, when i start publishing, some metadata is also being published - such as timestamp, agent.id and such - not from the ndjson file. (Have shared the screenshots thru imgur links. forum didnt allow me to post more than 1 media material)

https://imgur.com/0LlvCqV

So i created a pipeline to remove those fields before ingestion as follows,

[
  {
    "remove": {
      "field": "@timestamp"
    }
  },
  {
    "remove": {
      "field": "agent.id"
    }
  },
  {
    "remove": {
      "field": "agent.ephemeral_id"
    }
  },
  {
    "remove": {
      "field": "agent.name"
    }
  },
  {
    "remove": {
      "field": "agent.type"
    }
  },
  {
    "remove": {
      "field": "agent.version"
    }
  },
  {
    "remove": {
      "field": "input.type"
    }
  },
  {
    "remove": {
      "field": "ecs.version"
    }
  },
  {
    "remove": {
      "field": "log.file.idxhi"
    }
  },
  {
    "remove": {
      "field": "log.file.idxlo"
    }
  },
  {
    "remove": {
      "field": "log.file.vol"
    }
  },
  {
    "remove": {
      "field": "log.offset"
    }
  },
  {
    "remove": {
      "field": "host.name"
    }
  }
]

So in the filebeat config file, i have mentioned this pipeline to be used. And it does its job well. Only that this additional field "keyword" duplicating the file content is appearing below.

Is there anyway where i can just get the content of the files alone and nothing more or less like when it appeared when i used the data visualiser ?

https://imgur.com/1t8LIoj

Can you pl help me on why this is appearing and how i can may be get rid of it and have things simple and the content not inside the "message" field but each record just existing on its own like in the above image?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.