Configuring Elastic Search Mappings and Logstash

Hi,
I am trying to put a .csv file in to logstash and then get the index to kibana. When a dynamic mapping is given and run logstash. It works fine and Kibana shows the index. Following is the config file and the dynamic mapping created by logstash.

`input {
    file {
        path => "D:\Projects\A\Installations\logstash\logstash-2.3.4\bin\code.txt"
        start_position => beginning
    }
}
filter {
    csv {
        columns => [
        "A", 
        "B",
        "C", 
        "D"
        ]
        separator => ","
        }
    mutate{
        convert => {
        "B" => "integer"
        "C" => "integer"
        "D" => "integer"
        }
        }
}
output {
    elasticsearch {
    hosts=>["localhost:9200"]
    index => "report"
    document_id => "%{A}"
    }
    stdout { codec => rubydebug }
}

The dynamic mapping at the elasticsearche's side.

"report" : {
    "mappings" : {
      "logs" : {
        "properties" : {
          "@timestamp" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
          },
          "@version" : {
            "type" : "string"
          },
          "A" : {
            "type" : "string"
          },
          "B" : {
            "type" : "long"
          },
          "C" : {
            "type" : "long"
          },
          "D" : {
            "type" : "long"
          },
          "host" : {
            "type" : "string"
          },
          "message" : {
            "type" : "string"
          },
          "path" : {
            "type" : "string"
          }
        }
      }
    }
  }
}

However, once I create custom mapping (shown below) and try to upload the documents, Its not accepted by elastic search. Following is the mapping I have created.

curl -XPUT 'http://localhost:9200/test_coverage/' -d '{
"settings" : {
    "index" : {
        "number_of_shards" : 3,
        "number_of_replicas" : 2 
        }} ,
"mappings": {
    "logs": {
        "properties" : {
        "A": {"type": "string","index": "not_analyzed"},
        "B": {"type": "integer"},
        "C": {"type": "integer"},
        "D": {"type": "integer"}
        }
}
}
}'

I have following questions,

  1. Do I need to add meta fields (@timestamp, @version..etc) to the custom mapping I am creating in elasticsearch?
  2. Dynamic mapping actually identifies long for fields but integer is adequate. Cant I force it to use integer s ?
  3. Once I delete a document in the log file, it seems like that change is not reflected in the elasticsearche's index. Is there any way to configure it through the logstash configuration file or have to manually remove the doument via an external script.

I have following versions of the elk stack and I am working on Windows 7 64 bit.

Kibana 4.5.2
Logstash 2.3.4
Elastic Search 2.3.4

Thank You!

1/ Looks like that you haven't specify index pattern in your mapping template, it should be something like below

 "template": "report*",
  "settings":  { },
  "mappings" : { }

Any index named report or report-abc... will use this template

2/ You can use dynamic template to set data type for fields, but those field names must have something in common to match like

any_field1
any_field2
any_field3

then you can use the following dynamic mappings to set those fields as integer

"dynamic_templates": [
        {
          "integer_field": {
            "mapping": {
              "type": "integer"
            },
            "match_mapping_type": "string",
            "match": "any_*"
          }
        },

3/ I'm not sure if you are able to delete just certain documents in an ES index.

I'm not sure if you are able to delete just certain documents in an ES index.

Sure you can. However, deleting (or changing) lines in a log file monitored by Logstash won't cause the corresponding documents in ES to be touched.

Thanks. But how can I make the deletes reflected in the documents in ES index? that is one of the problems I am currently having.

What kind of files are you monitoring? How are they updated?

Hi The files are .csv file and its being updated by a script. Actually the script's output becomes the .csv file

The challenge here is that Logstash with few exceptions is stateless, i.e. it doesn't track what it has processed. Specifically, it has no support for detecting what has been deleted, but if you know the ids of the documents in ES that have been deleted you can use the elasticsearch output to delete them.

You'll need to write glue that diffs the old and new CSV files and emits a list of documents that have disappeared. If the data has a natural key that you can pass to ES as the documents id you can use that key to delete the documents. Otherwise you can generate an id based on other fields (check out the fingerprint filter).

Hi, Actually I have resolved that challenge using following code. I am taking the field A as the key. This configuration works in one of following scenarios only.

  1. when the initial field (Fields B) has value 688 and the change it from 688 to 68 =>works.
  2. (Fields B) has value 6 and the change it from 6 to 800 =>Doesnt Work.
  3. Adding a new raw to the file => Doesnt Work.

Scenario 2 and 3 produce following error,

“exception”=>#<NoMethodError: undefined method 'split' for nil:NilClass> Which I strongly believe is not having a proper mapping in the elasticsearche's side. The custom mapping I apply still doesn't work and I am currently working with a dynamic mapping.

Here is the output configuration of the logstash.

output {
    elasticsearch {
    hosts=>["localhost:9200"]
    index => "report"
    document_id => "%{A}"
    }
    stdout { codec => rubydebug }
}