Date matching

I have a CSV and I have got it indexed in ELASTICSEARCH. The issue is that I have the following set up and how can I verify that the dates are timestamps?

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

#input {
#  beats {
#    port => 5044
#  }
#}

#output {
#  elasticsearch {
#    hosts => ["http://localhost:9200"]
#    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
#    #user => "elastic"
#    #password => "changeme"
#  }
#}

input {
  file {
        path => "/etc/logstash/apps/2019.csv"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        }
}
filter {
        csv {
                separator => ","
                columns => ["App #","Resubmis.","Month Sub","Date of Rec","Date of FP","Date of Resp","Resale","Vio","VA/OS","Address","Sec.","Final Pics","Request","Home Type","Steve Walters","Antonio Alaimo","Gerald","Michael Flack","Brian","Michael Brown","Jessica Arseneault","Jayla Walters","Tim Swigert","David Gurule","Final Decision","Group Decision Date","DIP"]
        }
        mutate { convert => ["DIP","float"]}
        mutate { convert => ["Month","integer"]}
        mutate { convert => ["Resale","boolean"]}
        mutate { convert => ["Vio","boolean"]}
        mutate { convert => ["App #","integer"]}
        mutate { convert => ["Resubmis","boolean"]}
        mutate { convert => ["Final Pics","boolean"]}
        date { match => ["Date of Rec", "M/d/yyyy"]}
        date { match => ["Date of FP", "M/d/yyyy"]}
        date { match => ["Date of Resp", "M/d/yyyy"]}
        date { match => ["Group Decision Date", "M/d/yyyy"]}

}
output {
        elasticsearch {
                hosts => "http://localhost:9200"
                index => "2019apps"
                document_type => "arcapps"
        }
        stdout{}
}

Am I doing this correct?

Can you provide an example or two of the CSV entries?

Yes I can

1,No,1,1/15/2019,2/5/2019,2/26/2019,No,No,OS,7302 Stallings Drive,B,No,Deck,TH,approve(2/12),No Vote,No Vote,No Vote,No Vote,Approved 2/7/19,N/A,N/A,N/A,N/A,Approved,2/12/2019,7
2,No,1,1/25/2019,,,No,No,OS,7816 Stonebriar Drive,A1,No,Deck,SFH,resigned (4/9/19),,,,,N/A,N/A,N/A,N/A,N/A,Incomplete,,
3,No,1,1/25/2019,2/5/2019,2/26/2019,No,No,OS,6859 Archibald Drive,B,No,Deck,TH,Approve 1/30/19,No Vote,No Vote,No Vote,No Vote,Approved (2/19/19),N/A,N/A,N/A,N/A,Approved,2/19/2019,14
4,No,2,2/19/2019,2/25/2019,3/18/2019,No,Yes,OS,7614 Gunmill Lane,D,Yes,Fence,TH,Approved(3/12),No Vote,No Vote,No Vote,No Vote,Approved (3/5/19),N/A,N/A,N/A,N/A,Approved,3/12/2019,15
4,No,2,2/19/2019,2/25/2019,3/18/2019,No,Yes,OS,7614 Gunmill Lane,D,Yes,Patio,TH,Approved(3/12),No Vote,No Vote,No Vote,No Vote,Approved (3/5/19),N/A,N/A,N/A,N/A,Approved,3/12/2019,15
4,No,2,2/19/2019,2/25/2019,3/18/2019,No,Yes,OS,7614 Gunmill Lane,D,Yes,Hardscape/Edging,TH,approved(3/12),No Vote,No Vote,No Vote,No Vote,Approved (3/5/19),N/A,N/A,N/A,N/A,Approved,3/12/2019,15
5,No,2,2/21/2019,2/21/2019,3/14/2019,No,No,OS,6824 Warfield Street,B,No,Fence,TH,Approve 2/27/19,No Vote,No Vote,No Vote,No Vote,Approved 3/1/19,N/A,N/A,N/A,N/A,Approved,3/1/2019,8
5,No,2,2/21/2019,2/21/2019,3/14/2019,No,No,OS,6824 Warfield Street,B,No,Walkway,TH,Approve 2/27/19,No Vote,No Vote,No Vote,No Vote,Approved 3/1/19,N/A,N/A,N/A,N/A,Approved,3/1/2019,8
6,No,2,2/21/2019,3/20/2019,4/10/2019,No,No,OS,7015 Dannfield Ct,1,No,Fence,TH,Approved(3/25),No Vote,No Vote,No Vote,No Vote,Approved(3/25),N/A,N/A,N/A,N/A,Approved,3/25/2019,5
6,No,2,2/21/2019,3/5/2019,3/26/2019,No,No,OS,7015 Dannfield Ct,1,No,Storm Door,TH,Approved(3/25),No Vote,No Vote,No Vote,No Vote,Approved (3/5/19),N/A,N/A,N/A,N/A,Approved,3/25/2019,20
7,No,2,2/25/2019,2/25/2019,3/18/2019,No,No,OS,507  Kingslet Roost,OP,No,Deck,TH,Approve 2/27/19,No Vote,No Vote,No Vote,No Vote,Approved (3/1/19),N/A,N/A,N/A,N/A,Approved,3/1/2019,4
8,No,3,3/11/2019,3/25/2019,4/15/2019,No,No,OS,6854 Warfield,B,No,patio,TH,approved(3/26),No Vote,No Vote,No Vote,No Vote,Approved  - 3/27/19,,N/A,N/A,N/A,Approved,3/27/2019,2
9,No,3,3/11/2019,4/9/2019,4/30/2019,No,No,OS,7281 Stallings,,No,fence,TH,approved(4/9),No Vote,No Vote,No Vote,No Vote,Approved (4/9/19),,N/A,N/A,N/A,Approved,4/9/2019,1
10,No,3,3/12/2019,3/12/2019,4/2/2019,No,No,OS,1002 Sithean Way,A,No,solar,TH,Approved(3/13),No Vote,No Vote,No Vote,No Vote,Approved (3/18),NA,N/A,N/A,N/A,Approved,3/18/2019,6
11,No,3,3/18/2019,3/18/2019,4/8/2019,Yes,No,OS,7521 Briargrove Lane,A,Yes,Fence,TH,approved(3/22),No Vote,No Vote,No Vote,No Vote,Approved (3/21),NA,N/A,N/A,N/A,Approved,3/22/2019,4
12,No,3,3/25/2019,3/25/2019,4/15/2019,No,No,OS,6845 Warfield Street,B,No,Gravel (Pea Gravel),TH,approves(3/26),No Vote,No Vote,No Vote,No Vote,Approved (3/26/19),NA,N/A,N/A,N/A,Approved,3/26/2019,1

Those four date filters set and overwrite @timestamp. I think you want

    date { match => ["Date of Rec", "M/d/yyyy"] target => "Date of Rec" }
    date { match => ["Date of FP", "M/d/yyyy"] target => "Date of FP" }
    date { match => ["Date of Resp", "M/d/yyyy"] target => "Date of Resp" }
    date { match => ["Group Decision Date", "M/d/yyyy"] target => "Group Decision Date" }

Will those be listed as timestamp/date or just text when I look at it?

If they have already been indexed as text then additional documents in the same index will have them as text. If you roll over to a new index then they will start being mapped as dates.

1 Like

Thank you this has worked. Thank you so much!

Removed I will start a new thread in kibana

How can I make it so that when there is an update to the file (modification of an existing line or a new line) that it does not reingest everything all over again.

Unless you are appending lines to the file logstash will have to process every entry. You can take a subset of the columns and then use a fingerprint filter to set the document_id option on the elasticsearch output. Then elasticsearch will overwrite the existing document. Still does all the work, but you do not get duplicates.

Thank you for your quick reply.

I am getting this error/warning now

[WARN ] 2019-11-03 20:17:54.148 [[main]>worker0] csv - Error parsing csv {:field=>"message", :source=>"", :exception=>#<NoMethodError: undefined method `each_index' for nil:NilClass>}
[WARN ] 2019-11-03 20:17:54.168 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>404, :action=>["update", {:_id=>"fishstats", :_index=>"fishstats", :_type=>"doc", :routing=>nil, :_retry_on_conflict=>1}, #<LogStash::Event:0x658d2fc1>], :response=>{"update"=>{"_index"=>"fishstats", "_type"=>"doc", "_id"=>"fishstats", "status"=>404, "error"=>{"type"=>"document_missing_exception", "reason"=>"[doc][fishstats]: document missing", "index_uuid"=>"XzCpHqrcQ6SL8XFscnjLVw", "shard"=>"4", "index"=>"fishstats"}}}}
[WARN ] 2019-11-03 20:17:54.168 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>404, :action=>["update", {:_id=>"fishstats", :_index=>"fishstats", :_type=>"doc", :routing=>nil, :_retry_on_conflict=>1}, #<LogStash::Event:0xf6cf018>], :response=>{"update"=>{"_index"=>"fishstats", "_type"=>"doc", "_id"=>"fishstats", "status"=>404, "error"=>{"type"=>"document_missing_exception", "reason"=>"[doc][fishstats]: document missing", "index_uuid"=>"XzCpHqrcQ6SL8XFscnjLVw", "shard"=>"4", "index"=>"fishstats"}}}}

My logstash.conf is the following

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

#input {
#  beats {
#    port => 5044
#  }
#}

#output {
#  elasticsearch {
#    hosts => ["http://localhost:9200"]
#    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
#    #user => "elastic"
#    #password => "changeme"
#  }
#}

#input {
#  file {
#        path => "/etc/logstash/apps/2019.csv"
#        start_position => "beginning"
#        sincedb_path => "/dev/null"
#        }
#}
#filter {
#        csv {
#                separator => ","
#                columns => ["App #","Resubmis.","Month Sub","Date of Rec","Date of FP","Date of Resp","Resale","Vio","VA/OS","Address","Sec.","Final Pics","Request","Home Type","Steve Walters","Antonio Alaimo","Gerald","Michael Flack","Brian","Michael Brown","Jessica Arseneault","Jayla Walters","Tim Swigert","David Gurule","Final Decision","Group Decision Date","DIP"]
#        }
#       mutate { convert => ["DIP","float"]}
#       mutate { convert => ["Month Sub","integer"]}
#       mutate { convert => ["Resale","boolean"]}
#       mutate { convert => ["Vio","boolean"]}
#       mutate { convert => ["App #","integer"]}
#       mutate { convert => ["Resubmis","boolean"]}
#       mutate { convert => ["Final Pics","boolean"]}
#       date { match => ["Date of Rec", "M/d/yyyy"] target => "Date of Rec" }
#       date { match => ["Date of FP", "M/d/yyyy"] target => "Date of FP" }
#       date { match => ["Date of Resp", "M/d/yyyy"] target => "Date of Resp" }
#       date { match => ["Group Decision Date", "M/d/yyyy"] target => "Group Decision Date" }
#
#}

input {
  file {
        path => "/etc/logstash/fish/10gFresh.csv"
        start_position => "beginning"
        sincedb_path => "/dev/null"


        }
}

filter {
        csv {
                separator => ","
                columns => ["date","temp","pH","Alkalinity","salinity","chlorine","hardness","ammonia","nitrites","nitrates","who"]
        }

        mutate { convert => ["temp","float"]}
  mutate { convert => ["pH","float"]}
        mutate { convert => ["Alkalinity","float"]}
        mutate { convert => ["salinity","float"]}
        mutate { convert => ["chlorine","float"]}
        mutate { convert => ["hardness","float"]}
        mutate { convert => ["ammonia","float"]}
        mutate { convert => ["nitrites","float"]}
        mutate { convert => ["nitrates","float"]}
        date { match => ["date","M/d/yyyy"] target => "date"}
}


output {
        elasticsearch {
                hosts => "http://localhost:9200"
                #index => "2019apps"
                index => "fishstats"
                action => "update"
                document_id => "fishstats"
                #document_type => "arcapps"
        }
        stdout{}
}

So I will research a little on the document ID and go from there. That being said how can I use 1 config file for multiple files that use different indexes?

OK, so the [message] field is empty or missing, and the ruby CSV parser is returning nil, so when the csv filter tries to iterate over the columns using each_index it gets that exception.

I'm not an elasticsearch expert but the second error message also appears to be related to a missing message.

You can write to multiple indexes with a single output by using a sprintf reference in the elasticsearch output index option.

I think this has to do with the document id issue. I am still looking at that.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.