Date matching

mbrown007 · October 25, 2019, 10:58pm

I have a CSV and I have got it indexed in ELASTICSEARCH. The issue is that I have the following set up and how can I verify that the dates are timestamps?

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

#input {
#  beats {
#    port => 5044
#  }
#}

#output {
#  elasticsearch {
#    hosts => ["http://localhost:9200"]
#    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
#    #user => "elastic"
#    #password => "changeme"
#  }
#}

input {
  file {
        path => "/etc/logstash/apps/2019.csv"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        }
}
filter {
        csv {
                separator => ","
                columns => ["App #","Resubmis.","Month Sub","Date of Rec","Date of FP","Date of Resp","Resale","Vio","VA/OS","Address","Sec.","Final Pics","Request","Home Type","Steve Walters","Antonio Alaimo","Gerald","Michael Flack","Brian","Michael Brown","Jessica Arseneault","Jayla Walters","Tim Swigert","David Gurule","Final Decision","Group Decision Date","DIP"]
        }
        mutate { convert => ["DIP","float"]}
        mutate { convert => ["Month","integer"]}
        mutate { convert => ["Resale","boolean"]}
        mutate { convert => ["Vio","boolean"]}
        mutate { convert => ["App #","integer"]}
        mutate { convert => ["Resubmis","boolean"]}
        mutate { convert => ["Final Pics","boolean"]}
        date { match => ["Date of Rec", "M/d/yyyy"]}
        date { match => ["Date of FP", "M/d/yyyy"]}
        date { match => ["Date of Resp", "M/d/yyyy"]}
        date { match => ["Group Decision Date", "M/d/yyyy"]}

}
output {
        elasticsearch {
                hosts => "http://localhost:9200"
                index => "2019apps"
                document_type => "arcapps"
        }
        stdout{}
}

Am I doing this correct?

Badger · October 26, 2019, 12:03am

Can you provide an example or two of the CSV entries?

mbrown007 · October 26, 2019, 1:43am

Yes I can

1,No,1,1/15/2019,2/5/2019,2/26/2019,No,No,OS,7302 Stallings Drive,B,No,Deck,TH,approve(2/12),No Vote,No Vote,No Vote,No Vote,Approved 2/7/19,N/A,N/A,N/A,N/A,Approved,2/12/2019,7
2,No,1,1/25/2019,,,No,No,OS,7816 Stonebriar Drive,A1,No,Deck,SFH,resigned (4/9/19),,,,,N/A,N/A,N/A,N/A,N/A,Incomplete,,
3,No,1,1/25/2019,2/5/2019,2/26/2019,No,No,OS,6859 Archibald Drive,B,No,Deck,TH,Approve 1/30/19,No Vote,No Vote,No Vote,No Vote,Approved (2/19/19),N/A,N/A,N/A,N/A,Approved,2/19/2019,14
4,No,2,2/19/2019,2/25/2019,3/18/2019,No,Yes,OS,7614 Gunmill Lane,D,Yes,Fence,TH,Approved(3/12),No Vote,No Vote,No Vote,No Vote,Approved (3/5/19),N/A,N/A,N/A,N/A,Approved,3/12/2019,15
4,No,2,2/19/2019,2/25/2019,3/18/2019,No,Yes,OS,7614 Gunmill Lane,D,Yes,Patio,TH,Approved(3/12),No Vote,No Vote,No Vote,No Vote,Approved (3/5/19),N/A,N/A,N/A,N/A,Approved,3/12/2019,15
4,No,2,2/19/2019,2/25/2019,3/18/2019,No,Yes,OS,7614 Gunmill Lane,D,Yes,Hardscape/Edging,TH,approved(3/12),No Vote,No Vote,No Vote,No Vote,Approved (3/5/19),N/A,N/A,N/A,N/A,Approved,3/12/2019,15
5,No,2,2/21/2019,2/21/2019,3/14/2019,No,No,OS,6824 Warfield Street,B,No,Fence,TH,Approve 2/27/19,No Vote,No Vote,No Vote,No Vote,Approved 3/1/19,N/A,N/A,N/A,N/A,Approved,3/1/2019,8
5,No,2,2/21/2019,2/21/2019,3/14/2019,No,No,OS,6824 Warfield Street,B,No,Walkway,TH,Approve 2/27/19,No Vote,No Vote,No Vote,No Vote,Approved 3/1/19,N/A,N/A,N/A,N/A,Approved,3/1/2019,8
6,No,2,2/21/2019,3/20/2019,4/10/2019,No,No,OS,7015 Dannfield Ct,1,No,Fence,TH,Approved(3/25),No Vote,No Vote,No Vote,No Vote,Approved(3/25),N/A,N/A,N/A,N/A,Approved,3/25/2019,5
6,No,2,2/21/2019,3/5/2019,3/26/2019,No,No,OS,7015 Dannfield Ct,1,No,Storm Door,TH,Approved(3/25),No Vote,No Vote,No Vote,No Vote,Approved (3/5/19),N/A,N/A,N/A,N/A,Approved,3/25/2019,20
7,No,2,2/25/2019,2/25/2019,3/18/2019,No,No,OS,507  Kingslet Roost,OP,No,Deck,TH,Approve 2/27/19,No Vote,No Vote,No Vote,No Vote,Approved (3/1/19),N/A,N/A,N/A,N/A,Approved,3/1/2019,4
8,No,3,3/11/2019,3/25/2019,4/15/2019,No,No,OS,6854 Warfield,B,No,patio,TH,approved(3/26),No Vote,No Vote,No Vote,No Vote,Approved  - 3/27/19,,N/A,N/A,N/A,Approved,3/27/2019,2
9,No,3,3/11/2019,4/9/2019,4/30/2019,No,No,OS,7281 Stallings,,No,fence,TH,approved(4/9),No Vote,No Vote,No Vote,No Vote,Approved (4/9/19),,N/A,N/A,N/A,Approved,4/9/2019,1
10,No,3,3/12/2019,3/12/2019,4/2/2019,No,No,OS,1002 Sithean Way,A,No,solar,TH,Approved(3/13),No Vote,No Vote,No Vote,No Vote,Approved (3/18),NA,N/A,N/A,N/A,Approved,3/18/2019,6
11,No,3,3/18/2019,3/18/2019,4/8/2019,Yes,No,OS,7521 Briargrove Lane,A,Yes,Fence,TH,approved(3/22),No Vote,No Vote,No Vote,No Vote,Approved (3/21),NA,N/A,N/A,N/A,Approved,3/22/2019,4
12,No,3,3/25/2019,3/25/2019,4/15/2019,No,No,OS,6845 Warfield Street,B,No,Gravel (Pea Gravel),TH,approves(3/26),No Vote,No Vote,No Vote,No Vote,Approved (3/26/19),NA,N/A,N/A,N/A,Approved,3/26/2019,1

Badger · October 26, 2019, 12:58pm

Those four date filters set and overwrite @timestamp. I think you want

    date { match => ["Date of Rec", "M/d/yyyy"] target => "Date of Rec" }
    date { match => ["Date of FP", "M/d/yyyy"] target => "Date of FP" }
    date { match => ["Date of Resp", "M/d/yyyy"] target => "Date of Resp" }
    date { match => ["Group Decision Date", "M/d/yyyy"] target => "Group Decision Date" }

mbrown007 · October 26, 2019, 2:10pm

Will those be listed as timestamp/date or just text when I look at it?

Badger · October 26, 2019, 2:39pm

If they have already been indexed as text then additional documents in the same index will have them as text. If you roll over to a new index then they will start being mapped as dates.

mbrown007 · October 27, 2019, 2:59pm

Thank you this has worked. Thank you so much!

mbrown007 · October 27, 2019, 3:15pm

Removed I will start a new thread in kibana

mbrown007 · November 3, 2019, 8:05pm

How can I make it so that when there is an update to the file (modification of an existing line or a new line) that it does not reingest everything all over again.

Badger · November 3, 2019, 10:29pm

Unless you are appending lines to the file logstash will have to process every entry. You can take a subset of the columns and then use a fingerprint filter to set the document_id option on the elasticsearch output. Then elasticsearch will overwrite the existing document. Still does all the work, but you do not get duplicates.

mbrown007 · November 3, 2019, 10:34pm

Thank you for your quick reply.

I am getting this error/warning now

[WARN ] 2019-11-03 20:17:54.148 [[main]>worker0] csv - Error parsing csv {:field=>"message", :source=>"", :exception=>#<NoMethodError: undefined method `each_index' for nil:NilClass>}
[WARN ] 2019-11-03 20:17:54.168 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>404, :action=>["update", {:_id=>"fishstats", :_index=>"fishstats", :_type=>"doc", :routing=>nil, :_retry_on_conflict=>1}, #<LogStash::Event:0x658d2fc1>], :response=>{"update"=>{"_index"=>"fishstats", "_type"=>"doc", "_id"=>"fishstats", "status"=>404, "error"=>{"type"=>"document_missing_exception", "reason"=>"[doc][fishstats]: document missing", "index_uuid"=>"XzCpHqrcQ6SL8XFscnjLVw", "shard"=>"4", "index"=>"fishstats"}}}}
[WARN ] 2019-11-03 20:17:54.168 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>404, :action=>["update", {:_id=>"fishstats", :_index=>"fishstats", :_type=>"doc", :routing=>nil, :_retry_on_conflict=>1}, #<LogStash::Event:0xf6cf018>], :response=>{"update"=>{"_index"=>"fishstats", "_type"=>"doc", "_id"=>"fishstats", "status"=>404, "error"=>{"type"=>"document_missing_exception", "reason"=>"[doc][fishstats]: document missing", "index_uuid"=>"XzCpHqrcQ6SL8XFscnjLVw", "shard"=>"4", "index"=>"fishstats"}}}}

My logstash.conf is the following

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

#input {
#  beats {
#    port => 5044
#  }
#}

#output {
#  elasticsearch {
#    hosts => ["http://localhost:9200"]
#    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
#    #user => "elastic"
#    #password => "changeme"
#  }
#}

#input {
#  file {
#        path => "/etc/logstash/apps/2019.csv"
#        start_position => "beginning"
#        sincedb_path => "/dev/null"
#        }
#}
#filter {
#        csv {
#                separator => ","
#                columns => ["App #","Resubmis.","Month Sub","Date of Rec","Date of FP","Date of Resp","Resale","Vio","VA/OS","Address","Sec.","Final Pics","Request","Home Type","Steve Walters","Antonio Alaimo","Gerald","Michael Flack","Brian","Michael Brown","Jessica Arseneault","Jayla Walters","Tim Swigert","David Gurule","Final Decision","Group Decision Date","DIP"]
#        }
#       mutate { convert => ["DIP","float"]}
#       mutate { convert => ["Month Sub","integer"]}
#       mutate { convert => ["Resale","boolean"]}
#       mutate { convert => ["Vio","boolean"]}
#       mutate { convert => ["App #","integer"]}
#       mutate { convert => ["Resubmis","boolean"]}
#       mutate { convert => ["Final Pics","boolean"]}
#       date { match => ["Date of Rec", "M/d/yyyy"] target => "Date of Rec" }
#       date { match => ["Date of FP", "M/d/yyyy"] target => "Date of FP" }
#       date { match => ["Date of Resp", "M/d/yyyy"] target => "Date of Resp" }
#       date { match => ["Group Decision Date", "M/d/yyyy"] target => "Group Decision Date" }
#
#}

input {
  file {
        path => "/etc/logstash/fish/10gFresh.csv"
        start_position => "beginning"
        sincedb_path => "/dev/null"


        }
}

filter {
        csv {
                separator => ","
                columns => ["date","temp","pH","Alkalinity","salinity","chlorine","hardness","ammonia","nitrites","nitrates","who"]
        }

        mutate { convert => ["temp","float"]}
  mutate { convert => ["pH","float"]}
        mutate { convert => ["Alkalinity","float"]}
        mutate { convert => ["salinity","float"]}
        mutate { convert => ["chlorine","float"]}
        mutate { convert => ["hardness","float"]}
        mutate { convert => ["ammonia","float"]}
        mutate { convert => ["nitrites","float"]}
        mutate { convert => ["nitrates","float"]}
        date { match => ["date","M/d/yyyy"] target => "date"}
}


output {
        elasticsearch {
                hosts => "http://localhost:9200"
                #index => "2019apps"
                index => "fishstats"
                action => "update"
                document_id => "fishstats"
                #document_type => "arcapps"
        }
        stdout{}
}

So I will research a little on the document ID and go from there. That being said how can I use 1 config file for multiple files that use different indexes?

Badger · November 3, 2019, 11:02pm

OK, so the [message] field is empty or missing, and the ruby CSV parser is returning nil, so when the csv filter tries to iterate over the columns using each_index it gets that exception.

I'm not an elasticsearch expert but the second error message also appears to be related to a missing message.

You can write to multiple indexes with a single output by using a sprintf reference in the elasticsearch output index option.

mbrown007 · November 3, 2019, 11:05pm

I think this has to do with the document id issue. I am still looking at that.

system · December 1, 2019, 11:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Incorrectly collected data in Logstash to Elasticsearch Logstash	12	883	June 7, 2022
Add date{} config make logstash not working Logstash	6	426	January 18, 2021
Logstash configuration returns different results Logstash	15	1471	August 13, 2019
Update record if present otherwise insert Logstash	12	813	December 4, 2019
Trying to parse this datefield Logstash	38	1575	July 7, 2018

Date matching

Related topics