Problem with dynamic log update using http_poller

Hi ,
This is my configuration file
input {

http_poller{
	 	urls => {
	
    		url129server => "http://xxx.log"

		}


	codec => "plain"
metadata_target => "http_poller_metadata"
request_timeout => 60
schedule => { "every" => "1h"}
tags => check1
    }

file{
sincedb_path => "since_db"
}

}

filter {
if "check1" in [tags]{

split { 
    	 field => "message" 
	}
grok {    			
	match => {"message" => '%{IP:client} -  -  \[%{MONTHDAY:day}/%{MONTH:month}/%{YEAR:year}:%{HOUR:hour}:%{MINUTE:minute}:%{SECOND:second} %{ISO8601_TIMEZONE:timezone}\] "%{WORD:method} %{URIPATHPARAM:request} HTTP%{URIPATHPARAM:httpVersion}" %{NUMBER:responseCode} -  %{NUMBER:responseTime}'}
     }
mutate
     {  
	add_field => { "time" => "%{day}/%{month}/%{year}:%{hour}:%{minute}:%{second} %{timezone}" } 
	
     }

date { 
     	locale=> "en"     
	match => ["time", "dd/MMM/YYYY:HH:mm:ss +0100"]
	
     }



	        }

}

output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
}
file {
path => "C:/elastic/logstash/logstash-5.1.1/output/28decemberlog1.txt"
}
stdout {
codec => rubydebug { metadata => true }
}
}

I can't give you the actual URL. Everything is working fine in parsing log data. The URL which I mentioned in http_poller is dynamic log data so every time I am running the URL it always starts from starting. I want to start from next to the previous log data which I executed last time. Is there is any way in logstash to do that?

for e.g. When I run the log file first time I getting the log data of 50 lines. After few minutes my dynamic log data contains 100 log line. I don't want to start again from 1 st line of my log data, I want to start from 51 st line of my log data.

Please suggest any idea to overcome my problem.

I want to start from next to the previous log data which I executed last time. Is there is any way in logstash to do that?

No, but if you set the document id of the events you send to ES you will at least not get duplicates (because you'll overwrite the same event all the time).

You could e.g. set the document id to a hash of the event contents. You can use the fingerprint filter to generate the hash.

Thank you for your response. But How do generate document id?

If my last paragraph is unclear, please tell me what part is hard to understand.

What is meant by document id of the event? Do I need to create a field called "document id"?

All documents in Elasticsearch have a unique id. See the ES documentation for details. The id of Logstash events sent to ES can be set via the elasticsearch output's document_id option. You can create a field in the event that contains the desired id and then reference that field in the output configuration.

output {
  elasticsearch {
    ...
    document_id => "%{name-of-field}"
  }
}

If you create that field as a subfield of @metadata it won't be included in the payload that's sent to ES. Something like this should work:

filter {
  fingerprint {
    method => "SHA256"
    key => "random string"
    target => "[@metadata][fingerprint]"
  }
}

output {
  elasticsearch {
    ...
    document_id => "%{[@metadata][fingerprint]}"
  }
}

You might need to adjust the fingerprint configuration to include additional fields.

1 Like

Hi,

For my understanding, I need to place a unique field of my log event in the fingerprint on both the target and document_id. (i.e ) target => "my unique field" and document_id => "my unique field"

Am I right?

For my understanding, I need to place a unique field of my log event in the fingerprint

The fingerprint can be computed from multiple fields, but yes.

on both the target and document_id. (i.e ) target => "my unique field" and document_id => "my unique field"

The field that's referenced in target and document_id is the field that should store the resulting fingerprint.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.