Hello,
I'm really new to elasticsearch and I'm running some evaluation
(ElasticSearch vs SOLR) in order to decide which one I should use for
handling high log volume indexing.
I have a very basic setup:
syslogInjector -> logstash listening on port 514 -> elasticsearch.
input {
tcp {
port => 514
type => syslog
}
udp {
port => 514
type => syslog
}
}
filter {
grok {
match => { "message" => "%{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
}
output {
elasticsearch { host => localhost }
}
My test is:
inject 10000 1 ko syslog, at the rate 2000/sec.
I did a tcpdump on the server that receives the logs and all 10000 logs are
received.
BUT, when I look at the number of document indexed in Lucene, I almost
never get 10000 document, sometime I do, but most of the time I loose up to
40% of the logs.
Is there anything I can do to make sure that logstash doesn't loose data (I
assume it is logstash but not sure)
Is logstash doing some buffering?
Thank you for your response
Antoine Brun
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/656cec27-895b-4ade-988b-054c4e79c325%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hello,
maybe my post wasn't specific enough...
well, I'm still trying to validate logstash and still I'm loosing up to 40%
of logs:
- 10000 logs injected.
- 10000 logs detected by tcpdump (listening on port 514)
- 6000-10000 documents created in elasticsearch index.
is there a way to count, in logstash, the number of event that were process
and forwarded to elasticsearch ?
Thanks
Antoine
Le mercredi 26 mars 2014 15:13:27 UTC+1, Antoine Brun a écrit :
Hello,
I'm really new to elasticsearch and I'm running some evaluation
(Elasticsearch vs SOLR) in order to decide which one I should use for
handling high log volume indexing.
I have a very basic setup:
syslogInjector -> logstash listening on port 514 -> elasticsearch.
input {
tcp {
port => 514
type => syslog
}
udp {
port => 514
type => syslog
}
}
filter {
grok {
match => { "message" => "%{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
}
output {
elasticsearch { host => localhost }
}
My test is:
inject 10000 1 ko syslog, at the rate 2000/sec.
I did a tcpdump on the server that receives the logs and all 10000 logs
are received.
BUT, when I look at the number of document indexed in Lucene, I almost
never get 10000 document, sometime I do, but most of the time I loose up to
40% of the logs.
Is there anything I can do to make sure that logstash doesn't loose data
(I assume it is logstash but not sure)
Is logstash doing some buffering?
Thank you for your response
Antoine Brun
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b190724e-11d8-4237-97f8-f5bb0d028847%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.