hi,
We oversee a network interface that is close to 2 gigabits per second.
All our servers are interfaced at 10 Gbps. Currently, we have only one node elastic search: 4x16 core and 256 Gb of RAM.
We do not understand why flow seems to be ingested in blocks and not live.
If you have an idee...
thanks
Gaby
1 Like
Flow events are send based on the configured period
. You get intermediate updates about the flow every N seconds. Then you get a final event when the flow completes.
You can decrease the period for a more live view.
I think we have an other problem. our configuration period is
period: 1s
when the traffic is slow, it's ok but when it increases, it's ingested in blocks like this :
I wonder if there are issues with the output keeping up with the amount of data being sent by Packetbeat. Are you going directly to Elasticsearch? Maybe you are getting some bulk rejections? Anything in the logs (there a 30 second periodic metric dump in the beat log)? Anything interesting from GET _nodes/stats
in ES?
Hi Andrew,
"I wonder if there are issues with the output keeping up with the amount
of data being sent by Packetbeat. Are you going directly to Elasticsearch?"
Yes
"Maybe you are getting some bulk rejections? Anything in the logs
(there a 30 second periodic metric dump in the beat log)?"
Bulk ?
/2017-05-09T08:23:51+02:00 INFO Error publishing events (retrying):
temporary bulk send failure/
We have this info every 30s :
/2017-05-09T08:21:12+02:00 INFO Non-zero metrics in the last 30s:
beat.memstats.gc_next=1331232 beat.memstats.memory_alloc=-37092840
beat.memstats.memory_total=229741960
output.elasticsearch.events.acked=17808
output.elasticsearch.publishEvents.call.count=359
output.elasticsearch.read.bytes=183295
output.elasticsearch.write.bytes=11037292 output.events.acked=17808
output.write.bytes=11037292 publisher.events.count=17653
publisher.queue.messages.count=30//
/
Anything interesting from |GET _nodes/stats| in ES?"
I join the output
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "squid-eple",
"nodes" : {
"OmG2KhT3QDC7tTXXAySLhw" : {
"timestamp" : 1494311032698,
"name" : "elastic",
"transport_address" : "127.0.0.1:9300",
"host" : "127.0.0.1",
"ip" : "127.0.0.1:9300",
"roles" : [
"master",
"data",
"ingest"
],
"indices" : {
"docs" : {
"count" : 741900111,
"deleted" : 13
},
"store" : {
"size_in_bytes" : 416233754146,
"throttle_time_in_millis" : 0
},
"indexing" : {
"index_total" : 112629752,
"index_time_in_millis" : 141861806,
"index_current" : 0,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 25,
"time_in_millis" : 16,
"exists_total" : 25,
"exists_time_in_millis" : 16,
"missing_total" : 0,
"missing_time_in_millis" : 0,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 678210,
"query_time_in_millis" : 1930111,
"query_current" : 0,
"fetch_total" : 405289,
"fetch_time_in_millis" : 4370404,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 0,
"current_docs" : 0,
"current_size_in_bytes" : 0,
"total" : 219414,
"total_time_in_millis" : 204593627,
"total_docs" : 2295167592,
"total_size_in_bytes" : 1208995640184,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 12909675,
"total_auto_throttle_in_bytes" : 10660626671
},
"refresh" : {
"total" : 1843832,
"total_time_in_millis" : 67139879,
"listeners" : 0
},
"flush" : {
"total" : 974,
"total_time_in_millis" : 922366
},
"warmer" : {
"current" : 0,
"total" : 1844694,
"total_time_in_millis" : 742127
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 187483,
"hit_count" : 54,
"miss_count" : 187429,
"cache_size" : 0,
"cache_count" : 103,
"evictions" : 103
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 6706,
"memory_in_bytes" : 1207499371,
"terms_memory_in_bytes" : 1037726383,
"stored_fields_memory_in_bytes" : 108435888,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 6369664,
"points_memory_in_bytes" : 16441476,
"doc_values_memory_in_bytes" : 38525960,
"index_writer_memory_in_bytes" : 36215380,
"version_map_memory_in_bytes" : 113612,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : 1494115208511,
"file_sizes" : { }
},
"translog" : {
"operations" : 1611641,
"size_in_bytes" : 1396188410
},
"request_cache" : {
"memory_size_in_bytes" : 367567,
"evictions" : 0,
"hit_count" : 108536,
"miss_count" : 13114
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
},
"os" : {
"timestamp" : 1494311033144,
"cpu" : {
"percent" : 7,
"load_average" : {
"1m" : 20.81,
"5m" : 16.77,
"15m" : 11.68
}
},
"mem" : {
"total_in_bytes" : 271019524096,
"free_in_bytes" : 137014861824,
"used_in_bytes" : 134004662272,
"free_percent" : 51,
"used_percent" : 49
},
"swap" : {
"total_in_bytes" : 0,
"free_in_bytes" : 0,
"used_in_bytes" : 0
}
},
"process" : {
"timestamp" : 1494311033145,
"open_file_descriptors" : 2003,
"max_file_descriptors" : 65536,
"cpu" : {
"percent" : 3,
"total_in_millis" : 330584720
},
"mem" : {
"total_virtual_in_bytes" : 473911775232
}
},
"jvm" : {
"timestamp" : 1494311033147,
"uptime_in_millis" : 335999627,
"mem" : {
"heap_used_in_bytes" : 15031914264,
"heap_used_percent" : 44,
"heap_committed_in_bytes" : 33984610304,
"heap_max_in_bytes" : 33984610304,
"non_heap_used_in_bytes" : 129616528,
"non_heap_committed_in_bytes" : 138563584,
"pools" : {
"young" : {
"used_in_bytes" : 931525520,
"max_in_bytes" : 3001090048,
"peak_used_in_bytes" : 3001090048,
"peak_max_in_bytes" : 3001090048
},
"survivor" : {
"used_in_bytes" : 118319576,
"max_in_bytes" : 375128064,
"peak_used_in_bytes" : 375128064,
"peak_max_in_bytes" : 375128064
},
"old" : {
"used_in_bytes" : 13982069168,
"max_in_bytes" : 30608392192,
"peak_used_in_bytes" : 13982069168,
"peak_max_in_bytes" : 30608392192
}
}
},
"threads" : {
"count" : 235,
"peak_count" : 243
},
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 16740,
"collection_time_in_millis" : 1250246
},
"old" : {
"collection_count" : 1,
"collection_time_in_millis" : 370
}
}
},
"buffer_pools" : {
"direct" : {
"count" : 819,
"used_in_bytes" : 1111517508,
"total_capacity_in_bytes" : 1111517507
},
"mapped" : {
"count" : 14377,
"used_in_bytes" : 415438488151,
"total_capacity_in_bytes" : 415438488151
}
},
"classes" : {
"current_loaded_count" : 11521,
"total_loaded_count" : 11521,
"total_unloaded_count" : 0
}
},
"thread_pool" : {
"bulk" : {
"threads" : 32,
"queue" : 0,
"active" : 5,
"rejected" : 50796,
"largest" : 32,
"completed" : 11906818
},
"fetch_shard_started" : {
"threads" : 1,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 64,
"completed" : 440
},
"fetch_shard_store" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"flush" : {
"threads" : 2,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 5,
"completed" : 1823
},
"force_merge" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"generic" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 4,
"completed" : 34144
},
"get" : {
"threads" : 25,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 25,
"completed" : 25
},
"index" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"listener" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"management" : {
"threads" : 5,
"queue" : 0,
"active" : 1,
"rejected" : 0,
"largest" : 5,
"completed" : 431476
},
"refresh" : {
"threads" : 10,
"queue" : 0,
"active" : 1,
"rejected" : 0,
"largest" : 10,
"completed" : 21738340
},
"search" : {
"threads" : 49,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 49,
"completed" : 1219097
},
"snapshot" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"warmer" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 4,
"completed" : 1844809
}
},
"fs" : {
"timestamp" : 1494311033149,
"total" : {
"total_in_bytes" : 1266243043328,
"free_in_bytes" : 836408279040,
"available_in_bytes" : 772063252480,
"spins" : "true"
},
"data" : [
{
"path" : "/home/elastic/nodes/0",
"mount" : "/home (/dev/mapper/tallud--vg-home)",
"type" : "ext4",
"total_in_bytes" : 1266243043328,
"free_in_bytes" : 836408279040,
"available_in_bytes" : 772063252480,
"spins" : "true"
}
],
"io_stats" : {
"devices" : [
{
"device_name" : "dm-4",
"operations" : 112820992,
"read_operations" : 53148,
"write_operations" : 112767844,
"read_kilobytes" : 6197576,
"write_kilobytes" : 1575305684
}
],
"total" : {
"operations" : 112820992,
"read_operations" : 53148,
"write_operations" : 112767844,
"read_kilobytes" : 6197576,
"write_kilobytes" : 1575305684
}
}
},
"transport" : {
"server_open" : 0,
"rx_count" : 0,
"rx_size_in_bytes" : 0,
"tx_count" : 0,
"tx_size_in_bytes" : 0
},
"http" : {
"current_open" : 73,
"total_opened" : 6592
},
"breakers" : {
"request" : {
"limit_size_in_bytes" : 20390766182,
"limit_size" : "18.9gb",
"estimated_size_in_bytes" : 0,
"estimated_size" : "0b",
"overhead" : 1.0,
"tripped" : 0
},
"fielddata" : {
"limit_size_in_bytes" : 20390766182,
"limit_size" : "18.9gb",
"estimated_size_in_bytes" : 0,
"estimated_size" : "0b",
"overhead" : 1.03,
"tripped" : 0
},
"in_flight_requests" : {
"limit_size_in_bytes" : 33984610304,
"limit_size" : "31.6gb",
"estimated_size_in_bytes" : 30262,
"estimated_size" : "29.5kb",
"overhead" : 1.0,
"tripped" : 0
},
"parent" : {
"limit_size_in_bytes" : 23789227212,
"limit_size" : "22.1gb",
"estimated_size_in_bytes" : 30262,
"estimated_size" : "29.5kb",
"overhead" : 1.0,
"tripped" : 0
}
},
"script" : {
"compilations" : 0,
"cache_evictions" : 0
},
"discovery" : {
"cluster_state_queue" : {
"total" : 0,
"pending" : 0,
"committed" : 0
}
},
"ingest" : {
"total" : {
"count" : 3957832,
"time_in_millis" : 12146,
"current" : 0,
"failed" : 0
},
"pipelines" : {
"xpack_monitoring_2" : {
"count" : 0,
"time_in_millis" : 0,
"current" : 0,
"failed" : 0
},
"proxyecole" : {
"count" : 3957832,
"time_in_millis" : 12146,
"current" : 0,
"failed" : 0
},
"test_proxyecole" : {
"count" : 0,
"time_in_millis" : 0,
"current" : 0,
"failed" : 0
},
"squid-eple" : {
"count" : 0,
"time_in_millis" : 0,
"current" : 0,
"failed" : 0
}
}
}
}
}
}
steffens
(Steffen Siering)
May 9, 2017, 1:35pm
8
1s for flows is very small. The idea of flows is to 'compress' the information by computing counters/summaries to be published at a higher interval. By decreasing the period from 10s to 1s, you multiplied the amount of events by 10... On a GB interface. All flows are reported at exactly the same point in time. That is, counters of multiple flows can be summed together.
The message /2017-05-09T08:23:51+02:00 INFO Error publishing events (retrying): temporary bulk send failure/
indicates a bulk publish being rejected by ES. Have you checked ES logs? Reduce events to be published and/or introduce some queuing (e.g. redis) to buffer up events. On ES side you can also try to tune? E.g. number of shards/replicas. Note: when using some queue between packetbeat and ES, you want the consumer to have a chance to catch up. When traffic is high all the time, try to reduce amount of data being generate or tune/improve ingestion.
1 Like
system
(system)
Closed
June 6, 2017, 1:46pm
9
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.