Logstash is not ingesting data from all nodes

rahul_sirugudi · January 12, 2023, 2:53am

I am using filebeat to ship logs to logstash and using it to ingest data to Index. From 2 days i am seeing multiple issues.

from Kibana dashboard i am not seeing logs on regular basis of the particular index. // for this i tried to see logstash logs i can see logs are running fine (if i restart service) then i am able to see logs, with this i can see there some sort of delay to populate in the dashboard.
For any given service we usually run 40+ nodes which all come with filebeat installed, but i am not seeing some instance logs in the kibana dashboard. // for this i tried to see logstash logs grepping with IP but i don't see particular node logs. We are using same beats file across all nodes but not able to understand why i am not seeing anything logs coming from that particular IP.

Christian_Dahlqvist · January 12, 2023, 7:57am

Which version of Elasticsearch are you using?

What is the size and specification of your Elasticsearch cluster? What type of hardware are you using?

What is the full output of the cluster stats API?

Is there anything in the Elasticsearch logs that indicate issues or errors?

rahul_sirugudi · January 12, 2023, 9:04am

I am using 7.17 version.
Elastic search and logstash are running on 16 gigs, 4 vCPU on 2 seperate nodes.

{"_nodes":{"total":3,"successful":3,"failed":0},"cluster_name":"elasticsearch","cluster_uuid":"aAeJWK8FRTSeyDaac4_GZw","timestamp":1673513937719,"status":"yellow","indices":{"count":44,"shards":{"total":44,"primaries":44,"replication":0.0,"index":{"shards":{"min":1,"max":1,"avg":1.0},"primaries":{"min":1,"max":1,"avg":1.0},"replication":{"min":0.0,"max":0.0,"avg":0.0}}},"docs":{"count":255045676,"deleted":408045},"store":{"size_in_bytes":132162572193,"total_data_set_size_in_bytes":132162572193,"reserved_in_bytes":0},"fielddata":{"memory_size_in_bytes":1416,"evictions":0},"query_cache":{"memory_size_in_bytes":14480,"total_count":3162,"hit_count":491,"miss_count":2671,"cache_size":39,"cache_count":42,"evictions":3},"completion":{"size_in_bytes":0},"segments":{"count":400,"memory_in_bytes":7897418,"terms_memory_in_bytes":5537864,"stored_fields_memory_in_bytes":649376,"term_vectors_memory_in_bytes":0,"norms_memory_in_bytes":706112,"points_memory_in_bytes":0,"doc_values_memory_in_bytes":1004066,"index_writer_memory_in_bytes":326940254,"version_map_memory_in_bytes":7291102,"fixed_bit_set_memory_in_bytes":193544,"max_unsafe_auto_id_timestamp":1673493028872,"file_sizes":{}},"mappings":{"field_types":[{"name":"alias","count":9,"index_count":9,"script_count":0},{"name":"boolean","count":305,"index_count":28,"script_count":0},{"name":"byte","count":9,"index_count":9,"script_count":0},{"name":"constant_keyword","count":33,"index_count":11,"script_count":0},{"name":"date","count":588,"index_count":33,"script_count":0},{"name":"flattened","count":108,"index_count":9,"script_count":0},{"name":"float","count":99,"index_count":16,"script_count":0},{"name":"geo_point","count":81,"index_count":9,"script_count":0},{"name":"half_float","count":32,"index_count":8,"script_count":0},{"name":"histogram","count":9,"index_count":9,"script_count":0},{"name":"integer","count":88,"index_count":4,"script_count":0},{"name":"ip","count":146,"index_count":11,"script_count":0},{"name":"keyword","count":10029,"index_count":33,"script_count":0},{"name":"long","count":2208,"index_count":31,"script_count":0},{"name":"match_only_text","count":522,"index_count":9,"script_count":0},{"name":"nested","count":132,"index_count":16,"script_count":0},{"name":"object","count":3454,"index_count":33,"script_count":0},{"name":"scaled_float","count":72,"index_count":9,"script_count":0},{"name":"text","count":434,"index_count":29,"script_count":0},{"name":"version","count":3,"index_count":3,"script_count":0},{"name":"wildcard","count":135,"index_count":9,"script_count":0}],"runtime_field_types":[]},"analysis":{"char_filter_types":[],"tokenizer_types":[],"filter_types":[],"analyzer_types":[],"built_in_char_filters":[],"built_in_tokenizers":[],"built_in_filters":[],"built_in_analyzers":[]},"versions":[{"version":"7.17.7","index_count":44,"primary_shard_count":44,"total_primary_bytes":132162572193}]},"nodes":{"count":{"total":3,"coordinating_only":0,"data":0,"data_cold":1,"data_content":1,"data_frozen":0,"data_hot":1,"data_warm":1,"ingest":1,"master":1,"ml":0,"remote_cluster_client":0,"transform":0,"voting_only":0},"versions":["7.17.7","7.17.8"],"os":{"available_processors":12,"allocated_processors":12,"names":[{"name":"Linux","count":3}],"pretty_names":[{"pretty_name":"Ubuntu 20.04.5 LTS","count":3}],"architectures":[{"arch":"amd64","count":3}],"mem":{"total_in_bytes":50328584192,"free_in_bytes":12158894080,"used_in_bytes":38169690112,"free_percent":24,"used_percent":76}},"process":{"cpu":{"percent":8},"open_file_descriptors":{"min":353,"max":829,"avg":511}},"jvm":{"max_uptime_in_millis":520809764,"versions":[{"version":"19.0.1","vm_name":"OpenJDK 64-Bit Server VM","vm_version":"19.0.1+10-21","vm_vendor":"Oracle Corporation","bundled_jdk":true,"using_bundled_jdk":true,"count":2},{"version":"19","vm_name":"OpenJDK 64-Bit Server VM","vm_version":"19+36-2238","vm_vendor":"Oracle Corporation","bundled_jdk":true,"using_bundled_jdk":true,"count":1}],"mem":{"heap_used_in_bytes":11067341576,"heap_max_in_bytes":25165824000},"threads":147},"fs":{"total_in_bytes":3171279933440,"free_in_bytes":3025203310592,"available_in_bytes":3025152978944},"plugins":[],"network_types":{"transport_types":{"security4":3},"http_types":{"security4":3}},"discovery_types":{"zen":3},"packaging_types":[{"flavor":"default","type":"deb","count":3}],"ingest":{"number_of_pipelines":24,"processor_stats":{"conditional":{"count":0,"failed":0,"current":0,"time_in_millis":0},"convert":{"count":0,"failed":0,"current":0,"time_in_millis":0},"geoip":{"count":0,"failed":0,"current":0,"time_in_millis":0},"grok":{"count":0,"failed":0,"current":0,"time_in_millis":0},"gsub":{"count":0,"failed":0,"current":0,"time_in_millis":0},"pipeline":{"count":0,"failed":0,"current":0,"time_in_millis":0},"remove":{"count":0,"failed":0,"current":0,"time_in_millis":0},"rename":{"count":0,"failed":0,"current":0,"time_in_millis":0},"script":{"count":0,"failed":0,"current":0,"time_in_millis":0},"set":{"count":0,"failed":0,"current":0,"time_in_millis":0},"set_security_user":{"count":0,"failed":0,"current":0,"time_in_millis":0},"user_agent":{"count":0,"failed":0,"current":0,"time_in_millis":0}}}}}

I don't see any errors in es.

But what i am thinking is this logstash is completely over loaded CPU wise. Hence it is taking lot's of time to ingest/write date to index.

Christian_Dahlqvist · January 12, 2023, 9:09am

"versions": ["7.17.7", "7.17.8"],

It looks like you have node of two different versions in the cluster. Make sure all nodes are running 7.17.8.

What type of storage are you using?

rahul_sirugudi · January 12, 2023, 9:10am

AWS EBS volume with SSD GP3 version, I have upgraded ES node.
In any given time we will be having 200 nodes that will run with 2 grok patterns and filters. Now i have upgraded logstash to 64gigs with 16 vCPU. Any bench marks i can test on? I am seeing bit delay data ingestion i guess that is expected. But i am worried about data loss.

rahul_sirugudi · January 12, 2023, 11:58am

Ok to avoid this i am using SQS. So far so good.

system · February 9, 2023, 11:59am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch not updating data from Logstash Logstash	1	195	June 7, 2023
Logs not getting to elasticsearch from logstash Logstash	10	551	March 5, 2019
Elasticsearch, Where is my filebeat data? Elasticsearch	11	570	March 15, 2019
Filebeat monitoring problem Beats filebeat	7	673	December 1, 2018
Kibana Showing partial data for Mulesoft API's. Urgent help needed Kibana	4	534	October 17, 2021

Logstash is not ingesting data from all nodes

Related topics