Hello everyone, i'm new to Infosec and very new to ELK stack. I am running a home lab inside a VM.
I've downloaded a sample DDos dataset from kaggle.com and ingested that data into elk stack, but not really sure what to do next. The data is parsed, but to what extent, I do not know. Is it good enough to create dashboards and analyze the attack and in what manner, I do not know?
Looking for a little guidance from the experts. I want to able to extract useful data by creating interesting GUIs.
First, what is it that you'd like to do with this data set? If it provides good examples of a typical denial of service (impact) technique, what would you like to create from it?
As a preliminary step, you should use the Discover interface (left hand icon panel) to explore the data and learn what is actually parsed. You should make sure you adjust the time appropriately - if the data has timestamps, looking at the last 7 days of events may not show any results if the data was created last year or last month.
If you were interested in just understanding what a denial of service looks like, it will depend on what the data set depicts, but you could create some one-to-many or many-to-one visualizations in lens that visualize the relationship between source(s) and destination(s). I'll mention that this is less helpful if you want to understand what an individual denial of service attempt looks like, since they are only effective en masse; in other words, you want to try and visualize the phenomenon and not the parts of the whole.
If you wanted to develop detection logic such as rules or unsupervised ML jobs, a denial of service technique is probably best expressed as the number of connections to or from a specific point - if your data is properly parsed and you can see column headers for each field in the data set, you could begin at looking at high counts of connections. Both EQL and KQL could be helpful languages to learn if you're working on a home lab setup, either can express counts of network connections. We have some rules for unusual network behaviors that include denial of service and unsupervised ML examples for high counts of network activity, though ML support is a licensed (non-basic) feature you'd only be able to use in demo mode. You might want to check out the detection-rules repository for more information about our supported languages, tools and examples.
Your third and fourth paragraph nails exactly what I am looking for.
-Understanding what denial of service looks like in a visualized manner.
That being said I am a total newbie using Zeek/Elk Stack, and by tinkering with the parsed data presented to me, I want to say that either the data isn't parsed correctly or the dataset has missing data points for me to develop a dashboard. I've posted a screenshot below, what do you think? Any tips on how to improve.
I am also taking a look a the github material you posted. Thank for that.
A great place to start is some tutorials that walk through examples of visualizing data. However, you could also just pop over to the Visualize app and dynamically analyze it with Lens. You'll probably want to drag over fields representing source IP, destination IP, destination port and possibly fields like useragent, depending on your data set. Lens, in my opinion, is good for exploring data because it offers several concurrent visualizations you can quickly cycle through. If you line any of them (which can be customized), they are also easily saved.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.