Is there a way to filter for thousands of IPs in one visualization. Currently I have enabled state:storeinseesion and am using a DSL filter but am limited to 1024 clauses within the DSL filter. I would like to have a filter or multiple filters to search for all Tor exit relays published by multiple open and payed for services. The filter i have created works but I would like to be able to do more than 1024 IPs
When I run the below filter I get this error:
"took": 18,
"timed_out": false,
"_shards": {
"total": 12,
"successful": 6,
"skipped": 6,
"failed": 6,
"failures": [
{
"shard": 0,
"index": "2021.07",
"node": "jSiDg9yFRAGsuJZ-OUwLIw",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: maxClauseCount is set to 1024",
"index_uuid": "KbvJYxYeTc-yBAKVNXJu_Q",
"index": " 2021.07",
"caused_by": {
"type": "too_many_clauses",
"reason": "too_many_clauses: maxClauseCount is set to 1024"
}
}
},
{
"shard": 0,
"index": "2021.08",
"node": "MhlSku6FTVOP3woek1N9IQ",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: maxClauseCount is set to 1024",
"index_uuid": "5_Tv9MBQQQyEYPxQtbAAFQ",
"index": " 2021.08",
"caused_by": {
"type": "too_many_clauses",
"reason": "too_many_clauses: maxClauseCount is set to 1024"
}
}
}
]
},
"hits": {
"total": 0,
"max_score": 0,
"hits":
}
}
I am not entirely sure why the use-case requires filtering against thousands of individual terms. instead of fewer, more broad terms in the data set. It's almost always faster and more efficient to use fewer clauses when searching.
If you don't have any fields that have roll-ups to make this easier, when you ingest your data, you could categorize ips depending on their values, or the values of other fields. If you can't ingest the data like that, you could try using a runtime field to better categorize your ips.
If none of these work, you could try changing your ES settings to allow for a larger maxClauseCount, but this is not recommended.
The use case would be a Dashboard that filters for all Tor Exit relays. This way in one glance you can view all this traffic. Another would be list of known malicious user agent strings (these can easily be 1000s). I am using elastic to conduct cyber space blue team analysis.
Sadly It seems you are correct about the maxClauseCount being the only way and risky. I see Kibana isn't really the best tool to be conducting cyber space analysis in.
One solution would be to enrich your data with information about those tor exit relays.
For example, I have a similar use case, we use a scanning company to test vulnerability in our systems, and that company has hundreds of ip address that it uses when scanning the clients, to know that it is a expected scan and not some actor trying to find a vulnerability I created a dictionary in logstash with all the IP address that I know the origin.
I don't know how timely it would be, but could you run periodic queries on the known IP's and update the record with a flag (if not already flagged). Use multiple queries with a reasonable number of IP's. The Kibana presentations would just look at the flag field.
Depending on how ingested, could you set a flag at ingest?
If you had a reference index containing the target IP's, you might be able to use the logstash elasticsearch filter to do a lookup as well. I think there should be a way to do that in an ingest pipeline with enrich, but I haven't ever tried it.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.