I believe this is a Logstash issue, but please correct me if I am wrong.
Currently, I have a Kibana instance set up with a lengthy EQL filter to search a description field for keywords. I want to do the filtering as part of the ingest process thereby removing the EQL filter in Kibana (which is very slow to process with multiple search terms going simultaneously).
I have an API ingest for data. One of the data fields is a description field. I have a need to assign multiple sets of filters to the incoming data and, if the information matches Filter A, it goes to Index A. If it matches Filter B, it goes to Index B, so on so forth.
Essentially the filters might be 20, 30, 50 keywords to look for or to exclude and then take that data and place it in the Index that matches the filter.
You can use conditionals in both the filter blocks and output blocks in Logstash, but to provide more information you need to share more context on what you want to do, what your data looks like for example.
Basically you can have this in logstash:
output {
if [field] == "valueA" {
elasticsearch {output for index A}
}
if [field] == "valueB" {
elasticsearch {output for index B}
}
}
We are using filebeat with an http_endpoint and we are utilizing Azure managed Elastic.
I can provide an example of data if it helps but primarily it’s multiple fields of data and within the “description” field (a sentence or a paragraph of text usually) I want it to search for keywords and ship to the appropriate index.
So for example, let’s say three lines of data and within the “description” field we have:
The dog was cold
The cat was warm
The rabbit was wet
Index A needs to ingest anything mentioning Dog or Rabbit while Index B needs anything mentioning Dog or Cat.
In theory this can be done in Logstash, but I'm not sure this is a good approach.
For example:
Index A needs to ingest anything mentioning Dog or Rabbit while Index B needs anything mentioning Dog or Cat.
In this case the document where the description field contains The dog was cold would need to be indexed in both Index A and Index B, wouldn't this count as a duplicate?
Also, this could lead to have too many indices or too many small indices.
Can you provide some context of what you want to achieve with this? Like, what does your EQL looks like?
Maybe what you want is simple to add some tag to some events to filter on those tags instead of separate them in multiple indices.
To provide context, the different indexes represent clients we have that need alerting based on certain information that is relevant to them. Therefore yes there will be duplicates, in my example dog would go into both indexes assuming it’s relevant to both clients.
The objective is that each client has a dashboard that will be providing tailored information and visualization to their specific needs.
At present, I have a single index and then, in Kibana, I have an EQL Filter (of upwards of 50 key words or combinations of keywords to filter for or filter out) that searches across anywhere between 50 and 500 entries in a 24 hour period.
The issue is, of course, how resource intensive (especially cpu) running a hundred or so lines of days through 50 filter words via EQL.
Is this for relevance or permission? If it is relevance you may be able to use use tags, if it is permission they need to use different indexes.
Also, if you want documents to be parsed for keywords before filtering them you have the option to send them to a staging ES and then pull them into logstash using multiple pipelines with different ES queries.
Thank you for your reply! It is for relevance and permission. Ideally we will allow the client to see the information relevant to them but it will be segregated from other client info.
Could you elaborate on your solution or point me at documentation?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.