Using strings for Kibana search?


(Justin) #1

Hey fellas,,

I am sorry but I have used google, youtube, etc as much as I can before posting on here but I really need some help. I have a fair amount of strings ex: (device: foo1 AND device foo2( OR device: foo3) AND protocol: 200) that are in text files. If I copy these strings into the search bar for Kibana, it will pull the required information.

My issue is if I did this manually, it would probably take me weeks to get this done. What is the best way to do this? I wrote a python script to "print" each line out of each text file and have it saved into a variable that I am (hoping, very very very much) that it can be easily passed to Kibana and then (somehow) get the information. If it exists, then I need things like the src_ip, dest_ip, port, application, etc.

I am sorry for asking such a noob question. I searched these forums and have been googling for days with no luck.


(Tim Sullivan) #2

You don't have to apologize for asking this question! We're all here to learn and those that are able to help remember the time when they were in your shoes.

I have a few thoughts about this. First, if I take your requirements literally, yes it is possible to write a script that will print out each of your filters as a URL that can be clicked and load up Kibana with that filter in effect. Almost everything you do in Kibana affects the URL so if you want to force it to do something in an external way, you could figure out what needs to go in to the URL to get that done.

My other thought is, walking back a bit and trying to look at what your overall goals are, let's say you have a number of "terms" and for each term you want there to be some correlating data for that term. Imagine having a table where the leading column is that unique term, and additional columns summarize that data you're interested in about that term. That's what a data table visualization can do when you split the rows by a terms aggregation.

To get from your search strings to something that can be aggregated on or filtered by a term, you'll want to look for a way to add a field to each document that describes a "categorization" of the document. Here are some examples that I made up by taking your example and modifying it a little:

  • device1: foo1 AND device2: fooX AND protocol: 200 => dev1-foo1.dev2-fooX.200
  • device1: foo2 AND device2: fooY AND protocol: 200 => dev1-foo2.dev2-fooY.200
  • device1: foo3 AND device2: fooZ AND protocol: 200 => dev1-foo3.dev2-fooZ.200
  • device1: foo4 AND (device2: fooY OR device2: fooZ) AND protocol: 200 => dev1-foo4.dev2-fooYfooZ.200

The OR makes it a little difficult, but with some conditional logic for generating the type, it can be done. (Without that OR, all you would need to do is smoosh together some parts of the document). You know more about your own use case and can certainly come up with some better categorizations than this, but hopefully you can see where I'm going with this. The goal is to want to end up with a new field that you can use for aggregating with, and then make aggregate visualizations which have great features for being able to set filters on the fly with just a click.

To get that new field, I'd recommend taking a look at ingest node pipelines: https://www.elastic.co/guide/en/elasticsearch/reference/6.2/ingest.html it's a way to have Elasticsearch do some manipulation on incoming data before it gets indexed. To build a pipeline that does this sort of thing, you'll want to take a look at script processor: https://www.elastic.co/guide/en/elasticsearch/reference/6.2/script-processor.html

{
  "script": {
    "lang": "painless",
    "source": "ctx.category = 'dev1-' + ctx.dev1 + '.' ..."
  }
}

(Justin) #3

Thank you tsullivan!

This is exactly what I am looking for, unfortunately it sounds like I am still going to have to manually copy and paste everything into the search. Because, yes I agree the OR makes it difficult, but everything needs to hit exactly and I need to be able to pull out exactly the information what has hits.

I still dont get why I cant figure out how to write a script pipe these strings into the search box (or simply have a remote "search: " function built in). Oh well, I will keep on bashing my head against the wall until something gives =)


(Justin) #4

I think I have figured out how to get my search working, the code looks like this:

GET /_search
{
"query": {
"query_string" : {
"fields" : ["src_ip","dst_ip", "Application"],
"query" : "devicename:(foo1 OR foo2) AND policy_id:15561"
}
}
}

My current issue is its not returning the fields I want. I may be using fields incorrectly though? I want to know which src_ip or dst_ip its hitting.


(Tim Sullivan) #5

Glad you figured that out!

I don't think having a "remote search" function built into Kibana would actually be a good fit for this web UI. It looks like what you're doing now it what I was about to get into before I saw your last message, but the appropriate way to run all these query string queries through Elasticsearch is just to talk to Elasticsearch directly. You mentioned you have comfort working in Python, Python has a library that you could put into a script, instantiate a client to talk to ES, and do the automation you need. The code you followed up with is query DSL that you can enter into Console, but you can do pretty much the exact same thing in a script application with an ES client -- and then you have the power to process the query results in the script; mapping them and formatting them how you want.

You don't need the field property because you are specifying the fields to search in the query syntax. I take it that what you want is to have the search return only the src_ip, dst_ip, Application for the matching documents. Check out source filtering: https://www.elastic.co/guide/en/elasticsearch/reference/6.2/search-request-source-filtering.html. There's also filter_path which is similar but the filter runs at a later stage.

I ran a quick example using the accounts.json data: https://www.elastic.co/guide/en/kibana/current/tutorial-load-dataset.html A query like this:

GET /bank/_search?filter_path=hits.hits._source
{
  "query": {
    "query_string": {
      "query": "state:(PA OR AZ) AND age:30"
    }
  },
  "_source": ["firstname", "lastname"]
}

Returns this:

{
  "hits": {
    "hits": [
      {
        "_source": {
          "firstname": "Brittany",
          "lastname": "Cabrera"
        }
      },
      {
        "_source": {
          "firstname": "Montgomery",
          "lastname": "Washington"
        }
      }
    ]
  }
}

Let me know if that's what you meant by returning the fields you want.


(Justin) #6

Thank you tsullivan!

This is getting my much closer to where I want to be, but I am still running into issues. I am starting to get the information that I want out of the query except I am only able to pull out "src_ip" and "dst_ip" when I need more than just that. I am trying this:

INPUT:

GET _search
{
"query": {
"query_string": {
"query": "[python input for search string]"
}
},
"_source": ["dst_port","dst_ip","Device_Name", "protocol","policy_id", "src_ip"]
}

OUTPUT:

"hits": {
"total": 4,
"max_score": 2.3940716,
"hits": [
"_index": "[edited]",
"_type": "logs",
"_id": "[edited]",
"_score": 2.3940716,
"_source": {
"src_ip": "[edited]",
"policy_id": "[edited]",
"dst_ip": "[edited]"
}

Also, it only looks like I am not hitting all of logstash, only one area... How can I beef up the search?


(Justin) #7

Alright, I am making progress and I am sorry if you have a lot to read (because I am sure this wont be my last update).

I think I finally fixed the "how do I search logstash" issue, its returning the same number with my script as it is reporting in the kibana gui, only problem is is its only returning 14 results instead of the 150 or so hits Kibana claims. I know I will be pegging the server pretty hard to get all of this information but its necessary to.


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.