Non-Integer Data Types in my Data... How to make a Visualization?

Hi Kibana Gurus,

I’m a hardcore network guy trying to learn Kibana from scratch, and I’m having a tough time.
In my Elasticsearch, I have a repository of network data. When in Kibana, I can see the data and all the individual fields; that’s a good sign. Included in that data are IP and MAC addresses:

Okay, great. But I notice that then I try to create a Visualization using these fields, they do not appear as options in the Fields drop-down menu. See the pict below, notice that while Packet.L3.Src appears in the above data set, it does not appear among the available fields for a visualization:

I assume this is a data type issue. In other words, when my data is imported into Elasticsearch, I’m guessing that the IP and MAC addresses are imported as strings, not properly-formatted IPs or MACs. I wish I knew for sure; I don’t know how to check data types in Elasticsearch/Kibana.

I don’t care how the data is formatted, and I may not have a way to convert data before it gets into Elasticsearch anyway. But I have to be able to produce visualizations using these fields, or all this network data is meaningless.

Has anyone seen an issue like this before? How could I make a chart with Packet.L3.Src and Packet.L3.Dst as columns? Any advice will be greatly appreciated.

Thanks!

UPDATE: I verified that my IP addresses are indeed strings. Or, to be more specific, of type "text"

When I log into my ElasticSearch docker container and inspect the index, here is what I see:

[root@ES config]# curl -X GET "localhost:9200/logstash-2019.08.08-000001?pretty"
{
  "logstash-2019.08.08-000001" : {
    ...etc...

        "Packet" : {
          "properties" : {
            ...etc...

            "L3" : {
                ...etc...

                "Src" : {
                  "type" : "text",      <<-----------  Ah HA!!!
                  "norms" : false,
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
^C
[root@ES config]#

So note that Packet.L3.SRC is of type "text." I'm guessing this is why I can't use this field in a Kibana Visualization? Is this a showstopper?

Thanks

Hi @redapplesonly, You are correct that most text fields are not available for aggregations in kibana because the result would be a very expensive query. For a text field like IP Address though you could use the keyword analyzed field instead. A standard text field is broken up and analyzed for each word whereas a keyword will use the entire string in the query.

To see how your fields are mapped you can visit the appropriate index in kibana within the management screens "/app/kibana#/management/kibana/index_patterns".

To learn more about how these fields are analyzed Tim Roes has a good blog post at: https://www.timroes.de/elasticsearch-kibana-queries-in-depth-tutorial

Thanks Tims,

I dug into Kibana's Management --> Index Patterns --> myData, and sure enough, the data fields I'm concerned about are marked as "Searchable" but not "Aggregatable". So that makes sense.

I began reading through the blog entry you sent me about Queries, but I don't think that directly addresses what I want to do. When my Elasticsearch/Kibana solution is up and running and crunching my network data, I don't want to search for one target IP address. I need to watch data trends across all IP addresses, and I need to tally statistics for every IP address that appears in the data. I need my Packet.L3.Src fields to be aggregatable.

So... is this a showstopper? A quick poke around the Elastic documentation suggests that I might try mapping as a solution: here. Essentially, when new data rolls in, I need Elasticsearch or Kibana to translate the Packet.L3.Src string into a proper IP address data type. Would mapping be the way to do this?

Thanks!

Revision to what I said earlier... I think I misunderstood the term "mapping." I'm really, really new to ES, and there's a lot to learn.

What I really need is something like the following:

  1. New data arrives in Elasticsearch
  2. Elasticsearch or Kibana does an initial review of the data, and identifies key fields that I'm concerned about. In this case, the Packet.L3.Src field would be such a field
  3. Elasticsearch/Kibana then runs a Python script or something that takes the initial text from the field, parses it, and returns a properly-formatted IP address object
  4. The IP address is replaced in the data and can now be aggregated by Kibana.

Is this possible? If so, where would I begin to learn how to do this? Thanks

Ah ok, I missed the part about parsing the IP address out of the field. The best place to do that is when you ingest the data, so before it gets stored in elasticsearch. Or you could consider building a custom analyzer that will be applied to that field during ingestion: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html

Thanks Tim,

Custom analyzers, huh? That sounds promising. Okay, let me read through the documentation and see if I can't figure it out. THanks!

Won't an index template help you? This is what I used when I faced a similar issue.

Your field contains IP information, however in elasticsearch is mapped as a full-text search field, which is why you can't make aggregations on it.

Check: ES field datatypes

Your index template should contain a mapping like this:

"mappings" : {
  "properties" : {
        "Packet.L3.Src" : {
          "type" : "ip"
        }
  }
}

Based on the type you define in your template, the fields will be indexed accordingly:

Making your field string/keyword datatype is enough to make it Aggregatable, however you gain additional search functionalities if you map an IP field as ip datatype.

Also, be aware that an index template applies only at index creation time (so it will apply for your future indices, but not for those already existing). If you need to preserve the current data but change the data type, you will have to reindex.

Hi Alex,

Sorry I didn't see your post earlier. Yes, you understand exactly what I want to do. I had seen Index Templates in the online documentation, but being completely new to Elasticsearch, I didn't recognize them as a viable tool for me.

But rereading the doc now, I think I remember why I didn't think index templates as a useful tool for me. Perhaps I should have explained my environment in a little more detail... I have a Logstash server that is "upstream" of Elasticsearch and feeds ES the data for my topic. In fact, Logstash creates the index in ES. Here's my very minuscule logstash.conf file:

root@home/logstash-7.3.0# more config/logstash.conf
input {
   ...etc...
}

filter {
}

output {
  elasticsearch{
    hosts => ["http://192.168.3.8:9200"]
  }
}

root@home/logstash-7.3.0#

So as you can see, Logstash simply shovels the data to ES. ES then automatically creates my index.

So I bring this up because on the Index Template info page, there's a note saying "Templates are only applied at index creation time. Changing a template will have no impact on existing indices." I assumed this meant that I couldn't apply a template to an already-created index.

Am I wrong? I am such a newbie that I am not using any templates on my index. Is there any harm in taking your YAML snippet and creating a new template from that, and then applying it?

Thanks, I appreciate the help!

Following Alex's suggestion, I decided to go for broke and tried implementing a Index Template. "logstash-2019.08.08-000001" is the name of my Index:

curl -X PUT "localhost:9200/_template/logstash-2019.08.08-000001" -H 'Content-Type: application/json' -d'
{
    "template": "logstash-2019.08.08-000001",
    "order": 1,
        "settings": {
            "index": {
            "refresh_interval": "5s"
            }
        },
    "mappings": {
        "default": {
            "_all": {
                "norms": false,
                "enabled": true
                },
                "properties": {
                    "Packet.L3.Src": { "type": "ip"},
                    "Packet.L3.Dst": { "type": "ip"}
            }
        }
    }
}
'

FYI, this was run directly on the command line within my ElasticSearch Docker container (v7.3.0) The error message I got back was a single run-on error message, I've added a bunch of newlines to try and make sense of it:

{"error":{
		"root_cause":[
			{
					"type":
						"mapper_parsing_exception",
						"reason":"Root mapping definition has unsupported parameters:  
							[default : 
								{_all={norms=false, enabled=true}, 
								properties={Packet.L3.Src={type=ip}, 
								Packet.L3.Dst={type=ip}}}]"}],
								"type":"mapper_parsing_exception","reason":"Failed to parse mapping [_doc]: 
								Root mapping definition has unsupported parameters:  
								[default :
								{_all={norms=false, enabled=true}, 
								properties={Packet.L3.Src={type=ip}, 
								Packet.L3.Dst={type=ip}}}]",
								
								"caused_by":{
								"type":"mapper_parsing_exception",
								"reason":"Root mapping definition has unsupported parameters:  
								[default : 
								{_all={norms=false, enabled=true}, 
								properties={Packet.L3.Src={type=ip}, Packet.L3.Dst={type=ip}}}]"}},
	"status":400
}

Maybe I'm misreading this, but it looks like Elasticsearch is telling me, "I don't see a Packet.L3.Src" or "Packer.L3.Dst" field within logstash-2019.08.08-000001 Am I missing the biggest picture?

Thanks

Hi,
You're right, you cannot apply a template on an existing index, it applies only at index creation time. But that's not a show-stopper, even if you want to keep your data.

First, you can choose your index name in the output (and even apply a date pattern - optionally)

output {
elasticsearch{
hosts => ["http://192.168.3.8:9200"]
index => "myindex-%{+YYYY.MM.dd}"
}
}

Specifically for your case, this should work:

  • create the index pattern you need (I recommend you read up a bit on this, you may need more than my snippet) - use Kibana DevTools, it's easy to work from there.
  • stop logstash
  • modify logstash output with your new index name, then restart logstash

If you need the data you already have in elasticsearch, you can take it from the logstash index and move it in your new index, using the Reindex API.
This is fairly easy to use:

POST _reindex
{
"source": {
"index": "logstash-*"
},
"dest": {
"index": "myindex"
}
}

Thanks Alex,

So if I'm following you correctly... Its too late to apply an Index Template to my existing ElasticSearch Index because that index has already been created.

The solution is to:

  1. Create the Index Template I need
  2. Specify in Kibana that the Index Template is to be used with a new, yet-to-be-created ES Index (perhaps named "logstash-2019.08.26-NEW" or whatever)
  3. Stop Logstash
  4. Redirect Logstash's output to pump data into the newly-created ES index
  5. Restart Logstash

Did I get you right? Thanks so much!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.