Display the same fields in multiple index documents using data visualizations


#1

Hi, I am trying to use Kibana to display the same field in multiple index documents at once. I have a few database log indices (e.g. prod1-db.log-*, prod2-db.log-*, prod3-db.log-*), each of which has the same document structure as for example,

{
  "_index": "prod1-db.log-2016.08.18",
  "_type": "db.log",
  "_id": "AVadEaq7",
  "_score": null,
  "_source": {
    "message": "2016-07-08T12:52:42.026+0000 I NETWORK  [conn4928242] end connection 192.168.170.62:47530 (31 connections now open)",
    "@version": "1",
    "@timestamp": "2016-08-18T09:50:54.247Z",
    "type": "log",
    "input_type": "log",
    "count": 1,
    "beat": {
      "hostname": "prod1",
      "name": "prod1"
    },
    "offset": 1421607236,
    "source": "/var/log/db/db.log",
    "fields": null,
    "host": "prod1",
    "tags": [
      "beats_input_codec_plain_applied"
    ]
  },
  "fields": {
    "@timestamp": [
      1471513854247
    ]
  },
  "sort": [
    1471513854247
  ]
} 

Now, I want to show/display message fields from all the indices at once in the dashboard, but couldn't find a suitable visualization to present each message, so what's the best way to do that?

many thanks


(Thomas Neirynck) #2

Hi daiyue,

Thanks for your questions. There seem to be multiple things there:

So you want to display a single visualization that aggregates stats across those 3 indexes? Right now, a visualization is tied to a single index. Could you merge them into a single index?

Or did you ask what type of visualization would work for the "message" field? It's quite a dense text field, with a lot of information embedded in it. What are you interested in to see from it particularly?

thanks,


#3

Hi, firstly, I want to display an index in one visualization, which means 3 indices 3 visualizations.

Second, I am wondering what type of visualization could work for the message field. I like to see it entirely if possible, since it will make the semantics complete.

cheers


(Thomas Neirynck) #4

If I understand the message field correctly, it looks like it is unique (?) per document. It will be difficult to do something meaningful with it with the charts. Line charts, bar charts, ... all need to be able to aggregate on a value.

If you just want to see present each message entirely, the "Data Table" visualization may work for you.

  • In the "Metrics" section, select the "count" aggregation.
  • For your "Buckets", select a "Terms"-aggregation on the "message" field.
  • set the "size" to something sufficiently large

That will display the full message in a table.


#5

Hi, I tried the following setting for one of the indices,

but it only shows the term frequencies and ranking for the words in the message field, instead of entire message been shown.


(Thomas Neirynck) #6

Hi daiyue,

could you do message.keyword instead?

The screenshot is a small example that I used to test.


#7

HI, I couldn't find an option/item for message.keyword in the Field drop down box and I couldn't edit any item in the drop down box, and I am using Kibana 4.5.


(Thomas Neirynck) #8

tl;dr. You're right. The .keyword is a special case. I created the example with the 5.0 alpha versions of ES and Kibana. For older versions, you will to explicitly configure the field-mapping in the index correctly using 'not_analyzed' index-type.

There's been a change in the way ES handles new indices by default. From 5.0, it generates the .keyword field for string-fields, which allow you to aggregate on the entire field (More details can be found here: (https://github.com/elastic/elasticsearch/issues/12394)

In older versions, you can also achieve the same effect. To do this, you need to set the mapping properties of the field correctly. Specifically, you will need to set the index-property of the field to 'not_analyzed' when indexing the data.

So you may have to reindex your data to achieve this.

For an example configuration in 4.5, look at the default logstash-* index created by makelogs. It creates an extra ".raw" field, with a "index: not_analyzed" for every string-field.

 "@message": {
        "type": "string",
        "norms": {
          "enabled": false
        },
        "fields": {
          "raw": {
            "type": "string",
            "index": "not_analyzed"
          }
        }

For another example, you can also check the mapping of the userid field at https://www.elastic.co/guide/en/elasticsearch/reference/2.0/mapping.html.

  "user_id":  {
      "type":   "string", 
      "index":  "not_analyzed"
    },

#10

I have tried the following XPUT request, while trying to add a new mapping (to make message field not_analyzed) for the existing index (coredev-core.*) like this,

curl -g -X XPUT 'localhost:9200/coredev-core.*/_mapping/message {"properties": {"message": {"type": "string", "index": "not_analyzed"}}}'  

But, Elasticsearch generated an error,

java.lang.IllegalArgumentException: invalid version format: {"PROPERTIES": {"MESSAGE": {"TYPE": "STRING", "INDEX": "NOT_ANALYZED"}}} HTTP/1.1

I am wondering how to fix this issue?

cheers


(Thomas Neirynck) #11

Hi daiyue,

not quite sure what is going on

Could you share a few representative sample documents with a message field (the raw version, not the indexed one)? I can try and reproduce here on my end.

thanks,


#12

Hi, check the following is a few samples of our logs,

pathname~/project_directory/core/core_api/core_inbound_api.py||timestamp~2016-07-20 16:34:09,955||level~DEBUG||name~core_log||function_name~run_analysis||line_no~66||debug_message~/run_analysis called
pathname~/project_directory/core/core_api/core_inbound_api.py||timestamp~2016-07-20 16:34:09,955||level~DEBUG||name~core_log||function_name~run_analysis||line_no~73||debug_message~rawbody.decode() =  {"client_id": "unit_test_data", "system_id": "contrived_test_data", "queue_id": "578fa801fe892f19cc448e02", "to_date": "2016-07-20 16:34:04.666000", "from_date": "1961-10-17 16:34:04.666000", "analysis_name": "test_analysis_name"}
pathname~/project_directory/core/core_data_handling/load_data.py||timestamp~2016-07-20 16:34:09,985||level~INFO||name~core_log||function_name~connect_to_database||line_no~22||debug_message~Connecting to mongo at server: 192.168.207.203
pathname~/project_directory/core/core_data_handling/load_data.py||timestamp~2016-07-20 16:34:10,013||level~INFO||name~core_log||function_name~load_tables_from_database||line_no~150||debug_message~Loaded 19 vendors
pathname~/project_directory/core/core_data_handling/load_data.py||timestamp~2016-07-20 16:34:10,106||level~INFO||name~core_log||function_name~load_tables_from_database||line_no~150||debug_message~Loaded 21 line_items

these logs are the ones fetched by filebeat. How to reindex the log file to achieve adding mapping in this case? What are the steps?

cheers


#13

Hi, the IllegalArgumentException has been solved by using template.json in logstash, that makes message field not_analyzed. The message is displayed as a whole string now after a bit delay in ELK.

cheers


(system) #14