Index logic documentation?

brandondash · June 4, 2018, 7:34pm

I am looking for a writeup on when/how Elasticsearch decides to create new indices.

I went to list all indexes on my single node ELK stack server expecting to find exactly two: the one being fed by Logstash and the one created by Kibana. What I found was literally dozens of logstash indexes split by date, all open, and with no real rhyme nor reason to when they were created. (the kibana one was there also so at least I got that part right).

Where can I read up on this so I can stop it (or at least control it) ?

warkolm · June 4, 2018, 7:37pm

The only time Elasticsearch will create an index is if it is asked to do so.
That may be from explicit request via a mapping, or implicitly if a create request is made for an index that doesn't exist.

What does _cat/indices?v show?

brandondash · June 4, 2018, 7:40pm

That is what _cat/indices?v shows

All yellow (expected since I am single node)
All open (?)
All from logstash (they all follow the same naming convention "logstash-date")
All pri 5
All rep 1
Doc count is all over the place - as small as 11k, as large as 27k
Store size all over the place - 74mb to 105mb

warkolm · June 4, 2018, 7:43pm

Then that's what's causing them to be created.

brandondash · June 4, 2018, 7:47pm

OK so logstash is causing them to be created. Presumably each index is holding a unique subset of the overall data I actually ingest. Since the names don't give me any context clues, how do I determine which data ends up in which index? I can tell you I didn't actively ask for any indices to be created. Whatever happened was triggered by logstash at a date AFTER I plugged in the pipeline.

Better yet, how can I tell logstash to give me an index name that is actually meaningful?

warkolm · June 4, 2018, 7:48pm

You'd need to share your Logstash config I think, it'll help us understand what's going on.

brandondash · June 4, 2018, 7:53pm

The obvious guess is that a new index request happens every time the Logstash thread is restarted. Is there a way to tell Logstash NOT to ask for a new index and instead feed into the latest current index?

I am happy to post my pipeline, but it isn't very exciting.

warkolm · June 4, 2018, 7:55pm

That's unlikely.

If you can post your config it'll help immensely.

brandondash · June 4, 2018, 8:05pm

input {
    file {
        type => "bbb-web"
        path => [
            "/data/logstash/logs/*conf*{{ logstash_path_qualifier }}/bbb-web.log"
        ]
    }

    file {
        type => "freeswitch-log"
        path => [
            "/data/logstash/logs/*conf*{{ logstash_path_qualifier }}/freeswitch-log.log"
        ]
        codec => multiline {
            pattern => "%{SYSLOGTIMESTAMP} %{HOSTNAME} freeswitch-log: %{TIMESTAMP_ISO8601} "
            negate => "true"
            what => "previous"
            multiline_tag => "freeswitch_multiline"
        }
    }

    file {
        type => "freeswitch-master"
        path => [
            "/data/logstash/logs/*conf*{{ logstash_path_qualifier }}/freeswitch-master.log"
        ]
    }    
    
    file {
        type => "chatdb-mysql-audit"
        path => [
            "/data/logstash/logs/*chatdb*{{ logstash_path_qualifier }}/mysql-audit.log"
        ]
    }

    file {
        type => "confdb-mysql-audit"
        path => [
            "/data/logstash/logs/*confdb*{{ logstash_path_qualifier }}/mysql-audit.log"
        ]
    }

    file {
        type => "nginx-alb"
        path => [
            "/data/logstash/logs/*alb*{{ logstash_path_qualifier }}/nginx-access.log",
            "/data/logstash/logs/*alb*{{ logstash_path_qualifier }}/nginx-error.log"
        ]
    }

    file {
        type => "nginx-conference"
        path => [
            "/data/logstash/logs/*conf*{{ logstash_path_qualifier }}/nginx-access.log",
            "/data/logstash/logs/*conf*{{ logstash_path_qualifier }}/nginx-error.log"
        ]
    }

    file {
        type => "nginx-portal"
        path => [
            "/data/logstash/logs/*portal*{{ logstash_path_qualifier }}/nginx-access.log",
            "/data/logstash/logs/*portal*{{ logstash_path_qualifier }}/nginx-error.log"
        ]
    }

    file {
        type => "openfire-error"
        path => [
            "/data/logstash/logs/*chat*{{ logstash_path_qualifier }}/openfire-error.log"
        ]
    }

    file {
        type => "openfire-info"
        path => [
            "/data/logstash/logs/*chat*{{ logstash_path_qualifier }}/openfire-info.log"
        ]
    }

}

filter {

    if "bbb-web" in [path] {
        grok { 
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{BBB_WEB}" }
        }    
    }

    if "freeswitch-log" in [path] {
        if "freeswitch_multiline" in [tags] {
            # If we find a multiline entry, strip out the recurring prefixes that occur mid-line
            mutate { 
                gsub => [
                    "message",
                    # NOTE - gsub does not recognize predefined grok patterns, so we have to hand enter them
                    "\n\b(?:Jan(?:uary|uar)?|Feb(?:ruary|ruar)?|M(?:a|ä)?r(?:ch|z)?|Apr(?:il)?|Ma(?:y|i)?|Jun(?:e|i)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|O(?:c|k)?t(?:ober)?|Nov(?:ember)?|De(?:c|z)(?:ember)?)\b +(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]) (?!<[0-9])(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9])(?::(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?))(?![0-9]) \b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b) freeswitch-log: ",
                    " "
                ]
            }
        }
        grok { 
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{FREESWITCH_LOG}" }
        }    
    }
    
    if "freeswitch-master" in [path] {
        grok { 
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{FREESWITCH_MASTER}" }
        }    
    }

    if "mysql-audit" in [path] {
        grok { 
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{MYSQL_AUDIT}" }
        }    
    }
    
    if "nginx-access" in [path] {
        grok { 
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{NGINX_ACCESS}" }
        }    
    }

    if "nginx-error" in [path] {
        grok { 
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{NGINX_ERROR}" }
        }
    }

    if "openfire-error" in [path] {
        grok { 
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{OPENFIRE_ERROR}" }
        }    
    }

    if "openfire-info" in [path] {
        grok { 
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{OPENFIRE_INFO}" }
        }    
    }

    if "Guest" in [full_name] { mutate { add_tag => "guest_user" } }
    if " FOO " in [full_name] { mutate { add_tag => "FOO" } }
    if " BAR " in [full_name] { mutate { add_tag => "BAR" } }
    if " BAZ " in [full_name] { mutate { add_tag => "BAZ" } }

}

output {
    elasticsearch {
        hosts => [ "localhost:9200" ]
    }
}

brandondash · June 4, 2018, 8:23pm

... and the logstash configuration proper (in case you wanted that too):

path.data: /var/lib/logstash
path.config: /etc/logstash/conf.d
config.reload.automatic: true
path.logs: /var/log/logstash

azhar · June 4, 2018, 9:02pm

Hi,

From the elasticsearch output plugin documentation, logstash defaults to index name "logstash-%{+YYYY.MM.dd}" if not provided explicitly.

Refer to this link for more details.

Hence, if you don't want new indexes to be created every day, you will need to set the index name explicitly in the pipeline config.

output {
  elasticsearch {
    hosts => [ "localhost:9200" ]
    index => "<index_name>"
    action => "index"
  }
}

warkolm · June 4, 2018, 10:16pm

Why do you want a single big index?

system · July 2, 2018, 10:16pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Basic understating regarding index creation Logstash	10	2914	August 13, 2021
New index is not created Elasticsearch	7	5004	February 14, 2020
Creation Index Elasticsearch	2	334	June 23, 2018
Understanding Index creation! Elasticsearch	3	1677	July 13, 2018
Elasticsearch data indexing for logstash Elasticsearch	10	1972	July 5, 2017

Index logic documentation?

Related topics