Space is getting filled. Need some information how to clear/re-claim the space

dadoonet · May 7, 2018, 2:49pm

So with:

yellow open fuselog-2018.05.05 pGGh2mwASGmHET_YKSgldA 5 1 28867041 0 103.4gb 103.4gb
yellow open fuselog-2018.05.07 UQvSXBKlSjeJM_bngM01wQ 5 1 35098404 0 124.6gb 124.6gb
yellow open data AlnpmM8QSeiFciLtJTkDCA 5 1 0 0 810b 810b
yellow open fuselog-2018.05.06 GqrrGFuvT56cWfCdEjcO8g 5 1 30833617 0 112.2gb 112.2gb
yellow open .kibana MtbGHB3TSz6KivnuRarZQQ 1 1 32 2 80.6kb 80.6kb
yellow open fuselog-2018.05.04 Cq1Cluf8SEK6F2wv3NFNGw 5 1 45942884 0 164.8gb 164.8gb

We can see that each fuselog-* is taking more than 100gb for 30m++ documents.
So this is accurate with what you are seeing on disk.

I don't understand what your question is then.

Tamal_Kundu · May 7, 2018, 2:54pm

yes..
it seems accurate.

Tamal_Kundu · May 7, 2018, 2:56pm

can we arrange a call or something. So that we can have discussion regarding this matter.

dadoonet · May 7, 2018, 3:08pm

No sorry. If you need private direct support, you need to engage with elastic: https://www.elastic.co/subscriptions

You can continue asking your questions here.

Tamal_Kundu · May 7, 2018, 3:14pm

Okay..

So I have 1.5 TB and indices I can see that it is generating almost 150 G per day. So how can I reduce that, payload is generating a huge amount of data. Is there any way I can reduce the space utilization or something in elasticsearch.

Christian_Dahlqvist · May 7, 2018, 3:16pm

I did provide a number of suggestions in this post.

Tamal_Kundu · May 7, 2018, 3:20pm

yes.. I have posted there too.

[fuseadmin@a0110pcsgmon02 logstash-5.5.0]$ cat logstash.conf
input {
beats {
type => beats
port => 5044
}
}

filter {
if [type] == "log" {
#Grok to get SourcesystemID
grok {
match => {
"message" => "(?<=SourceSystemID:)%{WORD:sourceSystemID}"
}
}

    if ![sourceSystemID] {
     grok {
           match => {
                   "message" => "(?<=ChannelID:)%{DATA:sourceSystemID}(?>\|)"
           }
       }
    if ![sourceSystemID] {
            drop {}
    }
    }

    #Grok to get Container Name
     grok {
            match => {
                    "message" => "(?<=ContainerName:)%{GREEDYDATA:containerName}"
            }
    }

#Grok to get InvocationPoint
grok {
match => {
"message" => "(?<=LogPoint:)%{WORD:logPoint}"
}
}
if ![logPoint] {
grok {
match => {
"message" => "(?<=InvocationPoint:)%{DATA:logPoint}(?>|)"
}
}
}
#Grok to get LogTimestamp
grok {
match => {
"message" => "(?<=LogTimestamp:)%{TIMESTAMP_ISO8601:logTimestamp}"
}
}
if ![logTimestamp] {
grok {
match => {
"message" => "%{TIMESTAMP_ISO8601:logTimestamp}%{SPACE}|%{SPACE}%{LOGLEVEL:level}%{SPACE}|%{SPACE}%{DATA:thread}%{SPACE}|%{SPACE}%{DATA:serviceNameOld}%{SPACE}|%{SPACE}%{DATA:bundle}%{SPACE}|%{SPACE}%{GREEDYDATA:logdetails}"
}
}
#hardcoded to get if the log is first or last entry
grok {
match => {"logdetails" => "%{WORD:first_word}"}
}
}

    #Grok to get GUID

     grok {
            match => {
                    "message" => "(?<=GUID:)%{DATA:GUID}(?>\|)"
            }
    }

    #Grok to get ServiceName

     grok {
            match => {
                    "message" => "(?<=ServiceName:)%{DATA:serviceName}(?>\|)"
            }
    }


    #Grok to get ServerName

     grok {
            match => {
                    "message" => "(?<=ManagedServer:)%{IP:managedServer}"
            }
    }

    #Grok to get ErrorCode

     grok {
            match => {
                    "message" => "(?<=ErrorCode:)%{DATA:errorCode}(?>\|)"
            }
    }

    date {
            match => ["logTimestamp" , "ISO8601"]
    }


    #Grok to get ReferenceID added on 16th Apr 2018 by Rudrajit

     grok {
              match => {
                      "message" => "(?<=ReferenceID:)%{DATA:ReferenceID}(?>\|)"
              }
      }

    #Grok to get TargetService added on 16th Apr 2018 by Rudrajit

     grok {
               match => {
                       "message" => "(?<=TargetService:)%{DATA:TargetService}(?>\|)"
               }
       }







    #tag the log entry with first or last, drop other entry
    if [logTimestamp] != "" {

if [errorCode] != "" {

mutate {

add_tag => ["error_log"]

}

} else {

     if [first_word] == "Incoming_Request" {
            mutate {
                    add_tag => ["start_log"]
            }
     } else if [first_word] == "Outbound" {
            mutate {
                    add_tag => ["end_log"]
            }
    } else if [logPoint] == "InboundReq" {
            mutate {
                    add_tag => ["start_log"]
            }
    } else if [logPoint] == "InboundResp" {
            mutate {
                    add_tag => ["end_log"]
            }
    } else {
    #        drop {}
    }

    } else {
   #         drop {}
    }

    #start logstash processing to get response time
    elapsed {
            start_tag => "start_log"
            end_tag => "end_log"
            unique_id_field => "GUID"
            new_event_on_match => false
    }

}
}

    output {

elasticsearch {
hosts =>["10.89.13.28:9200"]
manage_template => true
index => "fuselog-%{+YYYY.MM.dd}"
#index => filebeat
#document_type => "%{[@metadata][type]}"
}
}

dadoonet · May 7, 2018, 3:36pm

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

I'm not going to answer more in this thread as @Christian_Dahlqvist already helped you in the other thread. Let's keep the rest of the discussion there.

Tamal_Kundu · May 8, 2018, 5:40am

[fuseadmin@a0110pcsgmon02 logstash-5.5.0]$ cat logstash.conf
input {
  beats {
     type => beats
     port => 5044
  }
}

filter {
  if [type] == "log" {
         #Grok to get SourcesystemID
         grok {
                match => {
                        "message" => "(?<=SourceSystemID:)%{WORD:sourceSystemID}"
                }
        }

        if ![sourceSystemID] {
         grok {
               match => {
                       "message" => "(?<=ChannelID:)%{DATA:sourceSystemID}(?>\|)"
               }
           }
        if ![sourceSystemID] {
                drop {}
        }
        }

        #Grok to get Container Name
         grok {
                match => {
                        "message" => "(?<=ContainerName:)%{GREEDYDATA:containerName}"
                }
        }

 #Grok to get InvocationPoint
        grok {
                match => {
                        "message" => "(?<=LogPoint:)%{WORD:logPoint}"
                }
        }
        if ![logPoint] {
        grok {
                match => {
                        "message" => "(?<=InvocationPoint:)%{DATA:logPoint}(?>\|)"
                }
        }
        }
        #Grok to get LogTimestamp
        grok {
                match => {
                        "message" => "(?<=LogTimestamp:)%{TIMESTAMP_ISO8601:logTimestamp}"
                }
        }
        if ![logTimestamp] {
        grok {
                match => {
                       "message" => "%{TIMESTAMP_ISO8601:logTimestamp}%{SPACE}\|%{SPACE}%{LOGLEVEL:level}%{SPACE}\|%{SPACE}%{DATA:thread}%{SPACE}\|%{SPACE}%{DATA:serviceNameOld}%{SPACE}\|%{SPACE}%{DATA:bundle}%{SPACE}\|%{SPACE}%{GREEDYDATA:logdetails}"
                }
        }
          #hardcoded to get if the log is first or last entry
        grok {
                match => {"logdetails" => "%{WORD:first_word}"}
        }
        }

        #Grok to get GUID

         grok {
                match => {
                        "message" => "(?<=GUID:)%{DATA:GUID}(?>\|)"
                }
        }

        #Grok to get ServiceName

         grok {
                match => {
                        "message" => "(?<=ServiceName:)%{DATA:serviceName}(?>\|)"
                }
        }


        #Grok to get ServerName

         grok {
                match => {
                        "message" => "(?<=ManagedServer:)%{IP:managedServer}"
                }
        }

        #Grok to get ErrorCode

         grok {
                match => {
                        "message" => "(?<=ErrorCode:)%{DATA:errorCode}(?>\|)"
                }
        }

        date {
                match => ["logTimestamp" , "ISO8601"]
        }


        #Grok to get ReferenceID added on 16th Apr 2018 by Rudrajit

         grok {
                  match => {
                          "message" => "(?<=ReferenceID:)%{DATA:ReferenceID}(?>\|)"
                  }
          }

        #Grok to get TargetService added on 16th Apr 2018 by Rudrajit

         grok {
                   match => {
                           "message" => "(?<=TargetService:)%{DATA:TargetService}(?>\|)"
                   }
           }







        #tag the log entry with first or last, drop other entry
        if [logTimestamp] != "" {
#        if [errorCode] != "" {
#                mutate {
#                        add_tag => ["error_log"]
#                }
#        } else {

         if [first_word] == "Incoming_Request" {
                mutate {
                        add_tag => ["start_log"]
                }
         } else if [first_word] == "Outbound" {
                mutate {
                        add_tag => ["end_log"]
                }
        } else if [logPoint] == "InboundReq" {
                mutate {
                        add_tag => ["start_log"]
                }
        } else if [logPoint] == "InboundResp" {
                mutate {
                        add_tag => ["end_log"]
                }
        } else {
        #        drop {}
        }

        } else {
       #         drop {}
        }

        #start logstash processing to get response time
        elapsed {
                start_tag => "start_log"
                end_tag => "end_log"
                unique_id_field => "GUID"
                new_event_on_match => false
        }
}
}

        output {
elasticsearch {
    hosts =>["10.89.13.28:9200"]
    manage_template => true
    index => "fuselog-%{+YYYY.MM.dd}"
    #index => filebeat
    #document_type => "%{[@metadata][type]}"
  }
}

kindly check and suggest please

system · June 5, 2018, 5:40am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Disk space almost filled Elasticsearch	5	682	January 15, 2020
Elastic Agent filling up disk space with logs, disaster Endpoint Security	7	2503	July 26, 2021
Freeing up disk space (again) Elasticsearch	3	2539	April 12, 2017
Marvel reporting incorrect free disk space? Elasticsearch	9	1300	July 6, 2017
Disk space getting filled up Elasticsearch	11	8930	July 30, 2018

Space is getting filled. Need some information how to clear/re-claim the space

if [errorCode] != "" {

mutate {

add_tag => ["error_log"]

}

} else {

Related topics