Bash to import data - what could I be doing better


(Brian Dunbar) #1

I have log files from 8 apache servers from 2014 that I need imported into elasticsearch.

After much blood, sweat, and tears, this appears to be working. But I'd like to know if there is, perhaps, a better way of doing things.

Log files are in /var/log/muo-2014/$servername/yyyymmdd

The basic idea is the extend the do-while loop to add host app02, app03, and so forth, then call the function with the new variable.

in day_process function if-then is to avoid processing a non-existent file (some days, some servers were not online).

logger lines to write the file size to syslog so I can track what's going on.

function day_process() {
    if [ -f /var/log/muo-2014/${HOST}/*${PROC_DATE}* ]
    then
        logger "Log Import start ${HOST}/${PROC_DATE}"
        logger "Log Import start ${HOST}/${PROC_DATE} File Size follows"
        find  /var/log/muo-2014/${HOST}/*${PROC_DATE}* -size +1000000c -print0 | du -c --files0-from=- | awk 'END{print $1}'
        find  /var/log/muo-2014/${HOST}/*${PROC_DATE}* -size +1000000c -print0 | du -ch --files0-from=- | awk 'END{print $1}'
        cat /var/log/muo-2014/${HOST}/*${PROC_DATE}* | /opt/logstash/bin/logstash -f /root/muo/fileimport.conf
        logger "Log Import End ${HOST}/${PROC_DATE}"
        sleep 1
    else
        echo "/var/log/muo-2014/${HOST}/*${PROC_DATE}* does not exist moving along"
    fi

}

echo "Starting APP01"
HOST=app01

DATE=2014-06-01
for i in {0..30}
do
    NEXT_DATE=$(date +%m-%d-%Y -d "$DATE + $i day")
    PROC_DATE=$(date +%Y%m%d -d "$DATE + $i day")
    day_process
    echo "echo next_date is $NEXT_DATE"
    echo "echo proc-date is $PROC_DATE"
done

(Mark Walkom) #2

Yes, use Logstash! It's just what is was made for.


(Brian Dunbar) #3

This is using logstash?

cat /var/log/muo-2014/logfile | /opt/logstash/bin/logstash

Or did you mean instead something like this?

import.conf

file {
    path => "/var/log/muo-2014/logfile"
}

I did try that but ran into the issue where it did not close (because it mimics 'tail -f', which left me confused about when to exit the process.


(Mark Walkom) #4

Use the cat method, that's fine.

(I initially misread your post sorry, was operating with reduced sleep yesterday)


(Brian Dunbar) #5

No worries: Been there, done that, got the t-shirt.


(system) #6