Csv files from FIlebeat to Elasticsearch

wild_turkey · April 23, 2020, 2:43pm

Hi All,

I'm a beginner to the Elasticstack world and learning to architect the collection of logs for visualization.

One file I'm working on is a CSV file. I want to feed it into Elasticsearch to visualize in Kibana. I assume I use filebeat for this.

Question #1 - What is logstash and would I need it in this scnario?
Question #2 - Is there a template of a config file that will get me started to collect files from a directory and feed to elasticsearch?

Thank you in advance!

dadoonet · April 23, 2020, 3:23pm

Welcome!

Logstash is an ETL (Extract Transform and Load) tool. Used for complex use cases. You don't need it here. You can directly stream the CSV to Elasticsearch with filebeat and use https://www.elastic.co/guide/en/elasticsearch/reference/7.6/csv-processor.html to transform the data.
Here is one:

github.com

dadoonet/bano-elastic/blob/master/filebeat-config/filebeat-cloud.yml

filebeat.inputs:
- type: log
  paths:
    - /bano-data/bano-*.csv

output.elasticsearch:
  indices:
    - index: "bano"
  pipeline: bano

And the associated ingest pipeline:

github.com

dadoonet/bano-elastic/blob/master/cloud/ingest-bano.json

{
  "processors": [
    {
      "csv": {
        "field": "message",
        "target_fields": [
          "_id",
          "address.number",
          "address.street_name",
          "address.zipcode",
          "address.city",
          "source",
          "location.lat",
          "location.lon"
        ]
      }
    },
    {
      "convert": {
        "field": "location.lat",

This file has been truncated. show original

droberts195 · April 23, 2020, 3:32pm

In the not-too-distant future there will be an easier way to get the ingest pipeline config and Filebeat config than typing them out by hand.

Starting in version 7.7 (not released yet but not too far off) after importing a CSV file using the File Data Visualizer in Kibana there will be an option to display a sample Filebeat config that would be appropriate for CSV files with the same structure that you uploaded as a sample - it's the change that was made in https://github.com/elastic/kibana/pull/58152 and there's a screenshot in that PR. The File Data Visualizer will leave behind the ingest pipeline that was appropriate for the CSV columns that it saw, so you'll have nearly everything you need. (Just a few details like hostnames and passwords need to be dealt with manually.)

As I said, you cannot do this today but 7.7 is the next release, so it won't be that long before it's possible.

wild_turkey · April 23, 2020, 3:36pm

Thank you @dadoonet, this is helpful. Excuse my ignorance, but for my education, can you confirm the flow?

CSV file is on my PC in a c:/CSV folder
The file is imported through Filebeat, and sent to Elasticsearch.

The ingestion part I have a few grey areas. Is there anything I have to install to use the csv processor? Does this take place before of after the filebeat import?

dadoonet · April 23, 2020, 4:07pm

I confirm the flow.

No

You have to define the ingest pipeline before starting filebeat.

wild_turkey · April 23, 2020, 5:11pm

One more question to help connect the dots (I haven't ingested any files yet, this will be my first time using a Windows installation of Elasticstack)

Would the pipeline be in a different configuration file and run prior?

I've been reading and watching youtube videos, there seems to be many ways to do one half of the configuration but not a step by step of my end to end use case.

I really appreciate the input.

dadoonet · April 24, 2020, 2:42am

You create a pipeline using the REST API.
So no config file for this. See

github.com

dadoonet/bano-elastic/blob/c735a3dcad65c7b332df346b365faae69ad81a2e/setup-cloud.sh#L18




until curl -u elastic:$CLOUD_PASSWORD -s "$CLOUD_URL" | grep "\"number\" : \"$ELASTIC_VERSION\"" > /dev/null; do
	  sleep 1
		echo -ne '.'
done


echo -ne '\n'
echo Elasticsearch is now up.


echo Defining bano ingest pipeline
curl -XPUT "$CLOUD_URL/_ingest/pipeline/bano" -u elastic:$CLOUD_PASSWORD -H 'Content-Type: application/json' -d'@cloud/ingest-bano.json' ; echo


echo Defining bano index template
curl -XPUT "$CLOUD_URL/_template/bano" -u elastic:$CLOUD_PASSWORD -H 'Content-Type: application/json' -d'@cloud/template-bano.json' ; echo

system · May 22, 2020, 2:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Shipping CSV file using Filebeat to Logstash and then from logstash to Elasticsearch Logstash	3	8212	July 6, 2017
Configuring filebeat to transport csv Logstash	3	950	December 12, 2019
Use Filebeat for CSV logs? Beats filebeat	2	360	September 15, 2021
How to import csv files into Elasticsearch Elasticsearch	7	8159	July 5, 2017
Import CSV File as input in logstash and the output data to elasticsearch Logstash	10	38084	May 19, 2017

Csv files from FIlebeat to Elasticsearch

Related topics