How do I take multiple CSV files into a single Elasticsearch index using logstash?

Drashti_Shah · April 18, 2020, 4:53pm

I have two CSV files in which,
CSV 1 (student_master) has the following fields: Student ID, First Name, Gender, State and in
CSV 2,(student_marks_data) I have Student ID, Date, Math, Physics, Chemistry, Total, Percentage and Grade.

I need to create a logstash data adapter to load the csv_1 into an Elasticsearch index called “student_master”.

While loading the student marks data, I have to link with “student_master” index (created in previous step) based on the student_id column and fetch the student’s firstname, lastname, gender and load into a new index namely “student_marks_[current_date]”.

I need the output as an index in which it is one index I have Student ID, Name, Gender, Date, Math, Physics, Chemistry, Total, Percentage and Grade.

I tried the below code:

         input {
          file {
            type => "csv1"
            path => "/home/vunet/Downloads/student_master.csv"
            start_position => "beginning"
            sincedb_path => "/dev/null"
          }
          file {
            type => "csv2"
            path => "/home/vunet/Downloads/student_marks_new.csv"
            start_position => "beginning"
            sincedb_path => "/dev/null"
          }
        
        }
        
        filter {
                if [type] == "csv1"{
                         csv {
                                columns => [ "ID", "First name", "Last name", "Gender", "City", "State" ]
                                remove_field => ["City", "State"]
                        }
                }
        
                if [type] == "csv2"{
                         csv {
                                columns => [ "ID", "Date", "Chemistry", "Physics", "Biology", "Total", "Percentage", "Grade" ]
                                remove_field => ["ID"]
                        }
                        date {
                                match => ["Date", "dd/MM/yyyy"]
                                target => "@timestamp"
                        }
                }}
    
    output {
           # elasticsearch {
           #         doc_as_upsert => true
           #         document_type => "doc"
           #         index => "students-new-%{+YYYY.MM.dd}"
           # }
            stdout {
                    codec => rubydebug
            }
    }

How do I make a join (in SQL words) or how do I map with the ID of one CSV with the ID of the second CSV to create a single index?

Please do help. Thank you in advance.

system · May 16, 2020, 4:53pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Combining Data in Multiple CSVs to Single Index Logstash	2	289	June 20, 2020
How to perform join with two or more csv files into one index of elastic search using logstash Logstash	2	1583	February 7, 2018
Load multiple CSV to multiple index in a single conf file Logstash	2	780	March 24, 2020
Indexing multiple csv files into one index with nested fields Logstash	2	358	February 6, 2023
Multiple CSV files need to upload for same column names Logstash	10	2520	February 6, 2018

How do I take multiple CSV files into a single Elasticsearch index using logstash?

Related topics