Import CSV with different column names to same field

brunojpsantos · March 16, 2020, 10:00am

Hi guys,
I have 1 index in elasticsearch that I want to aggregate 2 CSV files data. So, I create the index and the mapping:
index: test
mapping with the fields: ID, Date, Name, Reference, Subject, Operator
I have 2 CSV files, let's supose:
File1:
ID, Date, Name
File 2:
Identifier, Ref, Subject, Operator

How can I match the column "Identifier" from file2 with "ID" from file1?
And one more question:
I have different pipelines to each file... So, one pipeline import file1 and match the fields that file contains, and same appends with file2..
Can you help me to do that?

Many thanks!

Fabio-sama · March 16, 2020, 12:00pm

Hi there,

How can I match the column "Identifier" from file2 with "ID" from file1?

After the CSV filter on file2, just put a rename filter that moves the "Identifier" value into the "ID" field. Then you can use the update/upsert technique I showed you in the other thread.

I have different pipelines to each file... So, one pipeline import file1 and match the fields that file contains, and same appends with file2..
Can you help me to do that?

I didn't get this one. Help you do what? Making a single pipeline that gets in input both the files?

brunojpsantos · March 16, 2020, 12:10pm

Can you do me a example of do the filter rename?

"Can you help me do that" was about all the question.. I only say that I have different pipelines only for you understand how I am work right now.

Fabio-sama · March 16, 2020, 12:35pm

Ok so, admitting you have files like the following:

You can even do a single pipeline like the following:

input {
  file {
    path => "/absolute_path_to_file1.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    tags => ["file1"]
  }

  file {
    path => "absolute_path_to_file2.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    tags => ["file2"]
  }
}

filter {
  if "file1" in [tags] {
    csv {
      separator => ","
      columns => ["ID","Date","Name"]
    }

    if [ID] == "ID" {
      drop{}
    }
  } else if "file2" in [tags] {
    csv {
      separator => ","
      columns => ["ID","Ref","Subject","Operator"]
    }

    if [ID] == "Identifier" {
      drop{}
    }
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
    index => "test"
    document_id => "%{ID}"
    action => "update"
    doc_as_upsert => true
  }
}

This way, you don't even need to rename anything, since you're directly mapping the Identifier value to the ID field.

In your elasticsearch you'll have something like this:

Does this answer to all your questions?

brunojpsantos · March 17, 2020, 10:19am

And if I have a field in index that's "type", for example, if I have:
file1 - ID, DATE, NAME
file2 - Identifier, Ref, Subject, Operator

And in the field "type" of index I want to put the "Ref" of file2... How can I do that using the example you make?
Sorry for the time losed, but I have a big project that I'm working on, and it's more complex that seems.
I'm using ELK for first time..
Thanks you!

Fabio-sama · March 17, 2020, 10:38am

It's not a problem for me (I'll reply whenever I can). The problem is for you because you're limiting yourself to copy and paste solutions provided by others, without learning anything.

What you asked is very trivial once I provided you with the previous example. You can easily achieve what you want simply chaning this line in the second if statement

columns => ["ID","Ref","Subject","Operator"]

into this line

columns => ["ID","type","Subject","Operator"]

This way, similarly to how it was possible to store the Identifier value into the ID field, you will store the Ref value into the type field.

Anyway, just as a tip, try to learn something from what you are answered in the forum or you'll risk to make the same questions again without learning anything and you might end up being stuck in the middle of something, waiting for someone to answer a question you already have an answer to

brunojpsantos · March 31, 2020, 9:08am

And if I have one field on CSV file that I don't want to import in index, how can I do that?
For example:
if I have:
file1 - ID, DATE, NAME
file2 - Identifier, number, Ref, Subject, Operator

And I don't want field "number"..

How can I do this on this line:
columns => ["ID","type","Subject","Operator"]

Just like this?:
columns => ["ID","","type","Subject","Operator"]

And one more question:
this name fields corresponding that name fields on index, right? or this name fields are the fields of CSV? just a question for I understand..
columns => ["ID","type","Subject","Operator"]

Can you gave me an email for I make you some simple questions? Or can I chat with you from any place?

Thanks in advance

Fabio-sama · March 31, 2020, 10:08am

Can you gave me an email for I make you some simple questions? Or can I chat with you from any place?

You can write your questions here since they can be useful to future readers.

Just like this?:
columns => ["ID","","type","Subject","Operator"]

Why asking me rather than trying it out yourself and find out? As I said, if you simply copy and paste what I write here, you'll learn nothing.

If it doesn't work, you can always store that piece of info in a field (whatever field, for example the number field) and then remove that field with the appropriate filter plugin.

this name fields corresponding that name fields on index, right? or this name fields are the fields of CSV? just a question for I understand..

Not sure I got this right. You asking me if what you write in columns => ["ID","type","Subject","Operator"] is what you will find in the elasticsearch index? Again, why asking me rather than trying it out yourself and see what happens changing a letter of one of those fields? What tragic event could happen?

Anyway, yes, the names you give to your columns there is exactly what you will find in your elasticsearch index.

brunojpsantos · March 31, 2020, 11:02am

Ok. thank you
Another question:

In my CSV file, say me the error:
_csvparsefailure

I think it was because in my CSV file I have some lines with break line (enter key).. and that regists doesn't import...
can it be possible?

system · April 28, 2020, 11:02am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Load multiple csv files and update existing index Logstash	7	2721	November 1, 2019
Import csv files to two different index Logstash	6	975	August 31, 2020
Merging CSV files based on a common key column Logstash	2	1755	July 30, 2018
Multiple CSV files need to upload for same column names Logstash	10	2524	February 6, 2018
Import Multiple Files CSV - Aggregate Data Logstash	3	457	April 6, 2020

Import CSV with different column names to same field

Related topics