Import CSV with different column names to same field

Hi guys,
I have 1 index in elasticsearch that I want to aggregate 2 CSV files data. So, I create the index and the mapping:
index: test
mapping with the fields: ID, Date, Name, Reference, Subject, Operator
I have 2 CSV files, let's supose:
File1:
ID, Date, Name
File 2:
Identifier, Ref, Subject, Operator

How can I match the column "Identifier" from file2 with "ID" from file1?
And one more question:
I have different pipelines to each file... So, one pipeline import file1 and match the fields that file contains, and same appends with file2..
Can you help me to do that?

Many thanks!

Hi there,

How can I match the column "Identifier" from file2 with "ID" from file1?

After the CSV filter on file2, just put a rename filter that moves the "Identifier" value into the "ID" field. Then you can use the update/upsert technique I showed you in the other thread.

I have different pipelines to each file... So, one pipeline import file1 and match the fields that file contains, and same appends with file2..
Can you help me to do that?

I didn't get this one. Help you do what? Making a single pipeline that gets in input both the files?

Can you do me a example of do the filter rename?

"Can you help me do that" was about all the question.. I only say that I have different pipelines only for you understand how I am work right now.

Ok so, admitting you have files like the following:
image

image

You can even do a single pipeline like the following:

input {
  file {
    path => "/absolute_path_to_file1.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    tags => ["file1"]
  }

  file {
    path => "absolute_path_to_file2.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    tags => ["file2"]
  }
}

filter {
  if "file1" in [tags] {
    csv {
      separator => ","
      columns => ["ID","Date","Name"]
    }

    if [ID] == "ID" {
      drop{}
    }
  } else if "file2" in [tags] {
    csv {
      separator => ","
      columns => ["ID","Ref","Subject","Operator"]
    }

    if [ID] == "Identifier" {
      drop{}
    }
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
    index => "test"
    document_id => "%{ID}"
    action => "update"
    doc_as_upsert => true
  }
}

This way, you don't even need to rename anything, since you're directly mapping the Identifier value to the ID field.

In your elasticsearch you'll have something like this:

Does this answer to all your questions?

And if I have a field in index that's "type", for example, if I have:
file1 - ID, DATE, NAME
file2 - Identifier, Ref, Subject, Operator

And in the field "type" of index I want to put the "Ref" of file2... How can I do that using the example you make?
Sorry for the time losed, but I have a big project that I'm working on, and it's more complex that seems.
I'm using ELK for first time..
Thanks you!

It's not a problem for me (I'll reply whenever I can). The problem is for you because you're limiting yourself to copy and paste solutions provided by others, without learning anything.

What you asked is very trivial once I provided you with the previous example. You can easily achieve what you want simply chaning this line in the second if statement

columns => ["ID","Ref","Subject","Operator"]

into this line

columns => ["ID","type","Subject","Operator"]

This way, similarly to how it was possible to store the Identifier value into the ID field, you will store the Ref value into the type field.

Anyway, just as a tip, try to learn something from what you are answered in the forum or you'll risk to make the same questions again without learning anything and you might end up being stuck in the middle of something, waiting for someone to answer a question you already have an answer to :wink:

And if I have one field on CSV file that I don't want to import in index, how can I do that?
For example:
if I have:
file1 - ID, DATE, NAME
file2 - Identifier, number, Ref, Subject, Operator

And I don't want field "number"..

How can I do this on this line:
columns => ["ID","type","Subject","Operator"]

Just like this?:
columns => ["ID","","type","Subject","Operator"]

And one more question:
this name fields corresponding that name fields on index, right? or this name fields are the fields of CSV? just a question for I understand..
columns => ["ID","type","Subject","Operator"]

Can you gave me an email for I make you some simple questions? Or can I chat with you from any place?

Thanks in advance

Can you gave me an email for I make you some simple questions? Or can I chat with you from any place?

You can write your questions here since they can be useful to future readers.

Just like this?:
columns => ["ID","","type","Subject","Operator"]

Why asking me rather than trying it out yourself and find out? As I said, if you simply copy and paste what I write here, you'll learn nothing.

If it doesn't work, you can always store that piece of info in a field (whatever field, for example the number field) and then remove that field with the appropriate filter plugin.

this name fields corresponding that name fields on index, right? or this name fields are the fields of CSV? just a question for I understand..

Not sure I got this right. You asking me if what you write in columns => ["ID","type","Subject","Operator"] is what you will find in the elasticsearch index? Again, why asking me rather than trying it out yourself and see what happens changing a letter of one of those fields? What tragic event could happen?

Anyway, yes, the names you give to your columns there is exactly what you will find in your elasticsearch index.

Ok. thank you
Another question:

In my CSV file, say me the error:
_csvparsefailure

I think it was because in my CSV file I have some lines with break line (enter key).. and that regists doesn't import...
can it be possible?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.