Can someone please help? Aggregation error. How to use existing index and add additional info?

I have an index created say, student_master. I now have a CSV from which I need to map with the ID of the index present and combine them and populate it in a new index. Both have the same ID.

I tried using aggregation but did not work. Can someone please help? Been in on for days.

I tried the below code:

input {
      elasticsearch {
            hosts => "localhost"
            index => "student-master"
            docinfo => true
            tags => ["in1"]
      }

      file {
            path => "/Users/dineshgupta/Downloads/student_marks_new.csv"
            start_position => "beginning"
            sincedb_path => "/dev/null"
            tags => ["in2"]
      }

    }

    filter {
    aggregate {
        task_id => "%{ID}"
        code => "
            if (event.get('tags').include('in1'))
              map['Gender'] = event.get('Gender');
              map['State'] = event.get('State');
            else
              map['Chemistry'] = event.get('Chemistry');
              map['Physics'] = event.get('Physics');
            end
            event.cancel();
        "
        #inactivity_timeout => 300  #seconds since last event
        #push_map_as_event_on_timeout => true
        #timeout_task_id_field => "ID"
    }

    }


    output {
            elasticsearch {
                    #action => update
                    doc_as_upsert => true
                    document_type => "doc"
                    document_id => "%{ID}"
                    index => "students-new-%{+YYYY.MM.dd}"
            }
            stdout {
                    codec => rubydebug
            }
    }

My output is not what I expected. Can someone please tell me what mistake am I making here?

It is unclear what you are trying to do but if you are trying to map a field using a mapping from a csv file I would look at a translate filter.

I have an index, and it has the fields such as ID, Gender, Name etc. Then I have a CSV, which has the field ID and then marks such as Math, Physics, English etc.

I need to like make a join (in SQL terms) and make a final index which has the fields of the index already created and the additional data (Math, Physics, English).

Example, in index: ID, Name, Gender

In CSV, ID, Math, Physics, Chemistry.

Final Output (in a new index): ID, Name, Gender, Math, Physics and Chemistry. Note I can map them using IDs.

How do I do this and what changes do I make to my code? Thank you so much for your help.

Use a translate filter. You will need the CSV to have two columns, the first being ID. Then you should find something like

translate {
    field => "ID"
    target => "marks"
    dictionary_path => "/path/to/file.csv"
}

Then use another filter such as csv to separate the marks field into individual subjects.

1 Like

That means I can use my existing index and the CSV right? I'm sorry, I'm very new to this. Apologies.

Yes it does.

1 Like

I'm so sorry Badger, I've tried implementing it but I don't understand how could I use translate plugin for enrichment.

Here is my code:

input {

  elasticsearch {
        hosts => "localhost" 
        index => "student-master"
        docinfo => true
        tags => ["in1"]
  }

}

filter {

      csv {
        columns => ["ID", "Physics", "Chemistry"]
        separator => ","
      }

      translate {
        dictionary_path => "/Users/user/Downloads/student_marks_new.csv"
        field      => "[ID]"
        destination => "[marks]"
      }
      #dissect { mapping => { "marks" => "%{Chemistry};%{Physics}" } }

}

output {
    stdout {
            codec => rubydebug
    }
}

So how do I use translate? I did a lot about it and tried a lot of things but I'm unable to solve it.

I have an index (ID, Name, Gender) from which I wanna map to a CSV which has ID, Physics, Chemistry etc.

I'm sorry, just starting out, been on this for weeks. Do help me out. Thank you so much.

What does the second line of your CSV file look like? Make sure you use markdown to preserve the format.

If you change your output to

output { stdout { codec => rubydebug } }

what does a single event look like? Feel free to redact or obfuscate personal data.

Hello Badger.

I actually managed to finish this, I saw another thread of yours this is my code.

Thread for reference.

Edited: Had an error in CSV (duplicate date, it's correct now)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.