Created nested fields from Xpath & check for existing documents

I have two questions;

  1. parsing xml data & adding it to an array in a record in an index
  2. checking for an existing record in an index and if it exists add the new data of that record to the array of the existing record

I have an jdbc input that has an xml column,

input {
  jdbc {
    ....
    statement => "SELECT event_xml....
  }
}

then an xml filter to parse the data,
How do i make the the last 3 xpaths to be an array, do i need a mutate or ruby filter? I cant seem to figure it out

filter {  
  xml {       
    source => "event_xml"              
    remove_namespaces => true 
    store_xml => false
    force_array => false
    xpath => [ "/CaseNumber/text()", "case_number" ]
    xpath => [ "/FormName/text()", "[conversations][form_name]" ]
    xpath => [ "/EventDate/text()", "[conversations][event_date]" ]
    xpath => [ "/CaseNote/text()", "[conversations][case_note]" ]
  }
}

so it would something like this look like this in the Elastic search.

{
    "case_number" : "12345",
    "conversations" :
        [
            {
                "form_name" : "form1",
                "event_date" : "2019-01-09T00:00:00Z",
                "case_note" : "this is a case note"
            }
        ]                
}

So second question is, if there is already a case_number of "12345" instead of creating a new document for this add the new xml values to the existing record. so it would look like this

{
    "case_number" : "12345",
    "conversations" : [
        {
            "form_name" : "form1",
            "event_date" : "2019-01-09T00:00:00Z",
            "case_note" : "this is a case note"
        },
        {
            "form_name" : "form2",
            "event_date" : "2019-05-09T00:00:00Z",
            "case_note" : "this is another case note"
        },
    ]                
}

my output filter

output {
      elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "cases"  
        manage_template => false
      }
    }

Is this possible? thanks

For the first question I would use a ruby filter.

For the second, does this help?

this ruby filter created the array

ruby {
    code => '
        event.set("conversations", [Hash[
          "publish_event_id", event.get("publish_event_id"),
          "form_name", event.get("form_name"),
          "event_date", event.get("event_date"),
          "case_note", event.get("case_note")
        ]])
      '
  }

for the output was resolved by

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "cases"  
    document_id => "%{case_number}"
    action => "update"
    doc_as_upsert => true
    script => "     
                boolean recordExists = false;                                                        
                for (int i = 0; i < ctx._source.conversations.length; i++) 
                {                  
                    if(ctx._source.conversations[i].publish_event_id == params.event.get('conversations')[0].publish_event_id)
                    {
                        recordExists = true;
                    }                  
                }     
                if(!recordExists){
                    ctx._source.conversations.add(params.event.get('conversations')[0]); 
                }
              "
    manage_template => false
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.