Index parent-child data from csv to elasticsearch

Hi,

I have 2 csv files with 1 to many relationship. How would i index both the files in same index with relationship. So, that i can generate the deep dive visualisation in detail based on selection.

CSV 1 : customerID(123), "chandan", "xxxxx@gmail.com"
CSV 2 : customerID(123), "productID(213)", "New product"
customerID(123), "productID(321)", "old product"

My parent conf file is

input {
      file {
          path => ["C:/XXXXX/XXXX/OrderDetail_20190130173419.txt"]
          type => "orderDetail"
          start_position => "beginning"
		  ignore_older => 0
      }
}

filter {
    csv {
        columns => ["customer ID","name","email"]
        separator => "|"
        skip_header => "true"
		remove_field => [ "host", "message", "path" ]
    }	
	mutate {
		add_field => { "family" => "orderDetail"}
	}
	
	fingerprint {
		source => "customer ID"
		target => "[@metadata][fingerprint]"
		method => "MURMUR3"
	}
}

output {
	stdout { codec => rubydebug }
    elasticsearch {
		hosts => ["localhost:9200"]
		index => "test-%{+YYYY.MM.dd}"
		document_id => "%{[@metadata][fingerprint]}"
    }	
}

and Child conf file is

input {
      file {
          path => ["C:/XXXXXX/XXXXX/BCC_OFAP_Open_OrderActivities_20190130173419.txt"]
          type => "orderActivity"
          start_position => "beginning"
		  ignore_older => 0
      }
}

filter {
    csv {
        columns => ["customer ID","product ID","product Desc"]
        separator => "|"
        skip_header => "true"
		remove_field => [ "host", "message", "path" ]
    }	
	fingerprint {
		source => "customer ID"
		target => "[@metadata][fingerprint]"
		method => "MURMUR3"
	}
	mutate{
		add_field => { "['family']['name']" => "orderActivity"}
		add_field => { "['family']['parent']" => "%{[@metadata][fingerprint]}"}
	}
}

output {
	stdout { codec => rubydebug }
    elasticsearch {
		hosts => ["localhost:9200"]
		index => "test-%{+YYYY.MM.dd}"
		routing => "%{[@metadata][fingerprint]}"
    }	
}

With above files, i am getting total 3 entry in same index, instead it should create 1 entry with two array of product Details in same entry

This is a lookup enrichment scenario. Your need is to enrich each order detail with the customer info.
You might be able use the translate filter to do the lookup on the BCC_OFAP_Open_OrderActivities_20190130173419.txt file.
There are two problems I can think of:

  1. The name of the file seems to be dynamic or tied to a specific date and time.
  2. The structure will need to change. The translate filter takes a two column CSV form key comma value where in your case the key is the customer id and the value needs to be the rest of the columns as | delimited. This will put the | delimited string as a value into a "target" field. You will then need to use the csv or dissect (faster, as the structure is well known and not likely to change on the fly like log/metrics data can) filter to parse this value.

Let us know how the files are generated and we can advise further.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.