Index parent-child data from csv to elasticsearch

chandan.sing1 · February 26, 2019, 3:17pm

Hi,

I have 2 csv files with 1 to many relationship. How would i index both the files in same index with relationship. So, that i can generate the deep dive visualisation in detail based on selection.

CSV 1 : customerID(123), "chandan", "xxxxx@gmail.com"
CSV 2 : customerID(123), "productID(213)", "New product"
customerID(123), "productID(321)", "old product"

My parent conf file is

input {
      file {
          path => ["C:/XXXXX/XXXX/OrderDetail_20190130173419.txt"]
          type => "orderDetail"
          start_position => "beginning"
		  ignore_older => 0
      }
}

filter {
    csv {
        columns => ["customer ID","name","email"]
        separator => "|"
        skip_header => "true"
		remove_field => [ "host", "message", "path" ]
    }	
	mutate {
		add_field => { "family" => "orderDetail"}
	}
	
	fingerprint {
		source => "customer ID"
		target => "[@metadata][fingerprint]"
		method => "MURMUR3"
	}
}

output {
	stdout { codec => rubydebug }
    elasticsearch {
		hosts => ["localhost:9200"]
		index => "test-%{+YYYY.MM.dd}"
		document_id => "%{[@metadata][fingerprint]}"
    }	
}

and Child conf file is

input {
      file {
          path => ["C:/XXXXXX/XXXXX/BCC_OFAP_Open_OrderActivities_20190130173419.txt"]
          type => "orderActivity"
          start_position => "beginning"
		  ignore_older => 0
      }
}

filter {
    csv {
        columns => ["customer ID","product ID","product Desc"]
        separator => "|"
        skip_header => "true"
		remove_field => [ "host", "message", "path" ]
    }	
	fingerprint {
		source => "customer ID"
		target => "[@metadata][fingerprint]"
		method => "MURMUR3"
	}
	mutate{
		add_field => { "['family']['name']" => "orderActivity"}
		add_field => { "['family']['parent']" => "%{[@metadata][fingerprint]}"}
	}
}

output {
	stdout { codec => rubydebug }
    elasticsearch {
		hosts => ["localhost:9200"]
		index => "test-%{+YYYY.MM.dd}"
		routing => "%{[@metadata][fingerprint]}"
    }	
}

With above files, i am getting total 3 entry in same index, instead it should create 1 entry with two array of product Details in same entry

guyboertje · February 28, 2019, 12:12pm

This is a lookup enrichment scenario. Your need is to enrich each order detail with the customer info.
You might be able use the translate filter to do the lookup on the BCC_OFAP_Open_OrderActivities_20190130173419.txt file.
There are two problems I can think of:

The name of the file seems to be dynamic or tied to a specific date and time.
The structure will need to change. The translate filter takes a two column CSV form key comma value where in your case the key is the customer id and the value needs to be the rest of the columns as | delimited. This will put the | delimited string as a value into a "target" field. You will then need to use the csv or dissect (faster, as the structure is well known and not likely to change on the fly like log/metrics data can) filter to parse this value.

Let us know how the files are generated and we can advise further.

system · March 28, 2019, 12:12pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.