Insert undefined tuples length into ES


(Florent Valdelievre) #1

Please consider the following content in test_data.txt. Each lines contains an unlimited number of string comma delimited.

C#,winform,C++
java,pig,elasticsearch,hadoop

I would like to load each line into a tuple and store them in elasticsearch like the following for each document:

"tags": [
 [ C#,winform,C++]
]

I have done the following:

DEFINE EsStorage org.elasticsearch.hadoop.pig.EsStorage (
                      'es.http.timeout= 5m',
                      'es.index.auto.create = true',
                      'es.nodes = X.X.X.X'
               );

ANSWERS = LOAD 'test_data.txt' USING PigStorage(',');
TAGS = FOREACH ANSWERS GENERATE * as (tags:tuple());
STORE TAGS INTO 'test/post' USING EsStorage;

FYI:

DESCRIBE TAGS;
TAGS: {tags: ()}

DUMP TAGS
(C#,winform,C++)
(java,pig,elasticsearch,hadoop)

However, no tags is inserted into elastic, just two empty documents. any idea ?

{
	"took": 3,
	"timed_out": false,
	"_shards": {
		"total": 2,
		"successful": 2,
		"failed": 0
	},
	"hits": {
		"total": 2,
		"max_score": 1,
		"hits": [{
			"_index": "test",
			"_type": "post",
			"_id": "AVV1U57OmyhhWgMuCNGg",
			"_score": 1,
			"_source": {}
		}, {
			"_index": "test",
			"_type": "post",
			"_id": "AVV1U57OmyhhWgMuCNGf",
			"_score": 1,
			"_source": {}
		}]
	}
}

(system) #2