I can not get it to work , I tried BIGINT, DOUBLE, DECIMAL(6,1) but elasticsearch is showing data type as "text" except for the first field "txn_id" , strange ?
+---------------------+------------+--------------------+--+
| col_name | data_type | comment |
+---------------------+------------+--------------------+--+
| txn_id | bigint | from deserializer |
| txn_process_date | timestamp | from deserializer |
| transp_id | string | from deserializer |
| ext_plaza_id | string | from deserializer |
| transp_cur_bal | bigint | from deserializer |
| ext_lane_id | string | from deserializer |
| ext_date_time | timestamp | from deserializer |
| toll_amt_charged | bigint | from deserializer |
| toll_amt_collected | bigint | from deserializer |
| toll_amt_full | bigint | from deserializer |
| ent_plaza_id | string | from deserializer |
| ent_date_time | timestamp | from deserializer |
| ent_lane_id | string | from deserializer |
+---------------------+------------+--------------------+--+"toll_amt_full" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "txn_id" : { "type" : "long" }, "txn_process_date" : { "type" : "date" }
The data in the fields is a numeric type and I found a document on the web saying that elasticsearch for hadoop will set the type as "long/double" if the data is integer or real .
https://www.elastic.co/guide/en/elasticsearch/hadoop/master/mapping.html
0: jdbc:hive2://hadoop2:10000/default> select toll_amt_collected,toll_amt_full from pa_lane_txn_orc limit 5;
+---------------------+----------------+--+
| toll_amt_collected | toll_amt_full |
+---------------------+----------------+--+
| 79 | 106 |
| 106 | 132 |
| 106 | 132 |
| 102 | 125 |
| 106 | 125 |
I changed the fields containing numbers to 'FLOAT' but only one is converted to float the rest are still text ?
0: jdbc:hive2://hadoop2:10000/default> select toll_amt_full, toll_amt_collected, toll_amt_charged from pa_lane_txn_orc limit 10;
+----------------+---------------------+-------------------+--+
| toll_amt_full | toll_amt_collected | toll_amt_charged |
+----------------+---------------------+-------------------+--+
| 106 | 79 | 79 |
| 132 | 106 | 106 |
| 132 | 106 | 106 |
| 125 | 102 | 106 |"toll_amt_charged" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "toll_amt_collected" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "toll_amt_full" : { "type" : "float" }, "transp_id" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } },
I discovered some more strange behavior , it sometimes create the fields as text and sometimes as FLOAT ? whats going on here .
"toll_amt_charged" : { "type" : "float" }, "toll_amt_collected" : { "type" : "float" }, "toll_amt_full" : { "type" : "float" }, "transp_id" : {