srinathji
(Srinath Ji)
October 10, 2017, 1:30pm
1
Hi All,
please help me , i am getting below error while saving dataframe to Elastic search while spark streaming:
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot determine write shards for [custAct_index/custAct_index_type];
likely its format is incorrect (maybe it contains illegal characters?)
srinathji
(Srinath Ji)
October 10, 2017, 2:26pm
2
Below is the Gist for the Error I am facing:
1.tsv
1331799426 2012-03-15 01:17:06 2860005755985467733 4611687631188657821 FAS-2.8-AS3 N 0 99.122.210.248 1 0 10 http://www.acme.com/SH55126545/VD55170364 {7AAB8415-E803-3C5D-7100-E362D7F67CA7} U en-us,en;q=0.5 516 575 1366 Y N Y 2 0 304 sbcglobal.net 15/2/2012 4:16:0 4 240 45 41 10002,00011,10020,00007 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 48 0 2 3 0 homestead usa 528 fl 0 0 0 0 0 WPLG 0 120 WPLG 0
1331800486 2012-03-15 01:34:46 2859997896193943381 6917530184062522013 FAS-2.8-AS3 N 0 69.76.12.213 1 0 10 http://www.acme.com/SH55126545/VD55177927 {8D0E437E-9249-4DDA-BC4F-C1E5409E3A3B} U en-us,en;q=0.5 591 0 0 U U Y 0 0 300 rr.com 15/2/2012 1:7:2 4 420 45 41 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2 48 0 2 11 0 coeur d alene usa 881 id 0 0 0 0 0 KXLY 0 120 KXLY 0
1331857433 2012-03-15 17:23:53 2781404195155152050 6917530222178992983 FAS-2.8-AS3 N 0 67.240.15.94 1 0 10 http://www.acme.com/SH55126545/VD55166807 {E3FEBA62-CABA-11D4-820E-00A0C9E58E2D} U en-us 590 686 1278 Y Y Y 2 0 300 rr.com 15/2/2012 19:11:20 4 240 45 2 00011,10020 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.52.7 (KHTML, like Gecko) Version/5.1.2 Safari/534.52.7 71 0 20 59 0 queensbury usa 532 ny 0 0 0 0 0 WTEN 0 120 WTEN 0
ElasticSpark.scala
package org.elasticsearch.spark
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.sql.{SQLContext, SparkSession}
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe
import org.apache.spark.streaming.kafka010.KafkaUtils
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.sql.SparkSession
This file has been truncated. show original
regusers.tsv
SWID BIRTH_DT GENDER_CD
0001BDD9-EABF-4D0D-81BD-D9EABFCD0D7D 8-Apr-84 F
00071AA7-86D2-4EB9-871A-A786D27EB9BA 7-Feb-88 F
00071B7D-31AF-4D85-871B-7D31AFFD852E 22-Oct-64 F
0007967E-F188-4598-9C7C-E64390482CFB 1-Jun-66 M
There are more than three files. show original
system
(system)
Closed
November 7, 2017, 2:27pm
3
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.