I'm trying to work on the kdd99 Dataset for Fraud Detection. In the dataset, a record looks like this :
0,tcp,http,SF,215,45076,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,0,0,0.00,0.00,0.00,0.00,0.00,.00,0.00,0.00,normal.
A record represents a connection. For each connection, the data set contains information like the number of bytes sent, login attempts, TCP errors, and so on. Each connection is one line of CSV-formatted data, containing 38 features.
So what I am trying to do is to write the data into Elasticsearch using Spark, so I can first analyze it with Kibana on a visual level, before performing deeper computation with Spark to predict whether a record is a fraudulent action or not.
The issue is that till Scala 2.10, a case class is limited to 22 fields.
Which means that I can't create a case class to associate to a record.
How can I go around this limitation without switching to Scala 2.11 which seems that can solve the issue (SI-7296)?
I appreciate your help. Thanks in advance!