Hi,
I'm not able to join the two csv files based on the primary key(same value or same field).
I'm taking two csv input files that are shown below
Student Table(Table 1)
Stdid,sname,fee
121,john,10000
123,glenn,12000
124,James,14000
125,nick,15000
126,jimmy,16000
The above are two files with common field (stdid) we need to do join transformation based on that and merge the files based on the conditions .
For that we need a config file to load the data into elastic through logstash.
This type of use case is usually better for a traditional relational database. If you setup a small MySQL database and then use those CSV files as external tables then you can query the MySQL database using the logstash JDBC input select t1.Stdid, t1.sname, t1.fee, t2.sanme, t2.bookname from Student t1, Library t2 where t1.Stdid = t2.stdid
and then ingest the data into Elasticsearch.
Alternatively you could load all of your data from the Student table into Elasticsearch, then when you ingest the Library table you can query the Student Elasticsearch index and enhance your Library data into another index, but that's not as elegant. Like I said this kind of work is best done by a relational database.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.