I need to compare two excel reports (old and new), each reports have ~200 columns and each row in the old excel has a corresponding row in new excel sheet, also each row has a unique mapping by a combination of fields.
I am planning to convert these excel reports into a json document and store it in elastic search with an index and type.
But am pretty new to elastic search and Kibana, could someone please help me in comparing the reports.
Basically I need to compare how much percentage each columns matches with old and new and other statistics?
I could store two excels in two differnt indices. Each document is unique by a combination of few fields also the same documents will be availble in both documents.
How can I create an aggregate key and compare two indices in Kibana? Is Kibana suitable for this or any other approch for this?
a) As long as you can distinguish for each document which excell-sheet it belongs to from one of the field-values in the document, in Kibana you will be able to tease out both excell sheet; when designing visualizations, adding a query to the filter bar, etc...
b) If your documents are in different indices, that not a problem. In Kibana, you will need to create an "Index Pattern": https://www.elastic.co/guide/en/kibana/current/tutorial-define-index.html. This index pattern can contain wildcards, so it can match multiple indices at the same time. e.g.
excell-* would catch index excell-a and index excell-b.
Regardless, when you will be running queries and make comparisons between the two, you will still need a field-value per document that establishes which excell sheet they belong to.
So, if you want to e.g. make 2 line-charts that show metirics per excell-sheet, you will need (a). If you also want to have a clean index layout, and run index-level stats, you could opt to split them up with (b) and use an index-pattern with wildcard to run queries over both.
I have added excel specific property to distinguish the index it is from and also indexed both excels with different type on a same index. indexed both the excels with different type and same index name.
I get both the excel sheet documents in discovery.
now which type of chart I need to use to compare documents from both the types and get the stats of matching percentage?
i.e.
I need to find the correct document in b-sheet by an aggregate key [ uniqueness by combination of columns ] in a-sheet and do field by field comparison, and visualize the percentage of match for all the columns.
which type of chart or approach would be the better choice ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.