Hi All,
I have elasticsearch indexes which contain data for different machines regarding their performance.
There are two types of indexes, first type contains information only about machines e.g its geo-location etc while the other type contains combined data of all the machines regarding performance with respect to some parameters over the time.
We want to visualize this data in Kibana Tile Map by showing each machine on the map and aggregate the performance for that machine.
For this we will have to fetch machine location from first index and then fetch the data for that specific machine from other indexes and show the performance measures for that machine on the map.
This work is similar to join in SQL.
After carrying out some research I have found that Kibana does not support any type of joins.
So is there any other solution to solve this problem?
Or I will have to merge these two indexes into one?
Any help would be highly appreciated.
Thanks Mark Walkom,
But logically merging the tables will require the static machine specific information like geo-location etc to be added in all the documents which contain data about performance and data about machine specific information may be just a few hundred documents but documents containing data about performance measures will have millions of rows.
So if we add that information in millions of rows then it will be a great overhead and does not seem feasible.
So can there be a workaround to achieve our goal to have a separate index for machine specific data and a separate indexes for performance related data and still be able to JOIN them in Kibana or by any other way?
Hi Asad,
don't think relational (joins of tables) -> think elastic (index and search super fast)
i'm in telco business and we have exatly the same problem (like yours).
We have indeces with customer data and indeces with logs
our solution:
logstash receive a log,
logstash make a query with the elasticsearch query filter plugin, to get the information from customer inventory index
logstash enrich the customer data to the log.
This stratagy have a big advantage: you index a complete dataset, no relations between the indeces are necessary anymore.
in case of event storms:
we have two logstashes
one collect the logs, send it to redis, the second reads from redis and make the elasticsearch querys and data enrichment
maybe this is helpful for you....
Hi Mark!
Your response helped us in going to the right direction.
Can you please clarify the term "merge" the indexes?
For example.
If I have an index A and an Index B, index A contains the personal data regarding customers and index B contains all the data regarding their shopping with date and time.
And I want to get the details per customer about how much revenue he generated for a specific time period.
Does "merge" mean that I make a single index C which contains both the customers' personal data and their shopping details combined in one index?
Or it means that we keep two indexes A and B, but merge the required data from two indexes using some way?
If its the second case, then what can that way be?
Can you give any examples or references for that?
Hi Juergen!
Thanks for your response.
What I understand from your solution is that you receive a log, you make logstash to write a query to fetch customer data for that log and then you combine the data and insert that into a single index into the elasticsearch.
As this sentence suggests:
"you index a complete dataset, no relations between the indeces are necessary anymore."
Does it mean that in elasticsearch we need to have a single index containing customer and log data?
And we combine the data before inserting into the elasticsearch index?
Hi,
Exactly,
we have a Index with all the Customer data.
The logs (real time data) will be enriched with the Customer fields we need for our customer experience dasboards
(Performance data, alarms, trouble Tickets, orders usw..)
Regards
Juergen
index-123 - from Source1
2.index-234- from Source2
3.index-789- from Source3
Then in Kibana you can call index as "index-*
all data will come from above 3 index
If you have common keywords in 3 index then you can do analysis easily and can create a single dashboard which will consume data from all the three source.
I currently have the same database structure as you have explained.
I have indexes like shop-customer_info , shop-shopping_info, shop-shop_info
I can call all these indexes like shop-* in Kibana and data from all these indexes can be made available in Kibana.
The index shop-shop_info contains location of the shop, and it contains very few documents as compared to other indexes as there may be 10 or 20 shops.
What we want is that to plot the shop on tile map, sum/avg/min/max all the bill data from index shop-shopping_info for each shop and when i hover over the shop on tile map, I see sum/avg/min/max of all the billings on that shop.
We have tried to achieve this by first plotting the shops, but Kibana does not know how to separately aggregate the billing data for each shop from index shop-shopping_info
If you can give any example to do this by your suggested data structure then it would be highly helpful.
Thanks
Sorry for late reply, if this is still useful you can grab this info
1.Create a Tile map for Shop, do aggregation by billing (shop-shop_info )
2. Next to this tile map create a tabular view of same data "Shop by billing Info" (shopping_info)
3. Next to this create another tabular view which will provide detailed aggregation ( 2nd level aggregation)
Use Case: User will get info from Tile Map as which shop is doing well
He will click 2nd tabular view and will get the info( at this stage Tile view will show only one location)
and in 3rd view he will get sub aggregation automatically
I have similar scenario. Trying to follow your suggestion. But i could not find plugin to perform step 2
logstash make a query with the elasticsearch query filter plugin, to get the information from customer inventory index
can you please tell me the link to the plugin to read a different index. Default Elastic filter plugin does not take in indice as input and it seems to query onlt from current log index which is helpless in my scenario.
I'm trying to follow your suggestion but I have problems to run query correctly with the elasticsearch query filter plugin.
In my case I need to merge the fields from two source:
We are doing that, but it really doesn't work: it render data from both or more sources but when you try to filter one visualization, other visualizations indexes are not shown: the other visualizations says it has no data and renders nothing.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.