Kibana Visualization Help

Hello,

Kibana newbie here coming from a traditional SQL / relational DB environment and have some questions about how to best create / configure two separate report types I am looking to make.

Visualization 1:
I have data structured as follows:

ID	     Name 	        Source	         Address
123	     Joe	        Phone Book	123 Fake St
123	     Joe	        Facebook	125 Fake St
456	     Lisa	        Passport 	50 Imaginary Lane
456          Lisa	        License	        99 No Way

How do I create a visualization that gives me the ID and name of any record where someone has both a "Passport" and "License" entry for source?

Visualization 2:

I have a second master table where data has been amalgamated into a single record with the best quality data has already been selected for each ID / column:

ID	     Name 	        Source	         Address
123	     Joe	        Phone Book	123 Fake St
456          Lisa	        License	        99 No Way

How would I create a visualization that compares where there is a delta for the data fields?

For example it would say for "123 - Joe" there is a difference between his address in "Phone Book" and "Facebook"

@lukeelmers/@ppisljar can you please grab this?

Thanks!
Bhavya

Hey @KibanaNewbie -

For Visualization 1, here's the closest idea I have off the top of my head:

  1. Create a Data Table visualization with a Split Rows bucket using a Terms aggregation on Name. This should give you a list like:
Name     Count
Joe          2
Lisa         2
  1. Then add a filter on Source where the value is one of: Passport,License.
  2. At this point, you should have a table with all unique Names, and any name with Count > 1 should be what you are looking for.
  3. Now if you go under your Terms bucket and click "Advanced", you can put {"min_doc_count": 2} inside of the "JSON Input" field and it should display what you're looking for.

Now, this idea of doing it based on Count only works on the assumption that there are not duplicate Source entries per-user (e.g. Lisa will have no more than 1 entry with Source=Passport). Not sure if this applies to your situation or not.

As for Visualization 2, I'm not sure if I can think of a good way to achieve what you're looking for with the amalgamated data -- but if you took the original data and did a terms aggregation on the address (so you have a list of all unique addresses), you could then add a second subbucket terms agg on the Name. Then if you sort by Name, any name appearing more than once has a delta for the address.

I'm guessing this only partially answers both of your questions, but there may be some more magic you could do with scripting, perhaps by adding some fields that make querying a bit more straightforward... @ppisljar or @timroes might have some better ideas here that I'm not thinking of.

Hope this helps nudge you in the right direction!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.