Any way to do something similar to SQL's "IN"/"NOT IN" other table?

i have never used elk before and not very familiar with the structure of the database, mostly work with mysql stuff before at the old place.

since people want a visualization of the server log, i put the log into elk stack and gave them a nice looking dashboard. one of the thing they ask for is what "access code" is not in use using a data table.

they are stored in a separated table in mysql and we were using something like "SELECT * FROM access_code WHERE code NOT IN (SELECT access_code FROM server_log WHERE time>'start date'
AND time<'end date')" to search for them before. is there any way to do look up like that in kibana for visualization? should i import the lookup table as another index? if i do, how do i compare between 2 index?

Hi there,

can you post a sample of the documents you ingested? How are they structured? Where are the useful fields (access code, start date, end date)? Are there different indices for different types of docs or are all the info stored in every doc and you only have a single index?

data is something like {"log_id":"00001","return_data":"sample_data","email":"example@123","passwword":"hash","access_key":"0001","time":"2019-12-01T13:10:12.100"}
i have a list of sample access key like this
{"0002"."0001","1","0","2516"}
i want to visualize is there any key not being in use in a time frame.
im off work now so the data is not that accurate, but the important part is there. thank you for the attention, i have been trying to do this for a few day now, kinda frustrating, i know i can do it on the mysql server, but we want all the data presented at the same place, maybe i should do it before logstash?

Hi there,

next time try to indent your code and post it formatted (paste the indented code, highlight it and press the icon in the bar between the quote and the picture icons).

Now, let me get it straight. You have docs like

{
  "log_id":"00001",
  "return_data":"sample_data",
  "email":"example@123",
  "passwword":"hash",
  "access_key":"0001",
  "time":"2019-12-01T13:10:12.100"
}

And you want to visualize the access_key field from those docs which are out of a given timerange, right?

For example, ingesting 3 documents as follows

PUT wang_hay_ng_index/_bulk?refresh
{"index":{"_id":1}}
{"log_id":"00001","return_data":"sample_data","email":"example@123","passwword":"hash","access_key":"0001","time":"2019-12-03T13:10:12.100"}
{"index":{"_id":2}}
{"log_id":"00002","return_data":"sample_data","email":"example@123","passwword":"hash","access_key":"0002","time":"2019-12-05T13:10:12.100"}
{"index":{"_id":3}}
{"log_id":"00003","return_data":"sample_data","email":"example@123","passwword":"hash","access_key":"0003","time":"2019-12-09T13:10:12.100"}

and supposing you want all those docs with time < 2019-12-04 OR time > 2019-12-06, you want something like this, right? Showing only the 2 docs (or better the access_key fields of the 2 docs) which are out of that timerange.

Is this correct?

thanks for the tips on the forum.
i've done that, and they want what's not in use, if this is the case and all the access_key that had ever been generated is:
{0001,0002,0003,0004}
the table should show 0003 and 0004.
the challenge is to show the access_key that's not ever been logged, and i've been doing that with mysql by looking up another table that contain all the access key, but they want to display all of statistic with kibana, thats why i asked should i just do it before ingesting into elk.

But that list of unused access_key is for a specific timespan?

Make a concrete example posting the indices you're working with and some samples from those indices.

Let us understand which is you input exactly and which is the desired output, otherwise it's very difficult to help you.

For instance, I don't get why 0003 should return in your example and 0001 should not. They're both out of the range 2019/12/04-2019/12/06 and they're both present in that list of yours (btw, use square brackets [] for a list, please). What am I missing?

the log is timed, they want to know what is not used within maybe december, i would have done something like "SELECT access_code.code FROM access_code WHERE access_code.code NOT IN (SELECT log.access_code FROM log WHERE time > '2019-12-01T00:00:00.000Z' AND time < '2020-01-01T00:00:00.000Z')"if im doing mysql

my bad i just realize i mean access_code 0002 and 0004

Ok now I get it.

Mhmh don't think there's a straight way to compute it on the fly, since IMO using ES as a relational db is not a smart approach to it.

As far as I know, the general philosophy of elasticsearch is to upstream manipulate data so you can simply visualize and aggregate them downstream.
What you want to do here is to somehow join the results of two different indices (code and log) and you want to do it dynamically.

I'll write down here if something comes to mind.

thats a bummer, thanks bud.