LookUp Tables - Migrating from Splunk

Hi,

we are currently in the process of migrating from Splunk to Elasticsearch.
We have most of the data now in our ELK Cluster and are in the process of
recreating the Dashboards and visualizations necessary.

In Splunk we currently have so called "LookUp" tables which are static lists (which may get regenerated once per week)
containing processes or queues that are should be present through out our systems.

Based on this list, the teams can query the data present in the system and quickly see if we have the correct amount of processes
on a specific host (or which is missing), the correct number of queues and things like that.
Also it allows us to create flexible alerts.

I searched through the forums and the documentation and found a lot of questions regarding "SplunkLookup tables" ending up solved with enrichment policies and similar approaches.
But in this case this seems not to be quite fitting.

Because these lists can change and are quite large (~500 hosts, ~400 queues and so on) it is not that easy to simply create an alert or dashboard by hand for each.

Lets take as a example the following:
we have a Lookup table / or lookup index containing mandatory hosts that sould be present in our system: "host1, host2, host3"

based on that we would like to achieve something like that as a secondary index or temp index:

{
	host1: {
		countOfMetrics: 10	
	},
	host2: {
		countOfMetrics: 60	
	},
	host3: {
		countOfMetrics: 10	
	}
}

This would be currently required to be visualized on a dashboard.
On top of that we would need to send an alert if the count would be < 0 or just no data is present for a certain timeframe.

We could of course create such an alerting rule by hand currently, but as said we have quite a lot of this lookups currently which also are constantly changing.

I hope someone had an simmilar issue and might share some ideas how to migrate this to elasticsearch!
Maybe we simply missing a totaly obvious solution to this.

Thank you very much,
Tim

2 Likes

Hi @DerTim Welcome to the community.

What version are you on?

Recent version perhaps you could use ES|QL
With ENRICH function to both, create a visualization and alerts

Thank you so much for your response!

So we are currently running 8.13.4.

Regarding the "ENRICH" function: From the documentation I read it states: " The ES|QL ENRICH command only supports enrich policies of type match . Furthermore, ENRICH only supports enriching on a column of type keyword ."

So therefor to me it seems I can not simply run a "COUNT" like query and add the result, right?
Thats maybe exactly the point where we might missing something. How would the approach be for the ENRICH here?

Thank you so much,
Tim

Apologies I am not following the details samoke

If you can provide the initial index

What you want to enrich with

And the result Perhaps we can help.

Maybe at a higher level. Once you enrich the data then you can pipe it to a where clause And then build an alert.

The enrich adds data... Then it's there to work on the where clause You can alert off of.

Typically you enrich with some additional detail about the host, but I can't tell what you're trying to add and do at this point

Sorry for the late reply!

We have a lookup table with a list of hosts from which we expect data. As a lookup index it would look like this:

PUT lookup-index-host
{
  "mappings": {
    "properties": { 
      "host": {
        "properties": {
          "name": {
            "type": "keyword",
            "ignore_above": 256
          },
          "fqdn": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      } 
    }
  }
}

POST lookup-index-host/_doc
{
  "host": {
    "name": "host1",
    "fqdn": "host1.some.domain"
  }
}

POST lookup-index-host/_doc
{
  "host": {
    "name": "host2",
    "fqdn": "host2.some.domain"
  }
}

POST lookup-index-host/_doc
{
  "host": {
    "name": "host3",
    "fqdn": "host3.some.domain"
  }
}

Assuming we get this index updated on a regular daily basis, containing a high and volatile number of hosts.
Based on that we would like to join / enrich per hosts a field that contains the count of an ES|SQL or some other query like:

from metrics* | where host.name === *our index hosts*

If that is not possible we would be grateful for any other way we could determine on a live basis that we retreive data from all hosts listed in this index. (without hard wiring the host.name into alerts by hand)

Besides the host lookup, we use this pattern currently also for other datasources. So we are looking for some kind of general approach here.

Thank you very much for your help.

BR, Tim

This is the EXACT use case for enrich.

Just to be clear you can do this at 1) ingest time (with an enrich ingest pipeline) or 2) query time with ESQL.

Both start by creating an enrich policy / index from your look up data.

You should probably take a quick look through here... this for at ingest time put to do a query time you still need and enrich policy / index.

For Host Lookup look at this example here

so you will

Ohh Look it is right here Literaaly the whole flow

Here is my basic complete working example

DELETE lookup-index-host

PUT lookup-index-host
{
  "mappings": {
    "properties": { 
      "host": {
        "properties": {
          "name": {
            "type": "keyword",
            "ignore_above": 256
          },
          "fqdn": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      } 
    }
  }
}

POST lookup-index-host/_doc
{
  "host": {
    "name": "hyperion",
    "fqdn": "hyperion.some.domain"
  }
}

POST lookup-index-host/_doc
{
  "host": {
    "name": "host2",
    "fqdn": "host2.some.domain"
  }
}

POST lookup-index-host/_doc
{
  "host": {
    "name": "host3",
    "fqdn": "host3.some.domain"
  }
}

DELETE _enrich/policy/lookup-index-host-policy
# Create the policy 

PUT /_enrich/policy/lookup-index-host-policy
{
  "match": {
    "indices": "lookup-index-host",
    "match_field": "host.name",
    "enrich_fields": ["host.fqdn"]
  }
}

# Execute the policy
POST /_enrich/policy/lookup-index-host-policy/_execute?wait_for_completion=false 


POST /_query?format=txt
{
  "query": "from metricbeat-* | enrich lookup-index-host-policy | keep host.name, host.fqdn | limit 10"
}

# results
   host.name   |     host.fqdn      
---------------+--------------------
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain
hyperion       |hyperion.some.domain

Think you can take it from here... :). careful overwriting existing fields with the Enrich... you can if you want just be aware

There will be more functionality / ease of use in the near future ... unfortunately I can not share the details of upcoming product releases..

Hello Stephen, thank you for taking the time to reply to us!

Actually we did already read the article you provided.

It seems we have a little misunderstanding here.

We actually are not trying to enrich the FQDN, but the count of metrics.

The host.name and fqdn are allready provided in an index and we want to
see which of them provide data in a specific timeframe.

So based on your example:

host.name   	|   count.of.metrics      
---------------+--------------------
host1       	|	8
host2       	|	0
host4       	|	250
host5       	|	7
host6       	|	0

Here it would be fundamental for us to see also the hosts with Zero metrics.
Hence we trying to implement based on a lookup table / enrich construct.

We allready tried the enrich policys but we are currently only able to match static values.
We are not able to enrich dynamic values like the count metrics based on host.name

So the situation is as follows:

  1. We have a list of hosts in a lookup table (including fqdn)
  2. We have converted that lookup table into an enrich index via an enrich policy
  3. Now we want a count of ALL metrics by host in this lookup table (now enrich index) INCLUDING hosts that haven't sent any data. For those hosts the count would be 0.

Based on the article we also don?t see the possibility to include hosts that haven't sent any data in a specific timeframe, we would need to see those as well.

We do not want to simply enrich existing documents in a query with some extra fields (fqdn in your example, we have that already), we're basically trying to do an SQL OUTER JOIN (metric index on lookup index).

Checking each host individually is not a possibility in our case as the count of hosts we're monitoring is in the three digits.

Thank you very much for your help,
Tim

No the the OUTER join as you describe it is not available as far as I know.. perhaps in the future...
(I'll noodle on it a bit more... But pretty sure there is no natural way to do that today)

But if you are simply trying to figure out which hosts have not reported in in the last given time frame?

And you are using metrics to do that? Are you collecting logs from the host?

If so there is another way... Using a latest transform.

I have used it with many customers to support this / similar use case..

Example

You setup latest transform and then it it easy to find any host that has not reported in in say last 1.. 24 hour etc.. BTW I had a customer use this with a host inventory of 100K hosts

Latest Transforms can be used for a lot of things: host last reported in, user last login ...,ots of use cases.

1 Like

Hi Stephen,

thank you for the recommendation with the "latest transform".

The last days we did some experiments with it and now are able to get our host lookup table working.

We are combining the "latest transform" with the enrichment.

So we use the latest transform on the standard "metrics-*" dataview and transform it to an new index based on "host.hostname" and "@timestamp".

On our "lookup-hosts" index we perform an enrich based on the transformed index. Now we have the desired result with the latested timestamp on each host we received metrics from and an "null" timestamp on the host we never received data from.

Thank you for you tips and help.

Just wondering if your "I'll noodle on it a bit more" might brought up some other ways we might use for our other use cases and might be a bit more elegant?

Thank you very much and best regards,
Tim