Aggregating urls with slightly different variations in elasticsearch query

Geoffrey_De_Vylder · January 19, 2019, 8:53am

I am new to Elastic search and learning about how the tooling works. I have an "audit" database containing records of HTTP requests to different endpoints in my application and which time they were executed.

You can imagine this fictional example:

18 jan 2018 18:06:00: POST /user/1/books
18 jan 2018 18:07:00: POST /user/3/books
18 jan 2018 18:06:03: GET /books/search?title=Hello
19 jan 2018 17:04:01: GET /books/search?title=AnotherBook&pagesMoreThan=300

In my example the 1 and 3 and query parameters are variable parts.

I am wondering what the best way would be to build my documents to allow answering the following questions:

How many times did someone call the endpoint to get books from users in a given timeframe (any user)?
How many times did someone search for books (disregarding parameters)?

To do this I would need to be able to ignore the variable parts in each of the urls. I would need to be able to get a count of /user/.?/books or /books/search for example.

What is the recommended way of doing this in elasticsearch?

One thing I can think of is that it's not the responsibility of elasticsearch itself and maybe I should preprocess it when I'm writing the documents. So maybe I can store it as

{
    "url": "/user/?/books",
    "path_parameters": [1]
},
{
    "url": "/books/search",
    "parameters": ["title=AnotherBook", "pagesMoreThan=300"]
}

Even in that case, determining which parts of an URL are variable is not an easy task to do so maybe it's not even possible in a way where I don't manually specify all URLs that can occur.

I also noticed that elasticsearch has data aggregation functions but I'm not sure if that is flexible enough to support what I need.

Any recommendations?

system · February 16, 2019, 8:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Smart way to aggregate multiple HTTP routes by detecting variable parameters? Elasticsearch	1	347	July 6, 2017
Aggregartion by Elastic Uri Elasticsearch	2	303	August 18, 2019
How to handle same url with different parameters Elasticsearch	1	547	November 16, 2017
Agg script that returns multiple-documents from 1 document ? (json_each in postgresql) Elasticsearch	5	700	July 5, 2017
Unique URL and query param patching Elasticsearch	5	822	July 5, 2017

Aggregating urls with slightly different variations in elasticsearch query

Related topics