Best way to match URLs


(Johan Rask) #1

Hi!

I am using ES together with logstash and we are indexing simple access log
files.

Our problem is that we want to now the number of image views for a resource
which is determined by a specific
REST url:

"GET /resource//image" => i.e "GET /resource/abcde/image"

This results in millions of different URL´s that all mean = image view.

Another problem is that there are other "unpredictable" resources under
/image => "GET /resource/abcde/image/"
so a search like

url:get AND url:resource AND url:image AND - url:<?>

does not work since I do not know what to exclude

I was thinking about using regexp for this, performance is not really a
problem (at least not at the moment), since
this is mainly for reporting. However, I have not been able to solve it.

If using regex, should the field be analyzed or not_analyzed? I have tried
with both using a template but I am still
unable to get it working.

"url" : {
"type" : "multi_field",
"fields" : {
"name" : {"type" : "string", "index" : "analyzed" },
"facet" : {"type" : "string", "index" :
"not_analyzed"}
}
}

Anyway, any suggestions about how to solve this would be highly appreciated.

Kind regards, Johan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/01e52f9e-6cc3-4723-ba44-b5a5fdbe9fcc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #2