What are they searching for and what are they getting?


(Lukáš Vlček) #1

Hi,

Does ES allow me to collect data about the question in ${Subject}?

I would like to know what queries the users are issuing and what kind of
response they get. It seems that there is no simple way how to get this data
by setting log levels (even if I set rootLogger to TRACE level then no
relevant data is recorded for searches and responses). May be the better
option would be to have a specific plugin that could record this type of
data (configurabe output to log file or pluggable custom storage
implementation?) and it would be great if it would be able to return some
data via REST (eg. statistic and few recent queries and responses and their
processing times and the like...). What do you think? Would it be possible
to implement plugin like that?

Regards,
Lukas


(Shay Banon) #2

No, there is no stats gathering on that level. The plan is to have stats
gathering on the index level, but more general ones, like indexing TPS and
so on.

Regarding the question at hand, it would be very cool if it was provided.
The problems is that its quite complex to do from within elasticsearch. For
example, not all queries are "user" driven (i.e. did not result from a
direct user search request). Or, many times, the queries gets augmented. If
I had to implement something like this in a search centric application, I
would do that in a layer on top of elasticsearch that has a bit more
knowledge of where, why, and what the user did.

Of course, this information can be stored within elasticsearch so you can
search on it :wink:

-shay.banon

On Thu, Aug 12, 2010 at 10:03 AM, Lukáš Vlček lukas.vlcek@gmail.com wrote:

Hi,

Does ES allow me to collect data about the question in ${Subject}?

I would like to know what queries the users are issuing and what kind of
response they get. It seems that there is no simple way how to get this data
by setting log levels (even if I set rootLogger to TRACE level then no
relevant data is recorded for searches and responses). May be the better
option would be to have a specific plugin that could record this type of
data (configurabe output to log file or pluggable custom storage
implementation?) and it would be great if it would be able to return some
data via REST (eg. statistic and few recent queries and responses and their
processing times and the like...). What do you think? Would it be possible
to implement plugin like that?

Regards,
Lukas


(Lukáš Vlček) #3

For now I would be happy to get all user queries issued via REST interface
(although even at this level some queries may not be user driven). I just
want to know what users are looking for and all I give them is REST API. Is
there any easy way how to get queries and responses data in this case? For
now I can live with the fact that I will just put more logging message into
some class, is there any central class in this case which could be used for
this purpose?

Regards,
Lukas

On Thu, Aug 12, 2010 at 10:35 AM, Shay Banon
shay.banon@elasticsearch.comwrote:

No, there is no stats gathering on that level. The plan is to have stats
gathering on the index level, but more general ones, like indexing TPS and
so on.

Regarding the question at hand, it would be very cool if it was provided.
The problems is that its quite complex to do from within elasticsearch. For
example, not all queries are "user" driven (i.e. did not result from a
direct user search request). Or, many times, the queries gets augmented. If
I had to implement something like this in a search centric application, I
would do that in a layer on top of elasticsearch that has a bit more
knowledge of where, why, and what the user did.

Of course, this information can be stored within elasticsearch so you can
search on it :wink:

-shay.banon

On Thu, Aug 12, 2010 at 10:03 AM, Lukáš Vlček lukas.vlcek@gmail.comwrote:

Hi,

Does ES allow me to collect data about the question in ${Subject}?

I would like to know what queries the users are issuing and what kind of
response they get. It seems that there is no simple way how to get this data
by setting log levels (even if I set rootLogger to TRACE level then no
relevant data is recorded for searches and responses). May be the better
option would be to have a specific plugin that could record this type of
data (configurabe output to log file or pluggable custom storage
implementation?) and it would be great if it would be able to return some
data via REST (eg. statistic and few recent queries and responses and their
processing times and the like...). What do you think? Would it be possible
to implement plugin like that?

Regards,
Lukas


(Shay Banon) #4

There isn't one, and storing all the user queries means you need to build a
storage for that since it can be quite big. Thats something that you can do
on the client level by indexing it into elasticsearch (with the extra
knowledge you have about the query itself).

I can add logging for that, but its not always textual (it is with json, but
not with xson), and I work really hard at trying to reduce the number of fs
operations on critical path requests, and that logging would add to it :wink:

-shay.banon

On Thu, Aug 12, 2010 at 11:44 AM, Lukáš Vlček lukas.vlcek@gmail.com wrote:

For now I would be happy to get all user queries issued via REST interface
(although even at this level some queries may not be user driven). I just
want to know what users are looking for and all I give them is REST API. Is
there any easy way how to get queries and responses data in this case? For
now I can live with the fact that I will just put more logging message into
some class, is there any central class in this case which could be used for
this purpose?

Regards,
Lukas

On Thu, Aug 12, 2010 at 10:35 AM, Shay Banon <shay.banon@elasticsearch.com

wrote:

No, there is no stats gathering on that level. The plan is to have stats
gathering on the index level, but more general ones, like indexing TPS and
so on.

Regarding the question at hand, it would be very cool if it was provided.
The problems is that its quite complex to do from within elasticsearch. For
example, not all queries are "user" driven (i.e. did not result from a
direct user search request). Or, many times, the queries gets augmented. If
I had to implement something like this in a search centric application, I
would do that in a layer on top of elasticsearch that has a bit more
knowledge of where, why, and what the user did.

Of course, this information can be stored within elasticsearch so you can
search on it :wink:

-shay.banon

On Thu, Aug 12, 2010 at 10:03 AM, Lukáš Vlček lukas.vlcek@gmail.comwrote:

Hi,

Does ES allow me to collect data about the question in ${Subject}?

I would like to know what queries the users are issuing and what kind of
response they get. It seems that there is no simple way how to get this data
by setting log levels (even if I set rootLogger to TRACE level then no
relevant data is recorded for searches and responses). May be the better
option would be to have a specific plugin that could record this type of
data (configurabe output to log file or pluggable custom storage
implementation?) and it would be great if it would be able to return some
data via REST (eg. statistic and few recent queries and responses and their
processing times and the like...). What do you think? Would it be possible
to implement plugin like that?

Regards,
Lukas


(Clinton Gormley) #5

On Thu, 2010-08-12 at 12:03 +0300, Shay Banon wrote:

There isn't one, and storing all the user queries means you need to
build a storage for that since it can be quite big. Thats something
that you can do on the client level by indexing it into elasticsearch
(with the extra knowledge you have about the query itself).

I agree with you - the most meaningful query logging can happen in the
client. In ES, the same query could be expressed in a number of
different ways, and the end result probably wouldn't look terribly
meaningful to the programmer anyway.

Much better to just log queries to a text file or DB in a format that
suits your application, and analyse them later.

clint


(Lukáš Vlček) #6

Guys, I fully agree with you. Implementing this properly is not trivial but
all I need now is just simple workaround or hack if you like (nothing of a
production quality).
Anyway, I will probably gum
up org.elasticsearch.rest.action.search.RestSearchAction a little bit. This
could give me what I want and I should be able to configure logging to
output it into rolling files, that is all I need for now.

On Thu, Aug 12, 2010 at 11:14 AM, Clinton Gormley
clinton@iannounce.co.ukwrote:

On Thu, 2010-08-12 at 12:03 +0300, Shay Banon wrote:

There isn't one, and storing all the user queries means you need to
build a storage for that since it can be quite big. Thats something
that you can do on the client level by indexing it into elasticsearch
(with the extra knowledge you have about the query itself).

I agree with you - the most meaningful query logging can happen in the
client. In ES, the same query could be expressed in a number of
different ways, and the end result probably wouldn't look terribly
meaningful to the programmer anyway.

Much better to just log queries to a text file or DB in a format that
suits your application, and analyse them later.

clint


(system) #7