Get count by certain field based on a sub query


(Chen Wang) #1

Guys,
I just successfully imported my data to ES, e,g. It has looks like this:
"activity": 'viewed',
"sessionId": "000000143198107b3fe510b041138cd33fdd9252aab9808c",
"campaign_id":""
,

"activity": 'campaign_viewed',
"sessionId": "000000143198107b3fe510b041138cd33fdd9252aab9808c",
"campaign_id":"my_campaign"

As you can see, the two entries has the same session id, and since the
second entry has a campaign_id, i will assume the first activity(viewed) is
also generated from the campaign.
So how can i do count like:
(count the activities that are generated from campaign):

count(activity)
where sessionId in (select sessionid from index where
campaign_id="m_campaign") ?

Thanks much!
Chen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0758af7f-a8ef-451b-a029-e42d1678e73d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Adrien Grand) #2

I don't think this is possible. To me the way to solve this kind of issues
would be to reindex events as soon as you know their campain_id.

On Thu, Jan 23, 2014 at 2:04 AM, Chen Wang chen.apache.solr@gmail.comwrote:

Guys,
I just successfully imported my data to ES, e,g. It has looks like this:
"activity": 'viewed',
"sessionId": "000000143198107b3fe510b041138cd33fdd9252aab9808c",
"campaign_id":""
,

"activity": 'campaign_viewed',
"sessionId": "000000143198107b3fe510b041138cd33fdd9252aab9808c",
"campaign_id":"my_campaign"

As you can see, the two entries has the same session id, and since the
second entry has a campaign_id, i will assume the first activity(viewed) is
also generated from the campaign.
So how can i do count like:
(count the activities that are generated from campaign):

count(activity)
where sessionId in (select sessionid from index where
campaign_id="m_campaign") ?

Thanks much!
Chen

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0758af7f-a8ef-451b-a029-e42d1678e73d%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4mEB0rJGPJOF2bcWHn_q-MfO0xzD081_bB2A9%2B52UJww%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Chen Wang) #3

I actually have a feeling that it might be possible by the new aggregation
in ES 1.0, my thoughts are:

  1. make the campaign_id field a integer value
  2. group by all session data, doing the count, and also sum the campaign_id
    field.
  3. filter through the group with sum(campaign_id) != 0, (meaning this
    session is from campaign), then do a sum on all the filtered group.

But how can I implement this in ES query...
Chen

On Thu, Jan 23, 2014 at 1:23 PM, Adrien Grand <
adrien.grand@elasticsearch.com> wrote:

I don't think this is possible. To me the way to solve this kind of issues
would be to reindex events as soon as you know their campain_id.

On Thu, Jan 23, 2014 at 2:04 AM, Chen Wang chen.apache.solr@gmail.comwrote:

Guys,
I just successfully imported my data to ES, e,g. It has looks like this:
"activity": 'viewed',
"sessionId": "000000143198107b3fe510b041138cd33fdd9252aab9808c",
"campaign_id":""
,

"activity": 'campaign_viewed',
"sessionId": "000000143198107b3fe510b041138cd33fdd9252aab9808c",
"campaign_id":"my_campaign"

As you can see, the two entries has the same session id, and since the
second entry has a campaign_id, i will assume the first activity(viewed) is
also generated from the campaign.
So how can i do count like:
(count the activities that are generated from campaign):

count(activity)
where sessionId in (select sessionid from index where
campaign_id="m_campaign") ?

Thanks much!
Chen

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0758af7f-a8ef-451b-a029-e42d1678e73d%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/scwSRLM08vc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4mEB0rJGPJOF2bcWHn_q-MfO0xzD081_bB2A9%2B52UJww%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACim9RnoWHcXtaEp2oCYJ8%2ByAvy0ryDfw_ey4n6_eTqYbLmvVQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4