Scenario: Tweets from All Followed Twitters or Link-Walking


(Ted Karmel) #1

Hi,

I'm gradually learning more and getting deeper into elasticsearch.
And it just keeps confirming my initial very positive impression :slight_smile:

Right now, I'm exploring different scenarios. In particular, I am
stumbling on the following query and how to structure it.

Let's say there are two types of documents. In the first type, the
name of a user and the name of a user that is followed.

{
"user": "kimchy",
"followed": "bulgogi"
}

In the second type of document, the actual tweet of a user with the
date and the message.

{
"user": "bulgogi",
"post_date": "2009-11-15T13:12:00",
"message": "Elastic Search, so far so good!"
}

My stumbling block is I see only one way of doing this query. Do one
query on the first type for a particular user. Then do an iteration
over each result for the value of followed. Then do a query on the
second type of document for that value of a user. Repeat this query
for each different value of followed. Combine all the results.

It seems a bit long-winded. I was wondering if you guys had any suggestions.

Also, is there something like link-walking - a term the Basho people
use for a feature of Riak. Basically, link-walking is a reference in
one document to another document or type of document. The query would
"walk over" to the linked document and return the results from there
(since our interest in the first type of document is only a means to
an end).


(Shay Banon) #2

There is no link walking feature in elasticsearch. Implementing it "the
basho way" is actually not very difficult, the more interesting part is
combining it with search itself, which is more difficult (and relationship
in general). I have some ideas in that area, but nothing concrete.

In your case, you will need to do two queries (and not one query and a query
per followed). The first to get all the followed, the second, the query
itself, with a terms filter that has all the followed in it (running against
the user). This will make sure you get results only from the related users
listed in the terms filter. Make sense?

-shay.banon

On Thu, Sep 2, 2010 at 12:12 PM, Ted Karmel ted.karmel@gmail.com wrote:

Hi,

I'm gradually learning more and getting deeper into elasticsearch.
And it just keeps confirming my initial very positive impression :slight_smile:

Right now, I'm exploring different scenarios. In particular, I am
stumbling on the following query and how to structure it.

Let's say there are two types of documents. In the first type, the
name of a user and the name of a user that is followed.

{
"user": "kimchy",
"followed": "bulgogi"
}

In the second type of document, the actual tweet of a user with the
date and the message.

{
"user": "bulgogi",
"post_date": "2009-11-15T13:12:00",
"message": "Elastic Search, so far so good!"
}

My stumbling block is I see only one way of doing this query. Do one
query on the first type for a particular user. Then do an iteration
over each result for the value of followed. Then do a query on the
second type of document for that value of a user. Repeat this query
for each different value of followed. Combine all the results.

It seems a bit long-winded. I was wondering if you guys had any
suggestions.

Also, is there something like link-walking - a term the Basho people
use for a feature of Riak. Basically, link-walking is a reference in
one document to another document or type of document. The query would
"walk over" to the linked document and return the results from there
(since our interest in the first type of document is only a means to
an end).


(Ted Karmel) #3

Hi Shay,

Thanks for your always prompt replies. It makes sense and simplifies
things quite a bit already.

But, just a clarification, with two queries, you would not be able to
apply filters selectively to each term. Say for instance you wanted
the last 10 tweets from each twitter user you followed. You could not
apply the from/size and sort options on each term with one query. You
would only be able to obtain the last 10 tweets from all the twitter
users you follow. Otherwise, you would need multiple queries.

Is that correct?

On Thu, Sep 2, 2010 at 12:55 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

There is no link walking feature in elasticsearch. Implementing it "the
basho way" is actually not very difficult, the more interesting part is
combining it with search itself, which is more difficult (and relationship
in general). I have some ideas in that area, but nothing concrete.
In your case, you will need to do two queries (and not one query and a query
per followed). The first to get all the followed, the second, the query
itself, with a terms filter that has all the followed in it (running against
the user). This will make sure you get results only from the related users
listed in the terms filter. Make sense?
-shay.banon
On Thu, Sep 2, 2010 at 12:12 PM, Ted Karmel ted.karmel@gmail.com wrote:

Hi,

I'm gradually learning more and getting deeper into elasticsearch.
And it just keeps confirming my initial very positive impression :slight_smile:

Right now, I'm exploring different scenarios. In particular, I am
stumbling on the following query and how to structure it.

Let's say there are two types of documents. In the first type, the
name of a user and the name of a user that is followed.

{
"user": "kimchy",
"followed": "bulgogi"
}

In the second type of document, the actual tweet of a user with the
date and the message.

{
"user": "bulgogi",
"post_date": "2009-11-15T13:12:00",
"message": "Elastic Search, so far so good!"
}

My stumbling block is I see only one way of doing this query. Do one
query on the first type for a particular user. Then do an iteration
over each result for the value of followed. Then do a query on the
second type of document for that value of a user. Repeat this query
for each different value of followed. Combine all the results.

It seems a bit long-winded. I was wondering if you guys had any
suggestions.

Also, is there something like link-walking - a term the Basho people
use for a feature of Riak. Basically, link-walking is a reference in
one document to another document or type of document. The query would
"walk over" to the linked document and return the results from there
(since our interest in the first type of document is only a means to
an end).


(Shay Banon) #4

Yes, if you just want the last 10 tweets, then the proposed solution won't
work.

-shay.banon

On Thu, Sep 2, 2010 at 4:19 PM, Ted Karmel ted.karmel@gmail.com wrote:

Hi Shay,

Thanks for your always prompt replies. It makes sense and simplifies
things quite a bit already.

But, just a clarification, with two queries, you would not be able to
apply filters selectively to each term. Say for instance you wanted
the last 10 tweets from each twitter user you followed. You could not
apply the from/size and sort options on each term with one query. You
would only be able to obtain the last 10 tweets from all the twitter
users you follow. Otherwise, you would need multiple queries.

Is that correct?

On Thu, Sep 2, 2010 at 12:55 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

There is no link walking feature in elasticsearch. Implementing it "the
basho way" is actually not very difficult, the more interesting part is
combining it with search itself, which is more difficult (and
relationship
in general). I have some ideas in that area, but nothing concrete.
In your case, you will need to do two queries (and not one query and a
query
per followed). The first to get all the followed, the second, the query
itself, with a terms filter that has all the followed in it (running
against
the user). This will make sure you get results only from the related
users
listed in the terms filter. Make sense?
-shay.banon
On Thu, Sep 2, 2010 at 12:12 PM, Ted Karmel ted.karmel@gmail.com
wrote:

Hi,

I'm gradually learning more and getting deeper into elasticsearch.
And it just keeps confirming my initial very positive impression :slight_smile:

Right now, I'm exploring different scenarios. In particular, I am
stumbling on the following query and how to structure it.

Let's say there are two types of documents. In the first type, the
name of a user and the name of a user that is followed.

{
"user": "kimchy",
"followed": "bulgogi"
}

In the second type of document, the actual tweet of a user with the
date and the message.

{
"user": "bulgogi",
"post_date": "2009-11-15T13:12:00",
"message": "Elastic Search, so far so good!"
}

My stumbling block is I see only one way of doing this query. Do one
query on the first type for a particular user. Then do an iteration
over each result for the value of followed. Then do a query on the
second type of document for that value of a user. Repeat this query
for each different value of followed. Combine all the results.

It seems a bit long-winded. I was wondering if you guys had any
suggestions.

Also, is there something like link-walking - a term the Basho people
use for a feature of Riak. Basically, link-walking is a reference in
one document to another document or type of document. The query would
"walk over" to the linked document and return the results from there
(since our interest in the first type of document is only a means to
an end).


(system) #5