Usecase suggestion needed


(Eugene Strokin) #1

I don't have a lot of experience with ES, so forgive if the answer is
obvious. But I still need help with some usecase:
I have document structure which is similar to something like blog
post.
For sake of example I'll remove some non related fields, and I'd have
something like this:
POST:
ID:String
Description:Text
OwnerScreenName: String
OwnerID:String

I have millions of such posts indexed, and it works Ok.
Now I need to add Comments for each post. Each Comment would have very
similar fields:
COMMENT:
ID:String
PostID:String
Comment:Text
CommenterScreenName:String

I could have comments be added/edited/deleted relatively often and in
large numbers (could be from 0 to 300 comments per post in usual case,
and even a lot more in some special cases). Also I need to find posts
based on the comment's content.

So my question: Should I just add the comments data right to the post
(making it part of the document)? But in this case, when I need just
the posts without comments, I'll still be getting everything together,
and because it could have a lot of comments, it could be big overhead.
Or should I keep Comments as a separate type in ES. But in this case,
to retrieve posts and comments together, I'd need to make 2 calls to
ES. Also, post's content should boost score higher than comment's
content, and if the comments are separate documents, I'm not sure how
to write such queries, if possible at all.

Could you give me some suggestions, or at least direction where to dig
in?

Thanks in advance,
Eugene S.


(Karussell) #2

Have a look into parent/child which should be perfectly for your
usecase.

post's content should boost score higher than comment's content

This is easier when separate IMO e.g. using disMaxQuery.

Peter.

On 3 Jan., 22:22, Eugene Strokin eug...@strokin.info wrote:

I don't have a lot of experience with ES, so forgive if the answer is
obvious. But I still need help with some usecase:
I have document structure which is similar to something like blog
post.
For sake of example I'll remove some non related fields, and I'd have
something like this:
POST:
ID:String
Description:Text
OwnerScreenName: String
OwnerID:String

I have millions of such posts indexed, and it works Ok.
Now I need to add Comments for each post. Each Comment would have very
similar fields:
COMMENT:
ID:String
PostID:String
Comment:Text
CommenterScreenName:String

I could have comments be added/edited/deleted relatively often and in
large numbers (could be from 0 to 300 comments per post in usual case,
and even a lot more in some special cases). Also I need to find posts
based on the comment's content.

So my question: Should I just add the comments data right to the post
(making it part of the document)? But in this case, when I need just
the posts without comments, I'll still be getting everything together,
and because it could have a lot of comments, it could be big overhead.
Or should I keep Comments as a separate type in ES. But in this case,
to retrieve posts and comments together, I'd need to make 2 calls to
ES. Also, post's content should boost score higher than comment's
content, and if the comments are separate documents, I'm not sure how
to write such queries, if possible at all.

Could you give me some suggestions, or at least direction where to dig
in?

Thanks in advance,
Eugene S.


(system) #3