Infinite scroll best practices with ES

Hi All.

I am new to this group and ES.

I checked some of the topics in this forum but I wasn't able to find my
answer that I needed.

I except to have more than 1 million documents in my ES cluster and show
them via ajax calls in my search results page in my app.

I am using many many filters(except aggregation until now), and my
application architecture is "Single Page App".

I want to make pagination with infinite scroll, so on scroll I will show
next batch of results.

What is the best practice to get filtered results on window scroll e.g if
my filtered results are 50000 and I want to show in batches with size of
250 with infinite scroll?

I am in dilemma to use pagination or scroll in ES. It can affect my
application performance and this is very important to me.

If you prefer scroll, how can I add scroll in body request instead in URL
during request?

I didn't see that in ES documentation.

Thanks for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8b262891-0379-4de8-bbd8-6494b245102f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

In this discussion, I will rely on this page for
reference: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html

At my level, I cannot really make a recommendation but I can share some
questions going through my head, which if you fill in the blanks, might
help you come to a reasonable decision ...

  1. Flat out notice by the docs: "Scrolling is not intended for real time
    user requests, but rather for processing large amounts of data" ... but for
    a user scrolling an infinite window with 250 results per pseudo-page ...
    can we really consider that real time? If not then I guess its reasonable
    to look at scoll for infinite paging.
  2. If you have 100 users accessing your application then that means you
    are serving the infinite-scroll screen via 100 scroll-queries ... what is a
    reasonable timeout period for your scroll queries?
    1. Should they stay alive for 10m each ... is that how long a user
      session is for you on average?
    2. Quoting the docs before my next thought: "The scroll parameter
      (passed to the search request and to every scroll request) tells
      Elasticsearch how long it should keep the search context alive. Its value
      does not need to be long enough to process all data — it just needs to be
      long enough to process the previous batch of results. Each scroll request
      (with the scroll parameter) sets a new expiry time."
    3. What happens when a scroll query timeout expires? I think your app
      will need to be smart enough to issue a new search_then_scroll chain but:
      1. the results will shift a bit if new data has been indexed
      2. your app will probably need to keep track of the fact that user
        was on pseudo-page 5 of their infinite scroll (looking at results #
        1000-1250) and then get the new page #6 by searching-1st and then scrolling
        pages # 2,3,4,5 until it gets to page #6 ... so the app needs to be very
        smart ... depending on your comfort level with writing code, you may or may
        not consider this to be a big deal
      3. also there is no way to scroll backwards, which means any pages
        you've retrieved, you'll need to cache them all on client side, does your
        app have sufficient memory?
      4. will you choose to replace the cache when something like (3.2)
        happens? I would btu again the shifting results may or may not feel odd to
        the user.
    4. Will you be happy with a scroll query only performance or will you
      want something more like the SCAN search_type also? In which case there is
      no sorting at all so when a timeout happens, the rows of data will seem to
      have moved around a LOT to a user with good memory who was scrolling the
      page. So it would be best to just make users start from the beginning in
      such a scenario
    5. I also imagine you will shorten the timeout based on how many
      users are hammering your application so when you try and strike a balance
      between the average user engagement time and the scroll query timeout,
      you'll have a lowerbound determined (somewhat) by the fact that on average
      users stick around on your page and scroll around for no more than 25
      seconds on average. I'm making these metrics up but you get the idea. So
      you wouldn't want to set a timeout smaller than 25 seconds then.
  3. The good news is if you do all this, then people will love you for
    sharing your infinite client code :slight_smile:
  4. Some practical advice: I have an angular+ionic app for selling
    clothes,shoes etc. where I have NOT setup infinite scroll (default is like
    1000 products), and I explain my decision to my team like so: I want my
    users to search and not scroll. This is just an opinion so if you took
    the time to write a smart client, and if it was open-source, I would
    totally consider using it :slight_smile:

Cheers!

  • Pulkit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f0d29b57-dfe9-44e2-98b0-ce5b97a9e96f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Scan/scroll queries use too much memory to serve all clients. They also
keep files around on disk after they would normally be deleted.
On Nov 9, 2014 12:12 PM, "pulkitsinghal" pulkitsinghal@gmail.com wrote:

In this discussion, I will rely on this page for reference:
Elasticsearch Platform — Find real-time answers at scale | Elastic

At my level, I cannot really make a recommendation but I can share some
questions going through my head, which if you fill in the blanks, might
help you come to a reasonable decision ...

  1. Flat out notice by the docs: "Scrolling is not intended for real
    time user requests, but rather for processing large amounts of data" ...
    but for a user scrolling an infinite window with 250 results per
    pseudo-page ... can we really consider that real time? If not then I guess
    its reasonable to look at scoll for infinite paging.
  2. If you have 100 users accessing your application then that means
    you are serving the infinite-scroll screen via 100 scroll-queries ... what
    is a reasonable timeout period for your scroll queries?
    1. Should they stay alive for 10m each ... is that how long a user
      session is for you on average?
    2. Quoting the docs before my next thought: "The scroll parameter
      (passed to the search request and to every scroll request) tells
      Elasticsearch how long it should keep the search context alive. Its value
      does not need to be long enough to process all data — it just needs to be
      long enough to process the previous batch of results. Each scroll request
      (with the scroll parameter) sets a new expiry time."
    3. What happens when a scroll query timeout expires? I think your
      app will need to be smart enough to issue a new search_then_scroll chain
      but:
      1. the results will shift a bit if new data has been indexed
      2. your app will probably need to keep track of the fact that
        user was on pseudo-page 5 of their infinite scroll (looking at results #
        1000-1250) and then get the new page #6 by searching-1st and then scrolling
        pages # 2,3,4,5 until it gets to page #6 ... so the app needs to be very
        smart ... depending on your comfort level with writing code, you may or may
        not consider this to be a big deal
      3. also there is no way to scroll backwards, which means any
        pages you've retrieved, you'll need to cache them all on client side, does
        your app have sufficient memory?
      4. will you choose to replace the cache when something like
        (3.2) happens? I would btu again the shifting results may or may not feel
        odd to the user.
    4. Will you be happy with a scroll query only performance or will
      you want something more like the SCAN search_type also? In which case there
      is no sorting at all so when a timeout happens, the rows of data will seem
      to have moved around a LOT to a user with good memory who was scrolling the
      page. So it would be best to just make users start from the beginning in
      such a scenario
    5. I also imagine you will shorten the timeout based on how many
      users are hammering your application so when you try and strike a balance
      between the average user engagement time and the scroll query timeout,
      you'll have a lowerbound determined (somewhat) by the fact that on average
      users stick around on your page and scroll around for no more than 25
      seconds on average. I'm making these metrics up but you get the idea. So
      you wouldn't want to set a timeout smaller than 25 seconds then.
  3. The good news is if you do all this, then people will love you for
    sharing your infinite client code :slight_smile:
  4. Some practical advice: I have an angular+ionic app for selling
    clothes,shoes etc. where I have NOT setup infinite scroll (default is like
    1000 products), and I explain my decision to my team like so: I want my
    users to search and not scroll. This is just an opinion so if you
    took the time to write a smart client, and if it was open-source, I would
    totally consider using it :slight_smile:

Cheers!

  • Pulkit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f0d29b57-dfe9-44e2-98b0-ce5b97a9e96f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f0d29b57-dfe9-44e2-98b0-ce5b97a9e96f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3vTaDBdyehZ_Frpiz7aS-P6ciuBmHbGOQjdAGfOyFfrQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi All,

Thanks pulkitsinghal and Nicolas for your post replies.

Actually I decided to go with pagination in elasticsearch, so on scroll I
woluld load another part of results.

On every scroll request I have to do a request in Elasticsearch server
with the same query just changing from offset.

This is not good solution for performance, but I will check and test this
with big data how it will work.

I will take time to share with you the application after It is finished,
and I will share details how many data are totally, and how the performance
is going on.

Best Regards,
Driton

On Thursday, October 30, 2014 1:54:16 AM UTC+1, Driton Alija wrote:

Hi All.

I am new to this group and ES.

I checked some of the topics in this forum but I wasn't able to find my
answer that I needed.

I except to have more than 1 million documents in my ES cluster and show
them via ajax calls in my search results page in my app.

I am using many many filters(except aggregation until now), and my
application architecture is "Single Page App".

I want to make pagination with infinite scroll, so on scroll I will show
next batch of results.

What is the best practice to get filtered results on window scroll e.g if
my filtered results are 50000 and I want to show in batches with size of
250 with infinite scroll?

I am in dilemma to use pagination or scroll in ES. It can affect my
application performance and this is very important to me.

If you prefer scroll, how can I add scroll in body request instead in URL
during request?

I didn't see that in ES documentation.

Thanks for your help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/988b5101-1a52-4311-9df7-0cfd9d4cf0d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.