Paging issues when developing search capabilities using Elasticsearch and MySQL

We are developing a search function for space rental products using Elasticsearch and MySQL.

[Search form]

  • From DateTime
  • To DateTime
  • Min Price
  • Max Price

The search form is the same as above, and there is no problem with Fulltext-search in ES. The problem is that when you search for a price, you need to calculate the amount to be paid within the search time in real time, search for a price range with the calculated price, and complete paging.

ES searches Fulltext-search MySQL calculates prices in real time and filters the calculated prices according to Min Max Price I need to page the records that are searched in ES and MySQL There is a problem that the number of records that are searched on both systems do not match, so I am asking for help on a solution.

To solve this issue, I thought about calculating and filtering using scripts in ES without searching in both ES and MySQL, but I think the paging issue can be solved completely, but I think the performance will be very poor, so I ask for your opinions from experts.

I thought about two things to solve this issue.

  1. I also thought about calculating and filtering using scripts in ES without searching in both ES and MySQL, but the paging issue will be completely resolved, but I think the performance will be very low.
  2. We also considered importing more than two to three times the record count from both ES and MySQL, and merging the search results from the Spring Boot project or Spark. However, certain pages may have issues that return only one or two.

Please recommend a good way to solve the problem and improve the search performance.

I would recommend that you test this. Even if performance is worse compared to search without scripting it may still be sufficiently fast if your data volumes are not extremely large

The index structure is as follows.

  • product
  • price
    {
    "date" : "2024-05-20",
    "t0000" : 10000,
    ....
    "t2330": 12000
    }

When registering the product, price will register 730 records.
If you register about 1 million products, 730,000,000 price records will be registered.

If you run a script that adds up the prices (only the prices in the search time zone between t0000 and t2330) when the data grows like this, doesn't it have a significant impact on performance?

That might be the case. I am not familar with your data so do not know if there may be a better way to represent it in Elasticsearch or what impact such a calculation will have.

If you know you are always going to sum up certain prices, why not do this when you index so you have it readily available?

Thank you very much for your quick response.
It is difficult to predict and calculate all time zones because the search time zones are very diverse.
However, we intend to cache the price of the time zone that users search for.
If you don't have a cached price, you have no choice but to do the calculation.

So I was thinking about doing a fulltextsearch search in es and calculating the price in mysql and bringing the results of es and mysql together.

However, if you bring it together, you will have a paging processing problem, and eventually you come back to the origin and even have to deal with fulltextsearch and calculation in es.

Or I'm so worried about how I can change the data structure to solve this problem..