Is Elasticsearch good for such a problem?

Hi,
We have been thinking about choosing the right technology for our problem for a long time.
The problem is as follows:

  • we will have tens of millions of JSON documents
  • documents will contain any number of "entry" sections (example below) in "entries" section.
    The user decides how many such "entries" will be in the Entries section. An example of such a document
  "user_id": 323233
  "entries": {
   "car_model": {
    "type": "string",
    "value": "123",
    "entry_updated_ts": 1634324070.098,
    "entry_created_ts": 1634324070.098
   },
   "year_of_production": {
    "value_changed_ts": 1634324070.098,
    "type": "integer",
    "value": "1980",
    "entry_updated_ts": 1634324070.098,
    "entry_created_ts": 1634324070.098
   }
  }

We want to quickly find documents that meet certain conditions, e.g.
"user_id" = 323233 and year_of_production = "1980"

The list of possible fields on which he can search is arbitrary, each user can have hundreds of fields on which he can search. Each user can create own type of 'entry'. We will have tens of thousands of users.

The question is whether Elasticsearch is a good tool for such a problem?
As I understand I can create one index with many fields but in our example we can have tens of thousand of different types of fields. We also though about such we situation that we create
index for each user but it result in generating as many Indices as user but we expect thousands for user.
Of course, you can base this search on a relational databases, but we are afraid poor performance and problem with scalability .

Michal Szymanski

Welcome!

Yes you will be amazed with what Elasticsearch can do to solve your case and even more gives you ideas on other use cases you have never dreamed about.

At least that was my story when I launched my very first node more than 10 years ago. :grin:

You can think also of:

{
  "user_id": 323233
  "entries": [{
    "name": "car_model",
    "type": "string",
    "value": "123",
    "entry_updated_ts": 1634324070.098,
    "entry_created_ts": 1634324070.098
   }, {
    "name":"year_of_production",
    "value_changed_ts": 1634324070.098,
    "type": "integer",
    "value": "1980",
    "entry_updated_ts": 1634324070.098,
    "entry_created_ts": 1634324070.098
   }
  ]
  }

Ok we can use list of dict but still i do not know how to organizm database. We will have milion of Such document with different fid na es. As i know there is a limit for number of fields (around 1000) what in our case is small number.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.