[SOLVED] Recommendations for index mapping

I'm trying to imitate a SQL database structure (to have a similar structure, but I can change it if needed).

The SQL tables are like:

Table: main
------------------------
main_id   : integer
added     : datetime
part_id   : integer
action_id : integer
....

Table: parts 
------------------------
part_id   : integer
name      : varchar

Table: actions 
------------------------
action_id : integer
name      : varchar

..... (and so on) ....

I have about 10 similar fields like part_id and action_id in "main" table, in which they refer to a table with just an id and a name.

I can think of two options for defining the mapping of such fields:

  1. as "integer" and creating an index for each table (parts, actions, etc)
  2. as "keyword" and store the name directly in my "main" index (no need of extra indices).

In a SQL database, storing it as varchar usually takes more space, but performs better when retrieving the data as it doesn't needs to join tables. However, using an integer for searching is faster than using a string. One advantage of using an integer is that if we need to update some "name", its very simple.
Note: The user has the power to add, remove or update any of those tables such as parts and actions.

I'm very new with ES, so I don't know how repeated keywords may affect performance (or storage) and if there is a better solution to this problem.

UPDATE: I found this post: Are ENUM stored efficiently? - #2 by danielmitterdorfer
In which states that:

Strings are encoded [...] with the help of an ordinal table. The strings are de-duplicated and sorted into a table, assigned an ID, and then those ID’s are used as numeric doc values.

So I guess going with "keyword" may be better and simpler. I will have to handle updates separately, but I think is fine. Still, if I'm missing something, I would like to learn more about ES.

More about this: Modeling Your Data | Elasticsearch: The Definitive Guide [2.x] | Elastic

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.