How to model data?


(Hossein Yeganeh Markid) #1

Hi, I have an index in which I am trying to have some fields translated in different languages:
which approach you think is better:

first approach:
{
"field1": {
"en": "book",
"de": "buch",
"fr": "livre"
},
"field2": {
"en": "water",
"de": "wasser",
"fr": "eau"
},
"field3": {
"en": "head",
"de": "kopf",
"fr": "tête"
}
}

Or
Second approach:
{
"en": {
"field1": "book"
"field2": "water"
"field3": "head"
},
"de": {
"field1": "buch"
"field2": "wasser"
"field3": "kopf"
},
"fr": {
"field1": "livre"
"field2": "eau"
"field3": "tête"
}
}

Which one is better and why?

Thanks in advance


(Jymit Singh Khondhu) #2

@hym
How do you intend to serve the results to your users?

Are you wanting to create a synonyms search whereby one user searches for auto IT and that is able to find results for car EN, coche ES?


(Hossein Yeganeh Markid) #3

@JKhondhu, Thanks for replay;
No in this case I am not thinking of using synonyms because I already provided synonyms per each analyzer of different languages.
My question is more from data model point of view, it is just like project structures nowadays is a question,
by-feature or by-type where in our case the first approach is more like by-feature and the second approach is like by-type

Please also tell me more what goes in your mind, maybe I did not get your hint and question
Thanks in advance


(Jymit Singh Khondhu) #4

How do you intend to serve the results to your users?
How do you presume your data to be queried? By feature or type? Then we can work backwards.


(Hossein Yeganeh Markid) #5

@JKhondhu, In my queries, I will always have locale (en, de, ...) here we define it type, for example, let's assume a request comes in and it has the locale set to en, then the query clauses will target fields for en.
please tell me some also what you have in mind for queries by feature?
and one more question, if we define data model by type we will have all fields (field1, field2, field3) several times defined with the same name in the mapping (same index) but with different analyzers, it could be a problem?

Thanks


(Jymit Singh Khondhu) #6

:slight_smile: @hym,

"en": {
"field1": "book"
"field2": "water"
"field3": "head"
.
.

This schema is much easier for search. In a multimatch you can use the wildcard en.*. So the recommendation would be to do this by language, absolutely.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.