Facing with a challenge in redesigning the way to store the data in elasticsearch(version 6), The current architecture looks like this, all the master products, product items, category, and brands are stored in Postgresql. Then add shop's inventory from PostgreSQL to elasticsearch, each shop have its own inventory. using elasticsearch-rails gem, persistence repository has mapping mentioned below. This was the mistake we did, for every shop indices the shards were three, the more shops we added it was getting multiple by 3 and our heap memory(20GB) got full and we have one server(another mistake).
class ShopInventoryRepository
include Elasticsearch::Persistence::Repository
include Elasticsearch::Persistence::Repository::DSL
client Elasticsearch::Client.new(url: ENV['ELASTICSEARCH_URL'], log: true)
index_name "#{Rails.env}_shop_inventory" … explicitly we provide index_name as #{Rails.env}_shop_inventory_#{shop.id}
document_type "shop_inventory"
klass ShopInventory
settings number_of_shards: 3,
mappings do
indexes :id, type: 'integer'
indexes :product_id, type: 'integer'
indexes :name, type: 'text', analyzer: 'gramAnalyzer', search_analyzer: 'whitespace_analyzer', fields: {raw: {type: "keyword"}}
indexes :product_sizes, type: 'nested' do
….
end
indexes :category do
….
end
indexes :sub_category do
….
end
indexes :brand do
…..
end
end
end
{
"took":0,
"timed_out":false,
"_shards":{
"total":3,
"successful":3,
"skipped":0,
"failed":0
},
"hits":{
"total":3202,
"max_score":1.0,
"hits":[
{
"_index":"staging_shop_inventory_552",
"_type":"shop_inventory",
"_id":"1",
"_score":1.0,
"_source":{
"id":1,
"name":"Product A",
"product_id":,
"image":",
"alternate_name":"",
"name_suggest":"Product A ",
"brand_suggest":"Product A",
"name_autocomplete":"Product A",
"brand_autocomplete":"Product A",
"is_deleted":false,
"deleted_at":null,
"created_at":"",
"product_sizes":[
{
"id":2,
"product_id":1,
"ean_code":null,
"uom":"ml",
"weight":180.0,
"price":295.0,
"is_deleted":false,
"deleted_at":null,
"product_update_on":"",
"product_update_status":0,
"in_stock":true,
"description":null
},
{
"id":3,
"product_id":2,
"ean_code":null,
"uom":"ml",
"weight":45.0,
"price":72.0,
"is_deleted":false,
"deleted_at":null,
"product_update_on":"",
"product_update_status":0,
"in_stock":true,
"description":null
}
],
"category":{
"id":3,
"name":"Category A",
"image":""
},
"sub_category":{
"id":11,
"name":"Sub Category A",
"image":"",
"is_selected":null,
"created_at":"",
"updated_at":""
},
"brand":{
"id":23,
"name":"Brand A",
"image":"",
"is_selected":null,
"created_at":"",
"updated_at":""
}
}
}
]
}
}
What we are planning to do is, instead of creating each repository/index for a shop, planning to group the shop by n(500) numbers and create a new repo for that and add 5-10 shards. Planning to flatten the mapping couldn't find a way to do it. Any suggestion on how to design this would be really helpful. If more information need please let me know I can share accordingly