Entities architecture with custom fields and Upsert script issue

Spyros_Giannopoulos · June 17, 2021, 12:42pm

Hello,

I'm building an application in ES that needs to contain information about lists of members. The members in a list have a name, surname, etc.. but also contain some custom fields which are different for each list. That means list1 has different fields than list 2 and different types. The types offered though are specific like date, number, text, boolean.

I tried to approach this by creating one index per list and having flatten the fields like
List1

{
   "name":"...",
   "surname":"...",
   "customfield_id1":"some value",
   "customfield_id2":3,
   "customfield_id5":"some other text value",
 .....
}

List2

{
   "name":"...",
   "surname":"...",
   "customfield_id3": true
   "customfield_id4":"some text",
 .....
}

The ids of the fields are helping me define a definition of fields per list and thus I can manage which and what type of fields each list contains

I created my app and tried to migrate a lot of lists that I keep on an sql server now but the problem is by this way I'm creating a lot of indices and I'm resulting easily on consuming a lot of java heap.

To solve this issue I decided to merge the indices into less by a value that I use to cluster the lists into a much smaller group. Thus resulted to much less indices but now I'm having a different issue. How I'm going to handle the custom fields and the different types. For that I decided to use nested fields where I'm having different types of nested fields.
customFieldsTexts, customFieldsBooleans, customFieldsDates, etc...

I believe this can work but have stuck on the following problem. I set the following index with its mapping for testing purposes.

PUT /testindex1
{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "name" : {
        "type" : "text"
      },
      "customFieldsTexts": { 
        "type": "nested",
        "properties": {
          "id" : {
            "type": "keyword"
          },
          "value" : {
            "type": "text"
          }
        }
      },
      "customFieldsDates": { 
        "type": "nested",
        "properties": {
          "id" : {
            "type": "keyword"
          },
          "value" : {
            "type": "text"
          }
        }
      },
      "customFieldsNumbers": { 
        "type": "nested",
        "properties": {
          "id" : {
            "type": "keyword"
          },
          "value" : {
              "type": "float"
          }
        }
      },
      "customFieldsBooleans": { 
        "type": "nested",
        "properties": {
          "id" : {
            "type": "keyword"
          },
          "value" : {
            "type": "boolean"
          }
        }
      }
    }
  }
}

Then I added the following document

PUT testindex1/_doc/1
{
  "customFieldsTexts" : [
    {
      "id" : "1e633764-4948-4e50-bed1-2e60e3967f66",
      "value" :  "spyros2"
    },
    {
      "id" : "db516d25-9de2-4373-9998-0d5421406989",
      "last" :  "george"
    }
  ]
}

Now I want to run an upsert that will insert if missing or update a custom field for a specific document with the following upsert script call

POST testindex1/_update/1
{
  "script":{
    "source": """
    
    for (int i=0; i< params.customFieldsTexts.size(); i++)
    {
      //Debug.explain(params.customFieldsTexts[i].value);
      def targets = ctx._source.customFieldsTexts.findAll(cf -> cf.id == params.customFieldsTexts[i].id); 
      for (cf in targets) 
	    { 
		    cf.value = params.customFieldsTexts[i].value; 
	    }
    }
    """,
    "params": {
      "customFieldsTexts" : [{
        "id" : "1e633764-4948-4e50-bed1-2e60e3967f61",
        "value" : "spyros4"
      }]
    }
  },
  "upsert": {
    "customFieldsTexts" : {
        "id" : "1e633764-4948-4e50-bed1-2e60e3967f61",
        "value" : "spyros4"
      }
  }
}

The update can work but when missing it doesn't insert the record. What I'm doing wrong here?
Also do you believe my approach on this problem is right?

Thank you in advance!!!
Spyros

Spyros_Giannopoulos · June 17, 2021, 2:08pm

I actually managed to make it work with the following script

POST testindex1/_update/1?refresh
{
  "script":{
    "source": """    
    for (int i=0; i< params.customFieldsTexts.size(); i++)
    {
      def targets = ctx._source.customFieldsTexts.findAll(cf -> cf.id == params.customFieldsTexts[i].id); 
      if (!targets.isEmpty())
      {
        for (cf in targets) 
  	    { 
  		    cf.value = params.customFieldsTexts[i].value; 
  	    } 
      }
      else{
        ctx._source.customFieldsTexts.add(params.customFieldsTexts[i])
      }
    }
    """,
    "params": {
      "customFieldsTexts" : [{
        "id" : "1e633764-4948-4e50-bed1-2e60e3967f61",
        "value" : "spyros4"
      }]
    }
  },
  "upsert": {
    "customFieldsTexts" : [{
        "id" : "1e633764-4948-4e50-bed1-2e60e3967f61",
        "value" : "spyros4"
      }]
  }
}

I would appreciate a lot though an insight view on the architecture. If there is a better way to handle my business case or a better way of doing in general the things I'm doing here from architecture perspective in general.

system · July 15, 2021, 2:08pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Dynamic schema from NoSQL river Elasticsearch	3	413	July 6, 2017
Search unmapped fields Elasticsearch	1	334	November 21, 2020
Dynamic custom fields with numeric and date range queries Elasticsearch	1	353	July 29, 2021
Nested issue Elasticsearch	9	5310	July 5, 2017
Strategy for multi-tenancy and custom fields in mapping Elasticsearch	1	816	August 28, 2017

Entities architecture with custom fields and Upsert script issue

Related topics