Help with nested datatype mapping

Hi, I'm having problems with the schema for a nested datatype.

So far, I have the following:

"mappings": {
	"note": {
		"_all": {
			"enabled": false
		},
		"properties": {
			"user_id": {
				"type": "long"
			},
			"creation": {
				"type": "date",
				"format": "date_hour_minute_second"
			},
			"deleted": {
				"type": "integer"
			},
			"favourite": {
				"type": "integer"
			},
			"modification": {
				"type": "date",
				"format": "date_hour_minute_second"
			},
			"note": {
				"type": "text",
				"analyzer": "english",
				"fields": {
					"std": {
						"type": "text",
						"analyzer": "asset_en_analyzer",
						"fields": {
							"std": {
								"type": "text",
								"analyzer": "standard"
							}
						}
					}
				}
			},
			"title": {
				"type": "text",
				"analyzer": "english",
				"fields": {
					"std": {
						"type": "text",
						"analyzer": "asset_en_analyzer",
						"fields": {
							"std": {
								"type": "text",
								"analyzer": "standard"
							}
						}
					}
				}
			},
			"links_to_asset": {
				"type": "nested",
				"properties": {
					"note_link_id": {
						"type": "long"
					},
					"user_id": {
						"type": "long"
					},
					"creation": {
						"type": "date",
						"format": "date_hour_minute_second"
					},
					"modification": {
						"type": "date",
						"format": "date_hour_minute_second"
					},
					"to_asset": {
						"type": "long"
					},
					"from_asset": {
						"type": "long"
					},
					"comment": {
						"type": "text",
						"analyzer": "english",
						"fields": {
							"std": {
								"type": "text",
								"analyzer": "asset_en_analyzer",
								"fields": {
									"std": {
										"type": "text",
										"analyzer": "standard"
									}
								}
							}
						}
					}
				}
			}
		}
	}
}

Here, the links_to_asset appears to be of the proper format (based on the few examples I've been able to find).

However, when I populate Elasticsearch with data, the "type": "nested" attribute is removed.

Does anyone know how I could fix this?

I am not following what you mean by "type": "nested" attribute is removed. Is it being removed from the mapping? If not, what makes you think it is being removed? Please, give me a sample of what you are observing that makes you think it is being removed.

Hi @thiago, that's correct.

Once data is added and I check the mappings, that attribute has been removed.

{
    "asset_en_v1": {
        "mappings": {}
    },
    "asset": {
        "mappings": {
            "note": {
                "properties": {
                    "creation": {
                        "type": "date"
                    },
                    "deleted": {
                        "type": "long"
                    },
                    "favourite": {
                        "type": "long"
                    },
                    "links_to_asset": {
                        "properties": {
                            "comment": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "creation": {
                                "type": "date"
                            },
                            "from_asset": {
                                "type": "long"
                            },
                            "modification": {
                                "type": "date"
                            },
                            "note_link_id": {
                                "type": "long"
                            },
                            "to_asset": {
                                "type": "long"
                            },
                            "user_id": {
                                "type": "long"
                            }
                        }
                    },
                    "modification": {
                        "type": "date"
                    },
                    "note": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "title": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "user_id": {
                        "type": "long"
                    }
                }
            }
        }
    }
}

Is this for a new index? Elasticsearch won't do that unless you are indexing in a new index. Are you using templates?

Hi @thiago, this is a new index.

No, I'm not using templates; I'm running the mapping via Postman.

I've had similar problems with date attributes, when the data was the wrong format, so I'm wondering if this is the same, thought I don't see what's wrong with the data.

So you need to create the mapping in the new index before indexing. You should also index in the same type defined in the index mapping, otherwise Elasticsearch will try to autodetect the mapping and it won't autodetect nested types.

Hi @thiago, that's what I'm doing.

Elasticsearch is performing some some autodetection in spite of the fact I'm specifying the index with exact values before indexing data.

What is the name of the index that you are trying to create and also could you give a sample document that you are trying to index after you create the index?

Hi @thiago, the index is: asset_en_v1 and I'm using the alias: asset to point to it.

It's not feasible to give you a sample document as the process of indexing is automated (I'm indexing existing data from within a database).

After testing, the index and the mappings are correct throughout the application, with the exception of the nested datatype.

Are you sure the alias is properly setup? After you index it, verify that you can query the document in asset but also in asset_en_v1.

Hi @thiago, here's the workflow I go through:

DEL localhost:9200/_all

PUT localhost:9200/asset_en_v1

{
	"settings": {
		"analysis": {
			"char_filter": {
				"&_to_and": {
					"type": "mapping",
					"mappings": [ "&=> and "]
				}
			},
			"filter": {
				"asset_en_stopwords": {
					"type": "stop",
					"stopwords": [ "_english_" ]
				},
				"asset_en_stemmer": {
					"type": "stemmer",
					"name": "english"
				},
				"asset_en_shingle": {
					"type": "shingle",
					"max_shingle_size": 5,
					"min_shingle_size": 2,
					"output_unigrams": false,
					"output_unigrams_if_no_shingles": true
				}
			},
			"analyzer": {
				"asset_en_analyzer": {
					"type": "custom",
					"char_filter": [ "html_strip", "&_to_and" ],
					"tokenizer": "standard",
					"filter": [ "asset_en_stopwords", "asset_en_stemmer", "lowercase", "asset_en_shingle", "asciifolding" ]
				}
			}
		}
	}
},
"mappings": {
	"note": {
		"_all": {
			"enabled": false
		},
		"properties": {
			"user_id": {
				"type": "long"
			},
			"creation": {
				"type": "date",
				"format": "date_hour_minute_second"
			},
			"deleted": {
				"type": "integer"
			},
			"favourite": {
				"type": "integer"
			},
			"modification": {
				"type": "date",
				"format": "date_hour_minute_second"
			},
			"note": {
				"type": "text",
				"analyzer": "english",
				"fields": {
					"std": {
						"type": "text",
						"analyzer": "asset_en_analyzer",
						"fields": {
							"std": {
								"type": "text",
								"analyzer": "standard"
							}
						}
					}
				}
			},
			"title": {
				"type": "text",
				"analyzer": "english",
				"fields": {
					"std": {
						"type": "text",
						"analyzer": "asset_en_analyzer",
						"fields": {
							"std": {
								"type": "text",
								"analyzer": "standard"
							}
						}
					}
				}
			},
			"links_to_asset": {
				"type": "nested",
				"properties": {
					"note_link_id": {
						"type": "long"
					},
					"user_id": {
						"type": "long"
					},
					"creation": {
						"type": "date",
						"format": "date_hour_minute_second"
					},
					"modification": {
						"type": "date",
						"format": "date_hour_minute_second"
					},
					"to_asset": {
						"type": "integer"
					},
					"from_asset": {
						"type": "integer"
					},
					"comment": {
						"type": "long",
						"analyzer": "english",
						"fields": {
							"std": {
								"type": "text",
								"analyzer": "asset_en_analyzer",
								"fields": {
									"std": {
										"type": "text",
										"analyzer": "standard"
									}
								}
							}
						}
					}
				}
			}
		}
	}

... other asset types (removed due to character length constraints).

}

PUT localhost:9200/asset_en_v1/_alias/asset

Then I index the data from within the application.

As I said, the mapping appears to be correct, with the exception of the nested datatype.

It seems fine. Please, attach the full output of GET asset_en_v1/_mapping. Do not remove anything, if it's too big, then paste the content in pastebin.com and link it here.

Hi @thiago:

{
"asset_en_v1": {
"mappings": {}
}
}

I assume the alias isn't working?

So if you are indexing in asset and asset_en_v1 mapping is like this it is because the alias command didn't work.

What is the output of PUT /asset_en_v1/_alias/asset command?

Hi @thiago:

{
    "acknowledged": true
} 

If it had been anything else, I would have mentioned it.

Hi @thiago, any further thoughts on this?

There should be some command failing in between. I don't understand how you get an empty mapping with GET asset_en_v1/_mapping while your command for creating the index asset_en_v1 is defining mappings properly. Unless the create command is failing or the index is being deleted in between, I don't see how you can get an empty mapping.

Hi @thiago, we found that the problem was in the formatting of the settings and mappings as one lump of code; Postman allowed the JSON to be executed in spite that it wasn't valid.

Thank you for your time — much appreciated!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.