Indexing Hierarchical Couchbase Documents


(Dogugun Ozkaya) #1

Hi Everyone,
I'm quite a newbie in elasticsearch. I have a problem regarding indexing referenced documents.
I'm using couchbase as the db of my project. And most of the documents are linked to each other hierarchically. Below is an example:

document 1:
{
	"documentid": "docid1",
	"properties": [
		{
			"size": "XL"
		},
		{
			"color": "red"
		},
		{
			"image": "docid2"
		}
	]
}
document 2:
{
	"documentid": "docid2",
	"properties": [
		{
			"resolution": "1024*768"
		},
		{
			"type": "jpg"
		}
	]
}

"documentid" fields are document id's in the couchbase.
I have set up couchbase - elasticsearch plugin and it replicates data from couchbase to ES succesfully. However, when I want to search with image type property - "jpeg" - it gives me id of the document "docid2". However I want to get the parent document "docid1".

There are a couple of examples I found online. But in those examples document id's are mapped as:
"docid1::docid2" which cannot be done in my case. Also since the CB-ES connector did the whole indexing, I cannot change the indexing(I guess, tried and constantly getting error).

Now I'm thinking of replicating the related bucket to another node and nest the documents as:

{"documentid": "docid1",
	"properties": [
		{
			"size": "XL"
		},
		{
			"color": "red"
		},
		{
			"image": {
				"documentid": "docid2",
				"properties": [
					{
						"resolution": "1024*768"
					},
					{
						"type": "jpg"
					}
				]
			}
		}
	]}

and replicate it again to ES with this form.

Can you guys please suggest a more elaborate and economical way?


(Mark Harwood) #2

From the example documents you supplied there's no indication that doc 2 should be related to doc 1.

You have 3 options :

  • Collapse all source docs into a single "flat" JSON doc
  • Collapse all source docs into a single "nested" JSON doc [1]
  • Store parent and child source docs separately but relate them when you add them [2]

The decision process for why you would choose one approach over another is roughly this:

I'm unfamiliar with the couchbase plugin and to what extent it can be made to work with elasticsearch's parent/child indexing API.

[1] https://www.elastic.co/guide/en/elasticsearch/guide/2.x/nested-objects.html
[2] https://www.elastic.co/guide/en/elasticsearch/guide/2.x/parent-child.html


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.