I know how to set up the river plugin and search across it. The problem is
if the same document is edited multiple times (multiple revisions), the
data from oldest revision is retained and older data is lost. I intend to
be able keep an index all revisions for my entire couchdb.
For this my naive strategy is the following - create a new id which is a
union of "_id" and "_rev" of the couchdb doc which comes through from the
_changes stream.
To accomplish, as per the docs, I tried the following:
What I did:
curl -XDELETE 127.0.0.1:9200/_all
curl -XPUT 'localhost:9200/foo_test' -d '{
"mappings": {
"foo_test": {
"_id": {
"path": "newId",
"index": "not_analyzed",
"store": "yes"
}
}
}
}'
curl -XPUT 'localhost: 9200/_river/foo_test/_meta' -d '{
"type": "couchdb",
"couchdb": {
"host": "127.0.0.1",
"port": 5984,
"db": "foo_test",
"script": "ctx.doc.newId = ctx.doc._id + ctx.doc._rev",
"filter": null
},
"index": {
"index": "foo_test",
"type": "foo_test",
"bulk_size": "100",
"bulk_timeout": "10ms"
}
}'
And after this, when I search for a doc I added, I get:
_index: foo_test
_type: foo_test
_id: 53fa6fcf981a01b05387e680ac4a2efa
_score: 8.238497
_source: {
_rev: 4-8f8808f84eebd0984d269318ad21de93
content: {
foo: bar
foo3: bar3
foo2: bar2
}
_id: 53fa6fcf981a01b05387e680ac4a2efa
newId:
53fa6fcf981a01b05387e680ac4a2efa4-8f8808f84eebd0984d269318ad21de93
So I see that elasticsearch is not not using the new path to define its
"_id" field. Has anyone tried this before, or know what I might be doing
wrong?
Also, I know its probably not wise to just keep storing all revs of the
couchdb. I intend to figure out a way to delete really old revisions, but I
guess that's for later.
Also, I did look into native "versioning" using elasticsearch - but that
doesn't help me, as I intend to be able to access & search over previous
versions and per what I have found "you can't do this using the builtin
versioning. All that does is to store the current version number to prevent
you applying updates out of order. "
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.