FS River seems like a great plugin!
I tried to play around with some data. For one, it was quite fast (y).
But I ran into some issues/quirks, or maybe I didn't use it correctly. I'm
using version 0.0.3, with 0.20.5 of elasticsearch and using filesystem.
- When I create river with following setting:
curl -XDELETE 127.0.0.1:9200/_river/foo
curl -XDELETE 127.0.0.1:9200/foo
curl -XPUT 'localhost:9200/_river/foo/_meta' -d '{
"type": "fs",
"fs": {
"name": "Foo Data",
"url": "/Users/slodha/foo/content",
"update_rate": 60000,
"includes": "*.json",
"json_support" : true
},
"index": {
"index": "foo",
"type": "foo",
"bulk_size": 50
}
}'
and search on this with this query:
{
"query": {
"query_string": {
"default_field": "_all",
"query": "slodha"
}
}
}
I get results like:
hits: [
- {
- _index: foo
- _type: foo
- _id: 18156b6b5a6b3a8e1ec5984f185e18
- _score: 6.9584246
- _source: {
- sunnyVal: slodha
}
}
- sunnyVal: slodha
- {
- _index: foo
- _type: foo
- _id: d7b4df4222e0d075d74ffde8aaa04a56
- _score: 6.901722
- _source: {
- fileNameTest: slodha
}
}
- fileNameTest: slodha
]
-- I never get which file it belonged to - which I would definitely need to
be able to search in the filesystem eventually.
When I do this:
curl -XDELETE 127.0.0.1:9200/_river/foo
curl -XDELETE 127.0.0.1:9200/foo
curl -XPUT 'localhost:9200/_river/foo/_meta' -d '{
"type": "fs",
"fs": {
"name": "Foo Data",
"url": "/Users/slodha/foo/content",
"update_rate": 60000
},
"index": {
"index": "city",
"type": "city",
"bulk_size": 50
}
}'
Notice here I do not use any json restriction..
I get results like this:
hits: [
- {
- _index: foo
- _type: foo
- _id: 18156b6b5a6b3a8e1ec5984f185e18
- _score: 2.0015228
- _source: {
- name: slodha_1.json
- postDate: 1363384941000
- pathEncoded: 44d22b925f562f4e8d1d847253493336
- rootpath: 948cd64d775db4119962b5a36dd530
- virtualpath: t/sunnyTest
- file: {
- _name: slodha_1.json
- content: ewoic3VubnlWYWwiIDogInNsb2RoYSIgCn0K
}
}
}
- {
- _index: foo
- _type: foo
- _id: d7b4df4222e0d075d74ffde8aaa04a56
- _score: 1.7533717
- _source: {
- name: file1.json
- postDate: 1363388628000
- pathEncoded: 99d79f46f1ce275b6b9152a0de54d5
- rootpath: 948cd64d775db4119962b5a36dd530
- virtualpath: t/sunnyTest/sunnyTest2
- file: {
- _name: file1.json
- content: ewoiZmlsZU5hbWVUZXN0IiA6ICJzbG9kaGEiCn0K
}
}
}
]
Now, here I get to know the exact file paths, but this time the content is
all a hash sum, and not readable..
I'm sure there should be a way for me to see both content and file paths as
human readable with just one river. Can you suggest what I'm doing wrong?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.