Can someone correct if I am doing smth wrong:
I have collection of tweets which I stored in Mongodb. So I want to retrieve text of tweets from collection with keyword.
so first, I`ve deleted existed index:
{ "_id" : ObjectId("4fbb380cfed8f515a0000004"), "text" : "Lil Wayne Singlee Oh Y
opp #SiyahTweetin笶、笶、笶、" }
{ "_id" : ObjectId("4fbb380cfed8f515a0000005"), "text" : "In a happy place right
now but the best part is its only gonna get better" }
{ "_id" : ObjectId("4fbb380cfed8f515a0000002"), "text" : "Like damn I wish it wa
s 6:00" }
{ "_id" : ObjectId("4fbb380dfed8f515a0000008"), "text" : "New post: Lesbian plea
sure two ways http://t.co/37d16t9t" }
{ "_id" : ObjectId("4fbb380dfed8f515a0000009"), "text" : "Omg..I'm kinda hungry.
" }
{ "_id" : ObjectId("4fbb380dfed8f515a000000a"), "text" : "I have a crush on #oom
f but she has no clue. Lol" }
Without the field, I believe you will be search against the _all
field, which has its own analyzer.
--
Ivan
On Mon, May 28, 2012 at 12:01 AM, Serikozz serikozz@mail.ru wrote:
Can someone correct if I am doing smth wrong:
I have collection of tweets which I stored in Mongodb. So I want to retrieve
text of tweets from collection with keyword.
so first, I`ve deleted existed index:
{ "_id" : ObjectId("4fbb380cfed8f515a0000004"), "text" : "Lil Wayne Singlee
Oh Y
opp #SiyahTweetin笶、笶、笶、" }
{ "_id" : ObjectId("4fbb380cfed8f515a0000005"), "text" : "In a happy place
right
now but the best part is its only gonna get better" }
{ "_id" : ObjectId("4fbb380cfed8f515a0000002"), "text" : "Like damn I wish
it wa
s 6:00" }
{ "_id" : ObjectId("4fbb380dfed8f515a0000008"), "text" : "New post: Lesbian
plea
sure two ways http://tgirlclips.com/lesbian-pleasure-two-ways/" }
{ "_id" : ObjectId("4fbb380dfed8f515a0000009"), "text" : "Omg..I'm kinda
hungry.
" }
{ "_id" : ObjectId("4fbb380dfed8f515a000000a"), "text" : "I have a crush on #oom
f but she has no clue. Lol" }
I have checked if ES even indexing tweets with curl localhost:9200/tweetindex/_count , so it says {"count":0,"_shards":{"total":5,"successful":5,"failed":0}}
It seems it is not indexing at all...Am I doing smth wrong?
From log file
[2012-05-30 19:14:09,333][INFO ][node ] [Lightspeed] {0.19.3
}[6476]: initializing ...
[2012-05-30 19:14:09,405][INFO ][plugins ] [Lightspeed] loaded
[river-mongodb, mapper-attachments], sites []
[2012-05-30 19:14:12,595][INFO ][node ] [Lightspeed] {0.19.3
}[6476]: initialized
[2012-05-30 19:14:12,595][INFO ][node ] [Lightspeed] {0.19.3
}[6476]: starting ...
[2012-05-30 19:14:12,754][INFO ][transport ] [Lightspeed] bound_a
ddress {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.249:9300]
}
[2012-05-30 19:14:15,929][INFO ][cluster.service ] [Lightspeed] new_mas
ter [Lightspeed][mAuUZJOnQ_S1F5TK-bQ2Ug][inet[/192.168.1.249:9300]], reason: zen
-disco-join (elected_as_master)
[2012-05-30 19:14:16,170][INFO ][discovery ] [Lightspeed] elastic
search/mAuUZJOnQ_S1F5TK-bQ2Ug
[2012-05-30 19:14:16,279][INFO ][http ] [Lightspeed] bound_a
ddress {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.249:9200]
}
[2012-05-30 19:14:16,280][INFO ][node ] [Lightspeed] {0.19.3
}[6476]: started
[2012-05-30 19:14:16,808][INFO ][gateway ] [Lightspeed] recover
ed [9] indices into cluster_state
[2012-05-30 19:14:19,697][INFO ][river.mongodb ] [Lightspeed] [mongod
b][mongodb] starting mongodb stream: host [localhost], port [27017], gridfs [fal
se], filter [testtweet], db [testtweet], indexing to [tweet]/[{}]
[2012-05-30 19:14:20,383][INFO ][river.mongodb ] [Lightspeed] [mongod
b][aikin] starting mongodb stream: host [localhost], port [27017], gridfs [false
], filter [aikin], db [tweetindex], indexing to [a_tweets]/[{}]
[2012-05-30 19:14:20,552][INFO ][river.mongodb ] [Lightspeed] [mongod
b][mongodb] No known previous slurping time for this collection
[2012-05-30 19:14:20,553][INFO ][river.mongodb ] [Lightspeed] [mongod
b][aikin] No known previous slurping time for this collection
[2012-05-30 19:14:20,804][WARN ][river.mongodb ] [Lightspeed] [mongod
b][aikin] A mongoDB cursor bug ?
java.util.NoSuchElementException
at java.util.LinkedList$ListItr.next(LinkedList.java:888)
at com.mongodb.DBCursor._next(DBCursor.java:453)
at com.mongodb.DBCursor.next(DBCursor.java:533)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.processFullColle
ction(MongoDBRiver.java:378)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver
.java:353)
at java.lang.Thread.run(Thread.java:722)
[2012-05-30 19:14:20,821][INFO ][river.mongodb ] [Lightspeed] [mongod
b][aikin] No known previous slurping time for this collection
[2012-05-30 19:16:59,250][INFO ][cluster.metadata ] [Lightspeed] [testtw
eet] deleting index
[2012-05-30 19:18:26,873][INFO ][cluster.metadata ] [Lightspeed] [testtw
eet] creating index, cause [api], shards [5]/[1], mappings []
I have never used the MongoDB river, so I can't comment on the error.
Maybe you should post the code that initializes the river and someone
familiar can figure it out.
[mongod
b][aikin] A mongoDB cursor bug ?
java.util.NoSuchElementException
at java.util.LinkedList$ListItr.next(LinkedList.java:888)
at com.mongodb.DBCursor._next(DBCursor.java:453)
at com.mongodb.DBCursor.next(DBCursor.java:533)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.processFullColle
ction(MongoDBRiver.java:378)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver
.java:353)
at java.lang.Thread.run(Thread.java:722)
On Wed, May 30, 2012 at 3:52 AM, Serikozz serikozz@mail.ru wrote:
I have checked if ES even indexing tweets with curl
localhost:9200/tweetindex/_count , so it says
{"count":0,"_shards":{"total":5,"successful":5,"failed":0}}
It seems it is not indexing at all...Am I doing smth wrong?
From log file
[2012-05-30 19:14:09,333][INFO ][node ] [Lightspeed] {0.19.3
}[6476]: initializing ...
[2012-05-30 19:14:09,405][INFO ][plugins ] [Lightspeed] loaded
[river-mongodb, mapper-attachments], sites
[2012-05-30 19:14:12,595][INFO ][node ] [Lightspeed] {0.19.3
}[6476]: initialized
[2012-05-30 19:14:12,595][INFO ][node ] [Lightspeed] {0.19.3
}[6476]: starting ...
[2012-05-30 19:14:12,754][INFO ][transport ] [Lightspeed] bound_a
ddress {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/192.168.1.249:9300]
}
[2012-05-30 19:14:15,929][INFO ][cluster.service ] [Lightspeed] new_mas
ter [Lightspeed][mAuUZJOnQ_S1F5TK-bQ2Ug][inet[/192.168.1.249:9300]], reason:
zen
-disco-join (elected_as_master)
[2012-05-30 19:14:16,170][INFO ][discovery ] [Lightspeed] elastic
search/mAuUZJOnQ_S1F5TK-bQ2Ug
[2012-05-30 19:14:16,279][INFO ][http ] [Lightspeed] bound_a
ddress {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/192.168.1.249:9200]
}
[2012-05-30 19:14:16,280][INFO ][node ] [Lightspeed] {0.19.3
}[6476]: started
[2012-05-30 19:14:16,808][INFO ][gateway ] [Lightspeed] recover
ed [9] indices into cluster_state
[2012-05-30 19:14:19,697][INFO ][river.mongodb ] [Lightspeed] [mongod
b][mongodb] starting mongodb stream: host [localhost], port [27017], gridfs
[fal
se], filter [testtweet], db [testtweet], indexing to [tweet]/[{}]
[2012-05-30 19:14:20,383][INFO ][river.mongodb ] [Lightspeed] [mongod
b][aikin] starting mongodb stream: host [localhost], port [27017], gridfs
[false
], filter [aikin], db [tweetindex], indexing to [a_tweets]/[{}]
[2012-05-30 19:14:20,552][INFO ][river.mongodb ] [Lightspeed] [mongod
b][mongodb] No known previous slurping time for this collection
[2012-05-30 19:14:20,553][INFO ][river.mongodb ] [Lightspeed] [mongod
b][aikin] No known previous slurping time for this collection
[2012-05-30 19:14:20,804][WARN ][river.mongodb ] [Lightspeed] [mongod
b][aikin] A mongoDB cursor bug ?
java.util.NoSuchElementException
at java.util.LinkedList$ListItr.next(LinkedList.java:888)
at com.mongodb.DBCursor._next(DBCursor.java:453)
at com.mongodb.DBCursor.next(DBCursor.java:533)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.processFullColle
ction(MongoDBRiver.java:378)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver
.java:353)
at java.lang.Thread.run(Thread.java:722)
[2012-05-30 19:14:20,821][INFO ][river.mongodb ] [Lightspeed] [mongod
b][aikin] No known previous slurping time for this collection
[2012-05-30 19:16:59,250][INFO ][cluster.metadata ] [Lightspeed] [testtw
eet] deleting index
[2012-05-30 19:18:26,873][INFO ][cluster.metadata ] [Lightspeed] [testtw
eet] creating index, cause [api], shards [5]/[1], mappings
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.