Missing data sent to be indexed


(David Jensen-2) #1

I've tried this a few times now. I send 15 documents to be indexed.

When I check status on the servers, the num_docs is only incremented
by 5.

I'm running on 2 large EC2 instances with an S3 gateway. I have 10
shards and 2 replicas.

Is there some weirdness because I'm configured for 2 replicas but I
only have enough machines for 1 replica?


(David Jensen-2) #2

I've just now verified that adding another instance did >not< solve
the problem. So the question is, why isn't ES indexing my data OR what
the heck am I doing wrong?

On Jul 19, 4:08 pm, David Jensen djense...@gmail.com wrote:

I've tried this a few times now. I send 15 documents to be indexed.

When I check status on the servers, the num_docs is only incremented
by 5.

I'm running on 2 large EC2 instances with an S3 gateway. I have 10
shards and 2 replicas.

Is there some weirdness because I'm configured for 2 replicas but I
only have enough machines for 1 replica?


(Berkay Mollamustafaoglu-2) #3

I take it you have no error messages in the logs ? May be you can increase
the log level to observe what it says when it says the data.
If you can post the data that does not get indexed as well as the index
mapping, it may have some clues to what the problem may be. What version of
ES are you using?
Just to cover the basics, you're checking the number of docs after waiting
couple of secs right?

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Mon, Jul 19, 2010 at 7:19 PM, David Jensen djensen47@gmail.com wrote:

I've just now verified that adding another instance did >not< solve
the problem. So the question is, why isn't ES indexing my data OR what
the heck am I doing wrong?

On Jul 19, 4:08 pm, David Jensen djense...@gmail.com wrote:

I've tried this a few times now. I send 15 documents to be indexed.

When I check status on the servers, the num_docs is only incremented
by 5.

I'm running on 2 large EC2 instances with an S3 gateway. I have 10
shards and 2 replicas.

Is there some weirdness because I'm configured for 2 replicas but I
only have enough machines for 1 replica?


(David Jensen-2) #4

There is another thread with the exception; has a title like "75k
missing docs".

After clearing out the index completely and loading a new clean
record, everything seems to be happy.

On Jul 19, 4:24 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

I take it you have no error messages in the logs ? May be you can increase
the log level to observe what it says when it says the data.
If you can post the data that does not get indexed as well as the index
mapping, it may have some clues to what the problem may be. What version of
ES are you using?
Just to cover the basics, you're checking the number of docs after waiting
couple of secs right?

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Mon, Jul 19, 2010 at 7:19 PM, David Jensen djense...@gmail.com wrote:

I've just now verified that adding another instance did >not< solve
the problem. So the question is, why isn't ES indexing my data OR what
the heck am I doing wrong?

On Jul 19, 4:08 pm, David Jensen djense...@gmail.com wrote:

I've tried this a few times now. I send 15 documents to be indexed.

When I check status on the servers, the num_docs is only incremented
by 5.

I'm running on 2 large EC2 instances with an S3 gateway. I have 10
shards and 2 replicas.

Is there some weirdness because I'm configured for 2 replicas but I
only have enough machines for 1 replica?


(Shay Banon) #5

Is there a chance that you are simply running into the "near real time"
aspect of elasticsearch? If not, can you execute a search request similar
to: curl -XGET http://host:9200/_search?q=: and check that the total_hits
(thats basically a search all query)?

-shay.banon

On Tue, Jul 20, 2010 at 2:53 AM, David Jensen djensen47@gmail.com wrote:

There is another thread with the exception; has a title like "75k
missing docs".

After clearing out the index completely and loading a new clean
record, everything seems to be happy.

On Jul 19, 4:24 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

I take it you have no error messages in the logs ? May be you can
increase
the log level to observe what it says when it says the data.
If you can post the data that does not get indexed as well as the index
mapping, it may have some clues to what the problem may be. What version
of
ES are you using?
Just to cover the basics, you're checking the number of docs after
waiting
couple of secs right?

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Mon, Jul 19, 2010 at 7:19 PM, David Jensen djense...@gmail.com
wrote:

I've just now verified that adding another instance did >not< solve
the problem. So the question is, why isn't ES indexing my data OR what
the heck am I doing wrong?

On Jul 19, 4:08 pm, David Jensen djense...@gmail.com wrote:

I've tried this a few times now. I send 15 documents to be indexed.

When I check status on the servers, the num_docs is only incremented
by 5.

I'm running on 2 large EC2 instances with an S3 gateway. I have 10
shards and 2 replicas.

Is there some weirdness because I'm configured for 2 replicas but I
only have enough machines for 1 replica?


(David Jensen-2) #6

It looked like ES was rejecting some of my records.

See this thread: http://groups.google.com/a/elasticsearch.com/group/users/browse_thread/thread/4d74faac7dab497e#

I think what I need to do is define the index before loading or at
least start with a known good document to index. Once I cleared the
index and started from scratch with a good document, the exceptions
stopped.

On Jul 20, 12:28 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Is there a chance that you are simply running into the "near real time"
aspect of elasticsearch? If not, can you execute a search request similar
to: curl -XGEThttp://host:9200/_search?q=*:*and check that the total_hits
(thats basically a search all query)?

-shay.banon

On Tue, Jul 20, 2010 at 2:53 AM, David Jensen djense...@gmail.com wrote:

There is another thread with the exception; has a title like "75k
missing docs".

After clearing out the index completely and loading a new clean
record, everything seems to be happy.

On Jul 19, 4:24 pm, Berkay Mollamustafaoglu mber...@gmail.com wrote:

I take it you have no error messages in the logs ? May be you can
increase
the log level to observe what it says when it says the data.
If you can post the data that does not get indexed as well as the index
mapping, it may have some clues to what the problem may be. What version
of
ES are you using?
Just to cover the basics, you're checking the number of docs after
waiting
couple of secs right?

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Mon, Jul 19, 2010 at 7:19 PM, David Jensen djense...@gmail.com
wrote:

I've just now verified that adding another instance did >not< solve
the problem. So the question is, why isn't ES indexing my data OR what
the heck am I doing wrong?

On Jul 19, 4:08 pm, David Jensen djense...@gmail.com wrote:

I've tried this a few times now. I send 15 documents to be indexed.

When I check status on the servers, the num_docs is only incremented
by 5.

I'm running on 2 large EC2 instances with an S3 gateway. I have 10
shards and 2 replicas.

Is there some weirdness because I'm configured for 2 replicas but I
only have enough machines for 1 replica?


(system) #7