Max documents 10,500?


(Blake McBride) #1

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents
to each. Afterwards, one shows the 50,000+ documents and the other only
shows 10,500. The second machine seems to be capping out at 10,500. Why,
and how can I correct this? The relevant facts are as follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB of
    RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running
    local to each machine.

  3. Both machines are running the exact same program to load up ES. Each
    has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are added
    to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good machine
    shows a max_score of 8.2, the bad machine shows .499 - remember, same set
    of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would
sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #2

If you have nothing in logs it could mean that you have an issue with your injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride blake1024@gmail.com a écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine.

  3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E2B14086-CFD2-4DF0-AC4A-E00A0204C8A5%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Blake McBride) #3

The log only contains:

[2015-05-01 18:22:10,398][INFO ][cluster.metadata ]
[mmsapp-na-component] [components-1430504530354] creating index, cause
[api], templates [], shards [5]/[1], mappings [index_name, component]

Each document is being added individually from JavaScript via:

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:

If you have nothing in logs it could mean that you have an issue with your
injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride <blak...@gmail.com <javascript:>> a
écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents
to each. Afterwards, one shows the 50,000+ documents and the other only
shows 10,500. The second machine seems to be capping out at 10,500. Why,
and how can I correct this? The relevant facts are as follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB of
    RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is
    running local to each machine.

  3. Both machines are running the exact same program to load up ES. Each
    has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are added
    to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good machine
    shows a max_score of 8.2, the bad machine shows .499 - remember, same set
    of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would
sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #4

Could you add a counter in your JS app to make sure you sent all docs?

I suspect something wrong in your index process

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 20:40, Blake McBride blake1024@gmail.com a écrit :

The log only contains:

[2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component]

Each document is being added individually from JavaScript via:

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:
If you have nothing in logs it could mean that you have an issue with your injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine.

  3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/137922C9-B539-452E-9181-B82E16AFC9C5%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Blake McBride) #5

I changed the code to read:

var counter = 0;

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
if (counter++ % 1000 === 0) {
console.log('Adding document #' + counter);
}
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

The last printout reads: Adding document #53001

The code that does the error check looks like:

esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) {
if (err !== undefined || status !== 201 || response.created !== true) {
console.log('Unexpected ES response: ' + status + ' ' + err + response.created);
}
});

I never see that message. Finally, after the above I get:

$ curl -s -XPOST 'http://localhost:9200/components/_count'
{"count":10500,"_shards":{"total":5,"successful":5,"failed":0}}

Thanks for the help!

On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:

Could you add a counter in your JS app to make sure you sent all docs?

I suspect something wrong in your index process

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 20:40, Blake McBride <blak...@gmail.com <javascript:>> a
écrit :

The log only contains:

[2015-05-01 18:22:10,398][INFO ][cluster.metadata ]
[mmsapp-na-component] [components-1430504530354] creating index, cause
[api], templates [], shards [5]/[1], mappings [index_name, component]

Each document is being added individually from JavaScript via:

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:

If you have nothing in logs it could mean that you have an issue with
your injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents
to each. Afterwards, one shows the 50,000+ documents and the other only
shows 10,500. The second machine seems to be capping out at 10,500. Why,
and how can I correct this? The relevant facts are as follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB of
    RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is
    running local to each machine.

  3. Both machines are running the exact same program to load up ES. Each
    has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are
    added to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good
    machine shows a max_score of 8.2, the bad machine shows .499 - remember,
    same set of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would
sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #6

Any chance you are using the same id multiple times?

--
David Pilato - Developer | Evangelist
elastic.co
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 1 mai 2015 à 21:25, Blake McBride blake1024@gmail.com a écrit :

I changed the code to read:

var counter = 0;

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
if (counter++ % 1000 === 0) {
console.log('Adding document #' + counter);
}
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

The last printout reads: Adding document #53001

The code that does the error check looks like:

esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) {
if (err !== undefined || status !== 201 || response.created !== true) {
console.log('Unexpected ES response: ' + status + ' ' + err + response.created);
}
});

I never see that message. Finally, after the above I get:

$ curl -s -XPOST 'http://localhost:9200/components/_count'
{"count":10500,"_shards":{"total":5,"successful":5,"failed":0}}

Thanks for the help!

On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:
Could you add a counter in your JS app to make sure you sent all docs?

I suspect something wrong in your index process

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 20:40, Blake McBride <blak...@gmail.com <javascript:>> a écrit :

The log only contains:

[2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component]

Each document is being added individually from JavaScript via:

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:
If you have nothing in logs it could mean that you have an issue with your injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride <blak...@gmail.com <>> a écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine.

  3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com <>.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/51F802A3-4952-4A8D-B322-1547E9B88668%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Blake McBride) #7

No, for two reasons:

  1. I am using the exact same code and data on both machines.

  2. I've seen duplicates in the past and I get an error message.

Thanks.

Blake

On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote:

Any chance you are using the same id multiple times?

--
David Pilato - Developer | Evangelist
elastic.co http://elastic.co
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 1 mai 2015 à 21:25, Blake McBride <blak...@gmail.com <javascript:>> a
écrit :

I changed the code to read:

var counter = 0;

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
if (counter++ % 1000 === 0) {
console.log('Adding document #' + counter);
}
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

The last printout reads: Adding document #53001

The code that does the error check looks like:

esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) {
if (err !== undefined || status !== 201 || response.created !== true) {
console.log('Unexpected ES response: ' + status + ' ' + err + response.created);
}
});

I never see that message. Finally, after the above I get:

$ curl -s -XPOST 'http://localhost:9200/components/_count'
{"count":10500,"_shards":{"total":5,"successful":5,"failed":0}}

Thanks for the help!

On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:

Could you add a counter in your JS app to make sure you sent all docs?

I suspect something wrong in your index process

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com a écrit :

The log only contains:

[2015-05-01 18:22:10,398][INFO ][cluster.metadata ]
[mmsapp-na-component] [components-1430504530354] creating index, cause
[api], templates [], shards [5]/[1], mappings [index_name, component]

Each document is being added individually from JavaScript via:

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:

If you have nothing in logs it could mean that you have an issue with
your injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+
documents to each. Afterwards, one shows the 50,000+ documents and the
other only shows 10,500. The second machine seems to be capping out at
10,500. Why, and how can I correct this? The relevant facts are as
follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB of
    RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is
    running local to each machine.

  3. Both machines are running the exact same program to load up ES.
    Each has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are
    added to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good
    machine shows a max_score of 8.2, the bad machine shows .499 - remember,
    same set of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would
sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b4a6a6df-73aa-44ce-bab1-20db2ed1e3c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #8

Could you compare disk size (/data dir) for your two elasticsearch instances?
Also, could you GIST the result of a simple _search?pretty on both nodes?

--
David Pilato - Developer | Evangelist
elastic.co
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 1 mai 2015 à 21:58, Blake McBride blake1024@gmail.com a écrit :

No, for two reasons:

  1. I am using the exact same code and data on both machines.

  2. I've seen duplicates in the past and I get an error message.

Thanks.

Blake

On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote:
Any chance you are using the same id multiple times?

--
David Pilato - Developer | Evangelist
elastic.co http://elastic.co/
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 1 mai 2015 à 21:25, Blake McBride <blak...@gmail.com <javascript:>> a écrit :

I changed the code to read:

var counter = 0;

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
if (counter++ % 1000 === 0) {
console.log('Adding document #' + counter);
}
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

The last printout reads: Adding document #53001

The code that does the error check looks like:

esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) {
if (err !== undefined || status !== 201 || response.created !== true) {
console.log('Unexpected ES response: ' + status + ' ' + err + response.created);
}
});

I never see that message. Finally, after the above I get:

$ curl -s -XPOST 'http://localhost:9200/components/_count' http://localhost:9200/components/_count'
{"count":10500,"_shards":{"total":5,"successful":5,"failed":0}}

Thanks for the help!

On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:
Could you add a counter in your JS app to make sure you sent all docs?

I suspect something wrong in your index process

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 20:40, Blake McBride <blak...@gmail.com <>> a écrit :

The log only contains:

[2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component]

Each document is being added individually from JavaScript via:

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:
If you have nothing in logs it could mean that you have an issue with your injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride <blak...@gmail.com <>> a écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine.

  3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com <>.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com <>.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b4a6a6df-73aa-44ce-bab1-20db2ed1e3c7%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/b4a6a6df-73aa-44ce-bab1-20db2ed1e3c7%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2A3198F3-AF63-4D90-8DC4-60C255297AFB%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Blake McBride) #9

The question of relative size has, I believe, led me to the problem.

I create aliases. After the load, I have the alias point to the new index.
One of the indexes had a bad document that made the change alias mechanism
fail. This means I kept loading the document into an index but the alias
was always pointing to an old index. So, on the working system the
database size was relatively small - since I got rid of the old indexes.
The bad machine was taking up a lot of space because the old indexes were
never deleted because the alias didn't point to them.

Thanks a lot for the help!!

Blake

On Friday, May 1, 2015 at 3:39:05 PM UTC-5, David Pilato wrote:

Could you compare disk size (/data dir) for your two elasticsearch
instances?
Also, could you GIST the result of a simple _search?pretty on both nodes?

--
David Pilato - Developer | Evangelist
elastic.co http://elastic.co
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 1 mai 2015 à 21:58, Blake McBride <blak...@gmail.com <javascript:>> a
écrit :

No, for two reasons:

  1. I am using the exact same code and data on both machines.

  2. I've seen duplicates in the past and I get an error message.

Thanks.

Blake

On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote:

Any chance you are using the same id multiple times?

--
David Pilato - Developer | Evangelist
elastic.co http://elastic.co/
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 1 mai 2015 à 21:25, Blake McBride blak...@gmail.com a écrit :

I changed the code to read:

var counter = 0;

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
if (counter++ % 1000 === 0) {
console.log('Adding document #' + counter);
}
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

The last printout reads: Adding document #53001

The code that does the error check looks like:

esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) {
if (err !== undefined || status !== 201 || response.created !== true) {
console.log('Unexpected ES response: ' + status + ' ' + err + response.created);
}
});

I never see that message. Finally, after the above I get:

$ curl -s -XPOST 'http://localhost:9200/components/_count'
{"count":10500,"_shards":{"total":5,"successful":5,"failed":0}}

Thanks for the help!

On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:

Could you add a counter in your JS app to make sure you sent all docs?

I suspect something wrong in your index process

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com a écrit :

The log only contains:

[2015-05-01 18:22:10,398][INFO ][cluster.metadata ]
[mmsapp-na-component] [components-1430504530354] creating index, cause
[api], templates [], shards [5]/[1], mappings [index_name, component]

Each document is being added individually from JavaScript via:

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:

If you have nothing in logs it could mean that you have an issue with
your injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+
documents to each. Afterwards, one shows the 50,000+ documents and the
other only shows 10,500. The second machine seems to be capping out at
10,500. Why, and how can I correct this? The relevant facts are as
follows:

  1. Both machines are current 64 bit Linux machines with at least 8GB
    of RAM and more than sufficient disk space.

  2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is
    running local to each machine.

  3. Both machines are running the exact same program to load up ES.
    Each has nearly default ES config files (just different names).

  4. The program keeps a counter of the number of times documents are
    added to ES, and the return codes of each add is checked. Both are 50,000+.

  5. When I do a the same query on each machine with curl, the good
    machine shows a max_score of 8.2, the bad machine shows .499 - remember,
    same set of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help
would sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f4c6434c-2dc2-49c3-a331-e5f88204e239%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b4a6a6df-73aa-44ce-bab1-20db2ed1e3c7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b4a6a6df-73aa-44ce-bab1-20db2ed1e3c7%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2c5ae437-5a65-4528-b33c-edc133e542c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #10