Hi all ,
I am trying to index the text file using attachment plugin using transport client in java . I have 4 node cluster .
If i try it using REST Api then it give curl argument is large type error and hang up and while trying from transport client it gives java heap size error .I am not sure where i am going wrong .
How big is the text file and how much memory did you give to elasticsearch?
How are you trying to index it? Can you post your curl command (without the
content of the file obviously)?
On Tuesday, January 15, 2013 8:21:16 AM UTC-5, Sanjay wrote:
Hi all ,
I am trying to index the text file using attachment plugin using
transport
client in java . I have 4 node cluster .
If i try it using REST Api then it give curl argument is large type error
and hang up and while trying from transport client it gives java heap size
error .I am not sure where i am going wrong .
Hi ,
Thanks for reply ,
I configure the memory 1.5 GB to elasticsearch and file size is about 240 MB But when i am trying to index it using curl command which is below then i got to length out argument is very large and when trying to index using transport java client it shows java heap size error.
Hi ,
Thanks for reply ,
I configure the memory 1.5 GB to elasticsearch and file size is about 240 MB
But when i am trying to index it using curl command which is below then i
got to length out argument is very large and when trying to index using
transport java client it shows java heap size error.
You'll either have to break the document down into smaller documents, or
get a lot more memory
Hi Clinton,
If I break down the document in smaller document then how it will be indexed as i mean it will index as an separate document.
and one question any other way to speed up the indexing a documents. How can i calculate the time for indexing ? which method curl (REST API) or JAVA transport client is efficient way to indexing documents? Is curl command works as distributed or single node? How can i index document in distributed manner?
Hi Clinton,
If I break down the document in smaller document then how it will be indexed
as i mean it will index as an separate document.
yes
and one question any other way to speed up the indexing a documents.
i'd normally suggest using bulk indexing, but your documents are already
huge, and so processing several documents at once will probably just
result in more OOMs.
How can
i calculate the time for indexing ?
By trial and error.
which method curl (REST API) or JAVA
transport client is efficient way to indexing documents?
The Java client may be slightly faster than the REST API.
Is curl command
works as distributed or single node? How can i index document in distributed
manner?
ES does this out of the box. You can speak to any node in ES and it
will forward the request to the appropriate node (assuming you are
running more than one node)
But really, you need more memory and probably more powerful boxes. You
can't expect something with the power of a calculator to perform well.
Thanks Clinton ,
I am trying to use the JAVA client to index document .
Can you please suggest me to in which way i have to do that. I mean the sequence for mapping , index creating or document indexing . Give me some hint for generic mapping for any type of document (.pdf , .txt , .doc etc) using java api .
Thanks in advance.
Thanks Clinton ,
I am trying to use the JAVA client to index document .
Can you please suggest me to in which way i have to do that. I mean the
sequence for mapping , index creating or document indexing . Give me some
hint for generic mapping for any type of document (.pdf , .txt , .doc etc)
using java api .
Thanks in advance.
I suggest you start by reading the documentation. Come back when you
have a specific problem that you are struggling with.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.