I have problem to find the documents by Hebrew words.
When I create query I encode the query to UTF-8:
QueryStringQueryBuilder textQueryBuilder = new
QueryStringQueryBuilder(new String(query.getBytes(),"UTF-8");
It finds the documents that contains the Hebrew word, but also finds
other documents that contain other engish word.
For example I search for שלום , but elasticsearch finds also documents
that contain Olga.
I have problem to find the documents by Hebrew words.
When I create query I encode the query to UTF-8:
QueryStringQueryBuilder textQueryBuilder = new
QueryStringQueryBuilder(new String(query.getBytes(),"UTF-8");
It finds the documents that contains the Hebrew word, but also finds
other documents that contain other engish word.
For example I search for שלום , but elasticsearch finds also documents
that contain Olga.
It appears that we indexed HTML encoded data - apparently it was a
problem.However after we removed encoding, we can't find any words in
Hebrew by Java API, but can find by the following curl command:curl -
XGET http://localhost:9200/default/conversation/_search -d '{"query" :
"שלום"}'
The Java sample code:
String queryString="שלום"; SearchRequestBuilder searchRequestBuilder =
client.prepareSearch(tenantName).setTypes(type)
.setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(queryString);SearchResponse
response = searchRequestBuilder.execute().actionGet();
What can be a problem?
Thanks,Olga
On Oct 26, 4:09 pm, OlgaT tubm...@gmail.com wrote:
Hi,
I have problem to find the documents byHebrewwords.
When I create query I encode the query to UTF-8:
QueryStringQueryBuilder textQueryBuilder = new
QueryStringQueryBuilder(new String(query.getBytes(),"UTF-8");
It finds the documents that contains theHebrewword, but also finds
other documents that contain other engish word.
For example I search for שלום , but elasticsearch finds also documents
that contain Olga.
You need to wrap the query string you pass in a QueryBuilders.queryString
construct. When you pass just a string to the setQuery method, it is
supposed to be a json. Check the failed shards on the SearchResponse you
get back, you will see that all are failed with failing to parse the query.
It appears that we indexed HTML encoded data - apparently it was a
problem.However after we removed encoding, we can't find any words in
Hebrew by Java API, but can find by the following curl command:curl -
XGET http://localhost:9200/default/conversation/_search -d '{"query" :
"שלום"}'
The Java sample code:
String queryString="שלום"; SearchRequestBuilder searchRequestBuilder =
client.prepareSearch(tenantName).setTypes(type)
.setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(queryString);SearchResponse
response = searchRequestBuilder.execute().actionGet();
What can be a problem?
Thanks,Olga
On Oct 26, 4:09 pm, OlgaT tubm...@gmail.com wrote:
Hi,
I have problem to find the documents byHebrewwords.
When I create query I encode the query to UTF-8:
QueryStringQueryBuilder textQueryBuilder = new
QueryStringQueryBuilder(new String(query.getBytes(),"UTF-8");
It finds the documents that contains theHebrewword, but also finds
other documents that contain other engish word.
For example I search for שלום , but elasticsearch finds also documents
that contain Olga.
You need to wrap the query string you pass in a QueryBuilders.queryString
construct. When you pass just a string to the setQuery method, it is
supposed to be a json. Check the failed shards on the SearchResponse you
get back, you will see that all are failed with failing to parse the query.
It appears that we indexed HTML encoded data - apparently it was a
problem.However after we removed encoding, we can't find any words in
Hebrew by Java API, but can find by the following curl command:curl -
XGEThttp://localhost:9200/default/conversation/_search-d '{"query" :
"שלום"}'
The Java sample code:
String queryString="שלום"; SearchRequestBuilder searchRequestBuilder =
client.prepareSearch(tenantName).setTypes(type)
.setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(queryString);SearchRes ponse
response = searchRequestBuilder.execute().actionGet();
What can be a problem?
Thanks,Olga
On Oct 26, 4:09 pm, OlgaT tubm...@gmail.com wrote:
Hi,
I have problem to find the documents byHebrewwords.
When I create query I encode the query to UTF-8:
QueryStringQueryBuilder textQueryBuilder = new
QueryStringQueryBuilder(new String(query.getBytes(),"UTF-8");
It finds the documents that contains theHebrewword, but also finds
other documents that contain other engish word.
For example I search for שלום , but elasticsearch finds also documents
that contain Olga.
You need to wrap the query string you pass in a QueryBuilders.queryString
construct. When you pass just a string to the setQuery method, it is
supposed to be a json. Check the failed shards on the SearchResponse you
get back, you will see that all are failed with failing to parse the
query.
It appears that we indexed HTML encoded data - apparently it was a
problem.However after we removed encoding, we can't find any words in
Hebrew by Java API, but can find by the following curl command:curl -
XGEThttp://localhost:9200/default/conversation/_search-d '{"query" :
"שלום"}'
The Java sample code:
String queryString="שלום"; SearchRequestBuilder searchRequestBuilder =
client.prepareSearch(tenantName).setTypes(type)
response = searchRequestBuilder.execute().actionGet();
What can be a problem?
Thanks,Olga
On Oct 26, 4:09 pm, OlgaT tubm...@gmail.com wrote:
Hi,
I have problem to find the documents byHebrewwords.
When I create query I encode the query to UTF-8:
QueryStringQueryBuilder textQueryBuilder = new
QueryStringQueryBuilder(new String(query.getBytes(),"UTF-8");
It finds the documents that contains theHebrewword, but also finds
other documents that contain other engish word.
For example I search for שלום , but elasticsearch finds also
documents
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.