Index and search pdf file with elasticsearch php client


(Tanguy Bernard) #1

Hello,
Recently, I find a very helpfull information here :

I would like to reproduce the same indexing and searching with php
ElasticSearch client.
My indexing seems to work!

<?php require_once 'vendor/autoload.php'; $client = new Elasticsearch\Client(); $doc_src = "fn6742.pdf"; $binary = fread(fopen($doc_src, "r"), filesize($doc_src)); $doc_str = base64_encode($binary); $article = array(); $article['index'] = 'index2'; $article['type'] = 'attachment'; $article['body'] = array('file' => $doc_str); $result = $client->index($article); ?>

But my "search" does not work. I would like to find the sentence where my
world is.
I tried this :

$params2['body']['query']['match']['file'] = 'my word';
$results = $client->search($params2);
print_r($results);

And I would like something like this : "file" : [ " It'smy word
/You can't use my word / because " ]

I hope you can help me?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9eb30ab1-119d-4cb6-a502-5402986f5cfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Tanguy Bernard) #2

I find the answer :

$params2 =array();

$params2['body']['query']['text']['file'] = 'my words';
$params2['body']['highlight']['fields']['file'] = array("term_vector" =>
"with_positions_offsets");
$results = $client->search($params2);
print_r($results);

Le mardi 8 avril 2014 10:22:21 UTC+2, Tanguy Bernard a écrit :

Hello,
Recently, I find a very helpfull information here :
https://gist.github.com/lukas-vlcek/1075067

I would like to reproduce the same indexing and searching with php
ElasticSearch client.
My indexing seems to work!

<?php require_once 'vendor/autoload.php'; $client = new Elasticsearch\Client(); $doc_src = "fn6742.pdf"; $binary = fread(fopen($doc_src, "r"), filesize($doc_src)); $doc_str = base64_encode($binary); $article = array(); $article['index'] = 'index2'; $article['type'] = 'attachment'; $article['body'] = array('file' => $doc_str); $result = $client->index($article); ?>

But my "search" does not work. I would like to find the sentence where my
world is.
I tried this :

$params2['body']['query']['match']['file'] = 'my word';
$results = $client->search($params2);
print_r($results);

And I would like something like this : "file" : [ " It'smy word
/You can't use my word / because " ]

I hope you can help me?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3969e1e5-6a19-461d-87d3-3f5c8fa021fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Rajesh Jai) #3
<?php require_once 'vendor/autoload.php'; $client = new Elasticsearch\Client(); $doc_src = "path/file.pdf"; $binary = fread(fopen($doc_src, "r"), filesize($doc_src)); $doc_str = base64_encode($binary); $article = array(); $article['index'] = 'index2'; $article['type'] = 'attachment'; $article['body'] = array('file' => $doc_str); $result = $client->index($article); ?>

I used this code to index attached pdf files, indexing without error. But
in indexed file, the file field is empty. Please see this image

https://lh3.googleusercontent.com/-MDX53ZTDKMs/U6g0TZxx17I/AAAAAAAAANI/-XRdJeQO9rU/s1600/Screenshot%2Bfrom%2B2014-06-23%2B19%3A33%3A34.png

*will searching using this code r**aising error : *
$params2 =array();

$params2['body']['query']['text']['file'] = 'my words';
$params2['body']['highlight']['fields']['file'] = array("term_vector" =>
"with_positions_offsets");
$results = $client->search($params2);
print_r($results);

Raising error like :

PHP Fatal error: Uncaught exception
'Guzzle\Http\Exception\ClientErrorResponseException' with message 'Client
error response
[status code] 400
[reason phrase] Bad Request
[url] http://localhost:9200/_search' in
/var/www/html/magento/vendor/guzzle/guzzle/src/Guzzle/Http/Exception/BadResponseException.php:43
Stack trace:
#0
/var/www/html/magento/vendor/guzzle/guzzle/src/Guzzle/Http/Message/Request.php(145):
Guzzle\Http\Exception\BadResponseException::factory(Object(Guzzle\Http\Message\EntityEnclosingRequest),
Object(Guzzle\Http\Message\Response))
#1 [internal function]:
Guzzle\Http\Message\Request::onRequestError(Object(Guzzle\Common\Event),
'request.error', Object(Symfony\Component\EventDispatcher\EventDispatcher))
#2
/var/www/html/magento/vendor/symfony/event-dispatcher/Symfony/Component/EventDispatcher/EventDispatcher.php(164):
call_user_func(Array, Object(Guzzle\Common\Event), 'request.error',
Object(Symfony\Component\EventDispatcher\EventDispatcher))
#3 /var/www/html/magento/vendor/symfony/event-dispatcher/Symfony/Componen
in
/var/www/html/magento/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/GuzzleConnection.php
on line 266

I hope u guys will help to me to fix this. Thanks in advance.

On Tuesday, 8 April 2014 13:52:21 UTC+5:30, Tanguy Bernard wrote:

Hello,
Recently, I find a very helpfull information here :
https://gist.github.com/lukas-vlcek/1075067

I would like to reproduce the same indexing and searching with php
ElasticSearch client.
My indexing seems to work!

<?php require_once 'vendor/autoload.php'; $client = new Elasticsearch\Client(); $doc_src = "fn6742.pdf"; $binary = fread(fopen($doc_src, "r"), filesize($doc_src)); $doc_str = base64_encode($binary); $article = array(); $article['index'] = 'index2'; $article['type'] = 'attachment'; $article['body'] = array('file' => $doc_str); $result = $client->index($article); ?>

But my "search" does not work. I would like to find the sentence where my
world is.
I tried this :

$params2['body']['query']['match']['file'] = 'my word';
$results = $client->search($params2);
print_r($results);

And I would like something like this : "file" : [ " It'smy word
/You can't use my word / because " ]

I hope you can help me?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/561b7818-5deb-4f94-96d2-aa43389d8ffc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Rajesh Jai) #4

$params2 =array();

$params2['body']['query']['text']['file'] = 'my words';
$params2['body']['highlight']['fields']['file'] = array("term_vector" =>
"with_positions_offsets");
$results = $client->search($params2);
print_r($results);

"search" does not work...
Result raising error like below. please help. Give correct solution.
Thanks in advance.
PHP Fatal error: Uncaught exception
'Guzzle\Http\Exception\ClientErrorResponseException' with message 'Client
error response
[status code] 400
[reason phrase] Bad Request
[url] http://localhost:9200/_search' in
/var/www/html/magento/vendor/guzzle/guzzle/src/Guzzle/Http/Exception/BadResponseException.php:43
Stack trace:
#0
/var/www/html/magento/vendor/guzzle/guzzle/src/Guzzle/Http/Message/Request.php(145):
Guzzle\Http\Exception\BadResponseException::factory(Object(Guzzle\Http\Message\EntityEnclosingRequest),
Object(Guzzle\Http\Message\Response))
#1 [internal function]:
Guzzle\Http\Message\Request::onRequestError(Object(Guzzle\Common\Event),
'request.error', Object(Symfony\Component\EventDispatcher\EventDispatcher))
#2
/var/www/html/magento/vendor/symfony/event-dispatcher/Symfony/Component/EventDispatcher/EventDispatcher.php(164):
call_user_func(Array, Object(Guzzle\Common\Event), 'request.error',
Object(Symfony\Component\EventDispatcher\EventDispatcher))
#3 /var/www/html/magento/vendor/symfony/event-dispatcher/Symfony/Componen
in
/var/www/html/magento/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/GuzzleConnection.php
on line 266

On Tuesday, 8 April 2014 15:50:45 UTC+5:30, Tanguy Bernard wrote:

I find the answer :

$params2 =array();

$params2['body']['query']['text']['file'] = 'my words';
$params2['body']['highlight']['fields']['file'] = array("term_vector" =>
"with_positions_offsets");
$results = $client->search($params2);
print_r($results);

Le mardi 8 avril 2014 10:22:21 UTC+2, Tanguy Bernard a écrit :

Hello,
Recently, I find a very helpfull information here :
https://gist.github.com/lukas-vlcek/1075067

I would like to reproduce the same indexing and searching with php
ElasticSearch client.
My indexing seems to work!

<?php require_once 'vendor/autoload.php'; $client = new Elasticsearch\Client(); $doc_src = "fn6742.pdf"; $binary = fread(fopen($doc_src, "r"), filesize($doc_src)); $doc_str = base64_encode($binary); $article = array(); $article['index'] = 'index2'; $article['type'] = 'attachment'; $article['body'] = array('file' => $doc_str); $result = $client->index($article); ?>

But my "search" does not work. I would like to find the sentence where my
world is.
I tried this :

$params2['body']['query']['match']['file'] = 'my word';
$results = $client->search($params2);
print_r($results);

And I would like something like this : "file" : [ " It'smy word
/You can't use my word / because " ]

I hope you can help me?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/78b4cad8-6b23-4dea-a1cc-15c8ca7cdaa0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Rajesh Jai) #5

If search is not working. Change this line
$params2['body']['query']['text']['file'] = 'my words'; as *$params2['body']['query']['match']['file']
= 'my words'; *

$params2 =array();

$params2['body']['query']['match']['file'] = 'my words';
$params2['body']['highlight']['fields']['file'] = array("term_vector" =>
"with_positions_offsets");
$results = $client->search($params2);
print_r($results);

On Tuesday, 8 April 2014 15:50:45 UTC+5:30, Tanguy Bernard wrote:

I find the answer :

$params2 =array();

$params2['body']['query']['text']['file'] = 'my words';
$params2['body']['highlight']['fields']['file'] = array("term_vector" =>
"with_positions_offsets");
$results = $client->search($params2);
print_r($results);

Le mardi 8 avril 2014 10:22:21 UTC+2, Tanguy Bernard a écrit :

Hello,
Recently, I find a very helpfull information here :
https://gist.github.com/lukas-vlcek/1075067

I would like to reproduce the same indexing and searching with php
ElasticSearch client.
My indexing seems to work!

<?php require_once 'vendor/autoload.php'; $client = new Elasticsearch\Client(); $doc_src = "fn6742.pdf"; $binary = fread(fopen($doc_src, "r"), filesize($doc_src)); $doc_str = base64_encode($binary); $article = array(); $article['index'] = 'index2'; $article['type'] = 'attachment'; $article['body'] = array('file' => $doc_str); $result = $client->index($article); ?>

But my "search" does not work. I would like to find the sentence where my
world is.
I tried this :

$params2['body']['query']['match']['file'] = 'my word';
$results = $client->search($params2);
print_r($results);

And I would like something like this : "file" : [ " It'smy word
/You can't use my word / because " ]

I hope you can help me?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c8f2a7cf-8474-430b-a72f-701916909af2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6