I am using elasticsearch-1.7.3 and mapper attachment plugin-2.7.1
I have successfully add an attachment in elasticsearch with the help of php code, but while searching a text from attachment, i did not get the desired results. i get a complete attachment.
php code for adding attachment is:
$binary = fread(fopen($target_file,"r"), filesize($target_file));
$base = base64_encode($binary);
$article = array();
$article['index'] = 'test';
$article['type'] = 'person';
$article['body'] = array('my_attachment' => $base,'location2' => $location,'skills2' => $skills);
$result = $es->index($article);
where target_file is the location of file stored on my server folders, locations and skills are other fields in elasticsearch
when i search for a word in this file, i use the following php code:
$query = $es->search($params);
where $q is just a text(which is there in file)
my file is a simple ms word file.
everything worked but when i echo the result it gives me complete document printed, instead of my searched text.
please if anybody can solve it or may give me a link where i can find help regarding the php code of mapper attachment(search text in a file), it would be great please reply asap
$params['body']['highlight']['fields']['my_attachment'] =
array("term_vector" => "with_positions_offsets","store" => true);
this is basically used for searching purpose when i delete it and try to search again it still gives me same result.
when i add file(attachment), i got a huge string of chracters like this
"source" : "my_attachment":"fdvmevkjvmvvmvvvf................................................."
i did not try to run highlighting query. i am using postman
it does not matter whether i use $params['body']['highlight']['fields']['my_attachment'] =
array("term_vector" => "with_positions_offsets","store" => true); or not
when i do this print_r($query);
can you tell me how to use highlight query in php code because i think search does work but i do not know how to print it on screen
should i do echo $some_variable['_source']['my_attachment']; or what??
ok so what query should i use in place of that??
what i am saying is there is a word in my file "address" now when i search something else no result is shown(print_r($query) is empty)
which is excellent because that word is not there but when i search for address, it is searching from file that is correct but i do not know how to get only the desired word from file.please help me with that.
ok in this king queen is searched and we get this as result
"highlight": {
"file": [
""God Save the Queen" (alternatively "God Save the King"\n"
]
}
what i am getting here is my full document instead of only desired word.
$q="address";
$params['body']['query']['match']['my_attachment'] = $q;
when i print_r($params); this
i get Array
(
[index] => test
[type] => person
[body] => Array
(
[query] => Array
(
[match] => Array
(
[my_attachment] => address
)
)
)
)
which is great so the next step is search i.e.
$query = $es->search($params);
so it should give me the desired word from that file but i got full document
"highlight": {
"my_attachment": [
"...... my full document ......."
]
}
basically i am working on a recruitment framework and users upload their resumes.
so when i search skills(php,java) i want to search from those uploaded files and from that i get number of users and their skills and their addresses etc.
So the result you want to get is a resume, not really a single "word", right?
Then, you don't really need highlight here.
You have two options IMO:
get the result back from elasticsearch (use a simple search) and extract from the _source field the BASE64 content, decode it on the client and open the file in your browser.
basically what i want is for example there are 1000 resumes in elasticsearch and i want candidates with experience in java and i have many fields like i have skills field,experience fileld etc.so i will put java in the skills field search and say i put experience =1year in experience field search and based on that i should get all information about candidates experience and all information about skills.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.