Hilal
(Hilal)
February 19, 2016, 12:36pm
1
Hi,
I am in the process of converting my queries from Mysql to Elasticsearch, and would like to know if my mapping makes sense.
I've identified 4 common types of data that I want to store.
tip
: "tip" is my enum field name in mysql db. For exampe, tip=M, tip=X or tip=D.
usul
: "usul" is my enum field name in mysql db.
For example, usul=A, usul=P, usul=M or usul=K.
maliyet
: "maliyet" is my char field name in mysql db.
For example, maliyet=1, maliyet=2, maliyet=X or maliyet=0.
islem
: "islem" is my varchar field name in mysql db.
For example, islem=Y, islem=YD, islem=YDDD, islem=YI or islem=YSS.
Mappings:
1. tip - {type: string, index:not_analyzed}
2. usul - {type: string, index:not_analyzed}
3. maliyet - {type: string, index:not_analyzed}
4. islem - {type: nested, index:not_analyzed}
However, my serach results is not correct. I want to do mapping. How can I do mapping these fields?
ebuildy
(Thomas Decaux)
February 19, 2016, 1:10pm
2
Hello, you create the index (like a mysql data-base) first:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#mappings
With your mapping, then insert documents.
Hilal
(Hilal)
February 19, 2016, 1:19pm
3
I have a index called "ihaleler". My question is this not
I indexed 10000 datas but these fields ('tip', 'usul', 'maliyet', 'islem'
) mapping is not correct. So my search results is not correct.
What should be the type of these fields for mapping?
PUT ihaleler/ihale/_mapping
{
"properties": {
"tip" : {
"type" : "?"
, "index": "not_analyzed"
}
}
}
dadoonet
(David Pilato)
February 19, 2016, 3:15pm
4
I think your mapping is correct and using not_analyzed
is the way to go in such a case.
But it's hard to tell why your query fails without seeing the query.
Remember that not_analyzed
will not lowercase your term. So if you indexed YSS
, searching for yss
won't work.
Hilal
(Hilal)
February 29, 2016, 9:07am
5
Ok. I did "usul
", "tip
" and "maliyet
" fields. But I have a problem about "islem". It differs from others.
My query :
http://10.0.2.15:8080/filter.php?islem=YDDD
My result:
found = 2
->YDDD
->YDDD
It works true actually but I want to be YD instead of YDDD
Because YDDDDD=YDDDD=YDDD=YD. So it have to be unique array.
Mysql db screenshot:
CODES:
"islem" mapping on sense editor :
PUT ihaleler/ihale/_mapping
{
"properties": {
"islem": {
"type": "string",
"index": "analyzed",
"analyzer": "turkish"
}
}
}
indexed codes on php (index.php):
....
$islem=$ihale['islem'];
$islem_array = array();
$islem_array[] = 'Y';
//$islem=$ihale['islem'];
for($i=0;$i<strlen($islem);$i++){
$islem_array[]=substr($islem,0,1);
}
$islem=array_unique($islem_array);
$dizi['islem']=$ihale['islem'];
filter codes on php(filter.php) :
unset($filter_islem);
if(isset($_GET['islem'])){
$islemler=$_GET['islem'];
$islemler=trim($islemler, ',');
$islemler_array=explode(',', $islemler);
$islem=array_unique($islemler_array);
$filter_islem['match_phrase']['islem']=$_GET['islem'];
}
if(is_array($filter_islem)){
$searchParams['body']['query']['bool']['should']=$filter_islem;
}
How can I do? Is the mapping must be a nested instead of string?
dadoonet
(David Pilato)
February 29, 2016, 9:21am
6
If you want to run a query like "starts with", you could use edge n grams for that.
Hilal
(Hilal)
February 29, 2016, 9:51am
7
no it is not.
Y: Yeni (English meanings new)
D: Düzeltme (English meanings correction)
Y: Yeni(New)
YD: Yeni Düzeltme(New Correction)
YDD: Yeni Düzeltme Düzeltme(New Correction Correction)
YDDD: Yeni Düzeltme Düzeltme Düzeltme(New Correction Correction Correction)
...etc.
When I query YDDD It must come just YD. (because YD=YDDD)
dadoonet
(David Pilato)
February 29, 2016, 10:05am
8
Sorry. I don't understand.
Can you reproduce your case with a small example?
create a new index
create a type with a mapping and a single field
index some documents using this single field
query
Hilal
(Hilal)
February 29, 2016, 10:30am
9
1.I have mysql datase and my field name is islem(english meaning process).
2.I created index on sense editor:
DELETE ihaleler
PUT ihaleler
3.I did mapping for my field "islem"
PUT ihaleler/ihale/_mapping
{
"properties": {
"islem": {
"type": "string",
"index": "analyzed",
"analyzer": "turkish"
}
}
}
4.I indexed data on php (index.php):
I made array of process variable in here. After I did unique this array.
....
$islem=$ihale['islem'];
$islem_array = array();
$islem_array[] = 'Y';
//$islem=$ihale['islem'];
for($i=0;$i<strlen($islem);$i++){
$islem_array[]=substr($islem,0,1);
}
$islem=array_unique($islem_array);
$dizi['islem']=$ihale['islem'];
5.Filtered indexing data on php(filter.php) :
unset($filter_islem);
if(isset($_GET['islem'])){
$islemler=$_GET['islem'];
$islemler=trim($islemler, ',');
$islemler_array=explode(',', $islemler);
$islem=array_unique($islemler_array);
$filter_islem['match_phrase']['islem']=$_GET['islem'];
}
if(is_array($filter_islem)){
$searchParams['body']['query']['bool']['should']=$filter_islem;
}
6.And query :
http://10.0.2.15:8080/filter.php?islem=YDDD
I want to be YD of this query result .
Because in the future it could be very long.
For example:
like YDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
dadoonet
(David Pilato)
February 29, 2016, 10:43am
10
Where is the edge n gram in your mapping?
thn
(tri-man)
February 29, 2016, 10:47am
11
I think you can do ?islem=YDD* when searching.
In stead of using your filter.php, try to verify the search results from ES's REST interface to make sure it works the way you want... then review the filter.php module.
Here is the URL you can try...
http://[es host:es port]/[index name]/_search?q=islem:YDD*
Here is the link for further info
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html
Hilal
(Hilal)
February 29, 2016, 12:44pm
12
edge n gram :
PUT /ihaleler
{
"settings" : {
"analysis" : {
"analyzer" : {
"my_edge_ngram_analyzer" : {
"tokenizer" : "my_edge_ngram_tokenizer"
}
},
"tokenizer" : {
"my_edge_ngram_tokenizer" : {
"type" : "edgeNGram",
"min_gram" : "0",
"max_gram" : "1",
"token_chars": [ "letter" ]
}
}
}
}
}
Hilal
(Hilal)
February 29, 2016, 12:58pm
13
NO thn tri-man.
You didn't understand me.
when I do query http://10.0.2.15:8080/filtreleme.php?islem=YDDDDDD
my result shoulnd't be islem=YDDDDDD .
It should be islem=YD. Five of the D here is unnecessary
thn
(tri-man)
February 29, 2016, 1:12pm
14
Got it now... use the edgeNGram as suggested by @dadoonet is the right approach. I would change "min_gram" to 1 and "max_gram" to 2 based on your example (ie. Y, YD, YDD, etc.)
Hilal
(Hilal)
February 29, 2016, 1:37pm
15
I did but result is the same.
My query :
http://10.0.2.15:8080/filter.php?islem=YDDD
My result:
found = 2
->YDDD
->YDDD
Actually it should be :
found = 2
->YD
->YD
thn
(tri-man)
February 29, 2016, 1:58pm
16
Oh, are you expecting to see the results back as YD, not YDDD (where YDDD is the actual value)?
Hilal
(Hilal)
February 29, 2016, 2:04pm
17
yes. My islem datasY,D,I,S
for example :
Y
YD
YI
YDI
YDDDD
YS
......
example query and expecting result:
if I do query Y
, my result must be Y
.
if I do query YD
, my result must be YD
.
if I do query YI
, my result must be YI
.
if I do query YDDDDD
, my result must be YD
.
if I do query YDD
, my result must be YD
....
They must be unique array.
thn
(tri-man)
February 29, 2016, 2:19pm
18
The Edge NGram tokenizer or the "my_edge_ngram_analyzer" tells ES how to tokenize a value before indexing the information. When the data comes in as Y, YD, YDD, YDDD, YDDDD, etc. ES will index Y and YD as tokens, it won't index YDD, YDDD, etc b/c of this custom analyzer. Also, it won't change the original values either. When you search with YD, YDD, YDDD, etc.. ES will be able to serve the request and the returned results will be in its original string as you have seen. If you only wants YD as you said, you might need an extra step to turn the output value to the ngram value that you prefer to see.
This link might have what you are looking for
Since you are using PHP, I'm assuming you are using ES REST web services for this kind of search, you can check out this URL to product the tokens you want
http://[es host]:[es port]/[index name]/_analyze?analyzer=my_edge_ngram_analyzer&text=YDDD
Hilal
(Hilal)
March 1, 2016, 7:29am
19
Thank you. I tried.
http://localhost:9200/ihaleler/_analyze?analyzer=my_edge_ngram_analyzer&text=YDDD
result:
{"tokens":[{"token":"Y","start_offset":0,"end_offset":1,"type":"word","position":0},{"token":"YD","start_offset":0,"end_offset":2,"type":"word","position":1}]}
Hilal
(Hilal)
March 1, 2016, 7:42am
20
I have another idea.
I did nested mapping this field.
PUT ihaleler/ihale/_mapping
{ "properties": {
"islem": {
"type": "nested"
}}}
result:
{
"acknowledged": true
}
After I did unique array in php code. I indexed with my php codes:
....
$islem=$ihale['islem'];
$islem_array = array();
$islem_array[] = 'Y';
//$islem=$ihale['islem']; // abcdef
for($i=0;$i<strlen($islem);$i++){
$islem_array[]=substr($islem,0,1);
}
$islem=array_unique($islem_array);
$dizi['islem']=$ihale['islem'];
but it has failed:
Fatal error: Uncaught exception
'Guzzle\Http\Exception\ClientErrorResponseException' with message 'Client error response[status code] 400 [reason phrase] Bad Request [url] http://localhost:9200/ihaleler/ihale/1030235' in/home/admin web/server.com/public_html/vendor/guzzle/http/Guzzle/Http/Exception/BadResponseException.php:43 Stack trace: #0 /home/adminweb/server.com/public_html/vendor/guzzle/http/Guzzle/Http/Message/Request.php(145): Guzzle\HttpExceptionBadResponseException::factory(Object(Guzzle\Http\Message\EntityEnclosingRequest),Object(Guzzle\Http\Message\Response))#1 [internal function]:Guzzle\Http\MessageRequest::onRequestError(Object(Guzzle\Common\Event),
'request.error',Object(Symfony\Component\EventDispatcher\EventDispatcher))#2 /home/admin/web/server.com/public_html/vendor/symfony/event-dispatcher/Symfony/Component/EventDispatcher/EventDispatcher.php(164):call_user_func(Array, Object(Guzzle\Common\Event), 'request.error',Object(Symfony\Component\EventDispatcher\EventDispatcher))#3 /home/admin/web/ in /home/adminweb/server.com/public_html/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/GuzzleConnection.php on line 266