Not able to search POS text in ElasticSearch


#1

Hi,

In my data have a text like 'POS1','POS2','POS3' using match criteria am not able search with text like 'POS'
Example :

{  
   "query":{  
      "bool":{  
         "must":[  
            {  
               "multi_match":{  
                  "query":"POS",
                  "fields":[  
                     "233.233_*"
                  ],
                  "minimum_should_match":"25%"
               }
            }
         ]
      }
   }
}

Please help me.


(David Pilato) #2

Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.


#3

Hi,

POST

curl -XPOST -H "Content-Type: application/json" "http://192.168.1.37:9200/exampledata/examplerec" -d '{"name": "POS1"}'

curl -XPOST -H "Content-Type: application/json" "http://192.168.1.37:9200/exampledata/examplerec" -d '{"name": "POS2"}'

curl -XPOST -H "Content-Type: application/json" "http://192.168.1.37:9200/exampledata/examplerec" -d '{"name": "POS3"}'
GET 
{ 
"query":{
      "bool":{  
         "must":[ 
            {  
               "multi_match":{  
                  "query":"POS",
                  "fields":["name"],
                  "minimum_should_match":"25%"
               }
            }
         ]
      }
   }
}

or 


{
"query": {
    "bool": {
      "minimum_should_match": "25%",
      "should": [
        {
          "match": {"name": "POS"}
        }
      ]
    }
  }
}

Not working please help me.


(David Pilato) #4

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

As you are using the default analyzer, I guess that POS1 has been indexed as pos1 which is not equal to pos.

Have a look at the _analyze API to understand what is happening behind the scene.


#5

Hi David,

I tried below query which gives me expected result

GET 

{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "default_field": "name",
            "query": "POS*"
          }
        }
      ]
    }
  }
}

OUTPUT

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": 1,
    "hits": [
      {
        "_index": "exampledata",
        "_type": "examplerec",
        "_id": "x35RQ2MBWHlgBSJy1C6v",
        "_score": 1,
        "_source": {
          "name": "POS3"
        }
      },
      {
        "_index": "exampledata",
        "_type": "examplerec",
        "_id": "y35gQ2MBWHlgBSJynS5X",
        "_score": 1,
        "_source": {
          "name": "POS"
        }
      },
      {
        "_index": "exampledata",
        "_type": "examplerec",
        "_id": "xn5RQ2MBWHlgBSJywi4_",
        "_score": 1,
        "_source": {
          "name": "POS2"
        }
      },
      {
        "_index": "exampledata",
        "_type": "examplerec",
        "_id": "yH5RQ2MBWHlgBSJy7C4I",
        "_score": 1,
        "_source": {
          "name": "POS4"
        }
      },
      {
        "_index": "exampledata",
        "_type": "examplerec",
        "_id": "xX5RQ2MBWHlgBSJyny4n",
        "_score": 1,
        "_source": {
          "name": "POS1"
        }
      }
    ]
  }
}

Could you please help me to understand better am new to ElasticSearch.


(David Pilato) #6

When elasticsearch indexes text, it goes through an analysis process:

POST _analyze
{
  "text": "POS1",
  "analyzer": "standard"
}

Gives:

{
  "tokens": [
    {
      "token": "pos1",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    }
  ]
}

Which means that when you search for pos, it never matches pos1.

Searching for POS* matches then but I'd not recommend using wildcard as it can be super super slow.

Better to index the right terms instead but this depends on your use case. I guess that POS1 is not actual data but just an example. What'd be the real data?


#7

Yes POS,POS1,POS2 are realdata some customer kept computer name like this.

Sql query

select * from assetData where machinename like '%POS%' and installedPrograms like '%Google chrome%'; 

But in ElasticSearch data stores differently
in output 221 is field name for machine
185 is fieldname for installed programs. One computer will have a muttiple progarm installed.

{  
   "took":59,
   "timed_out":false,
   "_shards":{  
      "total":5,
      "successful":5,
      "skipped":0,
      "failed":0
   },
   "hits":{  
      "total":1,
      "max_score":0.5753642,
      "hits":[  
         {  
            "_index":"assetdatalatest_1",
            "_type":"post",
            "_id":"36",
            "_score":0.5753642,
            "_source":{  
               "221":{  
                  "221_1":"POS1"
               },
               "357":{  
                  "357_1":"JSFBTESTWindows__201800020"
               },
               "185":{  
                  "185_20":"Windows 10 Update Assistant",
                  "185_9":"Microsoft Visual C++ 2013 x64 Minimum Runtime - 12.0.21005",
                  "185_5":"Update for Windows 10 for x64-based Systems (KB4023057)",
                  "185_6":"Windows Setup Remediations (x64) (KB4023057)",
                  "185_7":"Microsoft Visual C++ 2008 Redistributable - x64 9.0.30729.6161",
                  "185_8":"Microsoft Visual C++ 2013 x64 Additional Runtime - 12.0.21005",
                  "185_1":"Rocket.Chat+ 2.10.5",
                  "185_2":"HeidiSQL",
                  "185_3":"SelfHeal Client",
                  "185_4":"Maxx Audio Installer (x64)",
                  "185_18":"Google Update Helper",
                  "185_19":"Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.6161",
                  "185_14":"Microsoft Visual C++ 2013 Redistributable (x64) - 12.0.30501",
                  "185_15":"Microsoft Visual C++ 2013 x86 Minimum Runtime - 12.0.21005",
                  "185_16":"osrss",
                  "185_17":"Realtek Audio COM Components",
                  "185_10":"VMware Workstation",
                  "185_21":"Intel(R) Processor Graphics",
                  "185_22":"Realtek High Definition Audio Driver",
                  "185_11":"UpdateAssistant",
                  "185_23":"Microsoft Visual C++ 2013 Redistributable (x86) - 12.0.30501",
                  "185_12":"Google Chrome",
                  "185_24":"Microsoft Visual C++ 2013 x86 Additional Runtime - 12.0.21005",
                  "185_13":"Notepad++"
               }
            }
         }
      ]
   }
}

Am trying query to fetch machine name in

{  
   "query":{  
      "bool":{  
         "must":[  
            {  
               "multi_match":{  
                  "query":"POS",
                  "fields":[  
                     "221.221_*"
                  ],
                  "minimum_should_match":"25%"
               }
            }
         ]
      }
   }
}

(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.