Wildcard query on complex text/json column

Hello everyone,

I'm trying to search a text in the Request,ResponseJSon column with wildcard method ,but It's look like search on only first 10 chars. All the other fields works like as excepted but I really dont get the problem or even why this query acting like that.

Example data :

{"_index" : "license",
            "_type" : "JSON_ELASTIC",
            "_id" : "tP8ermwB1p5f4ZMD1Pmr",
            "_score" : 1.0,
            "_source" : {
              "ComputerName" : "MyRealMachine2",
              "ApplicationName" : "LICENSE_HOST.exe",
              "TerminalIP" : "123.123.123.12.435345.345",
              "ListenPort" : "7020",
              "AppVersion" : "1.0.0.0",
              "UniqueId" : "000009_20190820111312",
              "SerialNumber" : "191380004610025",
              "TransactionStartDate" : "2019-08-20T11:13:12.131951+03:00",
              "LogStartDateTime" : "2019-08-20T11:13:12.4444405+03:00",
              "LogEndDateTime" : "2019-08-20T11:13:12.5072718+03:00",
              "LogElapsedTime" : 0,
              "URLCallStartDateTime" : "2019-08-20T11:13:12.4600757+03:00",
              "URLCallEndDateTime" : "2019-08-20T11:13:12.5072718+03:00",
              "URLCallElapsedTime" : 0,
              "WriteElasticSuccess" : true,
              "IsInnerException" : "",
              "ExceptionMessage" : "",
              "ResponseCode" : "00",
              "ResponseMessage" : "ISLEM BASARILI",
              "MachineName" : "MyMachineKgoasdo1asd123",
              "ProcessName" : "LICENSE_HOST",
              "LibraryName" : "Worker",
              "FunctionName" : "RunLicense",
              "Url" : "https://*012.3123.123.132.321/asdasd7asd7asd7as7d/asdasd7asd/as",
              "RequestJSon" : """{"buffer":"494C44000010A00180C580010F3139313338303030343631303032358002008003042019082080040311131180050C34332E30312D76313872393080060180800701019009010780080102A10180808101020134810202747081030944454EDD5A42414E4B8104104431303035304730304C4335303133348105104431303035304730304C433530313334810602FFFF810707000000000000008108070000000000000081090700000000000000810A083030393034373535810B0F353333303030303031343732323933810C0101A00700F0011070778AF13805D9F1220FA5DAB3C5A5F8","endPoint":"86.108.241.118","STX":null,"MessageLength":null,"ProtocolID":"ILD","Version":"0000","MessageType":"10","OwnerBankTeliumID":null,"OwnerBankName":null,"TerminalInfoGroupBuff":"80010F3139313338303030343631303032358002008003042019082080040311131180050C34332E30312D76313872393080060180800701019009010780080102A10180808101020134810202747081030944454EDD5A42414E4B8104104431303035304730304C4335303133348105104431303035304730304C433530313334810602FFFF810707000000000000008108070000000000000081090700000000000000810A083030393034373535810B0F353333303030303031343732323933810C0101","TerminalInfoGroup":{"SerialNumber":"191380004610025","TerminalProductCode":"","TerminalDate":"20190820","TerminalTime":"111311","IdleVersion":"43.01-v18r90","BankAppBlockBuff":"8101020134810202747081030944454EDD5A42414E4B8104104431303035304730304C4335303133348105104431303035304730304C433530313334810602FFFF810707000000000000008108070000000000000081090700000000000000810A083030393034373535810B0F353333303030303031343732323933810C0101","TerminalProfile":"?","TerminalLicenseType":"01","TerminalLicenseRequestType":"02","LoadReason":"","BankAppBlockList":[{"BkmId":"0134","TeliumAppID":"7470","BankName":"DEnemeGbank","BankVersiyon":"D10050G00LC50134","IngenicoVersion":"D10050G00LC50134","Status":"FFFF","FirstParameterDate":"00000000000000","LastParameterDate":"00000000000000","LastTransactionDate":"00000000000000","TerminalNumber":"00904755","MerchantNumber":"533000001472293","LicenseStatus":"01","LicenseEndDate":""}]},"OTPInfoGroupBuff":null,"OTPInfoGroup":{"TechnicianPhoneNumber":null},"ECRInfoGroupBuff":"","ECRInfoGroup":{"OKCFirm":null,"OKCModel":null,"OKCVersion":null,"OKCRegisterNumber":null,"DllVersion":null,"LicenseFirmName":null,"LicenseHASH":null,"AsciiOTP":null,"LoadReason":null,"OKCSerialNo":null},"TetraLicenseInfoGroupBuff":"","TetraLicenseInfoGroup":{"TetraLicenseOTPBuff":null,"ReasonForAppLicenseDownload":null,"TetraLicenseOTP":null},"PackageSign":"70778AF13805D9F1220FA5DAB3C5A5F8","ETX":"03","CRC":"7C65","ResultCode":"00","ResultMessage":"ISLEM BASARILI"}""",
              "LicenseRequestType" : 2,
              "LicenseRequestTypeDesc" : "Uygulama Lisans İsteği",
              "ImpProResponseCode" : "",
              "ImpProResponseMessage" : "",
              "AppResponseCode" : "00",
              "AppResponseMessage" : "ISLEM BASARILI"
            }
          }

and first query (the working one because query value occurs in first 10 chars (dont know why this is works but whatever :slight_smile: ):

GET license/_search
{
  "query": {
    "wildcard": {
      "RequestJSon":"*buff*"
    }
  }
}

And the second one (the not working query :thinking: ):

   GET license/_search
    {
      "query": {
        "wildcard": {
          "RequestJSon":"*C5A5F8*"
        }
      }
    }

Normal wildcard working on other fields:

 GET license/_search
        {
          "query": {
            "wildcard": {
              "UniqueId":"*1113*"
            }
          }
        }

Thanks for any ideas :slight_smile:

It depends on the analyzer (mapping) you used for this field.

Have a look at the analyze API to understand how your string has been actually indexed. Define a keyword analyzer on this field.
Otherwise use may be c5a5f8 but may be the text has been indexed as c, 5, a, ..

Be aware that even if you change the analyzer to keyword, which basically treats the whole content as a single string/token, there is a limit to how long a token can be (32k characters). This way of searching is also the most inefficient way of searching in Elasticsearch and will scale badly.

1 Like

Thanks for adding this information @Christian_Dahlqvist. I totally forgot to mention: "Do no use the wildcard query!" :slight_smile:

Thanks for fast response and advice,

I knew that was a bad practice to use wildcard but I'm curious about reason, now I will index that fields as json nested field, will it be better for mapping and performans

Thank you so much :slight_smile: :metal:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.