Problem with searching binary data


(Jay) #1

Hi I am new to ES, I am trying to store binary values such as 00001111 by converting them with Base64 encoding into ES.
Steps I did to put the data into ES

`PUT /my_store2
{
    "mappings" : {
        "products" : {
            "properties" : {
                "productID" : {
                    "type" : "binary",
                    "index" : "analyzed" 
                }
            }
        }
    }
}`

POST  /my_store2/products/_bulk
{ "index": { "_id": 8 }}
{ "productID" : "MDAwMTEwMDE="}

When I searched using this query I did not get any results. Am I doing wrong anywhere while storing the data into ES.

GET  /my_store2/products/_search
{
"query": {
    "query_string": {
       "default_field": "productID",
       "query": "MDAwMTEwMDE="
    }
}
} 

Any help on this will be much appreciated. Also, using the binary datatype to store the "11110000" binary values into ES is the ideal way or are there any better ways of storing the binary values into ES and be able to search them.


(Jay) #2

If possible, can someone provide an example to store and search the binary data in ES.


(David Pilato) #3

I would try to not analyze the content.

Does it change anything?


(Jay) #4

I tried that way too, it still doesn't change anything.


(Peter van der Weerd) #5

According to the docs a binary field is not searchable. (Didn't try though...)
But, for such a small field: why choose binary and not just string?


(Jay) #6

I have a field which I want it to hide it from the users - "more like a metadata field" used for the binary operations.

Say " get me all the products which start with a but of certain type whose ID is both "VegSection" and "Grocery"

GET /my_store3/products/_search
{
    "query": {
        "filtered": {
           "query": {
            "query_string": {
           "default_field": "productName_text",
            "query": "a*"
        }
           },
           "filter": {
               "term": {
                  "productID": "MDAwMTEwMDE"
               }
           }
        } 
       
    }
    
}

(Peter van der Weerd) #7

Yeah, but what I mean is: why not mapping it as a string. So, you supply the base64 string in the _source and then index that field as a string. From what I see in your samples, you work with the string value like 'MDAwMTEwMDE'...


(Jay) #8

Sorry for my mistake, I shouldn't have mentioned in the query that I was doing the search on the field type. Instead, I was doing the search on all the fields except my meta field.

After the result of the search, I would like to filter them out based on my metadata field.

If I index it as string then my search results might also include my metadata field.


(system) #9