Copy_to vs appending all terms in a string : Are the search results going to be the same?

Hi,

I'm building an e-commerce search engine, and I want to do a full text search on product name, category, and a few other fields. I'm currently using copy_to to copy all the individual fields into 1 single field (called 'fullTextSearchTerms'), analyzing that field(with phonetic filter, stemmers etc), and doing a match query on it.

I'm exploring an alternative to copy_to, where during index time, I will create the 'fullTextSearchTerms' field by appending the individual fields in a space separated way, like so :

fullTextSearchTerms = productName + " " + productCategory + " " + ...

And I index this field explicitly.

As per my understanding, the search experience should remain the same for these 2 methods of indexing, but am I missing anything? Are these two methods effectively not interchangeable in terms of search experience?

Thanks in advance!

copy_to stores copied values in array.

Whether the search experience is interexchangeable depends on the analyzer. In addition the result could be different for match_phrase query if the phrase hit over productName and productCategory.

If you creat fullTextSearchTerms as array, the result could be interchangeable completely.

Sample:

PUT test-index-000001
{
  "mappings": {
    "properties": {
      "first_name": {
        "type": "text",
        "copy_to": "full_name" 
      },
      "last_name": {
        "type": "text",
        "copy_to": "full_name" 
      },
      "full_name": {
        "type": "text"
      }
    }
  }
}

PUT test-index-000001/_doc/1
{
  "first_name": "John",
  "last_name": "Smith"
}

GET test-index-000001/_search
{
  "query": {
    "match": {
      "full_name": { 
        "query": "John Smith",
        "operator": "and"
      }
    }
  },
  "fields":["*"]
}

# see the "full_name" field
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.5753642,
    "hits" : [
      {
        "_index" : "test-index-000001",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.5753642,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith"
        },
        "fields" : {
          "last_name" : [
            "Smith"
          ],
          "first_name" : [
            "John"
          ],
          "full_name" : [
            "Smith",
            "John"
          ]
        }
      }
    ]
  }
}

Thanks for your response!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.