Query Movie Title Contain Episode Number


(Hoàng Phúc Lương) #1

I wonder if this is the right place to show and discuss about my use case. If not, please help me to redirect this question to other right place.

I'm using ES 5.2 for our searching feature in an social media website (something look like youtube). And it's really hard for me to find out how can I search a video with episode number.
For example:

{"id": "1","title": "Four Beautyful Sun Flower - Episode 01"} 
{"id": "2","title": "Four Beautyful Sun Flower - Episode 15"} 
{"id": "3","title": "Four Beautyful Sun Flower - Episode 17"} 
{"id": "4","title": "Four Beautyful Sun Flower - Episode 23"} 
{"id": "5","title": "Sun Flower In Morning - Episode 01"} 
{"id": "6","title": "Sun Flower In Morning - Episode 15"} 
{"id": "7","title": "Sun Flower In Morning - Episode 17"} 
{"id": "8","title": "Sun Flower In Morning - Episode 23"}

I always get the same result although I change keyword search with episode number.

{   "query": {
    "match": {
      "title": "Four Beautyful Sun Flower Episode 17"
    }   } }

This is the result I got

"hits": {
        "total": 8,
        "max_score": 3.5898633,
        "hits": [
            {
                "_index": "test_file",
                "_type": "sample",
                "_id": "1",
                "_score": 3.5898633,
                "_source": {
                    "id": "1",
                    "title": "Four Beautyful Sun Flower - Episode 01"
                }
            },
            {
                "_index": "test_file",
                "_type": "sample",
                "_id": "3",
                "_score": 2.6694531,
                "_source": {
                    "id": "3",
                    "title": "Four Beautyful Sun Flower - Episode 17"
                }
            },
            {
                "_index": "test_file",
                "_type": "sample",
                "_id": "2",
                "_score": 2.4949138,
                "_source": {
                    "id": "2",
                    "title": "Four Beautyful Sun Flower - Episode 15"
                }
            },
            {
                "_index": "test_file",
                "_type": "sample",
                "_id": "4",
                "_score": 2.4949138,
                "_source": {
                    "id": "4",
                    "title": "Four Beautyful Sun Flower - Episode 23"
                }
            },
            {
                "_index": "test_file",
                "_type": "sample",
                "_id": "7",
                "_score": 1.0144347,
                "_source": {
                    "id": "7",
                    "title": "Sun Flower In Morning - Episode 17"
                }
            },
            {
                "_index": "test_file",
                "_type": "sample",
                "_id": "5",
                "_score": 1.0068512,
                "_source": {
                    "id": "5",
                    "title": "Sun Flower In Morning - Episode 01"
                }
            },
            {
                "_index": "test_file",
                "_type": "sample",
                "_id": "8",
                "_score": 1.0068512,
                "_source": {
                    "id": "8",
                    "title": "Sun Flower In Morning - Episode 23"
                }
            },
            {
                "_index": "test_file",
                "_type": "sample",
                "_id": "6",
                "_score": 0.7445657,
                "_source": {
                    "id": "6",
                    "title": "Sun Flower In Morning - Episode 15"
                }
            }
        ]
    }

I expect that episode will be in the first. But result are always in the same order. Beside that, I just want to get only Four Beautyful Sun Flower movie, but result show bot Four Beautyful Sun Flower and Sun Flower In Morning.
Could someone help me how to do some searching like this. I tried all suggested from documents of ES page but still not work.

This is my bash script to reproduce this case.

curl -X PUT http://127.0.0.1:9200/test_file \   -d '{   "settings": {
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "edge_ngram",
          "min_gram": 3,
          "max_gram": 20
        },
        "custom_ascii_folding": {
          "type": "asciifolding",
          "preserve_original": true
        }
      },
      "analyzer": {
        "autocomplete": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter",
            "snowball",
            "custom_ascii_folding"
          ]
        }
      }
    }   },   "mappings": {
    "sample": {
      "properties": {
        "id": {
          "type": "keyword"
        },
        "title": {
          "type": "text",
          "term_vector": "yes",
          "analyzer": "autocomplete"
        }
      }
    }   } }'

curl -X PUT http://127.0.0.1:9200/test_file/sample/1 -d '{"id": "1","title": "Four Beautyful Sun Flower - Episode 01"}' 
curl -X PUT http://127.0.0.1:9200/test_file/sample/2 -d '{"id": "2","title": "Four Beautyful Sun Flower - Episode 15"}' 
curl -X PUT http://127.0.0.1:9200/test_file/sample/3 -d '{"id": "3","title": "Four Beautyful Sun Flower - Episode 17"}' 
curl -X PUT http://127.0.0.1:9200/test_file/sample/4 -d '{"id": "4","title": "Four Beautyful Sun Flower - Episode 23"}' 
curl -X PUT http://127.0.0.1:9200/test_file/sample/5 -d '{"id": "5","title": "Sun Flower In Morning - Episode 01"}' 
curl -X PUT http://127.0.0.1:9200/test_file/sample/6 -d '{"id": "6","title": "Sun Flower In Morning - Episode 15"}' 
curl -X PUT http://127.0.0.1:9200/test_file/sample/7 -d '{"id": "7","title": "Sun Flower In Morning - Episode 17"}' 
curl -X PUT http://127.0.0.1:9200/test_file/sample/8 -d '{"id": "8","title": "Sun Flower In Morning - Episode 23"}'

Thank you so much for your time.


(Glen Smith) #2

Hello,

I think if you investigate Match Phrase Query [1], you will be able to improve greatly in getting the results you would like.

[1] https://www.elastic.co/guide/en/elasticsearch/reference/5.2/query-dsl-match-query-phrase.html


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.