Prefix Query for People's names/Lookup

Hello all,

First off, I need to say that I'm VERY new to both ElasticSearch and Lucene in general - so I apologize if this is a basic question. I'm struggling with a search question that is going against people's names (proper nouns). This is sort of like an autocomplete for people. Here's the basic requirement:

A user types in some characters, and if the terms (either first name or last name) starts with that query, return it (sorted, but that's not an issue). This includes multiple spaces in the query. So for example:

if a user types in "ch" - they will get the following in return:

Chris Jones
Lisa Christenson
Charlie Wells far so good. If the user types in "ch j" or "j ch" then results would be:

Chris Jones

...again, no problem. I'm currently using a Prefix query on each one of the entries ("ch" and "j") and with a Boolean query (must for both inputted tokens). But here's where I'm starting to get lost. I'd like this functionality to work with dashed and apostrophes (or without them). For example:

Jason Von-Hanson
Miles O'Brien

If a user types in "ob" OR "o'b" I'd like the return response to include:

Miles O'Brien

Is there any method via filters or such that could help me in this? I have this "largely" working. I have a field called "name" that contains the full name of the user. This is the Field I'm searching against. The only way to do this that I can think of is a bit kludgy. And that would be to modify the name field to include all versions such as:

"name" : "miles o'brien obrien"
"name" : "jason von-hanson hanson"

Any thoughts or suggestions? Any help for a newbie would be most appreciated!



I'd play around with using a filter to remove the ' and - entirely. Also have a look at the completion suggester or using edge ngrams to avoid the need for usually slow prefix query. The completion suggester is going to be faster but uses heap memory to do its job. The edge ngrams are less fast and more complicated to set up but won't use heap memory.