Only one tokenizer can be defined per analyzer. Keep in mind that
tokenizers and token filters are different items, with the former being
executed first (of the two) in the analysis chain.
The lowercase tokenizer  is based on the letter tokenizer , which
simply breaks on non-letter characters. The standard tokenizer  is far
more complex, with various rules mostly based on the English language. It
all depends on your corpus and use cases. Data such as names and titles
could use a simpler letter tokenizer, but free form text that might
included urls or email address is probably best tokenized by the standard
As an aside (unrelated to the original question), the English part of this statement is not true. It is based on the Unicode Text Segmentation algorithm. See http://unicode.org/reports/tr29/. The standard analyzer has some English stuff, specifically the default set of English stop words.
Very true Ryan. I meant to say based on Latin character set languages, but
even that is false. I hope that the OP sees the difference between
tokenizers and token filters, especially for the standard tokenizer/token
filter. The former does tons, the latter does nothing!
That diagram does not highlight the fact that you can have several
character filters and token filters, but only one tokenizer. In general,
character filters are seldom used (mainly for pattern removal or
substitution), then a simple tokenizer, followed by several token filters
which work on the tokens generated by the tokenizer. Chances are you want
to focus on the token filters.