With multiple shards on the node, the queries for those shards have to be run serially?

Having multiple shards on a single node introduces another set of
performance considerations. As discussed earlier, in order to run a
complete query, Elasticsearch must run the query on each shard
individually. In a one-shard-per-node setup, all those queries can run
in parallel, because there's only one shard on each node. With multiple
shards on the node, the queries for those shards have to be run
serially.

Is it right?

Hi,

I don't think so, there are multiple search threads running even on a single node, so if you have multiple cores on your machine they should be all used in parallel for searching the shards.

I am not sure. In offical website, there are only some documents about don't use too many shard, so expecting an definite answer

No. That is not correct.

Lucene is multithreaded for starters, so even if you had a single shard, as long as you have multiple threads you get a benefit.