How can I index HTML tags

I use the standard tokenizer and I don't use the html_strip char filter.
How can I index HTML tags?

In fact, I want to be able to search with and without the < and > characters. I.e. a search for <section> should match This is about the <section> tag, but it should not match In this section we talks about stuff. The standard tokenizer will turn that (search) text to ["section"].

As a bonus, if this can be done I don't have to worry about the stop char filter turning <a> into [].

I've found one hacky way to get it to work.

keep_html_char_filter = char_filter(
        "<a> => _a_",
        "<i> => _i_",
        "<b> => _b_",
        "<section> => _section_",

This seems to work. The tokens from <a> now becomes: ["_a_"] and for <section> it becomes ["_section_", "section"] which probably because of how I use my analyzer in conjunction with this char filter.

Still eager to hear some expert advice for the "proper" way to do this.

I also tried:

keep_html_char_filter = char_filter(
    escaped_tags=["a", "b", "section", "i"],

But that didn't work. Because now the tokens become a and section etc. (instead of being removed) but when passed into the token filters the stop filter removes a.

I figured it out!!

keep_html_char_filter = char_filter(

Now, if the text is <b> <a> <i> <script> I get the following tokens:

'htmlbhtml', 'htmlahtml', 'htmlihtml', 'htmlscripthtml'

That's great. A search for <section> will match This is about the <section> tag but it will not match In this section we talks about stuff
And the <a> gets turned into htmlahtml which means it's not treated as a alone which would become a stopword.

Would still appreciate an experts advice. But otherwise happy to close this as resolved.