So there are three common ways things can end up on disk:
- Inverted index
- Stored fields
- Doc values
There are others but they are more rare. Doc values work as @danielmitterdorfer explained with the ordinals. Stored fields are combined in chunks with other documents and compressed. Usually you don't interact with stored fields directly, but you interact with
_source which is stored. The inverted index also has one copy copy of the text per segment.
So in an index with a billion docs you'd see a small compression by manually switching, but not huge. You'd save a tiny bit of space per chunk of stored fields and that'd add up, but probably not enough to be worth it. You'd save an even smaller amount of space in the inverted index as well, but again, I don't expect they'd add up to be worth it.
You should make sure that those strings are indexes as
"index": "not_analyzed" if on 2.x or
"type": "keyword" if you are trying out 5.0. That gives you the most useful behavior for strings you don't want to analyze.
One time it is more obviously worth it to convert strings like GET/PUT/POST etc into ordinals manually is if you want to use a range query or sort them non-alphabetically. There just aren't that many HTTP verbs so it probably isn't worth it for a field like that, but maybe it'd make more sense on field with 500 values.