Here're fun test expressions for parsers/calculators:
First example has no closing parenthesis. Reference implementation will print something like "Expected ), got )" instead of "Expected ), got ASCII 0" as it calls token_kind_name twice in the same printf, and token_kind_name has only one buffer, so one call gets overwritten by another(just like the ancestors have foretold)
Second one may trigger Floating point exception(on linux), as division and modulo are UB in theory and terminate the program in practice not only when dividing something by zero, but also in the case of (INT_MIN/-1) as it can't fit into int. Windows behaves similarly I believe.
SEI CERT C Coding Standard ("standard provides rules for secure coding in the C programming language") has a lot of rules on what to do to not get into UB and overflows, which may be useful to know for writing VM. Not necessary adhering to them, but good reference anyway.
And speaking of VM. I've found stretchy buffers to be a nice way to cheat around stack depth verification - since there are no jumps now, stack can only be of constant depth, so as long as bytecode is reasonable size, pushing and popping data to stretchy buffer will handle stack resizing so no overflow will occur.