What I find most surprising about exploits like this is the fact that input is subject to alignment, but the AI's output is seemingly exempt from any review. Even a cursory analysis of output would discover that alignment has failed, but this kind of basic sanity check is seemingly never done.