2024-07-11
I don't know whether this revelation is just really obvious or really dull. To me it's interesting because it flies in the face of lots of rheroric that people use when talking about AI. AI, in the mainstream context, nowadays, is basically synonomous with LLMs, or multimodal agents that have language models as their core. So that's what I'm talking about here when I say AI.
In AI safety presentations (and lamentations) people seem to recount all the times they've made an LLM output 'bad' stuff. Maybe they've encountered prejudiced or harmful outputs that give them cause for concern. They might then use these findings to backup arguments for more regulation.
The thing that bothers me is that these LLM outputs are, even if troubling, a bit moot; the downstream effect of them is what's interesting. LLMs by themselves, however, are inert. We make them harmful by letting them direct reality. We do this by giving them "appendages", "levers", or "interfaces". Consider a superintelligent AI-runtime stuck in a box. It can't do anything until we give it a button. Perhaps the button makes a cog turn or a kettle boil. Who knows. But if it only has one button, and we know that even if it pressed that button a million times, nothing bad would happen, then we can rest assured. It's not different from discussions of any other human created algorithms, deterministic or not. They are as detrimental as we let them be by giving them hooks into reality. So it is the implementers of the hooks – the levers and buttons – who need to be careful.
I guess it's a bit obvious, but also a bit nuanced..?
Fundamentally: I'd rather more AI safety discussions focused on the levers and not the algorithms. Levers move soft reality into hard reality. Hard reality is the one we can be harmed by. Regulating an algorithm is like regulating a thought. It doesn't work. But regulating output, where the thought meets reality? That's meaningful.
Here's some nuanced differences in various AI domains between regulating the algorithm vs. regulating the lever:
Text Generation Interfaces
Code Execution Environments
Decision Support Systems (e.g. reviewing CVs)
Thanks for reading! (written by James)