Wireheading is a Teleological Misnomer
As the mistaken belief that AGI is "coming up" on our horizon increases, the discussion around "wireheading" has increased in fervor. The idea is very simple:
What if you designed a robot to mow your lawn, by making sure it gets orgasms from well-cleaned lawns? Except you botch up the entire thing by making the robot so smart it just rewires its own programming to get 10,000 orgasms a second.
Surely this can happen—I don't mean to deny that this sequence of events can occur. Instead, I would like to say that the view of the robot as "cheating" is a mistake, the same kind of mistake that people make when they think that evolution has "designed" the spleen. Evolution has cutaway and added to what was once not a spleen, till our left abdomens look as they do today; we have named the spleen. In this same sense, the above is not the robot "cheating" but rather the designer not being able to do a good job specifying the reward structure for the expected perspective of the agent.
And when we start looking at it that way, we see that this has been happening non-stop for all of recorded history. Consider the College Board, which creates and administers standardized tests. Is it College Board's job to make sure high test scores correlate with:
b) competency at school work
c) good financial outcomes
I'm sure readers will realize it's d) none of the above. College Board's job is to get students to buy tests. They might even have secondary customers, e.g. book deals, tutoring deals, agreements with colleges. I have no idea and frankly I don't care, the point is that however teleologically you want to view its organizational structure, College Board does not "see" your viewpoint, even though it's well aware of the general feelings people impose on it. College Board's instrumental narrative is that more students should be taking more tests, because it is the fattening of the middle class educational apparatus that has allowed it to prosper.
I am not against this, and I think anyone who is morally against it is misguided. A moral attempt at creating standardized tests would be a disaster, and you don't have to take my word for it: you will see exactly this attempt to happen in the next five years. It will be much like a professor: extremely lopsided and with a tenuous connection to evidence due to its addiction to leading questions.
The problem is that names are generally teleogical: a "can opener" is meant to open cans. For entities that don't do much when you're not using them, this is perfectly fine. For entities with their own thing going on, this is not. Anteaters do not just eat ants.
This problem is in some sense the opposite of the concept my colleague has been developing an extended theory of: surrogation, wherein a metric or indirect percept of a phenomenon comes to stand-in for the "real thing". Think IQ vs. situated problem-solving ability: there's no Mensa club for situated problem-solving ability, how would you decide who gets in? If you just thought "What about the Nobel prize?" please close this window.
I struggle with surrogation, because I think in reality most of the time there is no "real thing" from the get go. We were never acquainted and coordinated enough around intersubjective consistency to claim that the deviation is the fault of measurement, so much as our inability to have any idea what we're talking about. Consider the tired argument around what entities are "alive". I can hardly disagree that what counts as alive will have big implications, but I'm unconvinced that there was ever enough agreement at the edges over what is "really" alive that we have any right to worry that such definitions have betrayed our intuitions. Rather, our use of "that pig is still alive" didn't shove these edges in our face.
Our lack of respect for the limitations of our definitions has betrayed us. There are many reasons why definitions are slippery, but wireheading reveals the most important one: teleology. We generally use names for a purpose and these names change very easily if they are untethered from that purpose. But we never really had a good understanding of the purpose, that was purpose of the name. Names trick you into "bottoming out" your level of inquiry. They give grounding out of the assumption that the other people you share the name with find the same things salient. When this shared attention vanishes, so does the ground beneath us.