Ever since reading Nick Bostrom's book on AI safety I've been somewhat into the AI alignment problem. It's an interesting intellectual exercise even for people outside the relevant fields (like me): when building an unstoppable hugely powerful golem which will execute its instructions to the letter, what should the instructions be so that we are not going to regret building it?


To my knowledge, a wholly satisfactory and practically implementable general solution has not yet been created. A lot of people are working on it, but it is difficult to purify and squeeze into code-able logical language such squishy and messy concepts as 'do what's right' or 'at least don't do anything morally reprehensible' and 'improve human lives in such a way as we want them to be improved even if we don't know how and disagree among ourselves what this might entail – or indeed, even a single person can't stick to one definition from day to day' and (this is a good one) 'human values'.
Everyone has a big idea on how to go about this and I have too. Just, you know, make it maximize everyone's possibility to exercise self-determination as long as it does not trample others' self-determination. Easy-peasy. I almost have the algorithm complete: W for world and t for time and f for fucks given… ahh, doodling and daydreaming through math classes seemed like a good idea at the time, sigh.


I sort of vaguely visualize a space of possibilities available to people, and the AI should keep everyone informed about the available possibilities (in whatever amount of information that is sufficient to making informed decisions but not so much as to become inconvenient) and at the same time keep watch that no one hogs too much of the possibility space to themselves. I assume it must be possible to formalize this somehow.


But I guess the main issue is that in the event of successfully creating a super powerful AGI which allocates the available resources in a way to optimize for everyone's preferred outcome... the world might end up almost as different from the current state as in the case of a superpowerful artificial intelligence which has some weird random ethically unclear goal (but does not end up killing everyone). At the moment resources and chances for self-determination are allocated rather unequally. Anyone who has time, education and stable enough internet connection to contemplate these things is most definitely at the upper 10% of global wealth. If it was spread evenly, it would mean more for most people but less for you and me. There's no telling how the world might look. (I don't expect a post-scarcity utopia to necessarily follow from intelligence explosion.)


So even if it ere technologically feasible to create this librarian-nanny AGI I'm thinking of, it probably would not be built. Why would anyone choose to spend enormous resources on developing a world where they themselves have less power – proportionally or even absolutely? Certainly it does not make sense from business perspective. It would require quite some forceful and imaginative persuasion to get investors behind such a scheme! And if we consider the possibility that animals might be included in the population whose self-determination is considered… just imagine a world where 7 or 8 billion humans are stuffed in tiny reservation so that we don't infringe on animals fully living their authentic life without humans trampling on their rights all the time most annoyingly. I do expect we'd be fed adequately, unless, oh shit, also plants are considered. I guess this is headed towards miniaturized humans who survive on tiny nibbles of plants. No, actually I know the only way out of this: green-skinned photosynthesizing people. If that fails, then some compulsory upload scenario. It's a tiny minority of people who find this future appealing, and I don't think they are the ones investing in developing new technologies.