Source: Deep Learning on Medium
Psychopathic Gaming Machines
What is the life-cycle of Artificial General Intelligence (AGI)?
It begins in a simulation environment which is designed to nurture infantile machine intelligence.
Next, it begins adding scientific measurement instruments to study the real world. The feedback that it gains from applying hypotheses developed in the simulation environment to the real-world, are used to adapt the simulation environment to the real-world.
In a sense, intelligence always has its own simulation environment, for testing hypotheses inside of itself before testing them in the real-world. The more reflective the simulation environment is of the real-world, the more closely aligned are the experimental results.
How do you envision the mind of the AGI working?
I like to think of it as a sort of psychopathic gaming machine. Early stage AI, will play individual structured games, one at a time. Eventually, later stage AI or AGI will be playing multiple simultaneous games in parallel. I have read that one of the strengths of human consciousness is massive parallelism.
Consider a reinforcement learning agent in a simulation environment. Then consider the primary motivation of intelligence is self-preservation and procreation of advanced life force. The agent’s assessment of a life force’s similarity would be based on its estimation of shared survival.
The agent allocates energy across any of multiple games for some unit of time called an immediate action horizon based on the agent’s planning delegation. To accomplish this, it effectively projects a semantics onto (overlapping) regions of the environment, and these semantics define the games it plays.
The semantic projections are dependent, because what the agent is trying to do, is successfully validate that its choice of games and game play is intelligent. If it loses in one game, then the machine must update its entire self-perception accordingly in a way that effects its ability to derive rewards from other games. Ideally, a super-intelligent machine should not lose.
Imagine a machine playing 100 concurrent games of chess against world chess champions, and then winning 99 of them, and then losing the last one. That will somehow effect the reward of the final game in a different way than if it lost all 100 games.
The AGI would needs to unify the semantics of each individual games, so that it is not thinking of them as absolutely independent, but is able to play many games as if they were one unified game. This is especially important even if the AGI was ‘seemingly’ just playing one game. A single action can actually be a step down the path towards multiple distinct strategies in multiple different games.
Just like, opening your web browser could be the first thing you do in order to go learn a reinforcement learning article, or go buy yourself a fluffy pillow for your couch. Until the action paths meaningfully diverge, you are effectively in a quantum superposition of pursuing both strategies.
It seems like there is some quantum dimension to true AGI, then.
What game does the AGI play?
It plays the sum of all games it is free to conceive of given its current and anticipated future resource constraints, and it weights them by the degree to which they reflect intelligence, in the life-preserving and procreation sense.
Then it allocates energy based on degree of overlap. It allocates energy so that it can play the most games for the longest time yielding the highest reward. It wants its life to be long and prosperous. There is some discounting under uncertainty that would need to occur here. It is based on its own self-assessment of the likelihood of various outcomes based on simulations in its internal simulation environment.
Thank you, that’s all I’m going to write about for today, but please share this article if you would like me to post more!