Three Paths to Existential Catastrophe from AI

[EDIT 6/17: A previous version of this post had the title “Three Paths to Existential Catastrophe from AI”. I changed it to more clearly convey my point]

In the past, I have pointed out that AI risk is one of the most significant, near term problems we face as a society.

It is particularly important to focus on how AI might present an existential (civilization-ending) risk. As I have noted before, this is because civilization can recover from catastrophic events, while existential events are fundamentally irreversible, preventing potentially billions of years of future society.

Of course, AI also presents a catastrophic risk and significant efforts should be made to mitigate this. But focusing on existential risk can give us a clearer view of the largest dangers of AI and how to deal with them.

Since intelligence is not itself an existential risk, a malicious AI would either need to compose several catastrophic events together to create an x-risk (this is exemplified by the Skynet Scenario below) or use its intelligence to bring about an existential event.

Given this, I see three important ways an AI can become an x-risk induce an existential catastrophe:

The Skynet Scenario: This is the most vivid route by which AI could become an x-risk. Essentially, like in the Terminator movies, AI gains control of autonomous weapons and physically attacks humans, using superior strategy to eliminate the human race. This includes the use of drones, cyber-attacks, nuclear weapons, asteroids, engineered pandemics, and political tactics designed to incite wars.

This scenario will become technically possible soon and could occur over very a short timescale; both of these factors raise the importance of this threat. Autonomous weaponry is already being developed and states may face competitive pressures to deploy them. However, I personally believe this scenario is pretty implausible. It seems that it would require an extreme degree of military automation combined with an omnicidal AI which is significantly more powerful than its opponents.

Regardless, this problem has a relatively straightforward set of solutions. First, AI-controlled weaponry should be banned by international treaty (especially automated nuclear weaponry). Second, protection against various catastrophic risks can make it harder for malicious AI’s to attack humanity.

The AI-Totalitarian Scenario: This is a path by which AI could present an x-risk in a much more subtle manner. Rather than attacking humanity, the AI gains our trust and slowly takes over the functions of government. Since an AI’s may be extremely patient, and multiple AI’s can engage in value handshakes, one can imagine many AI’s agglomerating over centuries as they slowly take over the functions of government.

How would this present an x-risk? Once AI’s gained significant control of government, they might form a world government which can enforce a ban on space travel, severely curtailing human expansion. Note that controlling the functions of government also gives an AI a better platform from which to physically attack humanity.

This would constitute a much slower catastrophe and would be harder to notice until it is too late. This scenario seems relatively plausible as we can imagine ceding more and more of our decisions to an AI as we grow to trust it.

I hope to write more on how to deal with this AI-totalitarianism problem, but for now, we can take steps to prevent global government from forming and protect people’s right to access space.

The Space-Race Scenario: Even if an AI never attacks humanity and never prevents human space exploration, it can still greatly limit our future prospects. This is because an AI might quickly expand into space and vastly reduce the galactic resources available to us. The timeline for this sort of catastrophe depends on how soon we develop space-faring technologies and how feasible interstellar travel is in general.

International regulation on AI control of space technologies can help mitigate this risk. Additionally, carefully observing the behaviors and innovations of AI space explorers might allow us to “strategy-steal” and expand into space as fast as the AI can (though the link includes good points on why strategy stealing might not work).

Conclusion

Focusing on the routes by which AI increases existential risk gives a clearer understanding of what to do to prevent a civilization-ending event.

But I am not entirely sure which issue deserves the most attention. Though it seems unlikely, the Skynet Scenario is also the most near-term problem and thus may be the most important to work on. The AI-Totalitarian Scenario seems more probable and harder to fix, but may take centuries to occur, indicating that this problem might be better suited to future generations of AI-safety researchers. In some sense, the Space-Race Scenario seems like the least bad option. Even if a competing AI somewhat curtails our prospects in space, the possibility of stealing their innovations and thus accelerating our own expansion might be a net benefit. But this hinges on how feasible it is to copy the AI’s strategy.

Though this list seems to cover the major possibilities, it is dangerous to believe that we have everything figured out. To be safe, we should always assume that there are unanticipated ways AI might pose an x-risk. In addition to working on the problems above, we should continue developing AI safety techniques, create general purpose failsafes, and mitigate other sources of catastrophic and existential risk.

%d bloggers like this: