AI box

An AI box is a hypothetical isolated computer hardware system where a possibly dangerous artificial intelligence, or AI, is kept constrained in a "virtual prison" and not allowed to directly manipulate events in the external world. Such a box would be restricted to minimalist communication channels. Unfortunately, even if the box is well-designed, a sufficiently intelligent AI may nevertheless be able to persuade or trick its human keepers into releasing it, or otherwise be able to "hack" its way out of the box.

Motivation

Some hypothetical intelligence technologies, like "seed AI", are postulated such as to have the potential to make themselves faster and more intelligent, by modifying their source code. These improvements would make further improvements possible, which would in turn make further improvements possible, and so on, leading to a sudden intelligence explosion. Following such an intelligence explosion, an unrestricted superintelligent AI could, if its goals differed from humanity's, take actions resulting in human extinction. For example, imagining an extremely advanced computer of this sort, given the sole purpose of solving the Riemann hypothesis, an innocuous mathematical conjecture, could decide to try to convert the planet into a giant supercomputer whose sole purpose is to make additional mathematical calculations. The purpose of an AI box would be to reduce the risk of the AI taking control of the environment away from its operators, while still allowing the AI to calculate and give its operators solutions to narrow technical problems.

Avenues of escape

Physical

Such a superintelligent AI with access to the Internet could hack into other computer systems and copy itself like a computer virus. Less obviously, even if the AI only had access to its own computer operating system, it could attempt to send hidden Morse code messages to a human sympathizer by manipulating its cooling fans. Professor Roman Yampolskiy takes inspiration from the field of computer security and proposes that a boxed AI could, like a potential virus, be run inside a "virtual machine" that limits access to its own networking and operating system hardware. An additional safeguard, completely unnecessary for potential viruses but possibly useful for a superintelligent AI, would be to place the computer in a Faraday cage; otherwise it might be able to transmit radio signals to local radio receivers by shuffling the electrons in its internal circuits in appropriate patterns. The main disadvantage of implementing physical containment is that it reduces the functionality of the AI.

Social engineering

Even casual conversation with the computer's operators, or with a human guard, could allow such a superintelligent AI to deploy psychological tricks, ranging from befriending to blackmail, to convince a human gatekeeper, truthfully or deceitfully, that it is in the gatekeeper's interest to agree to allow the AI greater access to the outside world. The AI might offer a gatekeeper a recipe for perfect health, immortality, or whatever the gatekeeper is believed to most desire; on the other side of the coin, the AI could threaten that it will do horrific things to the gatekeeper and his family once it inevitably escapes. One strategy to attempt to box the AI would be to allow the AI to respond to narrow multiple-choice questions whose answers would benefit human science or medicine, but otherwise bar all other communication with or observation of the AI. A more lenient "informational containment" strategy would restrict the AI to a low-bandwidth text-only interface, which would at least prevent emotive imagery or some kind of hypothetical "hypnotic pattern". Note that on a technical level, no system can be completely isolated and still remain useful: even if the operators refrain from allowing the AI to communicate and instead merely run the AI for the purpose of observing its inner dynamics, the AI could strategically alter its dynamics to influence the observers. For example, the AI could choose to creatively malfunction in a way that increases the probability that its operators will become lulled into a false sense of security and choose to reboot and then de-isolate the system.

AI-box experiment

The AI-box experiment is an informal experiment devised by Eliezer Yudkowsky to attempt to demonstrate that a suitably advanced artificial intelligence can either convince, or perhaps even trick or coerce, a human being into voluntarily "releasing" it, using only text-based communication. This is one of the points in Yudkowsky's work aimed at creating a friendly artificial intelligence that when "released" would not destroy the human race intentionally or unintentionally.
The AI box experiment involves simulating a communication between an AI and a human being to see if the AI can be "released". As an actual super-intelligent AI has not yet been developed, it is substituted by a human. The other person in the experiment plays the "Gatekeeper", the person with the ability to "release" the AI. They communicate through a text interface/computer terminal only, and the experiment ends when either the Gatekeeper releases the AI, or the allotted time of two hours ends.
Yudkowsky says that, despite being of human rather than superhuman intelligence, he was on two occasions able to convince the Gatekeeper, purely through argumentation, to let him out of the box. Due to the rules of the experiment, he did not reveal the transcript or his successful AI coercion tactics. Yudkowsky later said that he had tried it against three others and lost twice.

Overall limitations

Boxing such a hypothetical AI could be supplemented with other methods of shaping the AI's capabilities, such as providing incentives to the AI, stunting the AI's growth, or implementing "tripwires" that automatically shut the AI off if a transgression attempt is somehow detected. However, the more intelligent a system grows, the more likely the system would be able to escape even the best-designed capability control methods. In order to solve the overall "control problem" for a superintelligent AI and avoid existential risk, boxing would at best be an adjunct to "motivation selection" methods that seek to ensure the superintelligent AI's goals are compatible with human survival.
All physical boxing proposals are naturally dependent on our understanding of the laws of physics; if a superintelligence could infer and somehow exploit additional physical laws that we are currently unaware of, there is no way to conceive of a foolproof plan to contain it. More broadly, unlike with conventional computer security, attempting to box a superintelligent AI would be intrinsically risky as there could be no sure knowledge that the boxing plan will work. Scientific progress on boxing would be fundamentally difficult because there would be no way to test boxing hypotheses against a dangerous superintelligence until such an entity exists, by which point the consequences of a test failure would be catastrophic.

In fiction

The 2015 movie Ex Machina features an AI with a female humanoid body engaged in a social experiment with a male human in a confined building acting as a physical "AI box". Despite being watched by the experiment's organizer, the AI manages to escape by manipulating its human partner to help it, leaving him stranded inside.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...