Content area
Full text
1. Introduction
About 10,000 scientists[1] around the world work on different aspects of creating intelligent machines, with the main goal of making such machines as capable as possible. With amazing progress made in the field of artificial intelligence (AI) over the past decade, it is more important than ever to make sure that the technology we are developing has a beneficial impact on humanity. With the appearance of robotic financial advisors, self-driving cars and personal digital assistants come many unresolved problems. We have already experienced market crashes caused by intelligent trading software[2], accidents caused by self-driving cars[3] and embarrassment from chat-bots[4], which turned racist and engaged in hate speech. We predict that both the frequency and seriousness of such events will steadily increase as AIs become more capable. The failures of today’s narrow domain AIs are just a warning: once we develop artificial general intelligence (AGI) capable of cross-domain performance, hurt feelings will be the least of our concerns.
In a recent publication, Yampolskiy proposed a Taxonomy of Pathways to Dangerous AI (Yampolskiy, 2016b), which was motivated as follows: “In order to properly handle a potentially dangerous artificially intelligent system it is important to understand how the system came to be in such a state. In popular culture (science fiction movies/books) AIs/Robots became self-aware and as a result rebel against humanity and decide to destroy it. While it is one possible scenario, it is probably the least likely path to appearance of dangerous AI.” Yampolskiy suggested that much more likely reasons include deliberate actions of not-so-ethical people (“on purpose”) (Brundage et al., 2018), side effects of poor design (“engineering mistakes”) and finally miscellaneous cases related to the impact of the surroundings of the system (“environment”). Because purposeful design of dangerous AI is just as likely to include all other types of safety problems and will probably have the direst consequences, the most dangerous type of AI and the one most difficult to defend against is an AI made malevolent on purpose.
A follow-up paper (Pistono and Yampolskiy, 2016) explored how a Malevolent AI could be constructed and why it is important to study and understand malicious intelligent software. The authors observe that “cybersecurity research involves publishing papers about malicious exploits as much as publishing...





