Hacking Super Intelligence

Original article was published by Guy Harpak on Artificial Intelligence on Medium

Photo by Allan Beaufour, license: CC BY-NC-ND 2.0

Hacking Super Intelligence

AI/ML is affecting sensitive decision making — to protect our systems we need a unified framework and a new discipline: AI Security

There’s a new type of cyber attacks in the world. These attacks are not similar to the traditional ones and can’t be countered with traditional measures. Today there’s only a trickle of these attacks, but in the coming decade, we may be facing a tsunami. To prepare, we need to start securing our AI systems today.

I wanted to start with the premise to AI Security, only to realize that I’m risking a cliché on how disruptive AI is. Just to get it off the table, I’ll mention that AI is not only part of our daily life (search engine suggestions, photo filters, digital voice assistant). It is already involved in critical decision making processes; National agencies use AI to better analyze data. Cars use it to make life saving decisions (and drive autonomously). Financial institutions use it daily — we are probably experiencing the Cambrian Explosion of AI applications, and in the coming decade AI will change the most sensitive areas of our lives: medical diagnosis, financial institutions’ decision making, control of critical infrastructure and decision making in military systems.

Based on more than 3 decades of digital hacks and frauds, we can assume that criminal minds are already working to exploit the weaknesses of this technology — and some of these weaknesses are just as new as AI is [1]. Still, of the tens of thousands of people working to develop these AI systems, probably only a fraction are focusing on the safety and security of AI systems.

In this short article, I intend to briefly review “AI native risks”, provide some background on AI policies to highlight the intensity with which the topic is addressed and then propose a framework for a new security discipline: AI Security.

The Risk

Artificial intelligence is the future, not only for Russia, but for all humankind. It comes with colossal opportunities, but also threats that are difficult to predict. Whoever becomes the leader in this sphere will become the ruler of the world,.”– Vladimir Putin (RT)

In the technology world, new risks often follow innovation. These risks start small and then explode (see my post on exponential risks and specifically the IoT use case). AI has an exponential potential to be destructive. Just think of its impact on digital data; think of all the online videos, recorded phone calls, security cameras, websites… now imagine an AI algorithm that analyses any such piece of data, finds hidden connections and extracts new meaning from it. The data analysis creates a new body of data that is almost as big as the original one and no one can anticipate what this data will look like. Today, we give these systems the ability to react and interact, for example, let AI answer phone calls or drive cars, exponentially increasing the avenues for damage. Decision making is done in an unpredictable way relying on unpredictable data. How safe do you feel?

Even if you trust AI’s ability to make the right decisions and you trust the businesses/agencies that use AI, there is another important caveat: AI is susceptible to new types of attacks. Today’s cyber security is mostly focused around protecting IT systems and Cyber-Physical Systems (CPS) against exploitation of vulnerabilities in the logic, code or architecture. When targeting AI systems, attackers have new methods of operation that almost no one is countering, for example:

  1. Poisoning the training data at the learning phase to bias the model.
  2. Adversarial attacks that fool the AI by exploiting training white spots.
  3. Brute forcing the model to extract training data, breaching privacy or data secrecy.
  4. Physical model extraction, allowing attackers to extract the model and thereby breach the owner’s intellectual property.

There are numerous other examples [2]. When AI becomes critical in companies’ business processes and sensitive decision making, it’s easy to see how the aforementioned attacks can lead to damage scenarios across the SFOP range: safety, financial, operational and privacy.

Related Policies & Research

Timeline of AI strategic documents, effective as of April 2020, UNICRI under CC BY-SA 4.0

The term AI Safety Engineering was coined by Roman Yampolskiy as far back as 2010, starting to draw attention to the problem we face. In January 2017, a group of leading industry and academy researchers devised the Asilomar AI Principles [3]. Signed by over 1,600 researchers and endorsed by names such as Elon Musk and Stephen Hawking, the Asilomar AI principles aim to a framework of important considerations for AI research. Most of the principles are around AI ethics and values. Principle number 6 explicitly states that AI Safety must be a consideration in all AI implementations. There is definitely increased awareness and yet, only a small number of people are actively researching AI Safety.

On the bright side, regulators around the world are waking up to this risk early. This is contrary to past technology risks. GDPR, for example, came about 3 decades after technology breached our privacy and the California IoT Act [4] two decades after IoT was a thing. In the case of AI, governments are paying attention early on [5]. The White House, for example, published the “Artificial Intelligence for the American People” fact sheet in 2018. The European Commission published “Ethics Guidelines for Trustworthy Artificial Intelligence (AI)” [6] in April 2019 and China is investing in the field for years. AI policies is a great topic which I won’t go into in details in this post. For a deeper view, you can study the timeline of strategic documents above and read the summary paper from CLTC [7].

But even with all these policies, anyone can program an AI app in a few minutes and go unnoticed. The tools we have to manage the risks are effectively non-existent.

There are many angles to tackle AI Safety, and as a techie I want to focus on the technological options.

Solutions Landscape

First, let’s define our terms.

AI Safety, in my vocabulary, only covers part of what we need to trust AI. To generalise, I use the term AI Trust: the ability of a business and a user to sleep well when AI algorithms make decisions. To designate the discipline of securing AI systems against attacks I use the term AI Security.

AI Trust: human ability to trust the results of AI considering:

  1. Limited explainability
  2. Susceptibility to cyber attacks against IT systems
  3. Susceptibility to attacks against AI way of working (e.g adversarial attacks, poisoning, etc…)
  4. Risk of major data leakage

AI Security: the discipline of protecting AI systems from:

  1. AI native attacks (adversarial, poisoning, etc…)
  2. Theft of AI models (i.e intellectual property)
  3. Inference of training data
  4. Infringement of privacy

While I keep on stating that only a small number of people engage in AI Safety/Security, the industry and academia are not completely asleep. In a technical paper from the University of California [8] the authors have mapped different fields for research considering the massive adoption of AI. Three of these areas reside in the AI Security discipline: robust decisions, secure enclaves, shared learning on confidential data and adversarial learning. In recent years, major conferences have chosen to focus on AI security as a topic. In Black Hat USA 2020, for example, there were 8 briefings around ML/AI topics and 2 of them focused directly on model extraction [9] and protecting from adversarial attacks [10]. The ACM Workshop on AI and security (AISec) is covering an increasing number of sessions on AI Security topics. To the academic interest we should add the focused interest from the industry: see for example this overview piece by Huawei [11] and the open source project led by IBM [12] that provides you with a full tool kit for preparing your system to deal with adversarial attack (you can download here).

AI Security Technology

The research interest has led to practical solutions. Forward-looking companies can already rely on the existing body of knowledge to counter AI attacks. From a professional point of view, I believe we should first cover the boring stuff:

Image by Author


  1. Clear definition of the AI Security domain: what do we tackle?
  2. Clear responsibility for AI security in the organisation: is it the product owner, engineering or compliance department?

R&D processes:

  1. Take AI related risks into account during the development life cycle. This should cover both product development and data science operations.
  2. Each project should identify AI risks and related AI Security objectives at the initiation.
  3. Security testing and Red Teams should be trained to use AI attack vectors.
  4. Training data must be protected and analysed for threats.


  1. R&D teams should use standard off-the-shelf tools to harden the systems.
  2. Data must be filtered and prepared to counter adversarial attacks.
  3. AI Models should be hardened to reduce sensitivity and, if possible, to counter attacks.
  4. Deployment in production should use different controls for securing model and training data confidentiality.


  1. AI systems should be monitored for bias, poisoning and other attack attempts.
  2. Inference and queries should be analysed in production to detect attack attempts (i.e. adversarial attacks or data extraction).

Technology wise, existing tools can cover at least the basic threats:

  1. Model extraction: analyse inputs to detect model inference attempts. For example, integrate the PRADA detection model [13] in-line and have measures to respond to attack attempts.
  2. Poisoning: validate the training data and sample it (sparsely remove examples, as much as it hurts 🙂 ). In production, try to achieve comparison of inference results between different independent models, i.e. ensemble analysis. Also, look for anomalies in the data.
  3. Adversarial attacks: train your model with adversarial training data. This can be achieved using the ART toolbox during the training phase. Before production try and minimise your model’s sensitivity, for example by using knowledge transfer [14]. Finally, explore options to protect your model in the field with reconstruction [15] and detection of suspicious input.
  4. IP theft: the deployment of the model, especially in edge deployment cases, must be in an architecture that protects at lease parts of the model from local attackers. This can be achieved by encrypting the model when it is not used and making use of secure enclaves and TEEs to protect critical parts. Finally, consider including certain watermarks in your model using neural paths that are only triggered by a specific input and can help you spot if someone else stole your model.

As mentioned in this 2016 paper [16], what we really need is a unified framework. Like in other areas of cyber security, we don’t want each company to develop its own flavour of security — from society’s point of view, there is great importance that technological leaders will develop the AI Security discipline and the state-of-the-art technologies to protect AI systems.

AI Security Innovations

In my mind, we need a holistic solution that will allow companies to focus on AI development while purchasing security off-the-shelf. The solution will cover AI Security from ideation, through development, to deployment. Such a solution should, minimally, include the following elements:

  1. Risk and processes: basic training, governance framework and professional security testing should be offered as professional services by capable companies and centres of excellence.
  2. Data hygiene and robustness: filter the training data for anomalous points and inject adversarial samples into the training data.
  3. Model security: execute standard routines (distillation, pruning) and integrate advanced controls to make the model more robust. Inject watermarking paths into the model to enable model theft detection in the wild.
  4. Secure deployment: secure the model against physical theft by partitioning deployment of models between un-trusted and trusted execution environments [17]. Automate encryption and decryption of models during runtime at the edge.
  5. Threat prevention & detection: filter and pre-process inputs during operation (“in the field”) to stop attack attempts. Monitor usage to detect adversarial attack attempts and/or model extraction attempts.
Image by Author

The integration of the above controls into the AI development cycle should be easy and accessible for any customer, relieving development teams from the need to develop in-house competency and home-brewed tools for AI security. It should be a standard solution every company uses, much like all companies use TLS in their systems and install an off-the-shelf firewall. In addition, AI Security professional services can be offered: for example, AI-focused Red Teams, risk assessments and monitoring SOCs.

The only question we should ask ourselves is whether there is a business case strong enough to develop the above solution and market it as a product. I think this is not a question of “if” there will be a business case but rather a question of “when”. There are at least 4 forces in play that make the coming 5 years a lucrative time to innovate in the AI Security domain (no specific order):

  • The prevalence of AI in business decision processes [18]
  • Regulations
  • Cyber-Physical Systems starting to make use of AI [19]
  • AI Ops becoming a thing (in short: there’s a DevOps-like movement in the AI world) [20]

But when in the next 5 years is a question for speculators. Since we are talking about security, it is reasonable to assume that the big players (i.e. FANG) will take care of themselves, a few early adopters will look for solutions from the outside and the majority of the market will simply wait for a big hack to reach the news. Let’s see how that plays out.


As a cyber security professional with expertise in protecting CPS, I witnessed firsthand how security innovation usually follows the risk rather than precede it. For a company developing AI the normal trajectory is to first invest in making it work, then optimise it and only then secure it. However, as we are in 2020 and given the disruptive nature of this technology we can’t afford ourselves to wait with the investment in AI Security. Regulatory bodies are heavily (and understandably) focused on explainability — but AI Trust, our ability to trust AI to make decisions, requires just as much investment in countering the new attacks that this technology brings.

There is a lot that can be done today, even using open source tools, to protect your AI systems. The next natural step is for the industry to develop AI Security products. While I’m not 100% positive there will be a leading AI Security product company, I am pretty convinced we will see expert consultants and a niche of professional services in the field. If you find this topic interesting, now can be a good time to develop your AI Security skills.

I hope this short article provides some food for thought and perhaps a starting point for managing your own AI Security risks.

If you would like to engage in discussion, please comment, share and feel free to contact me at: harpakguy@gmail.com

Thank you 🙂

Disclaimer: my views are my own and have no connection with any company or individual I work with or for.