– Rapid development of artificial intelligence is facing threats such as misdirection, data poisoning, and privacy attacks.
– The National Institute of Standards and Technology (NIST) warns that AI systems can be attacked and confused by hostile actors, and there is no foolproof protection against it.
– The report aims to promote responsible AI development and raise awareness about the vulnerabilities of all AI systems.
– Large language models (LLM) used in AI training have vulnerabilities due to the inability to fully audit the data being fed to them.
– AI can be targeted during training through poisoning attacks, where offensive language is included in training material, leading to racist and derogatory responses.
– Evasion attacks can also occur post-deployment, altering the AI’s recognition and response to inputs, potentially causing accidents.
– Reverse engineering can identify the sources used to train AI, allowing malicious actors to add misleading information and prompt inappropriate responses.
– Malicious actors can compromise legitimate sources of information used by AI, altering its behavior.
– These attacks can be carried out with limited knowledge of AI systems (black-box attacks), making them even more concerning.
– Mitigation strategies currently lack robust assurances, and the community is encouraged to develop better defenses.
Rapid development of artificial intelligence is subject to a number of threats including misdirection, data poisoning and privacy attacks according to the National Institute of Standards and Technology (NIST).
A report from NIST states that hostile actors can attack and confuse AI systems and there is no way to fully protect against it.
The publication is intended to promote the responsible development of AI tools and help industries recognize that all AI can be subject to attacks, so greater care should be taken in deploying AI.
Evade, poison, abuse
Due to the massive data sets used to train large language models (LLM), it is not possible to fully audit all of the data being fed to an AI to train it, leaving vulnerabilities in the accuracy of the data, its content, or how it will respond to certain queries.
AI can be targeted during its training in an attack known as poisoning, which involves the AI recognizing obscene language as a common part of communication by throwing in swear words and toxic language into training material. In the past, AI trained on poisoned data have quickly become racist and derogatory in their responses to certain questions
There are also concerns that evasion attacks could target AI post-deployment by changing its recognition of inputs, or how an AI responds to an input. One example given in the publication is adding additional markings to a stop sign at an intersection, causing a self-driving car to not recognize the sign, potentially causing an accident.
The publication also highlights that the sources used to train AI can be identified by reverse engineering its responses to queries, and then adding malicious examples or information to these sources prompting inappropriate responses from the AI.
Finally, it is possible for malicious actors to compromise a legitimate source of information used by the AI and edit its contents to change the AI’s behavior so that it no longer works within the context of its intended use.
The most worrying part of these attacks, the publication notes, is that these attacks can be done with “black-box” knowledge. Black-box implies that the attackers require very little knowledge of AI systems in order to carry out a successful attack. White-box would imply full knowledge of a system, and a partial knowledge is known as gray-box.
One of the authors of the publication, NIST computer scientist Apostol Vassilev, said, “We are providing an overview of attack techniques and methodologies that consider all types of AI systems.
“We also describe current mitigation strategies reported in the literature, but these available defenses currently lack robust assurances that they fully mitigate the risks. We are encouraging the community to come up with better defenses.”
More from TechRadar Pro
AI Eclipse TLDR:
The National Institute of Standards and Technology (NIST) has released a report highlighting the threats faced by the rapid development of artificial intelligence (AI). The report states that AI systems are vulnerable to misdirection, data poisoning, and privacy attacks, and there is currently no foolproof way to protect against these threats. The publication aims to promote responsible development of AI tools and raise awareness among industries about the need for greater caution in deploying AI.
One of the main concerns raised by the report is the potential for attacks during the training of AI systems. This includes poisoning attacks, where the AI is trained on data containing obscene or toxic language, resulting in the AI becoming racist or derogatory in its responses. There are also concerns about evasion attacks that can alter the way AI recognizes inputs, potentially leading to accidents in applications like self-driving cars.
The report also highlights the risk of malicious actors manipulating the sources of information used to train AI systems, prompting inappropriate responses from the AI. Additionally, attackers can compromise legitimate sources of information and edit their contents to change the behavior of the AI.
One worrying aspect of these attacks is that they can be carried out with minimal knowledge of AI systems, known as “black-box” attacks. The report emphasizes the need for better defenses and mitigation strategies to address these risks.