Japan: AI Safety Institute Releases Guide on AI Security Testing

4 Oct

Japan AI Safety Institute (AISI) unveiled a new guide that outlines best practices for testing the security of artificial intelligence (AI) systems through a process known as "red teaming.", which was announced on 25 September 2024.

Red teaming is a method of evaluating AI systems by simulating attacks to identify weaknesses from an attacker's perspective. The goal is to ensure that AI systems are robust against potential security threats.

The guide details different types of tests:

Black box tests, where attackers know nothing about the system.
White box tests, where attackers have full knowledge.
Grey box tests, where attackers have partial knowledge.

These tests can be carried out in various environments, such as:

The production environment (live systems),
The staging environment (test systems close to production),
The development environment (early-stage systems).

Attack methods include prompt injections, data poisoning, and model extraction, among others.

The guide also emphasizes the need for structured red teaming at various stages of AI development—from data preprocessing and training to the system's use phase. It provides recommendations on how to build a red team and organize security assessments.

To strengthen AI safety, red teaming should be an ongoing process, evolving with the system.

The full guide is available here (in Japanese), providing more details on how to implement these strategies.

Jerry Sin

Japan: AI Safety Institute Releases Guide on AI Security Testing

Location

Office Hours

Contact

Japan: AI Safety Institute Releases Guide on AI Security Testing

Tully's Coffee Japan Reports Data Breach: Personal and Credit Card Information Exposed

Australia: New Guidelines Released to Protect Against Active Directory Cyber Attacks

Location

Office Hours

Contact