As required under the October 30, 2023, Executive Order 14110 on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence (AI), the White House issued a memorandum providing further direction on appropriately harnessing AI models and AI-enabled technologies in the U.S. government. This memorandum touches on a great number of areas, including the coronation of the AI Safety Institute (AISI) within the National Institute of Standards and Technology (NIST) as “the primary United States Government point of contact with private sector AI developers.”
The memorandum goes on to task AISI to, within 180 days of the date of the memorandum, aid in the evaluation of current and near-future AI systems in the following ways:
- AISI shall issue guidance for AI developers on how to test, evaluate, and manage risks to safety, security, and trustworthiness arising from dual-use foundation models, including on the following topics:
- How to measure capabilities that are relevant to the risk that AI models could enable the development of biological and chemical weapons or the automation of offensive cyber operations;
- How to address societal risks, such as the misuse of models to harass or impersonate individuals;
- How to develop mitigation measures to prevent malicious or improper use of models;
- How to test the efficacy of safety and security mitigations; and
- How to apply risk management practices throughout the development and deployment lifecycle.
- AISI, in consultation with other agencies as appropriate, shall develop or recommend benchmarks or other methods for assessing AI systems’ capabilities and limitations in science, mathematics, code generation, and general reasoning, as well as other categories of activity that AISI deems relevant to assessing general-purpose capabilities likely to have a bearing on national security and public safety.
- Subject to private sector cooperation, AISI shall pursue voluntary preliminary testing of at least two frontier AI models prior to their public deployment or release to evaluate capabilities that might pose a threat to national security. This testing shall assess models’ capabilities to aid offensive cyber operations, accelerate development of biological and/or chemical weapons, autonomously carry out malicious behavior, automate development and deployment of other models with such capabilities, and give rise to other risks identified by AISI.