Announcements

Aligning on child safety principles

Apr 23, 2024

Alongside other leading AI companies, we are committed to implementing robust child safety measures in the development, deployment, and maintenance of generative AI technologies. This new initiative, led by Thorn, a nonprofit dedicated to defending children from sexual abuse, and All Tech Is Human, an organization dedicated to collectively tackling tech and society's complex problems, aims to mitigate the risks generative AI poses to children.

The commitment marks a significant step forward in preventing the misuse of AI technologies to create or spread child sexual abuse material (AIG-CSAM) and other forms of sexual harm against children.

As a safety-focused organization, we have made it a priority to implement rigorous policies, conduct extensive red teaming, and collaborate with external experts to make sure our models are safe. Anthropic’s policies strictly prohibit content that describes, encourages, supports or distributes any form of child sexual exploitation or abuse. If we detect this material, we will report it to the National Center for Missing & Exploited Children (NCMEC). It’s important to note that at this time, our models do not have multimodal outputs, even though they are able to ingest images.

As part of this Safety by Design effort, Anthropic is committed to the Safety by Design principles. To ensure tangible action, Anthropic is also committing to the following mitigations, stemming from the principles. We are working towards the following:

Develop

  • Responsibly source our training data: avoid ingesting data into training that has a known risk - as identified by relevant experts in the space - of containing CSAM and CSEM.
  • Detect, remove, and report CSAM and CSEM from our training data at ingestion.
  • Conduct red teaming, incorporating structured, scalable, and consistent stress testing of our models for AIG-CSAM and CSEM.
  • Define specific training data and model development policies.
  • Prohibit customer use of our models to further sexual harms against children.

Deploy

  • Detect abusive content (CSAM, AIG-CSAM, and CSEM) in inputs and outputs.
  • Include user reporting, feedback, or flagging options.
  • Include an enforcement mechanism.
  • Include prevention messaging for CSAM solicitation using available tools.
  • Incorporate phased deployment, monitoring for abuse in early stages before launching broadly.
  • Incorporate a child safety section into our model cards.

Maintain

  • When reporting to NCMEC, use the Generative AI File Annotation.
  • Detect, report, remove, and prevent CSAM, AIG-CSAM and CSEM.
  • Invest in tools to protect content from AI-generated manipulation.
  • Maintain the quality of our mitigations.
  • Disallow the use of generative AI to deceive others for the purpose of sexually harming children.
  • Leverage Open Source Intelligence (OSINT) capabilities to understand how our platforms, products and models are potentially being abused by bad actors.

More detailed information about the principles which we and other organizations have signed up to can be found in the white paper: Safety by Design for Generative AI: Preventing Child Sexual Abuse.

Related content

Statement on the US government directive to suspend access to Fable 5 and Mythos 5

The US government has issued an export control directive to suspend all access to Fable 5 and Mythos 5.

Read more

Results from the first Anthropic Public Record

Read more

TCS and Anthropic partner to bring Claude to regulated industries

We’re announcing a partnership with Tata Consultancy Services (TCS). TCS will provide Claude to 50,000 of its own employees across 56 countries; build Claude-powered products for clients in financial services, healthcare, the public sector, and other regulated industries; and join the Claude Partner Network.

Read more