Rethinking Cognitive Security in the Age of Generative AI

In the 21st century, does winning the war mean losing our minds?

Peggy Yin and Julie Heng

Co-Directors, Cognitive Security Task Force

tl;dr

—Cognitive technologies are enabling a new kind of warfare that targets not just what we think but how we think.
—Generative AI enables novel forms of influence on human preferences and behavior, which makes cognitive operations faster, more scalable, and harder to detect. State and non-state actors alike are paying attention. So must we.
—What's missing? Cognitive security: protection against hazardous influence over cognitive processes, e.g. perception, reasoning, and judgment.
—The Cognitive Security Task Force at Stanford HAI aims to unite perspectives across industry, policy, academia, and civil society to help strengthen human decision-making, sensemaking, and learning in the age of AI.
—We've launched several projects and are actively seeking collaborators across three focus areas: (1) Industry and Innovation, (2) Law, Policy, and International Security, and (3) Education, Media Psychology, and Child Safety.

Last week, the Pentagon's Strategic Capabilities Office launched an initiative to develop new "cognitive warfare capabilities," with the goal of disrupting "the cognition and the thinking ability"^[1] of adversaries.

This is the latest in a series of announcements by military organizations preparing for war in the cognitive domain. China's People's Liberation Army has elevated "cognitive domain operations" – cyber-enabled influence operations on public opinion – to the status of a strategic priority in their military modernization effort, while Japan's latest National Security Strategy highlights the cognitive domain, and Australian Department of Defence documents describe plans to train "cognitive information warfare officers and specialists." NATO's Allied Command Transformation has established an Applied Cognitive Effects team with the explicit goal of establishing "cognitive superiority."

These cognitive operations target not just cognitive states but the cognitive processes^[2] through which we update our beliefs, weigh evidence, and distinguish credible from incredible. As a result, the attack surface is considerably expanded, with greater potential intensity and efficacy. And it makes such operations extraordinarily difficult, if not impossible, to detect: a victim may still be under the impression that their reasoning and sensemaking capacities remain fully autonomous, because the faculties they would use to notice attacks would have already been compromised.

The time to update our cognitive security was yesterday. We need to install it now.

Updating Cognitive Security for the 21st Century

Cognitive security protects against unauthorized access to and hazardous influence over cognitive processes, such as perception, reasoning, decision-making, learning, and judgment. At its core, cognitive security refers to our ability to maintain sovereignty over our inner mental landscape, including our thoughts, feelings, subjective experiences, and dispositions.^[3] A cognitively secure world, then, supports human agency in deliberation, reflection, and sensemaking.

This may not seem like a new problem. Advertising, propaganda, and manipulation are as old as language itself. Across cultures, practices like meditation, debate, and psychotherapy have long served as "interiority integrity" measures that detect distortion. Just as computer systems use checksums and error-correction protocols to identify corruption, human traditions of self-reflection have helped us notice when our thinking has gone wrong. What's new about emerging technologies is their ability to deploy cognitive attacks at an overwhelming and unprecedented speed, scale, specificity, and sophistication,^[4] creating a new series of threats for which we don't yet have defenses.

Consider a controversial experiment conducted by researchers at the University of Zurich on Reddit's r/ChangeMyView forum: over four months, AI accounts impersonated real people on Reddit, making over 1,000 comments across a range of debate topics. Each account came with a fabricated backstory tailored to match what the AI had inferred about individual users' real lives – in one case, an AI-generated post even falsely claimed to be written by a survivor of statutory rape. The personalized AI comments scored far higher than nearly all human commenters in the subreddit's persuasion point system, without being detected as AI.^[5]

AI-induced cognitive effects have already spilled outside the boundaries of lab experiments. This year, hospitals received their first patients suffering from "AI psychosis." "Brain rot" and AI "slop" became so normalized in our media consumption habits that they were named words of the year. And 25% of young adults believe that AI has the potential to replace real-life romantic relationships.

While cognitive warfare may conjure up images of brain-hacking through "BCI-fi" interfaces like neural jacks, these trends show that AI is already a capable cognitive weapon.^[6] Even without users directly providing information about themselves, it is now easier than ever for state and nonstate actors alike to leverage behavioral data, perform once-complex user modeling, and then generate personalized attacks targeting individuals' specific beliefs and identities. And more and more people are voluntarily opting into intimate, parasocial relationships with AI products, enabling AI to construct nuanced demographic and psychometric user profiles and to engage and shape users through personalized, on-demand content.

As Sam Altman predicted in 2017: "We could plug electrodes into our brains, or we could all just become really close friends with a chatbot."

Introducing the Cognitive Security Task Force (CSTF)

The Cognitive Security Task Force at Stanford Institute for Human-Centered AI started as a group of students and researchers in the Bay Area alarmed by global trends in cognitive operations enabled by emerging technologies. The effects of these trends cut across our fields – including computer science, psychology, neuroscience, electrical engineering, law, philosophy, biosecurity, history, and education – and, we believe, demand insights from all of them. As the World Economic Forum argues, AI is not just a tool but "cognitive infrastructure": a layer through which people understand and decide.

Why "security" as a focus? We've thought carefully about our framing, engaging with work on cognitive liberty, cognitive safety, and mental privacy. We found that a security framing best accounts for the emerging dangers of a world in which cognitive technologies can be manufactured with malicious intent, adapted for hazardous influence, and leveraged for misaligned goals. Moreover, we appreciated that security, in its technical lineage, implies ongoing attention to design, vulnerability, and repair.

Crucially, we also use the word "security" in the relational sense. Implementing cognitive security standards, safeguards, and systems requires sufficient trust among stakeholders and institutions. As David Graeber noted, "There is the security of knowing one has a statistically smaller chance of getting shot with an arrow. And then there's the security of knowing that there are people in the world who will care deeply if one is."

Our work so far

CSTF's discussions with representatives from industry, policy, education, academia, and civil society confirm that cognitive security is a shared concern across sectors. Our goal is to unite these perspectives to help research and develop international standards and pluralistic safeguards that strengthen human decision-making, sensemaking, and learning in the age of AI.

So far, we've launched a speaker series featuring cognitive technology founders, regulators, and researchers; introduced the concept of cognitive security at top machine learning and neuroscience research venues; and begun mapping out the cognitive security stakeholder space. Forthcoming events and presentations include:

—International Neuroethics Society (April 15–17, Stanford, CA): presenting an operational definition of cognitive security that compares assumptions across disciplines such as computer security, human-computer interaction, media and civil society, neuroethics, and national security, with a focus on narrative and semantic methods.
—Institute for Humane Studies (April 17–18, Austin, TX): presenting a policy memo outlining the contours of cognitive security policy, key areas of disagreement among experts, and our recommended actionable policy options.
—International Conference on Learning Representations (April 23–27, Rio de Janeiro, Brazil): presenting a technical roadmap for cognitive security research and development in generative AI, surveying the technical literature, conceptualizing cognitive security as an adversarial robustness problem, and proposing metrics (conversion, persistence, dependence, provenance) for the efficacy of attacking and defensive interventions.
—Center for Innovation in Education (various dates, hybrid): after presentations with superintendents and educators across the country in partnership with CIE, we are designing and piloting a series of activities and materials for educators concerned about cognitive offloading in K-12 students.
—Human+Tech Week (May 15, San Francisco, CA): an interactive cognitive security expo at Frontier Tower. RSVP here to reserve your spot.

Get involved: Your Brain on Cognitive Security

Over the next five years, we expect a "ChatGPT/GLP-1 moment" for cognitive technology: a mass-adoption tipping point that is paired with a cultural shift away from treating our innermost thoughts as untouchable.^[7] By raising security consciousness now, at the outset of this technology adoption timeline, we can help prevent forever cognitive wars before they become permanent, while ensuring cognitive technology innovation still benefits humanity.

How do we build robust, industry-wide cognitive security standards and norms? What does it mean to securitize the mind? What institutions, if any, should bear responsibility for cognitive security and cognitive rights – and at what level(s): individual, collective, or organizational? How might cognitive security become a form of literacy? And what lessons from history can best inform our thinking today?

To address these questions and build a cognitively secure future together, CSTF is actively seeking collaborators with backgrounds and interests in industry and innovation; law, policy and international security; and education, media psychology, and child safety. We're particularly focused on creating:

Industry and Innovation:

—An inventory of vulnerabilities cataloging abusable features in cognitive technologies (like the Common Vulnerabilities and Exposures framework in cybersecurity) so they can be addressed before deployment, when fixes are far cheaper and easier.
—An open-source cognitive security toolkit offering metrics, evaluations, and practical tools for evaluating the cognitive security implications of AI-enabled applications.

Law, Policy, and International Security:

—Policy briefs detailing current technological capabilities and strategic recommendations.

Education, Media Psychology, and Child Safety:

—A curriculum that prepares students for emerging cognitive threats, while building healthy media habits across development (piloting in partnership with the Center for Innovation in Education).^[8]
—A meta-analysis of the medium- and long-term risks of cognitive stunting and offloading.

If anything here resonates with your work or interests, let's connect!

Thank you to Matt Goerzon, Batu El, Shiye Su, Eric Heng, Jason Qu, Dominic Zappia, Katie Taylor, and Chinmay Deshpande for thoughtful comments on drafts of this post.

Footnotes

[1]The initiative, called Basic Information Awareness Operations, will leverage commercial AI technology to detect adversary materials, produce text, video, and audio in the information domain, and measure the effectiveness of deployed narratives in real time. The office's chief technology officer for autonomy and AI, Sam Gray, claimed at a National Defense Industrial Association conference that adversaries such as China and Iran are already using cognitive warfare to alter the thinking of entire populations, and that the United States is "behind from the technology perspective." New capabilities are expected within three to five years.
[2]As an example of the difference between a state and a process: a final draft preserves a moment in time, while version history in Google Docs preserves how a writer arrived at the set of thoughts the draft represents. Similarly, cognitive states exist as snapshots of experience (e.g., feeling happy or feeling unhappy), while cognitive processes reflect a series of temporally-ordered states (e.g., the process of becoming happy from a state of being unhappy). While traditional psychological and information operations focus on manipulating what people see and hear (for example, by dropping leaflets or broadcasting disinformation), cognitive operations target how people reason about what they see and hear.
[3]A general working definition from CSTF. See our forthcoming ICLR paper for a more precise elaboration of terms like "hazardous influence" in the generative AI context.
[4]Like how the Internet enabled delusions about gangstalking… Imagine Truman Show-style personalization, but at speed and scale.
[5]Plus, once the experiment became public, users revolted, sending hostile and threatening letters to researchers. The users claimed that the use of manipulative AI systems without consent eroded trust between users, leading to questions about democratic integrity.
[6]Of course, this is not to discount the increasing applicability of BCI for cognitive operations. For example, noninvasive AI chatbots can induce dosage-dependent sensations of wanting and liking, noninvasive AI-generated images of faces can modulate the activity of specific brain circuits, and noninvasive AI-generated text can predictably modulate neural activity.
[7]Just think about the days when mailing your DNA off to a private company was considered a truly horrifying idea – and how horrifying it was when 23andMe went bankrupt.
[8]Our work also demands that we come up with better qualitative ways to measure manipulation in more ecologically valid situations, such as classrooms.