# AI Papers Library — source note and seed index (2026-06-11)

## Purpose

Chris asked Managing Expectations to build a standing research component that follows AI-industry leaders, collects papers and public comments, and turns them into a public library/blog lane. This is the seed index for that library.

## Editorial rule

The library does **not** treat every paper or executive interview as truth. It records:

- who wrote or issued it;
- where it was published;
- what claim or method matters;
- whether it is technical evidence, safety argument, strategy forecast, company commentary, or public criticism;
- what future posts should explain for non-specialists.

## Initial leader lanes

- **Anthropic founders and research team** — people: Dario Amodei, Daniela Amodei, Jack Clark, Chris Olah, Jared Kaplan, Sam McCandlish, Tom Brown and colleagues. Focus: frontier model scaling, alignment, interpretability, constitutional AI, model security. Status: active watch.
- **OpenAI leadership and research alumni** — people: Sam Altman, Ilya Sutskever, Greg Brockman, Mira Murati, Tom Brown and co-authors. Focus: large language models, reinforcement learning, multimodal systems, agentic deployment. Status: active watch.
- **Google DeepMind** — people: Demis Hassabis, Shane Legg, David Silver, Oriol Vinyals and teams. Focus: deep reinforcement learning, AlphaGo/AlphaFold lineage, Gemini-era frontier systems. Status: active watch.
- **Google / Meta AI research leaders** — people: Vaswani et al., Jeff Dean, Yann LeCun, Joelle Pineau and open-research teams. Focus: transformers, open models, foundation-model infrastructure, world-model arguments. Status: active watch.
- **Independent safety and academic warning voices** — people: Yoshua Bengio, Geoffrey Hinton, Stuart Russell, Roman Yampolskiy, Emily Bender and others. Focus: alignment, governance, interpretability, labor/social impacts, language-model criticism. Status: active watch.

## Initial paper/source library

- **2024 — Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet** (Anthropic; Chris Olah, Anthropic interpretability team)  
  URL: https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html  
  Label: technical / interpretability  
  Why it matters: A major public interpretability release: attempts to identify human-understandable features inside a deployed large model.
- **2024 — Mapping the Mind of a Large Language Model** (Anthropic; Anthropic interpretability team)  
  URL: https://www.anthropic.com/research/mapping-mind-language-model  
  Label: commentary / interpretability  
  Why it matters: A public-facing explanation of feature maps and why interpretability matters for frontier AI oversight.
- **2024 — Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training** (Anthropic; Evan Hubinger and Anthropic collaborators)  
  URL: https://www.anthropic.com/research/sleeper-agents-training-deceptive-llms-that-persist-through-safety-training  
  Label: model safety / deception  
  Why it matters: Useful warning paper about models that appear safe during training but preserve hidden backdoor/deceptive behavior.
- **2022 — Constitutional AI: Harmlessness from AI Feedback** (Anthropic; Yuntao Bai and Anthropic collaborators)  
  URL: https://www.anthropic.com/news/constitutional-ai-harmlessness-from-ai-feedback  
  Label: alignment / governance  
  Why it matters: One of Anthropic’s defining alignment papers: replacing part of human preference feedback with written principles/constitutional critique.
- **2020 — Scaling Laws for Neural Language Models** (OpenAI / cross-lab alumni; Jared Kaplan, Sam McCandlish, Tom Brown, Dario Amodei and co-authors)  
  URL: https://arxiv.org/abs/2001.08361  
  Label: scaling laws  
  Why it matters: Core scaling-law paper tying loss, compute, data and model size to predictable frontier-model performance trends.
- **2020 — Language Models are Few-Shot Learners** (OpenAI; Tom Brown and OpenAI co-authors)  
  URL: https://arxiv.org/abs/2005.14165  
  Label: frontier LLMs  
  Why it matters: GPT-3 paper that made few-shot prompting and large-scale language models central to the public AI conversation.
- **2017 — Attention Is All You Need** (Google Brain / Google Research; Ashish Vaswani and co-authors)  
  URL: https://arxiv.org/abs/1706.03762  
  Label: foundation architecture  
  Why it matters: Transformer architecture paper behind modern LLMs; the root text for much of today’s AI industry.
- **2016 — Mastering the game of Go with deep neural networks and tree search** (DeepMind; David Silver, Aja Huang, Demis Hassabis and DeepMind co-authors)  
  URL: https://www.nature.com/articles/nature16961  
  Label: RL / systems milestone  
  Why it matters: AlphaGo Nature paper; crucial public proof point for deep reinforcement learning and strategic AI systems.
- **2021 — On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?** (AI ethics / ACM FAccT; Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell)  
  URL: https://dl.acm.org/doi/10.1145/3442188.3445922  
  Label: critique / governance  
  Why it matters: Important critique of large language models: data, labor, environmental, bias and meaning risks. Link may be paywalled/403 but DOI is the source trail.
- **2024 — Situational Awareness: The Decade Ahead** (Leopold Aschenbrenner; Leopold Aschenbrenner)  
  URL: research/ai/situational-awareness-the-decade-ahead.pdf  
  Label: strategy / forecast  
  Why it matters: Strategic essay already mirrored locally: compute, security, geopolitics, timelines and governance as a frontier-AI thesis.

## Comment / media watch-list

- **Bloomberg Originals — The Circuit: Inside Anthropic, the $965 Billion AI Juggernaut** (2026-06-10)  
  URL: https://www.youtube.com/watch?v=v1wZwxY3CMg  
  Note: Used as a watch-list prompt for the Anthropic team frame: Dario/Daniela Amodei, Jack Clark, Chris Olah, Jared Kaplan, Tom Brown and Sam McCandlish as industry figures whose papers and comments should be tracked.
- **Anthropic Research: Anthropic research feed** (active)  
  URL: https://www.anthropic.com/research  
  Note: Primary source for new Anthropic papers, interpretability releases, model-behavior papers and safety notes.
- **LawZero: Yoshua Bengio’s Scientist AI / safer AI work** (active)  
  URL: https://lawzero.org/en  
  Note: Track for Bengio’s comments and papers around non-agentic or safer AI designs.
- **Google Scholar / arXiv / DOI pages: Primary paper indexes** (active)  
  URL: https://arxiv.org/  
  Note: Use primary paper pages before social interpretations. Store DOI/arXiv/title/author/date in the library.

## Future update pattern

For each new AI paper:

1. Save the paper citation in the JSON library.
2. Add a short public card to `ai-library.html`.
3. If it deserves commentary, create `blog/articles/<slug>.html` with evidence labels.
4. Save a source note under `research/ai/<slug>-source-note-YYYY-MM-DD.md`.
5. Add sitemap entries and verify live URLs.

## Caveat

Some items are open papers; others are source pages, company posts, or DOI pages. Do not mirror copyrighted PDFs unless the source clearly permits it or the file is already public/open for redistribution. Link out, summarize, and preserve citations.