AI Safety at UCLA Intro Fellowship: Governance Track
Table of Contents
Part 1: Introduction AI Safety
- Week 0: Overview, Ethos, and Social
- Week 1: Artificial Intelligence — How it Works and What it Can Achieve
- Week 2: Catastrophic Risk from AI
- Week 3: AI Safety — Goals and Challenges
Part 2: Introduction to AI Governance
- Week 4: AI Policy Levers
- Week 5: Existing Approaches — Corporate Governance & Open vs. Closed Source
- Week 6: New Approaches — Compute Governance & International Approaches
- Week 7: Looking Ahead
Part 1: Introduction to AI Safety
Week 0: Overview, Ethos, and Social
Core Content (~10 min):
Optional Additional Content:
Learning Goals:
- Get to know the fellowship and each other!
- Get a flavor for the ethos and motivations of the fellowship
- Optionally, get acquainted with recent AI legislation
Week 1: Artificial Intelligence — How it Works and What it Can Achieve
Core Content (~60 min):
- Andrej Karpathy - Intro to Large Language Models (41 min)
- The AI Triad and What it Means for National Security Strategy by Ben Buchanan (intro + section 1) (20 min)
Optional Additional Content:
- AI, Machine Learning, and Deep Learning (10 min)
- Gradient Descent: How Neural Networks Learn | Chapter 2, Deep Learning (20 min)
- Visualizing the Deep Learning Revolution by Richard Ngo (20 min)
- The Transformative Potential of AGI – and When It Might Arrive by Shane Legg and Chris Anderson
Learning Goals:
- Understand the technical backbone of modern AI models
- Build a sense of why these technical details are relevant to AI policy, in the context of the AI triad (compute, data, and algorithms)
- Begin to think about what risks might be posed by AI
Week 2: Catastrophic Risk from AI
Core Content (~75 min):
- Existential Risk from Power-Seeking AI (60 min)
- The True Story of How GPT-2 Became Maximally Lewd (14 min)
Optional Additional Content:
First Principles AI Safety
- AGI Safety from First Principles
- Why Would AI Want to do Bad Things? Instrumental Convergence
- Why AI Alignment Could Be Hard with Modern Deep Learning by Ajeya Cotra (20 min)
Surveys of AI Risks
- AI Risks that Could Lead to Catastrophe | CAIS (25 min)
- Preventing an AI-Related Catastrophe - 80,000 Hours (60 min)
Concrete Scenarios
- What Failure Looks Like by Paul Christiano (20 min)
- Auto-GPT and AI Race Acceleration by The AI Beat (10 min)
Learning Goals:
- Understand the core arguments for existential risk from AI
- Begin to form an idea of the different paths to reducing AI risk
- Visualize how the techniques used to train AI directly contribute to potential bad outcomes
Week 3: AI Safety — Goals and Challenges
Core Content (~60 min):
- What is AI Alignment? – BlueDot Impact (10 min)
- Avoiding Extreme Global Vulnerability as a Core AI Governance Problem (10 min)
- AI Safety Seems Hard to Measure (18 min)
- Racing Through a Minefield: the AI deployment problem (18 min)
Optional Additional Content:
AI Safety Neglectedness
- Nobody’s on the Ball on AI Alignment (15 min)
- The Need for Work on Technical AI Alignment by Daniel Eth (25 min)
What Could Go Wrong
- AGI Ruin: A List of Lethalities (20 min)
- Rogue AIs by the Center for AI Safety (35 min)
Paths to Success
- Paradigms of AI Alignment: Components and Enablers (34 min)
- Managing Extreme AI Risks Amid Rapid Progress (20 min)
Learning Goals:
- Understand the term “AI alignment” — what it means, and paths to achieving it
- Understand the difficulties that arise when trying to align powerful AI models
- Build a framework for the various factors that exascerbate AI risk, and how each of these could potentially be mitigated
Part 2: Introduction to AI Governance
Week 4: AI Policy Levers
Core Content (~60 min):
- The AI Triad and What It Means for National Security Strategy (pgs 11-15) (15 min)
- Primer on Safety Standards and Regulations for Industrial-Scale AI Development (15 min)
- Historical Case Studies of Technology Governance and International Agreements (30 min)
Optional Additional Content:
- Strengthening Resilience to AI Risk: A Guide for UK Policymakers (30 min)
- The Policy Playbook: Building a Systems-Oriented Approach to Technology and National Security Policy (45 min)
- The Convergence of Artificial Intelligence and the Life Sciences: Safeguarding Technology (8 min)
Learning Goals:
- Understand the direct implications of the AI triad for policy
- Learn about existing standards for AI
- Gain historical context on the successes and failures of past technology governance
Week 5: Existing Approaches — Corporate Governance & Open vs. Closed Source
Core Content (~65 min):
- AI Index Report 2024, Chapter 7: Policy and Governance (20 min)
- Open Sourcing Highly Capable Foundation Models (45 min)
Optional Additional Content:
US AI Policy
- Recent U.S. Efforts on AI Policy (8 min)
- President Biden’s Executive Order on AI (10 min)
Principles and Recommendations
- The Bletchley Declaration (10 min)
- UNESCO’s Recommendation on the Ethics of AI (20 min)
- OECD AI Principles (10 min)
Open vs. Closed Source
- A Pro-Innovation Approach to AI Regulation (30 min)
- The Case for Uncensored Models (5 min)
Learning Goals:
- Understand the term “corporate governance,” and existing policies held by AI labs
- Weigh the pros and cons of open-sourcing model weights
- Understand the government’s role in the regulation of AI companies
Week 6: New Approaches — Compute Governance & International Approaches
Core Content (~60 min):
- Primer on AI Chips and AI Governance (20 min)
- International Institutions for Advanced AI (20 min)
- China’s AI Regulations and How They Get Made (20 min)
Optional Additional Content:
Compute Governance
- Choking Off China’s Access to the Future of AI (15 min)
- Computing Power and the Governance of AI (45 min)
Institutions and Policies 2. Driving U.S. Innovation in Artificial Intelligence: A Roadmap for AI Policy (30 min) 3. High-Level Summary of the AI Act (10 min) 4. Vision Statement of the US AI Safety Institute (15 min)
AI Control 5. Model Evaluation for Extreme Risks by Toby Shevlane (35 min) 6. Societal Adaptation to Advanced AI (40 min)
Learning Goals:
- Understand the term “compute governance,” and why regulating compute is a promising path for mitigating AI risk
- Survey existing international instiutions for AI, and proposals for new institutions
- Gain context on the state of AI in China, the US’s primary competitor in the field
Week 7: Looking Ahead
Core Content (~60 min):
- Career Profile: AI Governance and Policy by 80000 Hours (15 min)
- AI Governance Project Ideas – BlueDot Impact (10 min)
- 12 Tentative Ideas for U.S. AI Policy by Muehlhauser (5 min)
- Advice for Undergraduates (15 min)
- Career Resources on US AI Policy (5 min)
- Career Resources on US AI Strategy Research (12 min)
Optional Additional Content:
Skills to Build
Advice and Ideas
- Advice for Seeking Full-Time Roles (8 min)
- Collection of AI Governance Research Ideas
- So You Want to Be a Policy Entrepreneur? by Michael Mintrom (40 min)
Career Resources 4. AI Policy Resources 7. AI Safety Career Opportunities Job Board
Learning Goals:
- Understand what careers exist in the AI governance space, and the skills required for each
- Browse proposals for AI governance and take note of the ones you may be interested in pursuing
- Understand the resources available to you if you are interested in pursuing AI governance