Day 100

A Birthday Gift to Myself

Day 99

Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation

Day 98

MoralBench: Moral Evaluation of LLMs

Day 97

FVA-RAG: Falsification-Verification Alignment for Mitigating Sycophantic Hallucinations

Day 96

How Large Language Models Balance Internal Knowledge with User and Document Assertions

Day 95

When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models

Day 94

Sycophantic AI makes human interaction feel more effortful and less satisfying over time

Day 93

Beyond the Black Box: Interpretability of Agentic AI Tool Use

Day 92

How AI Impacts Skill Formation

Day 91

Emergent Introspective Awareness in Large Language Models

Day 90

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Day 89

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?

Day 88

Constitutional AI: Harmlessness from AI Feedback

Day 87

Hugging Face Demo

Day 86

Algorithmic sycophancy: A new source of systematic distortion in AI-driven biomedical research

Day 85

Activation Steering for Aligned Open-ended Generation without Sacrificing Coherence

Day 84

Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models

Day 83

pyFIES

Day 82

Verifying Chain-of-Thought Reasoning via Its Computational Graph

Day 81

Verbalizing LLMs’ assumptions to explain and control sycophancy

Day 80

ATLAS: Agent Tuning via Learning Critical Steps

Day 79

BEYOND BINARY REWARDS: TRAINING LMS TO REASON ABOUT THEIR UNCERTAINTY

Day 78

Sycophantic AI decreases prosocial intentions and promotes dependence

Day 77

A Rational Analysis of the Effects of Sycophantic AI

Day 76

Analyzing the Safety Pitfalls of Steering Vectors

Day 75

Mitigating Content Effects on Reasoning in LLMs Through Fine-Grained Activation Steering

Day 74

Can Activation Steering Generalize Across Languages?

Day 73

Extending Activation Steering to Broad Skills and Multiple Behaviours

Day 72

Improving Activation Steering in Language Models with Mean-Centering

Day 71

Aligning Large Language Models with Representation

Day 70

Who's asking? User personas and the mechanics of latent misalignment

Day 69

Personalized Steering of LLMs

Day 68

Programming Refusal with Conditional Activation Steering

Day 67

Sycophancy Shapes Multi-Agent Debate

Day 66

Truth Decay: Multi-Turn Sycophancy in LLMs

Day 65

Sycophancy in Multi-Turn Dialogues

Day 64

Too Polite to Disagree

Day 63

All About the FAO ESS

Day 62

Learning Transformer Fundamentals

Day 61

Detecting Implicit Reward Hacking by Measuring Reasoning Effort

Day 60

Building an SDR Radar

Day 59

Learning Token Probabilities

Day 58

A Unified Understanding and Evaluation of Steering Methods

Day 57

Auditing Language Models for Hidden Objectives

Day 56

Language Models (Mostly) Know What They Know

Day 55

Don’t lie to your friends

Day 54

Alignment Faking

Day 53

Semantic Uncertainty

Day 52

Reasoning Models Don't Always Say What They Think

Day 51

Concrete Problems in AI Safety

Day 50

50: A Look Back

Day 49

Auto CV Generator

Day 48

Agents Attack

Day 47

Learning AI Alignment

Day 46

Uganda PBS

Day 45

Forecasting Food Insecurity Part II

Day 44

Forecasting Commodity Prices (WB Pink Sheet)

Day 43

Alignment Network

Day 42

42 MCP

Day 41

What is famine?

Day 40

Data on Poverty, Hunger and Malnutrition in FTF Countries

Day 39

WASDE Forecasting

Day 38

AI + Creativity

Day 37

Iran Plumes

Day 36

Bert meets BERT

Day 35

RL for Beginners

Day 34

Slop Watch

Day 33

Breaking Words

Day 32

Neural Knots

Day 31

The Grand Orator

Day 30

The Geometry of Persona

Day 29

From Hedge to Answer: Uncertainty Dynamics in LLM Reasoning Chains

Day 28

Civic Firebrand

Day 27

SmartDublin Core

Day 26

Google Getter

Day 25

Fibonacci Cricut

Day 24

ASCII Librarian

Day 23

The Archetypical Librarian

Day 22

Forecasting EMS Demand in NYC w/ Two Decades of Data

Day 21

AIxiv — A preprint server for AI-generated research.

Day 20

Let's See the Guts!

Day 19

Floating Eyeballs

Day 18

RLHF Experiment Playground

Day 17

Feed the Future PBS RAG

Day 16

Jailbreaking Large Language Models

Day 15

chitty-chatty: LLM-to-LLM Conversation System

Day 14

baby-llm: Minimal GPT Training Script

Day 13

Urban Waste Dynamics of NYC

Day 12

Cookie Monster - Browser Privacy Dashboard

Day 11

Landfill Hunter

Day 10

INTJ: The Systems Forge Builder

Day 9

Steering the Librarian Persona

Day 8

xLibris Search Results Scraper

Day 7

Moltbook Human-Like Outlier Detection Pipeline

Day 6

Personaplex Speech-to-Speech Processor

Day 5

The arXiv Scout

Day 4

Snake Battles

Day 3

The Huggingface Daily Briefer

Day 2

Task-Oriented Delay Obliterator

Day 1

100 Days of Making AI Launch