Inner Alignment & Mesa-Optimization

Why learned optimizers can pursue different goals than the training objective suggests.

Part of Technical AI Safety on neo-ai.

Browse all neo-ai courses · Back to course overview