New ARC-AGI-2 Test Challenges AI Models' Intelligence

The Arc Prize Foundation has unveiled a new test, ARC-AGI-2, designed to assess the general intelligence of AI models, which has proven difficult for top models to pass. Reasoning models like OpenAI's o1-pro scored between 1% and 1.3%, while non-reasoning models achieved around 1%. The test requires AI to solve novel visual pattern puzzles, emphasizing efficiency and adaptability. Human participants scored 60% on average, far outperforming AI models. Co-founder François Chollet claims the new test is a significant improvement over its predecessor, focusing on how efficiently AI can learn new skills.