OpenAI's o3 Model Faces Scrutiny as Benchmark Scores Dispute Claims

OpenAI's o3 model, which the company touted as capable of solving over 25% of FrontierMath problems, is under fire after an independent test by EpochAI revealed it can only answer 10%. This discrepancy has raised concerns regarding OpenAI's transparency and credibility. Users are voicing their frustrations over the company's claims, which appear exaggerated according to the test results. While OpenAI insists the public version of o3 differs from earlier tests, the controversy continues to unfold as the tech community seeks clarity.