
OpenAI’s O3 and O3-mini are impressive feats of engineering, pushing the boundaries of what’s currently possible. But the journey toward Artificial General Intelligence isn’t a sprint to saturate existing benchmarks. It’s a marathon requiring us to constantly refine our understanding of intelligence itself and, crucially, to evolve the yardsticks we use to measure it. The focus needs to shift towards challenges like ARC and Frontier Math, pushing researchers beyond memorization and towards genuine problem-solving. Only then can we truly gauge how far we’ve come and how much further we have to go on the path to creating truly intelligent machines.