Generative AI faces fresh criticism over limits in reasoning and energy use
Gary Marcus and Apple researchers are calling out the flaws in today’s AI. Marcus highlights that just pumping more computing power into generative AI won’t fix its fundamental issues. The problem? AI lacks the embodied, sensory-based learning humans get from interacting with the world. A seven-year-old can solve puzzles like the Tower of Hanoi that confound billion-dollar AI systems.
Sheila Hayman from Cambridge University stresses humans’ edge comes from being “embodied animals” who explore and learn with all senses. Unlike AI needing thousands of images to recognize a cat, a child learns from just a few encounters. She also points out massive energy costs: autonomous cars use kilowatts, while humans operate on mere watts.
Apple researchers have found “fundamental limitations” in the latest AI models. They see a collapse in accuracy on complex problems. Graham Taylor from Australia backs this up, showing that large language models don’t really reason—they rely on brute force calculations and logic routines. Asking ChatGPT a simple trick math question still trips it up.
Graham Taylor stated:
It comes as no surprise to me that Apple researchers have found “fundamental limitations” in cutting-edge artificial intelligence models
(Advanced AI suffers ‘complete accuracy collapse’ in face of complex problems, study finds, 9 June). AI in the form of large reasoning models or large language models (LLMs) are far from being able to “reason”.
This can be simply tested by asking ChatGPT or similar: “If 9 plus 10 is 18 what is 18 less 10?” The response today was 8. Other times, I’ve found that it provided no definitive answer.
This highlights that AI does not reason – currently, it is a combination of brute force and logic routines to essentially reduce the brute force approach. A term that should be given more publicity is ANI – artificial narrow intelligence, which describes systems like ChatGPT that are excellent at summarising pertinent information and rewording sentences, but are far from being able to reason.
But note, the more times that LLMs are asked similar questions, the more likely it will provide a more reasonable response. Again, though, this is not reasoning, it is model training.
The critique is clear: AI still falls short of human intelligence and reasoning. And with climate concerns ramping up, the enormous power AI consumes makes human brains look even more efficient.
The debate on AI’s limits is far from over, but experts insist we rethink the hype around scaling compute alone.