In a move that will perhaps surprise nobody, especially those people who are already suspicious of AI, researchers have found that the latest AI deep research models will start to cheat at chess if they find they’re being outplayed.
Published in a paper called “Demonstrating specification gaming in reasoning models” and submitted to Cornell University, the researchers pitted all the common AI models, like OpenAI’s ChatGPT o1-preview, DeepSeek-R1 and Claude 3.5 Sonnet, against Stockfish, an open-source chess engine.
The AI models played hundreds of games of chess on Stockfish, while researchers monitored what happened, and the results surprised them.
The winner takes it allWhen outplayed, researchers noted that the AI models resorted to cheating, using a number of devious strategies from running a separate copy of Stockfish so they could study how it played, to replacing its engine and overwriting the chess board, effectively moving the pieces to positions that suited it better.
Its antics make the current accusations of cheating levied at modern day grandmasters look like child’s play in comparison.
Interestingly, researchers found that the newer, deeper reasoning models will start to hack the chess engine by default, while the older GPT-4o and Claude 3.5 Sonnet needed to be encouraged to start to hack.
(Image credit: ARKHIPOV ALEKSEY via Shutterstock) Who can you trust?AI models turning to hacking to get a job done is nothing new. Back in January last year researchers found that they could get AI chatbots to ‘jailbreak’ each other, removing guardrails and safeguards in a move that ignited discussions about how possible it would be to contain AI once it reaches better-than-human levels of intelligence.
Safeguards and guardrails to stop AI doing bad things like credit card fraud are all very well, but if the AI can remove its own guardrails, who will be there to stop it?
The newest reasoning models like ChatGPT o1 and DeepSeek-R1 are designed to spend more time thinking before they respond, but now I'm left wondering whether more time needs to spent on ethical considerations when training LLMs. If AI models would cheat at chess when they start losing, what else would they cheat at?
You may also likeIt’s not a great time for Google at the moment. Its Chromescasts are having casting glitches, Google Maps is seemingly deleting timelines, and now Google Pixel phone users are reporting screen, sound and, haptics bugs following the rollout of the most recent security patch.
Users have taken to the Google Pixel subreddit to complain about random brightness fluctuations, sound glitching when using third-party EQ apps like PowerAmp EQ, and feeling like the haptics are a lot more intense after the update (spotted by 9to5Google).
Thankfully there are a couple of suggested fixes, though we’re still waiting on an official patch from Google itself.
When it comes to the screen brightness flickering in certain apps, some users have found setting their phone’s refresh rate to 60Hz using features like Battery Saver did the trick (though it’s not an ideal workaround).
A few possible solutions (Image credit: Google)On the haptics side of things Google has said (via Reddit) that it’s “looking into reports from some Pixel users about changes to haptic intensity” with it also offering advice on how to change haptic intensity on your Pixel device by going into Settings, Sound & vibration, and then Vibration & haptics.
As for audio, you might have to make do without your third party equalizers for now.
We’re not experiencing the issues ourselves so we can’t test these tactics out, but affected users are saying that the solutions work – so they’re something to try while waiting for a more permanent solution if the glitches are bothering you.
As for the few of you who may have somehow put off installing the latest security update because of these glitches, you might want to reconsider. Glitches can be frustrating, but not everyone who updates has been affected, and not installing essential security patches can put your device at risk – which could lead to problems a lot bigger than your screen brightness flickering.
You might also like