They solved more practice problems. They felt more confident. And then they bombed the test. A massive study just proved what nobody wanted to hear.

Read That Again

Here’s a stat that’s going to bother you:

Researchers at the University of Pennsylvania gave almost 1,000 high school students access to ChatGPT while they practiced math. Those students solved 48% more practice problems correctly than the group without ChatGPT. So far, so obvious — of course they did, they had an AI helping them.

But then they took the ChatGPT away and gave everyone the same test.

The students who’d been using ChatGPT scored 17% worse than the students who practiced on their own. Not the same. Not slightly worse. Significantly worse. The ones who did it the old-fashioned way — just textbook and notes — outperformed the AI-assisted group by a real margin.

And here’s the part that should genuinely scare you: the ChatGPT group had no idea. When surveyed, they said they didn’t think the AI had caused them to learn less. They felt more confident. They thought they’d done well. They were wrong.

The researchers titled their paper “Generative AI Can Harm Learning” and called it a “cautionary tale.” That’s academic-speak for “this is bad and you should pay attention.”

“But I’m Using It to Learn, Not to Cheat”

Yeah, that’s what every student in the study thought too.

When the researchers analyzed what the students actually typed into ChatGPT, the pattern was clear: most of them just asked for the answer. Not “help me understand this.” Not “what’s the first step.” Just: give me the answer to question 4.

And ChatGPT happily obliged. Except — and this is the other part nobody talks about — ChatGPT was only giving the correct answer about half the time. Its step-by-step approach for solving problems was wrong 42% of the time. Not a small error rate. Nearly half. So students were copying wrong answers and not building any of the skills to know the difference.

The researchers described it like autopilot in aviation. When pilots rely too much on autopilot, they lose the ability to fly manually when something goes wrong. That’s literally why the FAA told pilots to use autopilot less. Same thing is happening with students and AI — you’re outsourcing the thinking, and the thinking is the whole point.

“Okay But What About Using It as a Tutor?”

The study actually tested this too. A third group got a modified version of ChatGPT that was designed to act like a tutor — giving hints instead of direct answers, guiding students through the process instead of just handing them the solution.

These students crushed the practice problems. They solved 127% more problems correctly than the no-AI group. That’s more than double. Sounds amazing, right?

Their test scores? Exactly the same as the students who used nothing. All that AI-guided practice didn’t translate to learning any better than just doing the problems yourself.

Let that sink in. The best-case scenario for AI-assisted studying — carefully designed, hint-based, tutor-style — produced the same results as just sitting with your textbook. And the normal ChatGPT experience? Made students measurably worse.

This Isn’t Just One Study

If this were one random experiment, you could write it off. But the data is piling up from everywhere.

A study from Aalto University and LMU Munich found that people who use AI to solve problems consistently overestimate how well they actually performed — and here’s the kicker — the more AI-literate someone is, the more overconfident they become. Not less. More. Being “good with AI” didn’t help people judge their own abilities. It made them worse at it.

The researchers found that most people didn’t even engage with ChatGPT beyond a single prompt. They copied the question in, got an answer, and moved on. No follow-up. No checking. No second-guessing. They called it “cognitive offloading” — your brain just… hands the work to the machine and clocks out.

A separate study of 666 participants found a significant negative correlation between frequent AI use and critical thinking abilities. And a Microsoft study found the pattern is strongest in younger users — ages 17 to 25 showed higher AI dependence and lower thinking scores than older age groups.

Meanwhile, an Israeli university tracked 36,000 students across 6,000 courses. In classes where assignments were AI-compatible (take-home essays, projects), grades went up — especially for lower-performing students. Sounds good until you realize what happened: the gap between good and struggling students compressed. Everyone’s grades looked similar. Which means grades stopped meaning anything. For employers trying to figure out who actually knows what, the signal just became noise.

Why This Matters More Than You Think

Here’s where this connects to your actual life.

You probably already know that the job market is getting harder. Entry-level tech hiring dropped 50% in five years. 37% of managers say they’d rather use AI than hire a new grad. The companies that are still hiring are being way more selective about who they take.

So what are they selecting for? They’re looking for people who actually understand things. Not people who can get ChatGPT to produce the right output, but people who can explain their reasoning, think through novel problems, and catch AI mistakes — because AI makes a lot of them.

In that world, the student who spent four years having ChatGPT do their work is in trouble. Not because they cheated — but because they never built the skills that employers are now specifically testing for. The technical interview doesn’t care that you got an A on your homework. It cares whether you can walk through a problem and explain why your approach works.

And here’s the cruel irony: those students don’t even know they’re in trouble. Remember — the ChatGPT group felt more confident. They genuinely believed they understood the material. The overconfidence isn’t a bug. It’s the main feature of the problem. You don’t know what you don’t know, and AI makes you extra sure you know it.

What This Means If You’re in School Right Now

This isn’t a “don’t use AI” argument. That ship sailed. AI exists, it’s powerful, and pretending it doesn’t exist is its own kind of stupid. The question is how you use it, and right now, most students are using it in exactly the way that makes them worse.

The “paste the question, copy the answer” method is actively hurting you. Not just ethically. Academically. You are learning less, retaining less, and becoming more confident in knowledge you don’t actually have. That’s not an opinion — it’s what happens in controlled experiments with hundreds of students.

The “use AI for hints” method is… fine. It’s not worse than studying alone. But it’s not better either. Which means if you’re paying for an AI tool that gives you hints, you could get the same result with a textbook.

What actually works is using AI in a way that forces you to think, not skip thinking. The Harvard Gazette quoted a researcher who put it perfectly: “If you think you’re in school to produce outputs, then you might be OK with AI helping you produce those outputs. But if you’re in school because you want to be learning, remember that the output is just a vehicle through which that learning is going to happen.”

The difference is: did the AI make you smarter, or did it just make your homework look smarter? Because those are very different things, and the test scores prove it.

The Split That’s Coming

Here’s what’s about to happen in the next few years, and it’s going to be ugly for some people:

There are going to be two types of students who used AI in school. The ones who used it as a crutch — pasting questions, copying answers, submitting AI-written essays — are going to hit a wall. In college admissions interviews, in job interviews, in any situation where they have to demonstrate understanding without a chatbot open in another tab. And they won’t see it coming, because their grades looked fine.

Then there are going to be students who used AI as an actual tool — to check their understanding, to explore topics deeper, to get unstuck on specific steps while doing the hard thinking themselves. These students are going to be disproportionately prepared, because they built real skills and know how to work with AI.

The gap between these two groups is going to be massive. And it’s going to show up at the worst possible time — when it actually counts.

If you want to be in the second group, the move is simple but hard: use AI in a way that makes you do the work, not in a way that does the work for you. Tools like Grassroot are specifically built around this — AI that asks you questions instead of answering them, that forces you to explain your reasoning instead of giving you a solution to copy. Not because it’s trying to make your life harder, but because that’s literally what the research says actually works.

The students who figure this out now are going to have a ridiculous advantage. The ones who don’t are going to find out the hard way — on a test, in an interview, at a job — that they spent years learning nothing while feeling like they learned everything.

Sources