Escapades with AI - How do you eat an elephant?

Escapades with AI - How do you eat an elephant?

This is the second installment in my “Escapades with AI” series. Today’s story comes from early June, back when I was still figuring out the boundaries of what these AI agents could actually handle.

If you missed the first one about Claude’s Volkswagen-level defeat device, check it out here.

The Mission

I was attempting to migrate a module in my Flutter app from the aging MVC paradigm to the shiny new MVVM architecture. The details of MVC vs MVVM aren’t important here - what matters is that we were modernizing a legacy app. This seemed like the perfect use case for a completely hands-off LLM agent approach:

  • ✅ Clear problem statement
  • ✅ Existing tests to prevent complete derailment
  • ✅ Boring, mechanical work

Normally, a migration like this would take a week of tedious refactoring. With the help of agentic tools, I was hoping to knock it out in a couple of hours.

This was around the time Claude was released, and we were still not all in on “agentic” bandwagon and were still using Cursor. I fired up Cursor in agentic mode and gave it the high-level problem statement.

I gave it complete permissions and asked it to proceed with implementation. I was hoping that by the time I was back from grabbing a cup of coffee the migration would be complete.

The Spiral

It took roughly 30 minutes for the first cut. During that time, Cursor occasionally asked for further permissions to do stuff, and I wasn’t really paying attention - just generally making sure it wasn’t doing anything blatantly stupid.

After 30 minutes, it went into a loop. I realized that it wasn’t making any progress anymore.

When I looked at the number of files that had changed in the GitHub diff, it had modified almost the entire repository - not just the module it was supposed to touch.

That’s when I realized this might not be as easy as I’d hoped.

Here’s a snippet from the interaction at that point:

Cursor realizes it’s losing control

The LLM is quite self-aware here and realizes it’s losing hold of the story, but it’s very tenacious in trying to see if it can fix things. Although the problem itself wasn’t that complicated, it had changed quite a few files. I asked it to check the Git history - it was working fine before with all tests passing, so let’s see what changed since then.

At this point, it started spinning for another 15-20 minutes and came back with this:

Cursor tries advanced Git archaeology

Now we’re really losing control. You can see the LLM trying to do advanced Git commands and finding the history, but it’s not going anywhere! The funny part is how it almost expects me to shout at it somehow, and is strangely human in the way the response is phrased. I wonder if this happens because of how it’s been trained to be extremely apologetic when something goes wrong.

I gave it a little bit more encouragement, saying “These are the files to fix” and asked it to continue to see if it could recover.

Cursor admits defeat

At this point, the LLM kind of really gave up. It literally told me that it couldn’t fix it and it might be better for me to take over and fix it manually. It apologizes to me for the “time wasted” (althought it was doing all the work)

The Choice

I was left with two options:

  1. Completely give up and manually do the bulk of the migration
  2. Provide deep architectural help on how to do the migration

Instead of choosing either of those, I was wondering if I could teach it like I would teach a junior developer. So I asked, “How would you eat an elephant?”

I wasn’t expecting the LLM to really parse it - it was more like a funny human question that I would ask a colleague who was overthinking a problem and I was hoping that the LLM would ask for clarification.

I was quite surprised by how well it handled the nuance:

Cursor gets the metaphor

It had really understood what was going on and grasped the meaning of the metaphor. It realized that to make a big change like this, it needed to go step by step - or bite by bite. At this point, it understood it had to slow down and make a plan.

This is now standard practice for us when working with LLM tools (Cursor, Claude Code, etc.) - entering planning mode before tackling bigger projects. But at that point in June, we had to manually prompt it to slow down.

The Recovery

With the prompting, it actually started step-by-step, created a plan, and went through it one-by-one. Even here, it did keep making the mistake of trying to do too many things at once, and I had to rein it back and slow it down to make sure that at each point the tests were passing.

But what was interesting was that with all of this back-and-forth and false starts, we were able to do the migration end-to-end in less than a day - which was still an amazing improvement from what we would’ve done before. I had to spend maybe a few more hours making sure the polish was fine and the edges were all filed correctly, but there was no doubt that this approach had made my life easier by giving me at least a 3x speedup.

The Lessons

Many of the learnings from this experience have now become industry standard:

  • Do planning - Don’t let the agent jump straight into implementation
  • Go step by step - Rather than trying to solve everything at once
  • LLMs do understand human nuance and humor - The elephant metaphor indeed helped it get out of the hole.

This was the point where I went all in on LLMs. They’re not magic - they need guidance, constraints, and the occasional metaphorical nudge - but when you work with them right, they can be incredibly effective.

Sometimes the best way to help an AI is to treat it like you would a smart but overeager junior developer: give it permission to slow down, break things into smaller pieces, and tackle them one bite at a time.