How to prompt an LLM for a training plan that actually works
ChatGPT, Claude, and Gemini can all write a genuinely good training plan - if you ask properly. Here's the prompt structure that produces a usable, periodized plan instead of generic filler, plus the step everyone forgets.
Most people prompt an LLM for a training plan the same way they'd ask a stranger at a bus stop: "give me a marathon plan." They get back something that looks right - twelve weeks, a long run on Sunday, some intervals - and is actually generic filler. No paces tied to their fitness. No logic to the progression. A taper that appears because taper is a word that goes near the end of marathon plans, not because the model reasoned about peaking.
The model didn't fail. The prompt did.
A current LLM - ChatGPT, Claude, Gemini, doesn't much matter which - can produce a properly periodized plan with sensible progression, recovery weeks, and a real taper. The difference between filler and a plan you'd actually trust is entirely in what you put in front of it. Here's how to do it properly.
Why the lazy prompt fails
"Write me a 16-week marathon plan" gives the model nothing to anchor on. It doesn't know your current weekly volume, your recent race times, how many days you can train, or whether you've run a marathon before. So it does the only thing it can: it averages. It produces the plan that's least wrong for the average person who might type that sentence.
Average is useless. A plan built for the average marathoner is too hard for a first-timer and too easy for someone chasing a 3:10. The entire value of asking a machine instead of buying a $20 book is that the machine can tailor. If you don't give it the inputs to tailor with, you've paid for personalization and received a paperback.
The structure: persona, context, constraints, output
Good plan prompts have four parts. Skip any one and quality drops.
1. Persona. Tell the model who it is. "You are an experienced endurance coach who builds periodized plans grounded in established training principles." This isn't theatre - it pulls the model toward the part of its training that contains real coaching knowledge (progressive overload, polarized intensity, recovery weeks) and away from generic wellness-blog content.
2. Context - this is where plans are won or lost. Everything the model needs to tailor:
- Your event and date. "Half marathon on 4 October 2026."
- Your goal. "Sub-1:45" or "just finish comfortably" - these produce very different plans.
- Current fitness, in numbers. Recent race times or time trials are gold. "I ran 10K in 48:30 six weeks ago" lets the model derive every training pace. "I'm reasonably fit" tells it nothing. If you've got recent training data - weekly mileage, a screenshot of your last few weeks - paste it in; the model will anchor to real numbers instead of averages.
- Current weekly volume. "I'm running about 30 km a week across 4 runs right now." This sets the safe starting point - the single biggest factor in not getting injured.
- Days available and any fixed constraints. "I can train 4 days a week, long run has to be Saturday, no running Mondays."
- History and limitations. Previous longest run, past injuries, age if relevant.
3. Constraints. The rules the plan must obey. "Don't increase weekly volume by more than 10% week to week. Include a recovery week every fourth week. Build to a peak three weeks out, then taper. Give every hard workout a specific pace or effort target, not just 'tempo run'."
4. Output format. Tell it exactly how to lay the plan out. This matters more than people expect, and we'll come back to why.
A prompt you can copy
Fill in the brackets and paste this into ChatGPT, Claude, or Gemini:
You are an experienced endurance running coach who builds
periodized training plans grounded in established principles
(progressive overload, mostly-easy intensity distribution,
recovery weeks, and a proper taper).
Build me a training plan.
CONTEXT
- Event: [half marathon] on [date]
- Goal: [sub-1:45 / finish comfortably / specific time]
- Recent race or time trial: [10K in 48:30, six weeks ago]
- Current weekly volume: [~30 km across 4 runs]
- Days I can train per week: [4]
- Fixed constraints: [long run must be Saturday; no running Mondays]
- History: [longest run ever 18 km; mild Achilles issue last year]
CONSTRAINTS
- Do not increase weekly volume more than ~10% week to week.
- Include a recovery (cutback) week every 4th week.
- Build to a peak ~3 weeks before race day, then taper.
- Every quality session must have a specific pace or effort
target derived from my recent race time. No vague "tempo run".
- Keep ~80% of running easy.
Before you write the plan: (1) derive my training paces from my
recent race time and show them to me, and (2) ask me up to five
questions about anything still missing that would change the plan.
Only then write the full plan - one workout per day, week by week,
organized into phases: base, build, peak, then taper.
Two lines in there do the heavy lifting. "Derive my training paces and show them to me" forces the model to reason out loud about your fitness before it generates anything, which produces far more coherent paces and lets you check the math before you trust it. And "ask me up to five questions" flips the usual dynamic: instead of the model guessing at the gaps in your brief, it tells you what it's missing. Its questions are often sharper than your original prompt - it'll ask about your sleep, your longest recent run, how you handled your last build - and answering them is where a generic plan turns into yours.
Don't stop at the first answer - make it a conversation
The single biggest upgrade over "give me a plan" isn't a better prompt. It's refusing to treat the first output as final. A plan you build over five or six messages is in a different league from one you accept in one shot, because each exchange lets you correct the model's assumptions before they compound.
A workflow that consistently produces good plans:
a. Let it interview you first (the prompt above triggers this). Answer its questions properly. This is the highest-leverage two minutes in the whole process.
b. Get a baseline, then impose structure. Once it's written a draft, ask it to reorganize the block into clear phases and explain the purpose of each: a base phase building aerobic volume and consistency, a build phase adding intervals, tempo, and hills, a peak, then a taper that arrives you fresh. Making the model justify each phase exposes whether there's real reasoning underneath or just a shape.
c. Then refine with targeted nudges. Small, specific instructions beat regenerating from scratch: "make my long runs slightly longer but keep weekly load about the same," "add a rest day before every key session," "simplify this for someone newer to structured training," "adjust the easy runs for trail instead of road." Each nudge is a course correction the model applies instantly - the thing a static PDF plan can never do for you.
You can layer non-running work in the same way once the runs are set - ask it to fit in two or three strength sessions a week without putting legs the day before a hard run, or short daily mobility work. Just add it after the running skeleton is solid, so it slots around your key sessions instead of crowding them.
Pick the model, but don't agonize
In practice the frontier models are close enough that the prompt matters more than the choice. Broadly, from testing across the community: ChatGPT and Claude both produce strong, specific, periodized plans; Gemini produces structurally sound plans but sometimes lighter on exact interval and pace detail. Any of the three, given the prompt above, will beat a generic plan from any of them given "make me a marathon plan."
Use whichever you already pay for. If you have access to more than one, generate the plan in two and compare - disagreements between them are usually the spots worth scrutinizing.
The part everyone gets wrong: verify, don't worship
An LLM will write you a confident plan and never tell you it's unsure. That confidence is the trap. Reviews of AI-generated plans keep landing on the same two findings: the plans are often genuinely good, and the model is relentlessly agreeable - it confirms your choices rather than challenging them. It won't tell you 4 days a week isn't enough for your sub-3 goal. It won't flag that ramping from 30 to 50 km in a month is how people get stress fractures.
So treat the first output as a draft, not a verdict. Check three things:
a. The jump from your current volume to week one. It should be gentle. If the plan opens at 45 km and you're running 30, push back: "start me closer to my current 30 km and build from there."
b. The progression rate. Weekly volume should rise gradually with regular cutback weeks, not climb in a straight line to race day.
c. The paces. They should follow from your recent race time. If the model assigned an interval pace faster than your current 5K pace, it's guessing - ask it to show its working.
Pushing back is the whole point of using a conversational model. "This looks too aggressive for my history, make week one easier and add a recovery week" gets you a better plan in seconds. That iterative correction is something a static PDF plan can never do - and it's exactly where the AI earns its keep.
Then comes the step nobody writes about
You now have a good plan. It's sitting in a chat window. This is where almost every guide stops - and where the actual problem begins.
A plan in a chat thread is not a plan you can run. On Tuesday morning you'll open the chat, scroll to week 3, read "10K with 8×400m at 5K pace," start an outdoor run on your Apple Watch, and try to remember how many intervals you've done while you're doing them. By week six you've stopped checking the chat entirely. The plan was good. The execution died in the gap between the chat window and your wrist.
This is the part worth solving deliberately. Two things help:
First, prompt for a clean, consistent output format - one workout per line, plain text, full week labels, specific targets. A tidy, uniform plan is far easier to act on (and far easier for any tool to read) than a wall of prose with bold headers and bullet points.
Second, get it off the screen and onto your watch as structured workouts, so each day's session is queued up ready to start - warmup, intervals, recovery, cooldown - without you retyping anything into Apple's workout builder. That last mile is exactly the problem Stopa exists to close: paste the plan your LLM just wrote, see it on a calendar, push each workout to your Apple Watch. We don't generate the plan - you and your model of choice already did the interesting part. We just make sure it ends up somewhere you can actually run it.
There's a second half to this loop most people miss. Your LLM writes the plan once and never finds out what happened next - whether you hit the intervals, whether the easy days stayed easy, whether week three quietly fell apart. Stopa watches your real runs come back through your Apple Watch and HealthKit and pairs each one against the workout you planned, so you see actual versus prescribed at a glance. That picture - how the block really went - is the best possible context for your next conversation with the model. You stop prompting from memory and start prompting from what you actually did.
If you want the deeper version of the prompting craft for a marathon specifically, see how to use a ChatGPT marathon training plan. For why the chat-to-watch gap is so stubborn, from PDF to Apple Watch walks through every workflow people try. And before you trust any plan to your watch, it's worth knowing what Apple Watch can and can't actually sync.
The model can write you a real plan. Give it the inputs to do it, check its work like you'd check a keen but overconfident training partner, and then make sure the plan lands somewhere you'll actually follow it.
Stopa is an iOS app for endurance athletes who already have a training plan - from an LLM, a coach, or a PDF - and want it on their Apple Watch without retyping every workout. We're building in public - follow along on X, Instagram, TikTok and Facebook.