← Beau Sumile
Poker Simulator MVP

Build the home-game trainer without the casino-app smell.

This walkthrough turns plan.md into an operating map: what the product is, what the other agent should build, where loops are safe, where human taste matters, and which gates define a real MVP.

8 build phases from foundation through PLO/PLO8
9 AI opponent archetypes, including Tilted Reg
28-41 focused working days estimated for MVP
11 launch gates in the MVP definition
Nit
12/9
LAG
32/26
Station
45/8
TAG
22/18
Hero
review
Casino Library felt, leather, brass, quiet feedback, full hand history
Product thesis

The wedge is fun NLHE now, variant depth later.

The product sits between expensive solver tools and play-money casino rooms: it should feel like playing poker, while producing post-session insight that actually improves decisions.

The default experience should feel like playing, not studying. Study features are allowed, but only when they do not pollute the main table experience.
Joy to use home-game table feel, readable UI, no casino tricks
Useful review EV errors, leak detection, equity timelines
Variant depth NLHE first, PLO/PLO8 architecture ready
Real practice, not play money

Target users

The plan weights casual learning, home-game realism, and serious review without letting any one audience pull the product off center.

Casual learner
40%
Home-game regular
30%
Serious amateur
20%
PLO/PLO8 learner
10%

Trust

Seedable RNG, reproducible hands, full history, explicit rules, and property tests are product features, not engineering garnish.

Taste

"Casino Library" is the screenshot-test standard: felt, leather, brass, editorial typography, deliberate motion, no neon clutter.

Moat

Hand review gets disproportionate investment because competitors either skip it or make it feel like homework.

Phase roadmap

Build in order, with testable exits.

Each phase has a working deliverable. The other agent should complete acceptance criteria before moving forward, and should treat UI-heavy phases as supervised work.

W1W2W3W4W5 W6W7W8W9Post
System shape

The engine and AI packages are the differentiated IP.

The monorepo keeps the pure poker engine, AI opponents, review engine, hand history, and web app behind clean package boundaries.

Do not collapse packages for convenience. The separation is what makes testing, licensing, and later variant expansion manageable.
apps/web Next.js app router, play table, review UI, pricing, auth, settings
packages/engine deck, RNG, evaluator, equity, state machine, variant adapters
Prisma + Postgres users, sessions, hands, subscriptions, preferences
ai-opponents archetypes, ranges, decision, sizing, calibration harness
review-engine replay, stats, leak detection, EV comparison, narration
hand-history canonical JSON and PokerStars text export path
shared-types cards, actions, hands, players, pots, street state
docs/ architecture, engine, archetypes, review, design direction
scripts/ fuzz, benchmark equity, archetype profiling, seeds, diagnostics

End-to-end product flow

1 Sign up

Magic link or Google OAuth, no password flow.

2 Set table

Stakes, archetypes, custom rules, free-tier limits.

3 Play hand

Engine resolves legal actions and deterministic bot decisions.

4 Persist history

Canonical JSON captures every state transition.

5 Review

Replay, EV deltas, leak detection, export, coaching.

A
K
7
Phase 1A

The engine is the one place loops should cook hard.

NLHE correctness is foundational: known-good vectors, benchmark equity, property tests, fuzzing, and strict TypeScript gates all come before higher-level polish.

The evaluator should port a proven algorithm concept. Avoid naive evaluators and avoid external poker packages with hidden edge-case bugs.

Correctness stack

Unit vectors
50+
Equity spots
200+
Fuzz hands
10k
Coverage
90%

State machine path

Setup Blinds Deal Preflop Flop Turn River Showdown Payout

Legal action validation, side-pot handling, all-in players, folds, and chip conservation are the high-risk edges.

Variant adapter

NLHE is fully implemented for MVP. PLO and PLO8 should be architecture stubs until Phase 7, where exact-two-card and low qualifier rules get their own plan.

NLHE: live PLO: stub PLO8: stub
Phase 2

Archetypes are teaching tools, not celebrity replicas.

Bot decisions must be pure deterministic TypeScript and return in under 50ms. LLMs are reserved for async review narration after the hand is complete.

Stats calibration can loop. Human recognizability cannot. The plan requires manual playtesting after profile stats pass.

VPIP / PFR map

Each dot shows the target personality space. Loose-passive and maniac are intentionally far apart even when VPIP is high.

Decision pipeline

1 Preflop

Position and action select range tables.

2 Postflop

Hand strength, board texture, villain range.

3 Sizing

Pot, stack, street, and archetype style.

4 Seeded mix

Deterministic randomization for replays.

5 Action

Fold, check, call, bet, raise in under 50ms.

Phase 3 and 4

Review is the moat; coach mode stays light.

Post-hand and post-session review should reveal decision quality without turning live play into a solver lecture.

The quality bar favors fewer, higher-confidence leak observations. False positives cost trust faster than missing a marginal leak.

Replay

Scrub action-by-action with table state restoration, step forward/back, speed controls, and hole-card reveal.

Decision EV

Show the user action, two or three alternatives, EV error in chips, and a plain-English explanation.

Session leaks

Scan patterns only after enough samples and only surface leaks costing more than 2 BB on average.

Drill / coach

Repeat spots like blind defense or c-bet decisions, then show one-line feedback after decisions.

Equity timeline concept

Preflop Flop Turn River Showdown 75% 50% 25%

Every meaningful action should be inspectable: equity before and after, alternative action EV, and opponent range explanation.

Leak detector gates

Relevant decisions
20+
EV magnitude
2 BB
Observations
3-5
False positives
0

The review engine should suppress low-confidence takes. Better to say one true thing than five noisy things.

Automation strategy

Loop only where the feedback is mechanical.

The plan is explicit: loops are excellent for correctness and calibration, but harmful for aesthetics, UX taste, and qualitative insight quality.

Hard rule for agents: do not modify acceptance criteria or test files to make a loop pass.
Phase
Loop?
Why

Nightly regression watcher

From Phase 1A onward, run the full test suite, fuzz checks, lint, and typecheck every 30 minutes. Capture failures; do not auto-fix them.

pnpm test pnpm test:fuzz pnpm test:lint pnpm test:typecheck

Bug-mining loop

After engine completion, fuzz with fresh seeds, minimize crashes, write regression tests, and continue until 100 consecutive seeds run clean.

Minimize Write regression Do not fix inline Continue
Launch gates

MVP means users can play, learn, pay, and trust it.

Launch is not just feature presence. The product has to pass engine stability, archetype calibration, review accuracy, payment flows, design quality, legal review, and distribution readiness.

The page below is a map, not proof. Use tests, playtests, Stripe test mode, accessibility QA, and legal review to actually close these gates.

Resource dependencies

Phase 0 Neon, Resend, Google OAuth, Vercel, PostHog, NextAuth secret
Phase 1A Python 3.11+, treys, curated NLHE test vector sources
Phase 3 Anthropic API key for async narration, with caching
Phase 5-6 Stripe, Discord, lawyer, creator outreach list

Agent execution prompts

Start Read plan.md end-to-end. Execute Phase 0 only. Run acceptance checks. Stop.
Prep Build Phase 1A prerequisite infrastructure. Do not implement engine logic yet.
Loop Kick off the Phase 1A /loop prompt verbatim from plan.md. Run overnight.
Verify Run full tests manually. Hand-trace 5 random fuzz hands. Mark Phase 1A done only if sound.

Explicit no-list

These are not accidental omissions. They protect scope and legal simplicity.

Real-money play Sweepstakes mechanics Chip-purchase economy Human multiplayer Tournament / ICM Named pro replicas Native mobile apps Solver imports HUD overlays Video lessons Hand imports from PokerStars/GG i18n before launch
Risk map

The scariest failures are trust failures.

The plan's mitigations cluster around correctness, believable opponents, high-confidence review, distinctive design, and early distribution.

Engine bugs and wrong leak analysis are not just defects. They damage the core promise: practice users can trust.
Engine bugs Robotic AI Wrong review LLM cost Generic UI No distribution
5
Engine correctness
Mitigate with vectors, property tests, fuzzing, and proven evaluator concepts.
4
Review false positives
Mitigate with sample minimums, EV thresholds, and sanity checks.
4
Bot feel
Mitigate with frequency randomization, tilt models, and real playtesting.
3
Screenshot test
Mitigate by enforcing Casino Library from Phase 0 onward.