--- # nuzlocke-tracker-bs05 title: Build PokeDB.org data import tool status: completed type: feature priority: normal created_at: 2026-02-10T14:04:11Z updated_at: 2026-02-11T10:54:04Z parent: nuzlocke-tracker-rzu4 blocking: - nuzlocke-tracker-spx3 --- Build a standalone Python tool that converts PokeDB.org's JSON data export into our existing seed JSON format. This replaces PokeAPI as the single source of truth for ALL games (Gen 1-9). Python was chosen over Go because: - The backend is already Python, so the team is familiar with it - We're processing local JSON files — no need for Go's concurrency - Remains a standalone tool in `tools/import-pokedb/`, not part of the backend ## Data source PokeDB.org provides a full data export at https://pokedb.org/data-export with JSON downloads: - `encounters.json` (69MB, 37,724 records) — all encounter data across all games - `locations.json` — 839 locations - `location_areas.json` — 2,672 location areas - `encounter_methods.json` — 73 encounter methods - `versions.json` — 82 game versions - `pokemon_forms.json` — Pokemon forms with identifiers **No scraping required.** Just download the JSON files and process them locally. **Terms of use:** "Data is provided for educational, research, and non-commercial purposes." Attribution to PokeDB requested. ## Encounter data coverage Encounter counts by version: - Sword: 10,160 / Shield: 10,144 - Scarlet: 4,135 / Violet: 4,101 - SoulSilver: 2,492 / HeartGold: 2,475 - Shining Pearl: 2,021 / Brilliant Diamond: 2,013 - Legends Arceus: 1,756 - Black 2: 1,418 / White 2: 1,418 - Crystal: 1,375 / Alpha Sapphire: 1,338 / Platinum: 1,337 - Diamond: 1,292 / Pearl: 1,289 / Silver: 1,284 / Gold: 1,282 - LeafGreen: 987 / FireRed: 985 / White: 981 / Black: 947 - Ultra Moon: 886 / Ultra Sun: 885 / X: 880 / Y: 879 - Emerald: 763 / Let's Go Eevee: 710 / Sun: 709 / Moon: 707 - Sapphire: 707 / Ruby: 707 / Let's Go Pikachu: 690 - Blue: 528 / Red: 526 / Yellow: 496 ## Data format details Each encounter record has: - `pokemon_form_identifier` — e.g. "pidgey-default", "mr-mime-default" - `version_identifiers` — array of game version IDs (e.g. ["sword", "shield"]) - `location_area_identifier` — e.g. "route-01-kanto", "axews-eye" - `encounter_method_identifier` — e.g. "walking-tall-grass", "surfing", "npc-trade" - `levels` — string like "2 - 4" or "67" - Rate fields vary by game generation: - Gen 1/3/6: `rate_overall` (single percentage) - Gen 2/4: `rate_morning`, `rate_day`, `rate_night` (time-of-day percentages) - Gen 5: `rate_spring`, `rate_summer`, `rate_autumn`, `rate_winter` (seasonal) - Gen 8 Sw/Sh: `weather_*_rate` fields (per-weather percentages, e.g. "40%") - Gen 8 Legends Arceus: `during_*` and `while_*` booleans (time+weather conditions) - Gen 9 Sc/Vi: `probability_*` fields (overworld probability weights) - `trade_for` — Pokemon form identifier for NPC trades - `alpha_levels` — for Legends Arceus alpha encounters - `visible` — overworld vs hidden encounter - Max Raid and Tera Raid fields for special encounters ## Subtasks Work is broken into child task beans: - [ ] **Set up Python tool scaffold** — project structure, CLI entry point, PokeDB JSON file loading - [ ] **Build reference data mappings** — pokemon_form → pokeapi_id, location_area → name/region, encounter method mapping - [ ] **Core encounter processing** — filter by game version, parse levels, handle rate variants, group by location area - [ ] **Output seed JSON** — produce per-game JSON in existing format, integrate route ordering + special encounters - [ ] **Validation & full generation** — compare against existing data, run for all games, fix discrepancies ## Encounter method mapping (draft) PokeDB method → Our seed method: - `walking-tall-grass`, `walking-*` → "walk" - `surfing`, `surfing-*` → "surf" - `fishing-old-rod` → "old-rod" - `fishing-good-rod` → "good-rod" - `fishing-super-rod` → "super-rod" - `fishing` → "fishing" - `rock-smash` → "rock-smash" - `headbutt-*` → "headbutt" - `npc-gift`, `egg`, `revive` → "gift" - `npc-trade` → "trade" - `symbol-encounter` → "walk" (overworld, Gen 8+) - `wanderer` → "walk" (overworld visible) - `fixed-encounter`, `static-encounter` → "static" - `swarm` → "swarm" - `poke-radar` → "pokeradar" - `dual-slot-mode` → "dual-slot" - Others: TBD based on relevance ## Notes - This tool replaces `tools/fetch-pokeapi/` as the primary data source for all games - Pokemon form identifiers need mapping to pokeapi IDs — may need a fuzzy match since naming conventions differ - The existing `pokemon.json` has names and pokeapi IDs we can use as a lookup - S/V probability weights are not percentages — they represent relative spawn weights - Legends Arceus uses boolean conditions (during_night + while_clear) rather than rates