2026-02-10 15:16:26 +01:00
---
# nuzlocke-tracker-bs05
title: Build PokeDB.org data import tool
2026-02-11 11:56:25 +01:00
status: completed
2026-02-11 09:49:51 +01:00
type: feature
2026-02-10 15:16:26 +01:00
priority: normal
created_at: 2026-02-10T14:04:11Z
2026-02-11 11:56:25 +01:00
updated_at: 2026-02-11T10:54:04Z
2026-02-10 15:16:26 +01:00
parent: nuzlocke-tracker-rzu4
2026-02-10 15:31:36 +01:00
blocking:
- nuzlocke-tracker-spx3
2026-02-10 15:16:26 +01:00
---
2026-02-11 09:49:51 +01:00
Build a standalone Python tool that converts PokeDB.org's JSON data export into our existing seed JSON format. This replaces PokeAPI as the single source of truth for ALL games (Gen 1-9).
Python was chosen over Go because:
- The backend is already Python, so the team is familiar with it
- We're processing local JSON files — no need for Go's concurrency
- Remains a standalone tool in `tools/import-pokedb/` , not part of the backend
2026-02-10 15:16:26 +01:00
## Data source
PokeDB.org provides a full data export at https://pokedb.org/data-export with JSON downloads:
- `encounters.json` (69MB, 37,724 records) — all encounter data across all games
- `locations.json` — 839 locations
- `location_areas.json` — 2,672 location areas
- `encounter_methods.json` — 73 encounter methods
- `versions.json` — 82 game versions
- `pokemon_forms.json` — Pokemon forms with identifiers
**No scraping required.** Just download the JSON files and process them locally.
**Terms of use:** "Data is provided for educational, research, and non-commercial purposes." Attribution to PokeDB requested.
## Encounter data coverage
Encounter counts by version:
- Sword: 10,160 / Shield: 10,144
- Scarlet: 4,135 / Violet: 4,101
- SoulSilver: 2,492 / HeartGold: 2,475
- Shining Pearl: 2,021 / Brilliant Diamond: 2,013
- Legends Arceus: 1,756
- Black 2: 1,418 / White 2: 1,418
- Crystal: 1,375 / Alpha Sapphire: 1,338 / Platinum: 1,337
- Diamond: 1,292 / Pearl: 1,289 / Silver: 1,284 / Gold: 1,282
- LeafGreen: 987 / FireRed: 985 / White: 981 / Black: 947
- Ultra Moon: 886 / Ultra Sun: 885 / X: 880 / Y: 879
- Emerald: 763 / Let's Go Eevee: 710 / Sun: 709 / Moon: 707
- Sapphire: 707 / Ruby: 707 / Let's Go Pikachu: 690
- Blue: 528 / Red: 526 / Yellow: 496
## Data format details
Each encounter record has:
- `pokemon_form_identifier` — e.g. "pidgey-default", "mr-mime-default"
- `version_identifiers` — array of game version IDs (e.g. ["sword", "shield"])
- `location_area_identifier` — e.g. "route-01-kanto", "axews-eye"
- `encounter_method_identifier` — e.g. "walking-tall-grass", "surfing", "npc-trade"
- `levels` — string like "2 - 4" or "67"
- Rate fields vary by game generation:
- Gen 1/3/6: `rate_overall` (single percentage)
- Gen 2/4: `rate_morning` , `rate_day` , `rate_night` (time-of-day percentages)
- Gen 5: `rate_spring` , `rate_summer` , `rate_autumn` , `rate_winter` (seasonal)
- Gen 8 Sw/Sh: `weather_*_rate` fields (per-weather percentages, e.g. "40%")
- Gen 8 Legends Arceus: `during_*` and `while_*` booleans (time+weather conditions)
- Gen 9 Sc/Vi: `probability_*` fields (overworld probability weights)
- `trade_for` — Pokemon form identifier for NPC trades
- `alpha_levels` — for Legends Arceus alpha encounters
- `visible` — overworld vs hidden encounter
- Max Raid and Tera Raid fields for special encounters
2026-02-11 09:49:51 +01:00
## Subtasks
Work is broken into child task beans:
2026-02-10 15:16:26 +01:00
2026-02-11 09:49:51 +01:00
- [ ] **Set up Python tool scaffold ** — project structure, CLI entry point, PokeDB JSON file loading
- [ ] **Build reference data mappings ** — pokemon_form → pokeapi_id, location_area → name/region, encounter method mapping
- [ ] **Core encounter processing ** — filter by game version, parse levels, handle rate variants, group by location area
- [ ] **Output seed JSON ** — produce per-game JSON in existing format, integrate route ordering + special encounters
- [ ] **Validation & full generation ** — compare against existing data, run for all games, fix discrepancies
2026-02-10 15:16:26 +01:00
## Encounter method mapping (draft)
PokeDB method → Our seed method:
- `walking-tall-grass` , `walking-*` → "walk"
- `surfing` , `surfing-*` → "surf"
- `fishing-old-rod` → "old-rod"
- `fishing-good-rod` → "good-rod"
- `fishing-super-rod` → "super-rod"
- `fishing` → "fishing"
- `rock-smash` → "rock-smash"
- `headbutt-*` → "headbutt"
- `npc-gift` , `egg` , `revive` → "gift"
- `npc-trade` → "trade"
- `symbol-encounter` → "walk" (overworld, Gen 8+)
- `wanderer` → "walk" (overworld visible)
- `fixed-encounter` , `static-encounter` → "static"
- `swarm` → "swarm"
- `poke-radar` → "pokeradar"
- `dual-slot-mode` → "dual-slot"
- Others: TBD based on relevance
## Notes
- This tool replaces `tools/fetch-pokeapi/` as the primary data source for all games
- Pokemon form identifiers need mapping to pokeapi IDs — may need a fuzzy match since naming conventions differ
- The existing `pokemon.json` has names and pokeapi IDs we can use as a lookup
- S/V probability weights are not percentages — they represent relative spawn weights
- Legends Arceus uses boolean conditions (during_night + while_clear) rather than rates