From 00dead68f7633bdbce2e3c9850e7ad5be0689c6b Mon Sep 17 00:00:00 2001 From: Julian Tabel Date: Tue, 10 Feb 2026 15:16:26 +0100 Subject: [PATCH] Add PokeDB.org data import bean, encounter display bean, complete data source research - Complete exploration of automated data sources (q5vd): PokeDB.org identified as ideal single source of truth with JSON data export - Add bean for PokeDB.org data import tool (bs05) - Add bean for improving encounter rate display with time/weather variants (oqfo) - Mark branding cleanup bean (xvaw) as completed Co-Authored-By: Claude Opus 4.6 --- ...-build-pokedborg-encounter-data-scraper.md | 112 ++++++++++++++++++ ...ounter-rate-display-for-timeweather-var.md | 32 +++++ ...tomated-data-sources-for-encounter-data.md | 89 +++++++++++--- ...clean-up-frontend-branding-and-metadata.md | 2 +- 4 files changed, 219 insertions(+), 16 deletions(-) create mode 100644 .beans/nuzlocke-tracker-bs05--build-pokedborg-encounter-data-scraper.md create mode 100644 .beans/nuzlocke-tracker-oqfo--improve-encounter-rate-display-for-timeweather-var.md diff --git a/.beans/nuzlocke-tracker-bs05--build-pokedborg-encounter-data-scraper.md b/.beans/nuzlocke-tracker-bs05--build-pokedborg-encounter-data-scraper.md new file mode 100644 index 0000000..f4bc959 --- /dev/null +++ b/.beans/nuzlocke-tracker-bs05--build-pokedborg-encounter-data-scraper.md @@ -0,0 +1,112 @@ +--- +# nuzlocke-tracker-bs05 +title: Build PokeDB.org data import tool +status: draft +type: task +priority: normal +created_at: 2026-02-10T14:04:11Z +updated_at: 2026-02-10T14:11:06Z +parent: nuzlocke-tracker-rzu4 +--- + +Build a Go tool that converts PokeDB.org's JSON data export into our existing seed JSON format. This replaces PokeAPI as the single source of truth for ALL games (Gen 1-9). + +## Data source + +PokeDB.org provides a full data export at https://pokedb.org/data-export with JSON downloads: +- `encounters.json` (69MB, 37,724 records) — all encounter data across all games +- `locations.json` — 839 locations +- `location_areas.json` — 2,672 location areas +- `encounter_methods.json` — 73 encounter methods +- `versions.json` — 82 game versions +- `pokemon_forms.json` — Pokemon forms with identifiers + +**No scraping required.** Just download the JSON files and process them locally. + +**Terms of use:** "Data is provided for educational, research, and non-commercial purposes." Attribution to PokeDB requested. + +## Encounter data coverage + +Encounter counts by version: +- Sword: 10,160 / Shield: 10,144 +- Scarlet: 4,135 / Violet: 4,101 +- SoulSilver: 2,492 / HeartGold: 2,475 +- Shining Pearl: 2,021 / Brilliant Diamond: 2,013 +- Legends Arceus: 1,756 +- Black 2: 1,418 / White 2: 1,418 +- Crystal: 1,375 / Alpha Sapphire: 1,338 / Platinum: 1,337 +- Diamond: 1,292 / Pearl: 1,289 / Silver: 1,284 / Gold: 1,282 +- LeafGreen: 987 / FireRed: 985 / White: 981 / Black: 947 +- Ultra Moon: 886 / Ultra Sun: 885 / X: 880 / Y: 879 +- Emerald: 763 / Let's Go Eevee: 710 / Sun: 709 / Moon: 707 +- Sapphire: 707 / Ruby: 707 / Let's Go Pikachu: 690 +- Blue: 528 / Red: 526 / Yellow: 496 + +## Data format details + +Each encounter record has: +- `pokemon_form_identifier` — e.g. "pidgey-default", "mr-mime-default" +- `version_identifiers` — array of game version IDs (e.g. ["sword", "shield"]) +- `location_area_identifier` — e.g. "route-01-kanto", "axews-eye" +- `encounter_method_identifier` — e.g. "walking-tall-grass", "surfing", "npc-trade" +- `levels` — string like "2 - 4" or "67" +- Rate fields vary by game generation: + - Gen 1/3/6: `rate_overall` (single percentage) + - Gen 2/4: `rate_morning`, `rate_day`, `rate_night` (time-of-day percentages) + - Gen 5: `rate_spring`, `rate_summer`, `rate_autumn`, `rate_winter` (seasonal) + - Gen 8 Sw/Sh: `weather_*_rate` fields (per-weather percentages, e.g. "40%") + - Gen 8 Legends Arceus: `during_*` and `while_*` booleans (time+weather conditions) + - Gen 9 Sc/Vi: `probability_*` fields (overworld probability weights) +- `trade_for` — Pokemon form identifier for NPC trades +- `alpha_levels` — for Legends Arceus alpha encounters +- `visible` — overworld vs hidden encounter +- Max Raid and Tera Raid fields for special encounters + +## Implementation approach + +### Checklist +- [ ] Set up project structure in `tools/import-pokedb/` +- [ ] Download and cache PokeDB JSON export files +- [ ] Parse PokeDB encounters, locations, location_areas, versions, pokemon_forms +- [ ] Build lookup maps: pokemon_form_identifier → pokeapi_id (using existing `pokemon.json`) +- [ ] Build lookup maps: location_area_identifier → location name + region +- [ ] Filter encounters by target game version +- [ ] Map PokeDB encounter methods to our seed format methods (73 → simplified set) +- [ ] Parse level strings ("2 - 4" → min_level: 2, max_level: 4) +- [ ] Handle rate variants per game generation: + - For now, flatten time/weather/season rates into `encounter_rate` (use the max or average) + - Preserve raw variant data for future use (see nuzlocke-tracker-oqfo) +- [ ] Group encounters by location area → route output +- [ ] Apply route ordering (use existing route_order.json or generate from location data) +- [ ] Output in existing `{game}.json` seed format +- [ ] Generate seed data for ALL games, replacing PokeAPI as the single source of truth +- [ ] Compare output against existing PokeAPI-sourced data to validate accuracy +- [ ] Run for all games and verify output + +## Encounter method mapping (draft) + +PokeDB method → Our seed method: +- `walking-tall-grass`, `walking-*` → "walk" +- `surfing`, `surfing-*` → "surf" +- `fishing-old-rod` → "old-rod" +- `fishing-good-rod` → "good-rod" +- `fishing-super-rod` → "super-rod" +- `fishing` → "fishing" +- `rock-smash` → "rock-smash" +- `headbutt-*` → "headbutt" +- `npc-gift`, `egg`, `revive` → "gift" +- `npc-trade` → "trade" +- `symbol-encounter` → "walk" (overworld, Gen 8+) +- `wanderer` → "walk" (overworld visible) +- `fixed-encounter`, `static-encounter` → "static" +- `swarm` → "swarm" +- `poke-radar` → "pokeradar" +- `dual-slot-mode` → "dual-slot" +- Others: TBD based on relevance + +## Notes +- This tool replaces `tools/fetch-pokeapi/` as the primary data source for all games +- Pokemon form identifiers need mapping to pokeapi IDs — may need a fuzzy match since naming conventions differ +- The existing `pokemon.json` has names and pokeapi IDs we can use as a lookup +- S/V probability weights are not percentages — they represent relative spawn weights +- Legends Arceus uses boolean conditions (during_night + while_clear) rather than rates \ No newline at end of file diff --git a/.beans/nuzlocke-tracker-oqfo--improve-encounter-rate-display-for-timeweather-var.md b/.beans/nuzlocke-tracker-oqfo--improve-encounter-rate-display-for-timeweather-var.md new file mode 100644 index 0000000..45bea00 --- /dev/null +++ b/.beans/nuzlocke-tracker-oqfo--improve-encounter-rate-display-for-timeweather-var.md @@ -0,0 +1,32 @@ +--- +# nuzlocke-tracker-oqfo +title: Improve encounter rate display for time/weather variants +status: draft +type: feature +created_at: 2026-02-10T14:04:27Z +updated_at: 2026-02-10T14:04:27Z +--- + +Improve how encounter rates are displayed in the tracker to support time-of-day, weather, and seasonal variants that exist in many Pokemon games. + +## Context + +PokeDB.org data reveals that encounter rates vary significantly by context across different games: +- **Gen 2 / Gen 4 (G/S/C, HG/SS, D/P/Pt, BDSP):** rates vary by morning/day/night +- **Gen 5 (B/W, B2/W2):** rates vary by season (spring/summer/autumn/winter) +- **Gen 8 (Sw/Sh):** rates vary by weather (clear, cloudy, rain, thunderstorm, snow, etc.) +- **Gen 8 (Legends Arceus):** time + weather boolean conditions +- **Gen 9 (Sc/Vi):** overworld probability weights (not traditional encounter rates) + +Currently the seed format has a single `encounter_rate` field per encounter, which doesn't capture these variants. + +## Goals +- Design a display format that lets users see encounter rates for different conditions (e.g., tabs or tables for morning/day/night) +- Determine how to extend the seed data format to store variant rates +- Decide which level of detail is useful for Nuzlocke tracking (do players care about exact weather rates, or is "available during rain" sufficient?) + +## Considerations +- Keep it simple for games with single rates (Gen 1, Gen 3, Gen 6) +- For Nuzlockes, the key question is usually "what can I encounter here?" — exact rates are secondary but useful for planning +- The UI should not become cluttered for simple cases +- This may affect the backend encounter model, seed format, and frontend display \ No newline at end of file diff --git a/.beans/nuzlocke-tracker-q5vd--explore-automated-data-sources-for-encounter-data.md b/.beans/nuzlocke-tracker-q5vd--explore-automated-data-sources-for-encounter-data.md index 6c59407..9ce8229 100644 --- a/.beans/nuzlocke-tracker-q5vd--explore-automated-data-sources-for-encounter-data.md +++ b/.beans/nuzlocke-tracker-q5vd--explore-automated-data-sources-for-encounter-data.md @@ -1,27 +1,86 @@ --- # nuzlocke-tracker-q5vd title: Explore automated data sources for encounter data -status: todo +status: completed type: task +priority: normal created_at: 2026-02-10T08:58:47Z -updated_at: 2026-02-10T08:58:47Z +updated_at: 2026-02-10T14:10:50Z parent: nuzlocke-tracker-rzu4 --- Research and evaluate automated or semi-automated options for populating encounter data, especially for games where PokeAPI has no data (Gen 8+). -## Potential sources to investigate: -- **PokeAPI CSV/database dumps** — the raw data behind PokeAPI may have more than the REST API exposes -- **veekun/pokedex** — community-maintained Pokémon database with encounter data -- **Bulbapedia / Serebii** — structured wiki data that could be scraped (check terms of use) -- **pkNX / game data extraction** — tools that extract data directly from game files -- **Community GitHub repos** — search for curated encounter datasets (e.g. for romhack tools, fan wikis) +## Games needing data -## Goals: -- Determine which games can realistically be populated via automation vs. manual entry -- If a viable source is found, prototype a script/tool to import data into the existing seed JSON format -- Document findings even if no automated approach is viable, so we know what's available +These games currently have empty/placeholder seed data: +- Let's Go Pikachu / Eevee +- Sword / Shield +- Brilliant Diamond / Shining Pearl +- Legends: Arceus +- Scarlet / Violet +- Legends: Z-A (upcoming, Oct 2025) -## Notes: -- The existing Go tool (`tools/fetch-pokeapi/`) could serve as a template for new data fetchers -- Output format must match the existing `{game}.json` structure (routes with encounters, children for sub-areas) \ No newline at end of file +## Research findings + +### Ruled out +- **PokeAPI** — encounter data last updated ~9 years ago (Sun/Moon era). No Gen 8+ encounter data. +- **veekun/pokedex** — last commit 5 years ago, similar dataset to PokeAPI. No advantage. +- **PokemonDB (pokemondb.net)** — covers all gens but encounter rates are NOT percentile (just rarity labels like "common", "uncommon"). Not suitable for our seed format which uses exact percentages. + +### Viable sources + +#### PokeDB (pokedb.org) — RECOMMENDED +- Covers **all generations** including Galar (Sw/Sh), Hisui (Legends Arceus), Paldea (Sc/Vi), and Let's Go +- **Percentile encounter rates** that sum to 100% per method — exactly what our seed format needs +- Rich data model with 60+ fields per encounter (documented at /editors/docs/data-model/tables/encounters/) +- Supports 6 rate variants: overall-only, weather-percentages, time-and-weather-checks, seasons, time-of-day, probability-weights +- Sub-area support for complex locations (e.g., Mount Coronet has floor-by-floor breakdown) +- Version-specific rates (e.g., different rates for Sword vs Shield, HeartGold vs SoulSilver) +- Includes trades, swarm encounters, special methods (headbutt, honey trees, pokeradar, etc.) +- `robots.txt`: very permissive — only disallows `/private/`, allows everything else +- URL pattern: `/locations/{region}/{location-name}/` +- Region index pages list all locations: `/locations/{region}/` +- Game version abbreviations: SW/SH, BD/SP, D/P/PL, S/V, LGP/LGE, etc. + +#### Bulbapedia (backup) +- MediaWiki-based, covers all generations +- 5-second crawl delay in robots.txt +- Inconsistent table format across generations + +#### Serebii (backup) +- Very permissive robots.txt +- Mixes version data on same page, harder to parse + +#### pkNX (alternative approach) +- Most accurate data (from game files), but requires ROM dumps +- Legal gray area, FlatBuffer conversion needed + +## Recommendation + +**PokeDB.org** is the ideal scraping target: +1. Percentile encounter rates matching our seed format +2. Covers all games we need data for +3. Very permissive robots.txt (only `/private/` disallowed) +4. Consistent, well-documented data model +5. Location index pages make discovery easy +6. Sub-areas and version-specific data handled cleanly +7. Rate-limited scraping acceptable (user confirmed) + +## Implementation approach + +Build a scraper (Go, to match existing tooling) that: +1. Fetches the region index page for each game/region needing data +2. Discovers all location URLs from the index +3. Scrapes each location page for encounter tables +4. Parses encounter method, pokemon name, game version, rate, level range +5. Maps Pokemon names to pokeapi IDs (from our existing `pokemon.json`) +6. Handles sub-areas (either flatten or use children in seed format) +7. Outputs data in the existing `{game}.json` seed format +8. Respects rate limiting (2+ second delay between requests, disk-cache responses) + +## Notes +- The existing Go tool (`tools/fetch-pokeapi/`) has a good HTTP client with caching and rate limiting that can be reused +- Output format must match the existing `{game}.json` structure (routes with encounters) +- Pokemon name → pokeapi_id mapping can use the existing `pokemon.json` as a lookup table +- PokeDB uses probability weights for Sc/Vi instead of percentages — will need conversion \ No newline at end of file diff --git a/.beans/nuzlocke-tracker-xvaw--clean-up-frontend-branding-and-metadata.md b/.beans/nuzlocke-tracker-xvaw--clean-up-frontend-branding-and-metadata.md index d24e1fa..3443124 100644 --- a/.beans/nuzlocke-tracker-xvaw--clean-up-frontend-branding-and-metadata.md +++ b/.beans/nuzlocke-tracker-xvaw--clean-up-frontend-branding-and-metadata.md @@ -5,7 +5,7 @@ status: completed type: task priority: normal created_at: 2026-02-10T09:36:24Z -updated_at: 2026-02-10T12:02:15Z +updated_at: 2026-02-10T13:58:27Z --- The frontend currently uses all Vite defaults — generic title, Vite favicon, no manifest, no meta tags. Clean it up so it looks polished and professional as "Nuzlocke Tracker".