Recipe ETL Scripts

This directory contains helper scripts for extracting Woodworking recipe data from the raw datasets/Woodworking.txt file and loading it into the project PostgreSQL database.

File overview

File	Purpose
woodworking_to_csv.py	Legacy first-pass parser → `datasets/Woodworking.csv`.
woodworking_to_csv_v2.py	Improved parser that matches the spec (category, level, sub-crafts, ingredients, HQ yields, etc.) → `datasets/Woodworking_v2.csv`.
recipes_to_csv_v2.py	Generic parser. `python recipes_to_csv_v2.py <Craft>` processes one craft; use `python recipes_to_csv_v2.py --all` or simply omit the argument to parse every `.txt` file under `datasets/`, producing `datasets/<Craft>_v2.csv` for each.
load_woodworking_to_db.py	Loader for the legacy CSV (kept for reference).
load_woodworking_v2_to_db.py	Drops & recreates recipes_woodworking table and bulk-loads `Woodworking_v2.csv`.
load_recipes_v2_to_db.py	Generic loader.
load_inventory_to_db.py	Truncate & load `datasets/inventory.csv` into the `inventory` table.
requirements.txt	Minimal Python dependencies for the scripts.
venv/	Local virtual-environment created by the setup steps below.

Prerequisites

Python ≥ 3.9

PostgreSQL instance reachable with credentials in db.conf at project root:

PSQL_HOST=…
PSQL_PORT=…
PSQL_USER=…
PSQL_PASSWORD=…
PSQL_DBNAME=…

Quick start (Woodworking example)

# 1. From project root
cd scripts

# 2. Create & activate virtualenv (only once)
python3 -m venv venv
source venv/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Generate CSVs for **all** crafts
python recipes_to_csv_v2.py --all  # or simply `python recipes_to_csv_v2.py`

# 5. Load all crafts into the DB (drops/recreates each table)
python load_recipes_v2_to_db.py

To work with a single craft, specify its name instead:

python recipes_to_csv_v2.py Smithing       # generate Smithing_v2.csv
python load_recipes_v2_to_db.py Smithing   # load only Smithing recipes

The loader will output e.g.:

Wrote 480 recipes -> datasets/Woodworking_v2.csv
Loaded recipes into new recipes_woodworking table.

CSV schema (v2)

Column	Notes
`category`	Craft rank without level range (e.g. "Amateur")
`level`	Recipe level integer
`subcrafts`	JSON list `[["Smithing",2],["Alchemy",7]]`
`name`	NQ product name
`crystal`	Element used (Wind, Earth, etc.)
`key_item`	Required key item (blank if none)
`ingredients`	JSON list `[["Arrowwood Log",1]]`
`hq_yields`	JSON list HQ1-HQ3 e.g. `[["Arrowwood Lumber",6],["Arrowwood Lumber",9],["Arrowwood Lumber",12]]`

Parsing rules

Item quantities are detected only when the suffix uses an “x” (e.g. Lumber x6).
Strings such as Bronze Leggings +1 are treated as the full item name; the +1/+2/+3 suffix is preserved.

Developing / debugging

Edit the parsers as needed, then rerun them to regenerate CSV.
Feel free to add new scripts here; remember to update requirements.txt & this README.

3.3 KiB Raw Blame History