90 lines
3.3 KiB
Markdown
90 lines
3.3 KiB
Markdown
# Recipe ETL Scripts
|
|
|
|
This directory contains helper scripts for extracting Woodworking recipe data
|
|
from the raw **datasets/Woodworking.txt** file and loading it into the project
|
|
PostgreSQL database.
|
|
|
|
## File overview
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| **woodworking_to_csv.py** | Legacy first-pass parser → `datasets/Woodworking.csv`. |
|
|
| **woodworking_to_csv_v2.py** | Improved parser that matches the spec (category, level, sub-crafts, ingredients, HQ yields, etc.) → `datasets/Woodworking_v2.csv`. |
|
|
| **recipes_to_csv_v2.py** | Generic parser. `python recipes_to_csv_v2.py <Craft>` processes one craft; use `python recipes_to_csv_v2.py --all` **or simply omit the argument** to parse every `.txt` file under `datasets/`, producing `datasets/<Craft>_v2.csv` for each. |
|
|
| **load_woodworking_to_db.py** | Loader for the legacy CSV (kept for reference). |
|
|
| **load_woodworking_v2_to_db.py** | Drops & recreates **recipes_woodworking** table and bulk-loads `Woodworking_v2.csv`. |
|
|
| **load_recipes_v2_to_db.py** | Generic loader.
|
|
| **load_inventory_to_db.py** | Truncate & load `datasets/inventory.csv` into the `inventory` table. | `python load_recipes_v2_to_db.py <Craft>` loads one craft; omit the argument to load **all** generated CSVs into their respective `recipes_<craft>` tables. |
|
|
| **requirements.txt** | Minimal Python dependencies for the scripts. |
|
|
| **venv/** | Local virtual-environment created by the setup steps below. |
|
|
|
|
## Prerequisites
|
|
|
|
* Python ≥ 3.9
|
|
* PostgreSQL instance reachable with credentials in `db.conf` at project root:
|
|
|
|
```ini
|
|
PSQL_HOST=…
|
|
PSQL_PORT=…
|
|
PSQL_USER=…
|
|
PSQL_PASSWORD=…
|
|
PSQL_DBNAME=…
|
|
```
|
|
|
|
## Quick start (Woodworking example)
|
|
|
|
```bash
|
|
# 1. From project root
|
|
cd scripts
|
|
|
|
# 2. Create & activate virtualenv (only once)
|
|
python3 -m venv venv
|
|
source venv/bin/activate
|
|
|
|
# 3. Install dependencies
|
|
pip install -r requirements.txt
|
|
|
|
# 4. Generate CSVs for **all** crafts
|
|
python recipes_to_csv_v2.py --all # or simply `python recipes_to_csv_v2.py`
|
|
|
|
# 5. Load all crafts into the DB (drops/recreates each table)
|
|
python load_recipes_v2_to_db.py
|
|
```
|
|
|
|
To work with a **single craft**, specify its name instead:
|
|
|
|
```bash
|
|
python recipes_to_csv_v2.py Smithing # generate Smithing_v2.csv
|
|
python load_recipes_v2_to_db.py Smithing # load only Smithing recipes
|
|
```
|
|
|
|
The loader will output e.g.:
|
|
|
|
```
|
|
Wrote 480 recipes -> datasets/Woodworking_v2.csv
|
|
Loaded recipes into new recipes_woodworking table.
|
|
```
|
|
|
|
## CSV schema (v2)
|
|
|
|
Column | Notes
|
|
------ | -----
|
|
`category` | Craft rank without level range (e.g. "Amateur")
|
|
`level` | Recipe level integer
|
|
`subcrafts` | JSON list `[["Smithing",2],["Alchemy",7]]`
|
|
`name` | NQ product name
|
|
`crystal` | Element used (Wind, Earth, etc.)
|
|
`key_item` | Required key item (blank if none)
|
|
`ingredients` | JSON list `[["Arrowwood Log",1]]`
|
|
`hq_yields` | JSON list HQ1-HQ3 e.g. `[["Arrowwood Lumber",6],["Arrowwood Lumber",9],["Arrowwood Lumber",12]]`
|
|
|
|
## Parsing rules
|
|
|
|
* Item quantities are detected only when the suffix uses an “x” (e.g. `Lumber x6`).
|
|
* Strings such as `Bronze Leggings +1` are treated as the **full item name**; the `+1/+2/+3` suffix is preserved.
|
|
|
|
## Developing / debugging
|
|
|
|
* Edit the parsers as needed, then rerun them to regenerate CSV.
|
|
* Feel free to add new scripts here; remember to update **requirements.txt** & this README.
|