Four stages in sequence. Each is a Python module with a clear input/output contract.Documentation Index
Fetch the complete documentation index at: https://docs.appliedaifoundation.org/llms.txt
Use this file to discover all available pages before exploring further.
Fetcher
File:src/fetcher.py
Purpose: Pull unread Metaweave Forms: emails from a shared Outlook mailbox via Microsoft Graph.
Auth
OAuth2 client credentials flow via MSAL. The app needsMail.Read + Mail.ReadWrite application permissions on the target mailbox (consented by an Azure AD admin).
Query
Filters emails server-side via OData:$top=50 per call — extend with pagination if your fleet exceeds 50 unread/run.
Subject parsing
Each subject is parsed against a strict regex (config.py):
(vessel_name, report_type_raw, report_date_DD.MM.YYYY). Subjects that don’t match are skipped — this is the gate that excludes stray emails from the same mailbox.
Body extraction
If the body’scontentType is html, fetcher strips HTML tags before passing to the parser. This handles forwarded plain-text wrapped in HTML by Outlook.
Mark as read
PATCH /messages/{id} with {"isRead": true} after successful processing. Failed messages stay unread for retry on the next run.
Output
A list ofFetchedEmail dataclasses:
Parser
File:src/parser.py
Purpose: Extract the encrypted payload from the email body, decrypt it to a JSON dict.
Header extraction
Pullsreport_type_raw and form_version from a header line in the body:
Marker extraction
Locates the encrypted block:config.py:
Decryption
- AES-128-CBC (16-byte key)
- IV = key (matching the form’s CryptoJS encryption)
- PKCS7 padding
- Library:
pycryptodome
Text fallback
If markers are missing (e.g. crew hand-edited and broke the block), parser falls back to a regex-based key-value extractor that reads section headers (---Section---) and key: value lines, flattening to a dict with keys like "Section::Key". Coverage is partial — most fields will arrive but rich nested arrays won’t.
Output
Mapper
File:src/mapper.py
Purpose: Translate the form’s JSON payload into SQLAlchemy model instances ready for the writer.
What it produces
Key transformations
| From form | To DB | Helper |
|---|---|---|
"6 1' 54\" N" | 6.0317 (decimal degrees) | dms_to_decimal() |
"13.04.2026 12:00:00 +03:00" | datetime(2026, 4, 13, 9, 0, tzinfo=UTC) | parse_report_datetime() |
"Yes" / "No" / "True" / "1" | True / False | parse_bool() |
"123.45" (string) | Decimal("123.45") | safe_decimal() (returns None on failure) |
"42" (string) | 42 | safe_int() |
src/utils/datetime_utils.py and src/utils/coordinates.py return None on failure rather than raising — bad data becomes NULL, the run continues.
Per-event nested fuel
Each event has a nested fuel array with 12 consumption categories per fuel type:main_engine_consumption, aux_engine_consumption, total_consumption. These map to EventFuelConsumption rows attached to each ReportEvent.
Context routing
The mapper reads two array names depending onlocation (At Sea / In Port):
| Location | Array names |
|---|---|
| At Sea | atseaeventrobdetails, atseabunkerrobdetails, gsatseaeventtypes |
| In Port | inporteventrobdetails, inportbunkerrobdetails, gsinporteventtypes |
context column tags each row.
Writer
File:src/writer.py
Purpose: Upsert into PostgreSQL.
Upsert sequence
CASCADE delete
All 11 child tables declare:DELETE FROM metaweave_report WHERE report_id=… removes the entire family in one statement. This is what makes corrections clean.
Why delete-then-insert (not UPDATE)
Reports have variable-length child arrays. A singleUPDATE would have to diff: which events to add, update, delete? Delete-then-insert is simpler and faster for typical sizes (~5 events, ~4 bunker ROBs per report). Atomicity is preserved by the surrounding transaction — session.commit() covers both the delete and the insert.
See also
- Data model — full table list and relationships
- Configuration — env vars
- Running — operational invocation