The Metaweave pipeline is a Python package that ingests submissions into PostgreSQL. It runs as a single CLI on a schedule.Documentation Index
Fetch the complete documentation index at: https://docs.appliedaifoundation.org/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
- Python 3.12+
- A Google Cloud SQL Postgres instance (and a service account JSON with access)
- An Azure AD app registration with
Mail.ReadandMail.ReadWritepermissions on the shared Outlook mailbox - Network access to:
graph.microsoft.comlogin.microsoftonline.comsqladmin.googleapis.com(for Cloud SQL Connector)
Set up the venv
src/ are picked up without re-installing.
Dependencies
Pulled in frompyproject.toml:
| Package | Why |
|---|---|
pycryptodome | AES-128-CBC decryption |
sqlalchemy ≥2.0 | ORM + declarative models |
cloud-sql-python-connector[pg8000] | GCP Cloud SQL Connector |
pg8000 | Pure-Python PostgreSQL dialect |
psycopg2-binary | Fallback PostgreSQL driver |
alembic | Schema migrations (included, not yet used) |
msal | Microsoft identity library (OAuth2 client credentials) |
requests | HTTP client for Microsoft Graph |
python-dotenv | .env loading |
pytest, pytest-cov.
Configure environment variables
Create a.env in the pipeline/ directory (or set them in your shell). Required variables:
Create the database tables
First-time setup:Base.metadata.create_all(engine) and exits. It’s idempotent — re-running on an existing schema is a no-op.
The 17 tables it creates:
Verify the install
Run a single email through end-to-end without touching Outlook:--file reads the body from disk, runs Parse → Map → Write, and exits. Useful for testing without consuming a real mailbox message.
Run the tests
PRAGMA foreign_keys=ON — no PostgreSQL needed.
See also
- Running — operational invocation patterns
- Configuration — every env var
- ETL stages — what happens internally