About LakeLogic

Contract-driven data engineering for the modern lakehouse.

What is LakeLogic?

LakeLogic is an open-source Python framework that lets data teams define, enforce, and monitor data contracts across every layer of the data stack — from raw ingestion to curated gold tables. Write your schema once in YAML, and LakeLogic validates, transforms, and materialises your data using the engine that fits: Polars, Pandas, DuckDB, Spark, Snowflake, or BigQuery.

Why we built it

Modern data teams manage hundreds of datasets across multiple engines, clouds, and formats. Without a shared contract layer, quality checks are ad-hoc, schema drift goes unnoticed, and bad data reaches production. LakeLogic gives every pipeline a single source of truth: a declarative contract that travels with the data.

Core principles

Project links

Contact

Reach the team at hello@lakelogic.org or open an issue on GitHub.