Found 932 repositories(showing 30)
apache
Apache DataFusion SQL Query Engine
ibis-project
the portable Python dataframe library
roapi
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
apache
Extensible SQL Lexer and Parser for Rust
lakesoul-io
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
apache
Apache DataFusion Ballista Distributed Query Engine
apache
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
tansu-io
Apache Kafka® compatible broker with S3, PostgreSQL, SQLite, Apache Iceberg and Delta Lake
lakehq
LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.
arkflow-rs
High performance Rust stream processing engine seamlessly integrates AI capabilities, providing powerful real-time data processing and intelligent analysis.
apache
Apache DataFusion Comet Spark Accelerator
ClickHouse
ClickBench: a Benchmark For Analytical Databases
Canner
The open context engine for AI agents support 15+ data sources. Built on Rust and Apache DataFusion.
andygrove
DataFusion has now been donated to the Apache Arrow project
apache
Apache DataFusion Python Bindings
paradedb
DuckDB-powered data lake analytics from Postgres
splitgraph
Analytical database for data-driven Web applications 🪶
XiangpengHao
Pushdown cache for DataFusion
probably-nothing-labs
Embeddable stream processing engine based on Apache DataFusion
kamu-data
Next-generation decentralized data lakehouse and a multi-party stream processing network
JanKaul
Unofficial rust implementation of Apache Iceberg with integration for Datafusion
apache
Apache DataFusion Ray
datafusion-contrib
Batteries included CLI, TUI, and server implementations for DataFusion.
datafusion-contrib
DataFusion TableProviders for reading data from other systems
datafusion-contrib
Allow DataFusion to resolve queries across remote query engines while pushing down as much compute as possible down.
biodatageeks
Blazing-Fast Bioinformatic Operations on Python DataFrames
PRQL
Query and transform data with PRQL
datafusion-contrib
Postgres protocol frontend for DataFusion
nimtable
Compaction runtime for Apache Iceberg.
hotdata-dev
A Rust-native DuckLake engine built on Apache DataFusion