Found 63,262 repositories(showing 30)
DataExpert-io
This is a repo with links to everything you'd ever want to learn about data engineering
taosdata
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
rustfs
🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platforms such as MinIO and Ceph.
apache
Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.
heibaiying
大数据入门指南 :star:
oxnr
A curated list of awesome big data frameworks, ressources and other awesomeness.
juicedata
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
wangzhiwubigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
databendlabs
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
vaexio
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
apache
Upserts, Deletes And Incremental Processing on Big Data.
volcano-sh
A Cloud Native Batch System (Project under CNCF)
TurboWay
大数据分析项目
iGaoWei
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
DTStack
A data integration framework
liyupi
🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~
apache
Apache Avro is a data serialization system.
MoRan1607
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
douban
Python clone of Spark, a MapReduce alike framework in Python
griddb
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.
geekyouth
深圳地铁大数据客流分析系统🚇🚄🌟
dotnet
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
DTStack
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
apconw
Aix-DB 基于 LangChain/LangGraph 框架,结合 MCP Skills 多智能体协作架构,实现自然语言到数据洞察的端到端转换。
shzlw
An easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.
fluid-cloudnative
Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)
byzer-org
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
Netflix
Distributed Big Data Orchestration Service
collabH
大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。
YoongiKim
Google, Naver multiprocess image web crawler (Selenium)