Back to search
This project creates a robust data pipeline for efficient ingestion, processing, and storage. Using Apache Airflow for orchestration, it integrates Python, Kafka, Zookeeper, and Spark for real-time data processing, with Cassandra for storage. Docker containerization ensures smooth deployment and scalability of all components.
Stars
7
Forks
2
Watchers
7
Open Issues
1
Overall repository health assessment
No package.json found
This might not be a Node.js project
5
commits