Back to search
PySpark is the Python API for Apache Spark, an open source, distributed computing framework and set of libraries for real-time, large-scale data processing. If you're already familiar with Python and libraries such as Pandas, then PySpark is a good language to learn to create more scalable analyses and pipelines.
Stars
3
Forks
5
Watchers
3
Open Issues
1
Overall repository health assessment
No package.json found
This might not be a Node.js project
10
commits