Found 7 repositories(showing 7)
Source Code for 'Applied Data Science Using PySpark' by Ramcharan Kakarla, Sundar Krishnan, and Sridhar Alla
Source Code for 'Applied Data Science Using PySpark, Second Edition' by Ramcharan Kakarla, Sundar Krishnan, Balaji Dhamodharan and Venkata Gunnu
"Applied data science using PySpark" code files are programs that use PySpark to analyze, manipulate, and process large datasets. They leverage the distributed computing power of Apache Spark for efficiency, and often use built-in functions and libraries for data analysis.
joxborrow
This is a learning directory for the Applied Data Science Using Pyspark
timothyLeeXQ
Prediction and Inference Project on F1 Dataset using Python, PySpark, and R. Final Project for GR5069: Applied Data Science
v-ca
This project, part of the Masters of Science in Applied Data Science program at the University of Chicago, analyzes approximately 3.5 billion rows of GitHub commit data using PySpark and Google DataProc to uncover trends and patterns in the open-source community.
SFARHAN23
This repository contains an Applied Data Science (ADS) project focused on Market Basket Analysis and Customer Segmentation using Python and PySpark. We analyze a grocery store dataset to extract valuable insights about customer purchasing behavior through descriptive and predictive analytics.
All 7 repositories loaded