Search Results

Found 238 repositories(showing 30)

flintrock

nchammas

🧡62

A command-line tool for launching Apache Spark clusters.

652

120

Apache-2.0

Python

Updated 3 days ago

apache-sparkapache-spark-clusterec2+2

spark-ec2

amplab

❤️43

Scripts used to setup a Spark cluster on EC2

387

292

Apache-2.0

Python

Updated 1 month ago

RailSewa-FinalYearProject

DataSenseiAryan

🧡50

Automated Real-Time Indian Railway Twitter Complaint Administration System. It uses Apache Kafka, Spark, MySQL, PHP. The full project was deployed on AWS EC2 and RDS.

MIT

CSS

Updated 1 month ago

apache-sparkawskafka+5

Steam_Recommendation_System

huntingzhu

❤️35

Recommendation System, Collaborative Filtering, Spark, Hive, Flask, Web Crawler, AWS EC2, AWS RDS

Jupyter Notebook

Updated 12 months ago

cgcloud

BD2KGenomics

❤️25

Image and VM management for Jenkins, Spark and Mesos clusters in EC2

NOASSERTION

Python

Updated 2 years ago

spark-ec2

shivaram

❤️37

Scripts used to setup a Spark cluster on EC2

166

Shell

Updated 9 months ago

spark-cloud

entropyltd

❤️20

Spark-cloud is a set of scripts for starting spark clusters on ec2

Shell

Updated 8 years ago

spark-ec2-setup

CloudComputingCourse

❤️20

No description available

Apache-2.0

Python

Updated 6 years ago

geotrellis-ec2-cluster

geotrellis

❤️20

Scripts to deploy a GeoTrellis Spark cluster on EC2

Apache-2.0

Python

Updated 8 years ago

sbt-spark-ec2-plugin

felixgborrego

❤️35

Sbt plugin to submit Spark jobs to AWS EMR Spark Clusters

Scala

Updated 4 years ago

sbt-spark-ec2

pishen

❤️40

SBT plugin for spark-ec2

Apache-2.0

Python

Updated 5 years ago

e-commerce-marketing-pipeline

anish749

❤️35

Data Pipeline examples using Oozie, Spark and Hive on Cloudera VM and AWS EC2 (branch aws-ec2)

Scala

Updated 2 years ago

ddapp

codeaucafe

❤️35

FULL stack data science project (tech currently utilized: AWS/boto3/EMR/EC2/S3, Python, PySpark (Spark SQL and MLlib), and Flask/Flask RESTPlus)

Python

Updated 5 years ago

ansible-spark-ec2

phamthuonghai

❤️10

No description available

Python

Updated 6 years ago

TwitterSentimentAnalysis-BigDataProject

rochitasundar

❤️35

Scrapped tweets using twitter API (for keyword ‘Netflix’) on an AWS EC2 instance, ingested data into S3 via kinesis firehose. Used Spark ML on databricks to build a pipeline for sentiment classification model and Athena & QuickSight to build a dashboard

Jupyter Notebook

Updated 2 years ago

accuracy-metricsaws-athenaaws-ec2+12

ETL-Pipeline-Applied-With-AWS

Maggie1001

❤️35

Spark, EMR, EC2, Redshift, Glue

Scala

Updated 5 years ago

aws_ec2_jupyter_spark

bgrosjea

❤️35

Setup procedure to work with jupyter notebook and pyspark on a EC2 AWS instance

Shell

Updated 2 years ago

ansible-spark

MBtech

❤️35

Ansible playbook to setup apache spark and hdfs on AWS EC2

Python

Updated 4 years ago

immigration_data_etl_analysis

richjdowney

❤️20

This project demonstrates skills in data engineering, specifically it contains an efficient ETL process utilizing AWS EC2, EMR and S3, Python and Spark and orchestrating the data pipeline with Airflow

Python

Updated 1 year ago

Cloud-Computing

rajshah1

❤️35

Grad Course Work for ITCS-6190 Cloud Computing for Data Analysis. Stack Used : AWS , EC2 Clusters ,Spark,Spring Boot Applications

Updated 4 years ago

itcs-6190itcss

Sales-analytics-Data-Lakehouse

ChahiriAbderrahmane

🧡50

This project simulates a real-world enterprise data migration and modernization strategy. It extracts transactional data from a simulated "On-Premise" environment (hosted on AWS EC2), performs heavy distributed processing using a Hadoop/Spark cluster, and ultimately serves the data via a Cloud-Native, serverless architecture to optimize costs .

Python

Updated 3 weeks ago

amazon-athenaamazon-quicksightamazon-s3+11

Spark_on_AWS_EC2

anuragdogra2192

❤️35

Spark and Python for Big Data with PySpark (SparkML, DataFrames) Udemy course projects

Jupyter Notebook

Updated 1 year ago

spark-ec2-setup

jinliangwei

❤️15

No description available

Apache-2.0

Python

Updated 9 years ago

ML-Spark-AWS

Ting-DS

❤️35

Spark ML, Spark SQL, Spark DataFrame, AWS EMR, AWS S3, AWS EC2, ML Classification

HTML

Updated 2 years ago

IIITG-Cloud-Computing

ShauryaManiTripathi

❤️35

MY work around AWS (includes NGINX,EC2,S3,Autoscaling,RDS,BeanStalk,Hadoop,Spark)

Python

Updated 11 months ago

Identification-of-mislabeling-data-using-ML

aliafzal

❤️35

Overcome mislabeling errors in genomics training sets by utilizing machine learning on AWS EC2 and Apache Spark.

Jupyter Notebook

Updated 2 years ago

aws-ecs-clustercloudgenomic-data-analysis+5

Hashtag-Recommendation-System-Twitter

bhuvan92

❤️35

Build a recommendation system for Twitter hashtags using Neo4j graph database running on Spark GraphX on an EC2 cluster on AWS.

Python

Updated 3 years ago

Hotspot-Analysis-Using-Apache-Sedona

sid-7

❤️35

Implemented spatial hotspot analysis on the NYC Yellow Cab taxi trip records using spark cluster setup on the AWS EC2 Instances. The aim was to analyse huge dataset using distributed cluster-computing framework like Apache Spark and Apache Sedona.

Scala

Updated 3 years ago

apache-sedonaaws-ec2scala+1

Spark-Processing-AWS

longNguyen010203

❤️40

👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows🥊

Apache-2.0

Python

Updated 1 year ago

apache-airflowapache-sparkaws+13

Customer-churn-prediction

Architectshwet

❤️35

A project about building a stacked model with tuning the hyperparameters with grid search and hyperopt and used PySpark to test the performance of model in Spark Clusters in AWS EC2 and ROSE to balance the target variable

Jupyter Notebook

Updated 5 years ago

feature-engineeringmachine-learningrose+3

GitHub Explorer

Search Results

flintrock

spark-ec2

RailSewa-FinalYearProject

Steam_Recommendation_System

cgcloud

spark-ec2

spark-cloud

spark-ec2-setup

geotrellis-ec2-cluster

sbt-spark-ec2-plugin

sbt-spark-ec2

e-commerce-marketing-pipeline

ddapp

ansible-spark-ec2

TwitterSentimentAnalysis-BigDataProject

ETL-Pipeline-Applied-With-AWS

aws_ec2_jupyter_spark

ansible-spark

immigration_data_etl_analysis

Cloud-Computing

Sales-analytics-Data-Lakehouse

Spark_on_AWS_EC2

spark-ec2-setup

ML-Spark-AWS

IIITG-Cloud-Computing

Identification-of-mislabeling-data-using-ML

Hashtag-Recommendation-System-Twitter

Hotspot-Analysis-Using-Apache-Sedona

Spark-Processing-AWS

Customer-churn-prediction

flintrock

spark-ec2

RailSewa-FinalYearProject

Steam_Recommendation_System

cgcloud

spark-ec2

spark-cloud

spark-ec2-setup

geotrellis-ec2-cluster

sbt-spark-ec2-plugin

sbt-spark-ec2

e-commerce-marketing-pipeline

ddapp

ansible-spark-ec2

TwitterSentimentAnalysis-BigDataProject

ETL-Pipeline-Applied-With-AWS

aws_ec2_jupyter_spark

ansible-spark

immigration_data_etl_analysis

Cloud-Computing

Sales-analytics-Data-Lakehouse

Spark_on_AWS_EC2

spark-ec2-setup

ML-Spark-AWS

IIITG-Cloud-Computing

Identification-of-mislabeling-data-using-ML

Hashtag-Recommendation-System-Twitter

Hotspot-Analysis-Using-Apache-Sedona

Spark-Processing-AWS

Customer-churn-prediction