Search Results

Found 88 repositories(showing 30)

Data-Analyst-Nanodegree

sondosaabed

❤️35

I aquired a full scholarship from Google Launchpad. Advanced data wrangling skills to work with messy, complex real-world datasets. Highly customized visualizations using the Matplotlib Python library

MIT

Jupyter Notebook

Updated 4 months ago

data-sciencedataanalysisdatawrangling+3

Real-World-Data-Wrangling-With-Python

JamilaHajAhmad

❤️35

Second project in my Data Analyst Nanodegree from Udacity

Jupyter Notebook

Updated 1 year ago

DATA_WRANGLING

seni1

❤️35

Course Outline Data wrangling is a core skill that everyone who works with data should be familiar with since so much of the world's data isn't clean. Though this course is geared towards those who use Python to analyze data, the high-level concepts can be applied in all programming languages and software applications for data analysis. Lesson 1: The Walkthrough In the first lesson of this course, we'll walk through an example of data wrangling so you get a feel for the full process. We'll introduce gathering data, then download a file from the web and import it into a Jupyter Notebook. We'll then introduce assessing data and assess the dataset we just downloaded both visually and programmatically. We'll be looking for quality and structural issues. Finally, we'll introduce cleaning data and use code to clean a few of the issues we identified while assessing. The goal of this walkthrough is awareness rather than mastery, so you'll be able to start wrangling your own data even after just this first lesson. Lessons 2-4: Gathering, Assessing, and Cleaning Data (in Detail) In the following lessons, you'll master gathering, assessing, and cleaning data. We'll cover the full data wrangling process with real datasets too, so think of this course as a series of wrangling journeys. You'll learn by doing and leave each lesson with tangible skills. Your In

Jupyter Notebook

Updated 3 years ago

Twitter-Data-Scrapping-Cleaning-and-Analyzing

IddarMehdi

❤️35

Real-world data rarely comes clean. Using Python and its libraries, I gathered data from a variety of sources and in a variety of formats, assessed its quality and tidiness, then cleaned it. This is called the data wrangling process. The dataset used gathered from Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. In this project, I conducted a data wrangling process through gathering data from a variety of sources and in a variety of formats: - First is downloaded manually a .csv file named ‘twitter_archive_enhanced.csv’ and stored it in ‘archive’ table - Then, I used the Requests python library to download programmatically a ‘.tsv’ file named ‘tweet-image-predictions.tsv’ and I stored it in the ‘images’ table. This file contains the results of a neural network's analysis which predicts a dog's breed based on images. - After this, I created an API object that I used to programmatically download a JSON file stored as ‘twitter_counts’ table, which contains additional Twitter data. For the second section of the project, which is devoted to data assessing, I first, looked for quality issues that pertain to the content of data I identified ten quality issues, then I examined tidiness issues, which pertain to the structure of data. In the last section of the wrangling process, I structured and cleaned dirty data into the desired format for better analysis and visualizations using Python and its libraries. For each identified issue, I defined the actions to undertake before translating those actions to lines of code. I also tested every code to check the result of the cleaning.

HTML

Updated 2 years ago

Real-World-Data-Wrangling-With-Python

MrIzzat

❤️35

Applying Data Wrangling process with real world data.

Updated 1 year ago

real-world-data-wrangling-with-python

Amid68

❤️30

No description available

MIT

Updated 1 year ago

Real-World-Data-Wrangling-with-Python

ngwam

❤️35

In this project, leverage Python and its libraries to collect, assess, and clean real-world data from various sources and formats. Document the entire process in a Jupyter Notebook and present analyses and visualizations using Python for a transparent showcase of the refined dataset.

Updated 2 years ago

Udacity-Project-Real-World-Data-Wrangling-with-Python

sky-adams

🧡65

Data wrangling and analysis of Santa Barbara bird populations and rainfall trends using Python

Jupyter Notebook

Updated 2 days ago

audubonchristmas-bird-countdata-analysis+3

Data-Eng-Proj

josiahuma

❤️35

Several python-based ETL projects, data wrangling, and analytics scripts with real-world examples. Visualizations using panda and power bi

Jupyter Notebook

Updated 5 months ago

airflowautomationdatabase+7

Data_Analysis

DenisMarcher

❤️35

This repository serves as a collection of educational data analysis and projects demonstrating: Data wrangling skills, Visualization techniques, Real-world EDA, Data Sets onlyClear, structured, using Python . All notebooks use real-world, non-synthetic datasets, in accordance with the course requirements.

Jupyter Notebook

Updated 3 months ago

Data_Analysis_Projects-freeCodeCamp

suneelshivanioffical

❤️35

In the Data Analysis with Python course by freeCodeCamp, gain hands-on experience with Python's core data analysis libraries, including Pandas, Matplotlib, and NumPy. Through real-world projects, you’ll learn to clean, manipulate, and visualize data effectively, developing skills in data wrangling, analysis, and visualization.

Jupyter Notebook

Updated 1 year ago

Data-Analytics-and-Science-mini-projects

Emart29

❤️35

Collection of practical data science projects demonstrating end-to-end analytics workflow: from data wrangling and EDA to ML modeling and interactive visualization. Built with Python, SQL, Tableau, and QuickSight to solve real-world business problems.

Jupyter Notebook

Updated 3 months ago

Data-analyst-

prince-std

❤️35

Empower your data-driven decision-making with this comprehensive repository of data analysis projects. Explore a variety of datasets, analyze trends, and visualize insights using Python and power bi and other tools. Enhance your data wrangling, analysis, and storytelling skills while gaining hands-on experience with real-world data challenges

Updated 2 years ago

Data-Science

gowthamkumar9

❤️45

I am a data science enthusiast skilled in turning raw data into meaningful insights and predictive solutions. With strong foundations in Python, Machine Learning, Statistics, and Data Wrangling, I enjoy solving real-world problems using data-driven approaches. building projects that strengthen my analytical thinking and technical expertise.

Jupyter Notebook

Updated 2 months ago

Wrangle-and-Analyze-Data

prast567

❤️35

HTML

Updated 5 years ago

Python_Wrangle-and-Analyze-Data

himanshusharmacu

❤️35

Real-world data rarely comes clean. Using Python and its libraries, I will gather data from a variety of sources and in a variety of formats, assess its quality and tidiness, then clean it. This is called data wrangling. I had documented my wrangling efforts in a Jupyter Notebook, plus showcased them through analyses and visualizations using Python (and its libraries) and/or SQL. The dataset that i wrangled (and analyzed and visualized) is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage. WeRateDogs downloaded their Twitter archive and sent it to Udacity via email exclusively to use in this project. This archive contains basic tweet data (tweet ID, timestamp, text, etc.) for all 5000+ of their tweets as they stood on August 1, 2017. More on this soon.

Jupyter Notebook

Updated 6 years ago

Wrangle-and-Analyze-data

joj19968

❤️35

# Wrangle-and-Analyze-data #### Introduction Real-world data rarely comes clean. Using Python and its libraries, you will gather data from a variety of sources and in a variety of formats, assess its quality and tidiness, then clean it. This is called data wrangling. You will document your wrangling efforts in a Jupyter Notebook, plus showcase them through analyses and visualizations using Python (and its libraries) and/or SQL. The dataset that you will be wrangling (and analyzing and visualizing) is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage. WeRateDogs downloaded their Twitter archive and sent it to Udacity via email exclusively for you to use in this project. This archive contains basic tweet data (tweet ID, timestamp, text, etc.) for all 5000+ of their tweets as they stood on August 1, 2017. More on this soon. ### Project Details Your tasks in this project are as follows: -Data wrangling, which consists of: Gathering data, Assessing data and Cleaning data. -Storing, analyzing, and visualizing your wrangled data -Reporting on 1) your data wrangling efforts and 2) your data analyses and visualizations #### Gathering Data for this Project Gather each of the three pieces of data as described below in a Jupyter Notebook titled wrangle_act.ipynb: The WeRateDogs Twitter archive. I am giving this file to you, so imagine it as a file on hand. Download this file manually by clicking the following link: twitter_archive_enhanced.csv The tweet image predictions, i.e., what breed of dog (or other object, animal, etc.) is present in each tweet according to a neural network. This file (image_predictions.tsv) is hosted on Udacity's servers and should be downloaded programmatically using the Requests library and the following URL: https://d17h27t6h515a5.cloudfront.net/topher/2017/August/599fd2ad_image-predictions/image-predictions.tsv Each tweet's retweet count and favorite ("like") count at minimum, and any additional data you find interesting. Using the tweet IDs in the WeRateDogs Twitter archive, query the Twitter API for each tweet's JSON data using Python's Tweepy library and store each tweet's entire set of JSON data in a file called tweet_json.txt file. Each tweet's JSON data should be written to its own line. Then read this .txt file line by line into a pandas DataFrame with (at minimum) tweet ID, retweet count, and favorite count. Note: do not include your Twitter API keys, secrets, and tokens in your project submission. #### Assessing Data for this Project After gathering each of the above pieces of data, assess them visually and programmatically for quality and tidiness issues. Detect and document at least eight (8) quality issues and two (2) tidiness issues in your wrangle_act.ipynb Jupyter Notebook. To meet specifications, the issues that satisfy the Project Motivation (see the Key Points header on the previous page) must be assessed. #### Cleaning Data for this Project Clean each of the issues you documented while assessing. Perform this cleaning in wrangle_act.ipynb as well. The result should be a high quality and tidy master pandas DataFrame (or DataFrames, if appropriate). Again, the issues that satisfy the Project Motivation must be cleaned. #### Storing, Analyzing, and Visualizing Data for this Project Store the clean DataFrame(s) in a CSV file with the main one named twitter_archive_master.csv. If additional files exist because multiple tables are required for tidiness, name these files appropriately. Additionally, you may store the cleaned data in a SQLite database (which is to be submitted as well if you do). Analyze and visualize your wrangled data in your wrangle_act.ipynb Jupyter Notebook. At least three (3) insights and one (1) visualization must be produced. #### Reporting for this Project Create a 300-600 word written report called wrangle_report.pdf or wrangle_report.html that briefly describes your wrangling efforts. This is to be framed as an internal document. Create a 250-word-minimum written report called act_report.pdf or act_report.html that communicates the insights and displays the visualization(s) produced from your wrangled data. This is to be framed as an external document, like a blog post or magazine article, for example.

Updated 5 years ago

RealWorldDataWrangling-withPython

mmaayyss20

❤️25

No description available

HTML

Updated 9 months ago

Wrangling

lmatos-803

❤️30

Real World Data Wrangling with Python

Jupyter Notebook

Updated 4 months ago

Real-World-Data-Wrangling-With-Python

MohammadHamo912

❤️35

Real World Data Wrangling With Python

Jupyter Notebook

Updated 1 year ago

AdvancedDataWranglingAndDataModeling

DataAnalytics-ISSS

❤️35

Real World Data Wrangling with Python

Jupyter Notebook

Updated 1 year ago

real-world-data-wrangling-with-python

thiago-grabe

❤️35

Real World Data Wrangling with Python

Jupyter Notebook

Updated 6 months ago

RealWorld_Data_Wrangling

Farha-Dahman

❤️35

Real World Data Wrangling with Python

Jupyter Notebook

Updated 1 year ago

Real-World-Data-Wrangling-with-Python

gkansdine

❤️25

No description available

Jupyter Notebook

Updated 1 year ago

data-visualizationmatplotlibnumpy

Real-World-Data-Wrangling-with-Python

Raghad-Odwan

❤️35

Udacity_Secned_Project_data_anayst

HTML

Updated 5 months ago

Real-World-Data-Wrangling-with-Python

jarvi652

❤️25

No description available

Jupyter Notebook

Updated 1 year ago

Real-World-Data-Wrangling-with-Python

kelseyz1229

❤️25

No description available

Jupyter Notebook

Updated 1 year ago

Real-World-Data-Wrangling-with-Python

AbdalrhmanJuber

❤️25

No description available

Jupyter Notebook

Updated 12 months ago

Real-World-Data-Wrangling-with-Python

tareq-saymeh

❤️25

No description available

Jupyter Notebook

Updated 1 year ago

Real-World-Data-Wrangling-with-Python

sarashrouf

🧡65

Data wrangling project exploring movie characteristics on Netflix vs. general movies dataset using Python and Pandas.

Jupyter Notebook

Updated 5 days ago

GitHub Explorer

Search Results

Data-Analyst-Nanodegree

Real-World-Data-Wrangling-With-Python

DATA_WRANGLING

Twitter-Data-Scrapping-Cleaning-and-Analyzing

Real-World-Data-Wrangling-With-Python

real-world-data-wrangling-with-python

Real-World-Data-Wrangling-with-Python

Udacity-Project-Real-World-Data-Wrangling-with-Python

Data-Eng-Proj

Data_Analysis

Data_Analysis_Projects-freeCodeCamp

Data-Analytics-and-Science-mini-projects

Data-analyst-

Data-Science

Wrangle-and-Analyze-Data

Python_Wrangle-and-Analyze-Data

Wrangle-and-Analyze-data

RealWorldDataWrangling-withPython

Wrangling

Real-World-Data-Wrangling-With-Python

AdvancedDataWranglingAndDataModeling

real-world-data-wrangling-with-python

RealWorld_Data_Wrangling

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Data-Analyst-Nanodegree

Real-World-Data-Wrangling-With-Python

DATA_WRANGLING

Twitter-Data-Scrapping-Cleaning-and-Analyzing

Real-World-Data-Wrangling-With-Python

real-world-data-wrangling-with-python

Real-World-Data-Wrangling-with-Python

Udacity-Project-Real-World-Data-Wrangling-with-Python

Data-Eng-Proj

Data_Analysis

Data_Analysis_Projects-freeCodeCamp

Data-Analytics-and-Science-mini-projects

Data-analyst-

Data-Science

Wrangle-and-Analyze-Data

Python_Wrangle-and-Analyze-Data

Wrangle-and-Analyze-data

RealWorldDataWrangling-withPython

Wrangling

Real-World-Data-Wrangling-With-Python

AdvancedDataWranglingAndDataModeling

real-world-data-wrangling-with-python

RealWorld_Data_Wrangling

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python

Real-World-Data-Wrangling-with-Python