Found 13 repositories(showing 13)
PacktPublishing
Python Data Cleaning Cookbook, published by Packt
PacktPublishing
Python Data Cleaning Cookbook, Second Edition - Published by Packt
shaynley
This is a walkthrough of basic text analytics in Python, working with an excerpt from a cookbook from 1871. (Covers text data cleaning and pre-processing, stop word removal, tokenization, lemmatization vs. stemming, word frequency, visualizations, as well as lexical density and part-of-speech (POS) distribution).
Aryia-Behroziuan
Tutorials This is a guide to many pandas tutorials, geared mainly for new users. Internal Guides pandas own 10 Minutes to pandas More complex recipes are in the Cookbook pandas Cookbook The goal of this cookbook (by Julia Evans) is to give you some concrete examples for getting started with pandas. These are examples with real-world data, and all the bugs and weirdness that that entails. Here are links to the v0.1 release. For an up-to-date table of contents, see the pandas-cookbook GitHub repository. To run the examples in this tutorial, you’ll need to clone the GitHub repository and get IPython Notebook running. See How to use this cookbook. A quick tour of the IPython Notebook: Shows off IPython’s awesome tab completion and magic functions. Chapter 1: Reading your data into pandas is pretty much the easiest thing. Even when the encoding is wrong! Chapter 2: It’s not totally obvious how to select data from a pandas dataframe. Here we explain the basics (how to take slices and get columns) Chapter 3: Here we get into serious slicing and dicing and learn how to filter dataframes in complicated ways, really fast. Chapter 4: Groupby/aggregate is seriously my favorite thing about pandas and I use it all the time. You should probably read this. Chapter 5: Here you get to find out if it’s cold in Montreal in the winter (spoiler: yes). Web scraping with pandas is fun! Here we combine dataframes. Chapter 6: Strings with pandas are great. It has all these vectorized string operations and they’re the best. We will turn a bunch of strings containing “Snow” into vectors of numbers in a trice. Chapter 7: Cleaning up messy data is never a joy, but with pandas it’s easier. Chapter 8: Parsing Unix timestamps is confusing at first but it turns out to be really easy. Lessons for New pandas Users For more resources, please visit the main repository. 01 - Lesson: - Importing libraries - Creating data sets - Creating data frames - Reading from CSV - Exporting to CSV - Finding maximums - Plotting data 02 - Lesson: - Reading from TXT - Exporting to TXT - Selecting top/bottom records - Descriptive statistics - Grouping/sorting data 03 - Lesson: - Creating functions - Reading from EXCEL - Exporting to EXCEL - Outliers - Lambda functions - Slice and dice data 04 - Lesson: - Adding/deleting columns - Index operations 05 - Lesson: - Stack/Unstack/Transpose functions 06 - Lesson: - GroupBy function 07 - Lesson: - Ways to calculate outliers 08 - Lesson: - Read from Microsoft SQL databases 09 - Lesson: - Export to CSV/EXCEL/TXT 10 - Lesson: - Converting between different kinds of formats 11 - Lesson: - Combining data from various sources Practical data analysis with Python This guide is a comprehensive introduction to the data analysis process using the Python data ecosystem and an interesting open dataset. There are four sections covering selected topics as follows: Munging Data Aggregating Data Visualizing Data Time Series Excel charts with pandas, vincent and xlsxwriter Using Pandas and XlsxWriter to create Excel charts Various Tutorials Wes McKinney’s (pandas BDFL) blog Statistical analysis made easy in Python with SciPy and pandas DataFrames, by Randal Olson Statistical Data Analysis in Python, tutorial videos, by Christopher Fonnesbeck from SciPy 2013 Financial analysis in python, by Thomas Wiecki Intro to pandas data structures, by Greg Reda Pandas and Python: Top 10, by Manish Amde Pandas Tutorial, by Mikhail Semeniuk indexmodules |next |previous |pandas 0.15.2 documentation » © Copyright 2008-2014, the pandas development team
realJohnLK
Aufgabe zum Buch Python Data Cleaning Cookbook
ttaylor1248
No description available
peacount
No description available
abbaskhan06
No description available
uy-nguyen00
No description available
rbsmotta
Estudo do livro ¨Python Data Cleaning Cookbook"
dat-lequoc
No description available
abriyanyusuf
This repository contains exercise files that I created myself in Google Colab by following guidelines sourced from a book called Python Data Cleaning Cookbook written by Michael Walker and published by Packt Publishing Ltd in 2020. The files related to the exercises I did were obtained through the GitHub Repository from the book.
ejayaraman
Jupyter Notebooks from Python Data Cleaning Cookbook
All 13 repositories loaded