Found 124 repositories(showing 30)
Data Science has been ranked as one of the hottest professions and the demand for data practitioners is booming. This Professional Certificate from IBM is intended for anyone interested in developing skills and experience to pursue a career in Data Science or Machine Learning. This program consists of 9 courses providing you with latest job-ready skills and techniques covering a wide array of data science topics including: open source tools and libraries, methodologies, Python, databases, SQL, data visualization, data analysis, and machine learning. You will practice hands-on in the IBM Cloud using real data science tools and real-world data sets. It is a myth that to become a data scientist you need a Ph.D. This Professional Certificate is suitable for anyone who has some computer skills and a passion for self-learning. No prior computer science or programming knowledge is necessary. We start small, re-enforce applied learning, and build up to more complex topics. Upon successfully completing these courses you will have done several hands-on assignments and built a portfolio of data science projects to provide you with the confidence to plunge into an exciting profession in Data Science. In addition to earning a Professional Certificate from Coursera, you will also receive a digital Badge from IBM recognizing your proficiency in Data Science.
kevilkhadka
This repo consists of all courses of IBM - Data Science Professional Certificate, providing with techniques covering a wide array of data science topics including open source tools and libraries, methodologies, Python, databases, SQL, data visualization, data analysis, and machine learning. You will practice hands-on in the IBM Cloud using real data science tools and real-world data sets.
mesbahiba
Gain the job-ready skills for an entry-level data analyst role through this eight-course Professional Certificate from IBM and position yourself competitively in the thriving job market for data analysts, which will see a 20% growth until 2028 (U.S. Bureau of Labor Statistics). Power your data analyst career by learning the core principles of data analysis and gaining hands-on skills practice. You’ll work with a variety of data sources, project scenarios, and data analysis tools, including Excel, SQL, Python, Jupyter Notebooks, and Cognos Analytics, gaining practical experience with data manipulation and applying analytical techniques.
fatihilhan42
IBM project: SpaceX launch analysis in Python (gather data - data wrangling - sql and visualization data analysis - prediction model - dashboard - final report)
Parisaroozgarian
The IBM Data Analyst Professional Certificate, consisting of 9 courses, equips with essential skills in Excel, SQL, Python, data visualization, and analysis techniques
Willie-Conway
Professional portfolio showcasing the IBM Data Analyst Certificate journey 🏆. Features 8 courses, 50+ labs, interactive dashboards 📊, SQL projects 🗄️, Python data analysis 🐍, and Generative AI applications 🤖. Demonstrates end-to-end analytics expertise for recruiters and hiring managers!
ERAMITDHOMNE
hadoop-projects IBM stock project Get IBM stock dataset Clean the dataset Load dataset on the HDFS Build MapReduce program Process/ Analyse result Hadoop set up Run single node Hadoop cluster /usr/local/Celler/hadoop Check : https://www.slideshare.net/SunilkumarMohanty3/install-apache-hadoop-on-mac-os-sierra-76275019 http://zhongyaonan.com/hadoop-tutorial/setting-up-hadoop-2-6-on-mac-osx-yosemite.html Go to :http://localhost:50070/dfshealth.html#tab-overview Start : hstart Hadoop command: hadoop fs -ls hadoop fs -mkdir /hbp Upload a file in HDFS hadoop fs -put <localsrc> ... <HDFS_dest_Path> go to : http://localhost:50070/explorer.html#/hbp/ibm-stock Dataset head date - opening stock quote - high - low - traded volume - closing price Clean dataset with command : awk,sed,grep Run the program Copy jar to Hadoop Run the program on Hadoop system: hadoop jar /hbp/ibm-stock/ibm-stock-1.0-SNAPSHOT.jar /hbp/ibm-stock/ibm-stock.csv /hbp/ibm-stock/output Check output dir : hadoop fs -ls /hbp/ibm-stock/output Copy file from HDFS to local file system : hadoop fs -get /hpb/ibm-stock/output/part-r-00000 home/Users/hien/results.csv Check head home/Users/hien/results.csv Customer Analysis Collect data Customer master data : MySQL Logs : text file Twitter feeds : JSON Load data from data sources in HDFS Mug data Create table in Hive to store data in format Query and join tables Export data Set up stack: Hortonwork data platform HDP Install HDP sandbox: HDP 2.3 HDP : hive, squoop , Fraud Detection system Clean dataset Create model Using: Spark and Hadoop Problem: predict payment transaction is suspect Build model : Find relevant field: Apache Spark 2 Spark ecosystem : Spark core Spark streaming Spark SQL MLlib GraphX Spark-R Apache Spark component: + navigate to : localhost:4040 run spark-shell : $SPARK_HOME/bin/spark-shell Word count Create pairRDD : valpairRDD=stringRdd.map( s => (s,1)) Run reducebykey to count the occurency of each word : alwordCountRDD=pairRDD.reduceByKey((x,y) =>x+y) Run the collect to see the result : valwordCountList=wordCountRDD.collect Find the sum of integers Create RDD of even number from integers : valintRDD = sc.parallelize(Array(1,4,5,6,7,10,15)) Filter even numbers from RDD : valevenNumbersRDD=intRDD.filter(i => (i%2==0)) Sum the even numbers from RDD : val sum =evenNumbersRDD.sum Count the number of words in file : Read txt file : cat people.txt Read file from Apache Spark shell : val file=sc.textFile("/usr/local/spark/examples/src/main/resources/people.txt") Flaten the file, prcess and split , with each word : valflattenFile = file.flatMap(s =>s.split(", ")) Check the content of RDD : flattenFile.collect Count all words from RDD : val count = flattenFile.count Working with Data and Storage + Chua hoc 4 (RDD transformation),
Vowles-Data-Scientist
Data Science is one of the hottest professions of the decade and the demand for data scientists who can analyze data and communicate results to inform data driven decisions has never been greater. This Professional Certificate from IBM will help anyone interested in pursuing a career in data science or machine learning develop career-relevant skills and experience. The program consists of 10 online courses that will provide you with the latest job-ready tools and skills, including open source tools and libraries, Python, databases, SQL, data visualisation, data analysis, statistical analysis, predictive modelling, and machine learning algorithms. You’ll learn data science through hands-on practice in the IBM Cloud using real data science tools and real-world data sets. This Professional Certificate has a strong emphasis on applied learning. Except for the first course, all other courses include a series of hands-on labs in the IBM Cloud that will give you practical skills with applicability to real jobs, including: Tools: Jupyter / JupyterLab, GitHub, R Studio, and Watson Studio Libraries: Pandas, NumPy, Matplotlib, Seaborn, Folium, ipython-sql, Scikit-learn, ScipPy, etc. Projects: random album generator, predict housing prices, best classifier model, predicting successful rocket landing, dashboard and interactive mapping
foxy7372
Data Analysis and Prediction of Falcon 9 Rocket Landings using Python, SQL, and Machine Learning (IBM Data Science Capstone Project).
Antoinechss
Repository documenting the completion of the IBM Data Science Professional Certificate, covering data analysis, machine learning, Python, SQL, and applied data science workflows.
atahabilder1
Predicting SpaceX Falcon 9 first stage landing success using ML. Features API data collection, web scraping, SQL analysis, interactive Folium maps, and model comparison (Decision Tree: 87.5% accuracy). IBM Data Science Professional Certificate Capstone.
RakeshsarmaKarra
This repository contains my hands-on lab work and projects completed as part of the Data Science Professional Certificate offered by IBM | Coursera. The certificate consists of 10 courses covering various aspects of data science, including Python, SQL, data analysis, and visualization.
fabriziolufe
he program consists of 9 online courses that will provide you with the latest job-ready tools and skills, including open source tools and libraries, Python, databases, SQL, data visualization, data analysis, statistical analysis, predictive modeling, and machine learning algorithms. You’ll learn data science through hands-on practice in the IBM Clo
Farhad-Davaripour
In this repository, a few hands-on practice learning labs for data science are presented. These labs are built as a part of IBM Data Science Professional Certificate. The IBM Data Science program consists of 10 online courses that will provide the most updated tools and skills including open source tools and libraries, Python, databases, SQL, data visualization, data analysis, statistical analysis, predictive modelling, and machine learning algorithms. The skeleton of the labs is provided within the online courses.
This program consists of 9 courses providing you with latest job-ready skills and techniques covering a wide array of data science topics including: open source tools and libraries, methodologies, Python, databases, SQL, data visualization, data analysis, and machine learning. You will practice hands-on in the IBM Cloud using real data science tools and real-world data sets.
olusegun18
This specialization consists of nine courses and a capstone project which I duly completed. I learnt a wide range of tools and skills like Python, SQL, data visualization, data analysis, predictive modeling, and machine learning algorithms. Hands-on practices were carried out on IBM Cloud using real data science tools and real-world data sets.
hardiknir
This repository highlights my certifications from platforms like Coursera and IBM, along with key projects such as the Adidas Sales Analysis. It showcases my skills in business analytics, data visualization, and tools like Excel, SQL, and Tableau, demonstrating my ability to turn data into actionable insights.
Willie-Conway
📊 A comprehensive showcase of projects and skills from the IBM Business Intelligence Analyst Professional Certificate! 📈 Features include: 📉 Tableau dashboards, 📊 Excel analytics, 🗄️ SQL database querying, 📐 statistical analysis, 🤖 AI-enhanced BI workflows, and 📋 interactive data visualizations.
This project focuses on analyzing employee attrition using IBM HR Analytics dataset. The goal is to uncover factors influencing attrition, perform exploratory data analysis (EDA), integrate SQL queries, and build machine learning models for prediction. A Streamlit dashboard was also developed for interactive exploration.
vafiyanaznin
For Data Analysts and Data Scientists, Python has many advantages. A huge range of open-source libraries make it an incredibly useful tool for any Data Analyst. We have pandas, NumPy and Vaex for data analysis, Matplotlib, seaborn and Bokeh for visualisation, and TensorFlow, scikit-learn and PyTorch for machine learning applications (plus many, many more). With its (relatively) easy learning curve and versatility, it's no wonder that Python is one of the fastest-growing programming languages out there. So if we're using Python for data analysis, it's worth asking - where does all this data come from? While there is a massive variety of sources for datasets, in many cases - particularly in enterprise businesses - data is going to be stored in a relational database. Relational databases are an extremely efficient, powerful and widely-used way to create, read, update and delete data of all kinds. The most widely used relational database management systems (RDBMSs) - Oracle, MySQL, Microsoft SQL Server, PostgreSQL, IBM DB2 - all use the Structured Query Language (SQL) to access and make changes to the data. Note that each RDBMS uses a slightly different flavour of SQL, so SQL code written for one will usually not work in another without (normally fairly minor) modifications. But the concepts, structures and operations are largely identical. This means for a working Data Analyst, a strong understanding of SQL is hugely important. Knowing how to use Python and SQL together will give you even more of an advantage when it comes to working with your data.
amolpattill
Java Database Connectivity (JDBC) is an application program interface (API) packaged with the Java SE edition that makes it possible to standardize and simplify the process of connecting Java applications to external, relational database management systems (RDBMS). Fundamentally, applications written in Java perform logic. The Java language provides facilities for performing iterative logic with looks, conditional logic with if statements and object-oriented analysis through the use of classes and interfaces. But Java applications do not store data persistently. Data persistence is typically delegated to NoSQL databases such as MongoDB and Cassandra, or to relational databases such as IBM’s DB2 or Microsoft’s SQL Server or the popular open source database MySQL. JDBC interfaces, classes and components The JDBC API is composed of a number of interfaces and classes that represent a connection to the database, provide facilities for sending SQL queries to a database and help Java developer process the results of relational database interactions.
phinglaspure123
analysing ibm dataset using sql queries inside the Jupyter Notebook
erdemszr
SQL projects and hands-on labs completed as part of the IBM "Databases and SQL for Data Science with Python" course. Includes real-world data analysis using SQLite and Jupyter notebooks. All code and outputs created by Erdem Sezer for learning and portfolio purposes.
Saadjassal
No description available
AjayMali-IITR
No description available
Csharpscale
SQL-based analysis of IBM HR Employee Attrition dataset using MySQL, including segmentation, window functions, and business insights.
This is the complete Jupyter Notebook showcasing the activity's correct answers.
nitinvishwakarma
No description available
No description available
aishwarya0404
No description available