Found 78 repositories(showing 30)
ShahadShaikh
Problem Statement Introduction So far, in this course, you have learned about the Hadoop Framework, RDBMS design, and Hive Querying. You have understood how to work with an EMR cluster and write optimised queries on Hive. This assignment aims at testing your skills in Hive, and Hadoop concepts learned throughout this course. Similar to Big Data Analysts, you will be required to extract the data, load them into Hive tables, and gather insights from the dataset. Problem Statement With online sales gaining popularity, tech companies are exploring ways to improve their sales by analysing customer behaviour and gaining insights about product trends. Furthermore, the websites make it easier for customers to find the products they require without much scavenging. Needless to say, the role of big data analysts is among the most sought-after job profiles of this decade. Therefore, as part of this assignment, we will be challenging you, as a big data analyst, to extract data and gather insights from a real-life data set of an e-commerce company. In the next video, you will learn the various stages in collecting and processing the e-commerce website data. Play Video2079378 One of the most popular use cases of Big Data is in eCommerce companies such as Amazon or Flipkart. So before we get into the details of the dataset, let us understand how eCommerce companies make use of these concepts to give customers product recommendations. This is done by tracking your clicks on their website and searching for patterns within them. This kind of data is called a clickstream data. Let us understand how it works in detail. The clickstream data contains all the logs as to how you navigated through the website. It also contains other details such as time spent on every page, etc. From this, they make use of data ingesting frameworks such as Apache Kafka or AWS Kinesis in order to store it in frameworks such as Hadoop. From there, machine learning engineers or business analysts use this data to derive valuable insights. In the next video, Kautuk will give you a brief idea on the data that is used in this case study and the kind of analysis you can perform with the same. Play Video2079378 For this assignment, you will be working with a public clickstream dataset of a cosmetics store. Using this dataset, your job is to extract valuable insights which generally data engineers come up within an e-retail company. So now, let us understand the dataset in detail in the next video. Play Video2079378 You will find the data in the link given below. https://e-commerce-events-ml.s3.amazonaws.com/2019-Oct.csv https://e-commerce-events-ml.s3.amazonaws.com/2019-Nov.csv You can find the description of the attributes in the dataset given below. In the next video, you will learn about the various implementation stages involved in this case study. Attribute Description Download Play Video2079378 The implementation phase can be divided into the following parts: Copying the data set into the HDFS: Launch an EMR cluster that utilizes the Hive services, and Move the data from the S3 bucket into the HDFS Creating the database and launching Hive queries on your EMR cluster: Create the structure of your database, Use optimized techniques to run your queries as efficiently as possible Show the improvement of the performance after using optimization on any single query. Run Hive queries to answer the questions given below. Cleaning up Drop your database, and Terminate your cluster You are required to provide answers to the questions given below. Find the total revenue generated due to purchases made in October. Write a query to yield the total sum of purchases per month in a single output. Write a query to find the change in revenue generated due to purchases from October to November. Find distinct categories of products. Categories with null category code can be ignored. Find the total number of products available under each category. Which brand had the maximum sales in October and November combined? Which brands increased their sales from October to November? Your company wants to reward the top 10 users of its website with a Golden Customer plan. Write a query to generate a list of top 10 users who spend the most. Note: To write your queries, please make necessary optimizations, such as selecting the appropriate table format and using partitioned/bucketed tables. You will be awarded marks for enhancing the performance of your queries. Each question should have one query only. Use a 2-node EMR cluster with both the master and core nodes as M4.large. Make sure you terminate the cluster when you are done working with it. Since EMR can only be terminated and cannot be stopped, always have a copy of your queries in a text editor so that you can copy-paste them every time you launch a new cluster. Do not leave PuTTY idle for so long. Do some activity like pressing the space bar at regular intervals. If the terminal becomes inactive, you don't have to start a new cluster. You can reconnect to the master node by opening the puTTY terminal again, giving the host address and loading .ppk key file. For your information, if you are using emr-6.x release, certain queries might take a longer time, we would suggest you use emr-5.29.0 release for this case study. There are different options for storing the data in an EMR cluster. You can briefly explore them in this link. In your previous module on hive querying, you copied the data to the local file system, i.e., to the master node's file system and performed the queries. Since the size of the dataset is large here in this case study, it is a good practice to load the data into the HDFS and not into the local file system. You can revisit the segment on 'Working with HDFS' from the earlier module on 'Introduction to Big data and Cloud'. You may have to use CSVSerde with the default properties value for loading the dataset into a Hive table. You can refer to this link for more details on using CSVSerde. Also, you may want to skip the column names from getting inserted into the Hive table. You can refer to this link on how to skip the headers.
I analysed the data analyst jobs data set to find the some insights such as the most in demand data anlyst jobs, most competitive, and so on
Linkedin trending jobs analysis BI project
AnshulSilhare
Python analysis of 100K+ US Data & Business Analyst job postings - skills demand, salary trends, and market insights.
MalayBhunia
🚀 Automated end-to-end pipeline to scrape and analyze 1,500+ Naukri.com job listings. Features custom Selenium web scraping, data cleaning with Pandas/Regex, and EDA to uncover real-time trends in Data Science, Engineering, and Analytics roles in India.
No description available
pavannagula
This project analyzes the US data science job market to gain insights into its growth, demand, and key trends. By performing exploratory data analysis (EDA), I've explored various aspects such as job titles, industries, required skills, salaries, and educational requirements related to Data Science field.
sharbanee7781
Analyzed the US data job market with Python to uncover in-demand and high-paying skills for Data Analysts. Used Pandas, Seaborn, and Matplotlib for data cleaning and visualization. Insights include skill demand trends, salary analysis, and optimal skills for career growth.
sulebalaban
No description available
ashwinijujare
Analysis on the job market data specific to data related jobs and visualizing the insights using power BI
Data Roles US Job Market Analysis using Python - 2023 data
No description available
smerdov
Analysis of job postings in the US related to data science
Palaniappansvs
A Visual analysis using Tableau and Python - Data science job market in US
An analysis of the US data analyst job market. Includes data cleaning, salary parsing, and text analysis of job descriptions to identify key skills.
Cynthiaudoye
"Data Science Jobs and Salaries Analysis Skills" offers insights on job preferences, salaries, and top hiring locations in data fields, aimed at guiding job seekers and employers in the UK, US, and Canada.
Jadonsofficiall
The project is a 6 weeks hands-on work for the different subgroups in Data Community AFrica. The aim is to horn the technical and creative skills within the community thus, helping us grow our portfolio and become job ready. It entails helping each of us learn how to approach a data analysis project.
georges-17
This a Data Analysis project, where I used the Fred API to extract data about unemployment in the USA. FRED stands for the Federal Reserve Economic Data, which I extracted, cleaned, and wrangled data, to be able to visualize and understand the job market and what it can affect in the us.
thorveakshay
Project setup have configured Hadoop cluster in AWS EC2 instance, and cluster have one master node and two slave nodes. Agenda of this project is to take a large dataset and reduce it using the Hadoop jobs and then represent the large data using the bar and the pie charts for better analysis for the user. The dataset used in our application is not specific, any large dataset can be run on the Hadoop jobs for the required output. We have considered the US Voting Poll data as our dataset. The entire data is around 4000 records and over 70 columns. We have the data in the form of .csv file. The csv file is directly imported to the Hadoop cluster by the master and then sent to the mapper and later to the reducer. The data is then categorized into four categories: Democratic, Republican, US Voting Poll 2012 and US Voting Poll 2016. The categorized data is then represented in the form of the charts along with the Hadoop jobs generated output of the sorted data.
christianchimezie
EDUCATION FOR ALL, a charity, aims to boost donations for the coming year and needs a fundraising strategy to accomplish this. The team will be having a meeting in two weeks to plan for the following year. I was given two data sets, Donation Data and Donor Data, to work with as a data analyst. The Donation Data collection includes the following information: Donor ID, Donor first name, Donor last name, Donor email address, Donor gender, Donor job field, Donation amount, Donor state of residence (US), and Donor t-shirt size. Donor Data includes the following information: Donor ID, Donation frequency, Donor university, Donor car make, Donor second language, Donor favourite colour, and Donor favourite film genre. I was tasked with the duty of extracting insights and patterns from the data sets provided and developing a strategy for increasing the charity's donations the following year. I also use Tableau to visualize the data and uncover more hidden insights. I also implored the FIVE WHYS OF ROOT CAUSE ANALYSIS to explore the underlying causes and effects of the problem. From the analysis, it is suggested that EDUCATION FOR ALL adopts recurring donations as it provides a consistent, steady, predictable source of income and manage relationships with high-value donors by engaging and contacting them on a regular basis to keep them in the loop.
Data analytics isn’t just about the future, it is being put to use at this very moment in all businesses. It forms an integral part of the company and the professionals are paid highly for their part. Here are reasons why joining data analytics training in Gurgaon is a viable option After the completion of Data Analytics Course, you will be able to: Understand Scala & Apache Spark implementation Spark operations on Spark Shell Spark Driver & its related Worker Nodes Spark + Flume Integration Setting up Data Pipeline using Apache Flume, Apache Kafka & Spark Streaming Spark RDDs and Spark Streaming Spark MLib : Creating Classifiers & Recommendations systems using MLib Spark Core concepts: Creating of RDDs: Parrallel RDDs, MappedRDD, HadoopRDD, JdbcRDD. Spark Architecture & Components Spark SQL experience with CSV, XML & JSON Reading data from different Spark sources Spark SQL & Dataframes Develop and Implement various Machine Learning Algorithms in daily practices & Live Environment Building Recommendation systems and Classifiers Perform various type of Analysis (Prediction & Regression) Implement plotting & graphs using various Machine Learning Libraries Import data from HDFS & Implement various Machine Learning Models Building different Neural networks using NumPy and TensorFlow Power BI Visualization Power BI Components Power BI Transformations Dax functions Data Exploration and Mapping Designing Dashboards Time Series, Aggregation & Filters Placement Gyansetu is providing complimentary placement service to all students. Gyansetu Placement Team consistently works on industry collaboration and associations which help our students to find their dream job right after the completion of training. Why Choose us? Gyansetu trainers are well known in Industry; who are highly qualified and currently working in top MNCs. We provide interaction with faculty before the course starts. Our experts help students in learning Technology from basics, even if you are not good at basic programming skills, don’t worry! We will help you. Faculties will help you in preparing project reports & presentations. Students will be provided Mentoring sessions by Experts.
What Makes a QlikView Developer? Most programming languages you can proceed and learn, but it occurs to that a QlikView developer is something you have to be different. By that I mean that you have to have skills in a large number of areas – and a failing in one can negate brilliance in another. IQ Online Training provides the Comprehensive qlikview online training with Placement. <a href=” http://www.iqonlinetraining.com/qlikview-online-training/” > qlikview online training </a> Coding Skills As this is what some people think the expanse of the requirements are it is where I shall start. Solid skills in this area are a must as the QlikView developer requires to code in four or five dissimilar languages at practically the same time. You have the load script syntax, SQL statements (in various flavors), QlikView expressions, Set Analysis and then VBA if you need macros or advanced automation. Whilst it is true that a fine SQL knowledge will get you through the load script and those with fine Excel skills will find no fright in expressions; it is the ability to do all at once that is vital. The direction to work in a case sensitive environment and vision to conceive solutions to problems where the standard methods do not deliver are also essential. Understanding Data QlikView documents where the data merely hasn’t been understood. Classic errors include summing percentages and incorrect chart option (often a data understanding issue rather than a plan one). A degree in maths or statistics is not necessarily need– just a good understanding of (and respect for) the rules of data. Bad Chart Choice Understanding the Business For a QlikView document (or set of documents) to be of maximum use to a business it should present insight from every area of that business. This needs the QlikView developer to have an actual understanding of the business and the processes that make it appear. Even if the developer is not the analysts to convert the requirements into inspired visualizations takes a few levels of knowledge. Experience Across Companies Someone who has been in a job for some time will know that organization well (ticking the box above) but this can lead to a tendency not to think outside the box. All too often I am asked to replicate existing Crystal or Excel reports for a client as “that is what we regularly report”. Also, there is an extensive difference in what is important to people in distant types of businesses. Only experience in many vertical markets can adorn the QlikView developer to know what is probable to be required on any particular dashboard they create. Best Practices There are a number of evolving best practices that can make a great impact on the performance and usability of a QlikView document. These span from data model design concepts, coding standards and interface design rules. The QlikView developer requires to keep up with all these best practices in order to stay on top of their game. The Ability to Ask Often the requirements that are set for the developer for a QlikView document are sparse. I have been given little more than a database connection string to use an instance. Having built many solutions gives you an idea of the sort of thing people are likely to want to see – but it’s easy to assume wrong. The only way to get things right is to sit the main sponsors down with a pad of paper or flip-chart and ask them what it is that they need to achieve. People Skills This means that as the person coding a dashboard you also need people skills. Not always the first thing you connect with someone building IT solutions. However, to gain the trust of the IT department and to challenge the assumptions of the finance director needs just that. A typical QlikView project should include people from many areas of a business, and the person doing the erect is often required to be the lynchpin between these individuals. Conclusion So to sum up, the QlikView Developer has to be someone that own many different skill – and knows when to apply each of them. Added to what I have placed out above, there are also those skills required in just about any role these days – the capability to work under pressure and to insane deadlines. It sounds like a pretty tall order – but it is a dare that us QlikView developers love to try and meet on a daily basis. If you want to know more information about the qikview online training course visit our website…….. http://www.iqonlinetraining.com/qlikview-online-training/ Contact Us: +1 904-304-2519 Mail id: info@iqtrainings.com
LaptopServiceHub
Lenovo Laptop Service Hub Are You Looking for quality laptop repair services at reasonable rates? If yes, contact our Lenovo laptop service Hub in Hyderabad.We serve the customers with the best and lasting repair services, making them completely satisfied. Lenovo Laptop service Hub has been working as the top Lenovo laptop repair service center for the last 16+ years. We own years of experience in handling various sorts of laptop problems. To do our job, we've a team of highly skilled technicians who have years of experience in solving the issues with Lenovo laptops. We offer our consumers with the laptop doorstep service through which they can get the services at their home or office. Our services are provided after an in depth analysis of your laptop. Our certified engineers make the utilization of the newest tools and techniques, which are wont to fix the matter together with your Lenovo laptop. Our technicians use only the genuine spare parts to do any kind of replacement in your Laptop. Lenovo Laptop Service Hub believes in giving complete service satisfaction to our customers. We are investigating our reliable and dependable services which are available for our customers at highly affordable rates. Our certified engineers make the utilization of their technical skill to unravel the matter together with your Laptop. Whether your laptop faces any hardware problem or any software issue, we'll solve everything during a timely manner. If you're trying to find a highly qualified service professional to repair your Lenovo Laptop problem, you'll contact our experienced and highly trained technicians anytime. directly call upon our service engineers and get the required solution in no time. Lenovo Laptop service Hub is largely known for its quick and effective laptop repair at your doorstep. Call our technicians today. Laptop Service Hub In Hyderabad The Laptop Service Hub In Hyderabad provides high quality and timely service to customers. There is a team of dedicated and well trained technicians which are committed to provide the support.There are various hardware and software issues which slows the speed and performance. These issues will be resolved with care and maintenance. The Laptop Service Hub In Hyderabad is available for all range of customers. The repair services are just a call away to resolve the problem within the desired way. The executives will connect to you and try to resolve the issue with a customer centric and user friendly approach.If you ever found issues in Your Laptop then it is easy to resolve the issue with the support of technicians. If your product is under the warranty period then you may call the experts on the phone to resolve the issue. There is growing demand for professional and reliable service providers in Hyderabad.But,Laptop Service Hub In Hyderabad is one of the most renowned and trustworthy centers which provides services across the region.The approach is friendly and customer centric and is employed to realize the trust of countless clients. Our expert technicians provide honest advice to customers to solve the issue in the best possible manner.The center has proper infrastructure and equipment to provide best services to the clients. The technicians would visit the home and office to analyse and resolve the issue in the best possible manner.Our primary focus is to provide services to suit the individual needs within reasonable cost.The main goal is repairing the laptop efficiently using a professional approach with quick response time and with telephonic support and customer satisfaction. Laptop Service Hub in Hyderabad provides doorstep services at cost effective price. If our technicians fail to repair the Laptop then the customer need not to pay any amount but other companies will take the service charge. There are various laptop issues such as. battery, data recovery, DC Jack, laptop hacks, keyword issues, liquid spill, motherboard and overheating etc. Our technicians repair the dead laptop ahead of the customer. It doesn't matter how much money you spend on the laptop but it's a necessary part of your life. You need to call us and the expert will come to the place to repair the laptop at a reasonable cost. Our technicians provide 24 / 7 services. Satisfaction is very crucial and the job is done with full efforts. So, you must never hesitate to urge the laptop repaired from us. To fix the laptop, you need to visit the www.laptopservicehub.com or fix an appointment through a call to resolve the issue.
ericyang94
US data science job market analysis
Sandeep101drj
No description available
Mohamed-Nabash
As a job seeker, I’m often surprised by the lack of data exploring the most ideal data science jobs and skills in the market. I set out to understand what skills top employers are looking for and how to earn higher salaries.
Achyuthkumar4756
No description available
aditinamdev
No description available
Rutik-Man-M0de
Exploratory Data Analysis (EDA) of AI Job Market dataset using Python, Pandas, NumPy, Matplotlib, and Seaborn to uncover trends in salaries, skills, job types, and hiring patterns.
No description available