Found 11,754 repositories(showing 30)
No description available
Madhuarvind
A complete exploratory data analysis (EDA) and forecasting project focused on retail sales data. The project identifies key sales patterns, seasonal trends, and builds predictive models to forecast future demand at the item-store level.
abhay01-kumar
No description available
Subash-d
📊 Retail Sales AnalysisA data-driven project focused on analyzing retail sales performance across different products, category, and time periods. This repository includes data cleaning, visualization, and actionable insights using Python and SQL.
Build a machine learning model to predict weekly sales with 97.4% accuracy. Integrated Exploratory Data Analysis tools to analyze trends, patterns, and actionable insights. The solution enables detailed sales comparisons, evaluates feature impacts and ranges, and identifies top performers, greatly enhancing decision-making in the retail industries.
The primary objective of this project is to develop a cutting-edge forecasting model utilizing advanced machine-learning algorithms and sophisticated time-series analysis techniques. The model aims to deliver precise predictions of future sales across diverse retail outlets.
ADARSH1805
No description available
joshuatochinwachi
I analyzed retail data (2022–2023) using Python (ETL) and SQL. Identified top products, regional trends, and sales growth, providing actionable insights to optimize logistics, boost sales, and improve profitability.
Viktor-Kukhar
SQL analysis of online retail sales, customer behavior, and product trends (2010-2011)
tushar2704
This project aims to analyze and visualize the sales data for Retail and Food Services in the U.S.A. The data is sourced from the U.S. government website and has been processed using SQL to create a database for easy management and analysis.
It is challenging to build useful forecasts for sparse demand products. If the forecast is lower than the actual demand, it can lead to poor assortment and replenishment decisions, and customers will not be able to get the products they want when they need them. If the forecast is higher than the actual demand, the unsold products will occupy inventory shelves, and if the products are perishable, they will have to be liquidated at low costs to prevent spoilage. The overall objective of the model is to use the retail data which provides us with historic sales across various countries and products for a firm. We use this information given, and make use of FM’ s to predict the sparse demand with missing transactions. The above step then enhances the overall demand forecast achieved with LSTM analysis. As part of the this project we answered the following questions: How well does matrix factorization perform at predicting intermittent demand How does matrix factorization approach improve the overall time-series forecasting
usabhishek
The aim of this project is to provide a comprehensive analysis of the sales performance of the beverage industry by examining key factors such as product categories, supplier contributions, customer demand, and sales trends over time.
SaurabhSSB
A data analysis project exploring consumer behavior and sales trends through EDA using Python. Includes visualizations and insights derived from retail shopping data.
Huyen-P
Embark on a thorough investigation as we navigate the transactional dataset of UCI, a non-store retail UK company, utilizing SQL queries and PowerBI visualization tools. This analysis empowers businesses with strategic insights for precise inventory management and sales planning in the coming year.
asupraja3
RetailTS is a data visualization and exploratory analytics project focused on uncovering trends, patterns, and seasonal behaviors in retail sales using time series analysis techniques.
pavan-ahire
A complete end-to-end MySQL project analyzing grocery store operations through database design, complex SQL queries, and business insights. Includes ERD modeling, customer & sales analysis, supplier performance evaluation, and data-driven recommendations for improving retail decision-making.
Menahakumari
No description available
AbubakarOrakzai
This project focuses on analyzing retail sales data using SQL to extract meaningful business insights. The database sql_project_p2 was created to store transaction details such as sales date, time, customer demographics, product categories, quantity, cost, and total sales.
ashrafalaghbari
Retail Sales Forecasting and Monitoring project offers real-time analysis and forecasts for retail sales.
anandjha90
DESCRIPTION One of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are sales data available for 45 stores of Walmart. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc. Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Part of the challenge presented by this competition is modeling the effects of markdowns on these holiday weeks in the absence of complete/ideal historical data. Historical sales data for 45 Walmart stores located in different regions are available. Dataset Description This is the historical data which covers sales from 2010-02-05 to 2012-11-01, in the file Walmart_Store_sales. Within this file you will find the following fields: Store - the store number Date - the week of sales Weekly_Sales - sales for the given store Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week Temperature - Temperature on the day of sale Fuel_Price - Cost of fuel in the region CPI – Prevailing consumer price index Unemployment - Prevailing unemployment rate Holiday Events Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13 Labour Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13 Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13 Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13 Analysis Tasks Basic Statistics tasks Which store has maximum sales Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation Which store/s has good quarterly growth rate in Q3’2012 Some holidays have a negative impact on sales. Find out holidays which have higher sales than the mean sales in non-holiday season for all stores together Provide a monthly and semester view of sales in units and give insights Statistical Model For Store 1 – Build prediction models to forecast demand
ShahadShaikh
Problem Statement Introduction So far, in this course, you have learned about the Hadoop Framework, RDBMS design, and Hive Querying. You have understood how to work with an EMR cluster and write optimised queries on Hive. This assignment aims at testing your skills in Hive, and Hadoop concepts learned throughout this course. Similar to Big Data Analysts, you will be required to extract the data, load them into Hive tables, and gather insights from the dataset. Problem Statement With online sales gaining popularity, tech companies are exploring ways to improve their sales by analysing customer behaviour and gaining insights about product trends. Furthermore, the websites make it easier for customers to find the products they require without much scavenging. Needless to say, the role of big data analysts is among the most sought-after job profiles of this decade. Therefore, as part of this assignment, we will be challenging you, as a big data analyst, to extract data and gather insights from a real-life data set of an e-commerce company. In the next video, you will learn the various stages in collecting and processing the e-commerce website data. Play Video2079378 One of the most popular use cases of Big Data is in eCommerce companies such as Amazon or Flipkart. So before we get into the details of the dataset, let us understand how eCommerce companies make use of these concepts to give customers product recommendations. This is done by tracking your clicks on their website and searching for patterns within them. This kind of data is called a clickstream data. Let us understand how it works in detail. The clickstream data contains all the logs as to how you navigated through the website. It also contains other details such as time spent on every page, etc. From this, they make use of data ingesting frameworks such as Apache Kafka or AWS Kinesis in order to store it in frameworks such as Hadoop. From there, machine learning engineers or business analysts use this data to derive valuable insights. In the next video, Kautuk will give you a brief idea on the data that is used in this case study and the kind of analysis you can perform with the same. Play Video2079378 For this assignment, you will be working with a public clickstream dataset of a cosmetics store. Using this dataset, your job is to extract valuable insights which generally data engineers come up within an e-retail company. So now, let us understand the dataset in detail in the next video. Play Video2079378 You will find the data in the link given below. https://e-commerce-events-ml.s3.amazonaws.com/2019-Oct.csv https://e-commerce-events-ml.s3.amazonaws.com/2019-Nov.csv You can find the description of the attributes in the dataset given below. In the next video, you will learn about the various implementation stages involved in this case study. Attribute Description Download Play Video2079378 The implementation phase can be divided into the following parts: Copying the data set into the HDFS: Launch an EMR cluster that utilizes the Hive services, and Move the data from the S3 bucket into the HDFS Creating the database and launching Hive queries on your EMR cluster: Create the structure of your database, Use optimized techniques to run your queries as efficiently as possible Show the improvement of the performance after using optimization on any single query. Run Hive queries to answer the questions given below. Cleaning up Drop your database, and Terminate your cluster You are required to provide answers to the questions given below. Find the total revenue generated due to purchases made in October. Write a query to yield the total sum of purchases per month in a single output. Write a query to find the change in revenue generated due to purchases from October to November. Find distinct categories of products. Categories with null category code can be ignored. Find the total number of products available under each category. Which brand had the maximum sales in October and November combined? Which brands increased their sales from October to November? Your company wants to reward the top 10 users of its website with a Golden Customer plan. Write a query to generate a list of top 10 users who spend the most. Note: To write your queries, please make necessary optimizations, such as selecting the appropriate table format and using partitioned/bucketed tables. You will be awarded marks for enhancing the performance of your queries. Each question should have one query only. Use a 2-node EMR cluster with both the master and core nodes as M4.large. Make sure you terminate the cluster when you are done working with it. Since EMR can only be terminated and cannot be stopped, always have a copy of your queries in a text editor so that you can copy-paste them every time you launch a new cluster. Do not leave PuTTY idle for so long. Do some activity like pressing the space bar at regular intervals. If the terminal becomes inactive, you don't have to start a new cluster. You can reconnect to the master node by opening the puTTY terminal again, giving the host address and loading .ppk key file. For your information, if you are using emr-6.x release, certain queries might take a longer time, we would suggest you use emr-5.29.0 release for this case study. There are different options for storing the data in an EMR cluster. You can briefly explore them in this link. In your previous module on hive querying, you copied the data to the local file system, i.e., to the master node's file system and performed the queries. Since the size of the dataset is large here in this case study, it is a good practice to load the data into the HDFS and not into the local file system. You can revisit the segment on 'Working with HDFS' from the earlier module on 'Introduction to Big data and Cloud'. You may have to use CSVSerde with the default properties value for loading the dataset into a Hive table. You can refer to this link for more details on using CSVSerde. Also, you may want to skip the column names from getting inserted into the Hive table. You can refer to this link on how to skip the headers.
SANJAYBAIRI8686
No description available
This is where i documented my Retail Sales Analysis Report
CHAITANYAI0
No description available
rahulrajput831
No description available
shoreyarchit
We all eagerly wait for Black Friday sales and plan ahead in order to make most out of it. Similar is the objective of a retail outlet on Black Friday. They also aspire to bring the best out of this day. The major objective of a store is to maximize the revenue on this day, by selling off a large proportion of their unsold inventory. The main challenge to achieve this objective is “What optimal prices should the store set to capture demand that maximizes revenue?” The problem we solve would help the business to get the predicted Purchase amount (or Willingness to Pay) for each product for each user. They can use this then to set optimal prices on the product (using Multinomial Model for Price Optimization or others). So, when we find Black Friday Sales Analysis data on Kaggle, it highly motivated our team to work for this interesting real-world problem for ABC Retail Store.
Code-with-HD
SQL Project on Retail Sales Analysis
Anand-kishore-kalthuri
Analyzed over 1 million rows of data to uncover sales trends, warranty claims insights, and store performance metrics using advanced SQL techniques. Optimized query performance with indexes and calculated growth ratios to solve real-world business challenges.
rajeevtiwari8055
Retail Sales Dashboard built using Excel with Power Query, Pivot Tables, and Map Charts. This project analyzes a 5000-row dataset to solve real business problems with dynamic visual insights. Includes slicer-controlled reports on category-wise sales, profit trends, top customers, and more.
End-to-end retail sales data cleaning, analysis & forecasting using Python, SQL & Tableau. Analyze & forecast retail sales using Global Superstore data: Python (Pandas, Matplotlib), SQLite, Tableau dashboards. Keep it under 100 characters if possible so it displays well.