Found 2,159 repositories(showing 30)
简单的线性回归模型预测房价
dawoodkhatri1
This project aims to develop a linear regression model to predict housing prices in California using the California Housing dataset. The project explores the impact of feature selection and scaling on the model's performance, with a focus on improving the accuracy of price predictions.
The famous Housing Price Advanced Regression competition on Kaggle. The dataset contains of training and testing sets each with about 1.46K rows and 81 features pertaining to a house. I have first performed an exhaustive EDA to identify the underlying trends in the data. I have also removed outliers to make the regression models more robust. Also proper missing values treatment has been done with imputation being done wherever needed. Lastly I have deployed various regression models like Lasso,Ridge etc... from scikit and have also tuned their parameters from the GridSearchCV module. Finally achieved a RMSE of little more than 0.12 which is pretty decent.
subhadipml
Build a model of housing prices to predict median house values in California using the provided dataset. Train the model to learn from the data to predict the median housing price in any district, given all the other metrics. Predict housing prices based on median_income and plot the regression chart for it.
Agent-A345
Predict house prices using a simple linear regression model trained on the Ames Housing dataset. The model takes square footage, number of bedrooms, and full bathrooms as input and returns the predicted price.
tashapiro
Predicting housing prices in Ames, Iowa (Ames Iowa Housing Dataset). Built various regression models to find best model with lowest RMSE.
priyanshu9142879533
Predicting California housing prices using multi-model regression. Includes EDA, preprocessing, feature engineering, and model comparison (Linear Regression, Decision Tree, Random Forest, KNN) to identify the best predictor based on RMSE and R².
Penglianfeng
This is the experimental assignment of my course "Machine Learning and Data Mining", which requires completing the training, testing and evaluation of the linear regression model for house price prediction based on the California Housing Prices dataset
This repository showcases a machine learning project that leverages PyTorch to implement a linear regression model for predicting house prices in Boston. It uses the well-known Boston Housing Dataset, incorporating a complete pipeline from data preprocessing and loading to model training, evaluation, and result visualization.
No description available
SergKhachikyan
Machine learning project for predicting housing prices in California using the California Housing dataset. The project includes data analysis, visualization, and building regression models for price estimation.
MansoobeZahra
A Streamlit-based web application that predicts median housing prices in Boston using Polynomial Regression, built from the Boston Housing Dataset. It supports interactive model tuning, residual visualization, and user-defined prediction.
nicekid1
Predicting Tehran Housing Prices using Machine Learning and Deep Learning A complete end-to-end data science project focused on building accurate regression models using real estate data from Tehran. Includes EDA, feature engineering, preprocessing, boosting models, and a neural network — ideal for decision-making in urban planning and real estate.
vishalaxi-tandel
This example shows how to build a serverless pipeline to orchestrate the continuous training and deployment of a linear regression model for predicting housing prices using Amazon SageMaker, AWS Step Functions, AWS Lambda, and Amazon CloudWatch Events.
This repo contains EDA of 2019 King County housing data. It also contains iterations of generating linear regression models to predict home sale prices.
Niteshyadav0331
A Model which predicts Boston Housing Prices using Linear, Ridge, Lasso and Elastic Net Regression.
krishnaarora023
📈 End-to-end data science project analyzing housing market data with EDA, visualization, and regression modeling to identify key factors influencing house prices.
pranav-ashokk
I created a machine learning regression model that predicts housing prices in Boston. This model is created using supervised learning from a dataset created by UCI (University of California, Irvine).
kshivamr
🏡 Housing Price Predictor using the Kaggle dataset. This project covers data preprocessing, EDA, feature engineering, and training multiple regression models (Linear Regression, Random Forest, Gradient Boosting) to predict house prices. Includes model evaluation with RMSE and Kaggle-ready submission output.
Safae26
A machine learning project that predicts Boston housing prices using XGBoost regression. Features comprehensive data analysis, correlation studies, and model implementation to identify key factors influencing real estate values in Boston suburbs.
abd1bayev
Housing prices are a hot topic today. In this project, we will build a model to predict the price base based on predefined criteria. This project is divided into two parts: 1# Understanding data 2# Create a linear regression prediction model.
rickyca
Using the Ames Housing dataset I applied a linear regression model to estimate house prices based on fixed characteristics. This result was combined with an additional model in order to estimate an increase in price after remodeling houses, based on not fixed features. In addition, proposed a logistic regression model to predict an abnormal sale type (unbalanced class analysis).
AnjalyG
The application is built to predict the housing prices of a region. The model is trained on kaggle banglore house price data, using logistic regression, then created a python Flask server and built a UI using javascript, HTML and CSS. Also the app is deployed to Amazon EC2.
In this project, we analyze the Boston Housing Price dataset using several machine learning techniques such as Linear Regression, Support Vector Machines (SVM), Random Forest, and Artificial Neural Networks (ANN) using the PyTorch library. The goal is to build robust models to predict house prices based on a set of features.
This data science project series walks through step by step process of how to build a real estate price prediction website. I will first build a model using sklearn and linear regression using bangaluru housing prices dataset from kaggle.com. Second step would be to write a python flask server that uses the saved model to serve http requests. Third component is the website built in html, css, bootstrap and javascript that allows user to enter home square ft area, bedrooms etc and it will call python flask server to retrieve the predicted price. During model building I will cover data science concepts such as data loading and cleaning, outlier detection and removal, feature engineering, dimensionality reduction, gridsearchcv for hyperparameter tunning, k fold cross validation etc.
Ridhi655
Predicting Median value of owner-occupied homes The aim of this assignment is to learn the application of machine learning algorithms to data sets. This involves learning what data means, how to handle data, training, cross validation, prediction, testing your model, etc. This dataset contains information collected by the U.S Census Service concerning housing in the area of Boston Mass. It was obtained from the StatLib archive, and has been used extensively throughout the literature to benchmark algorithms. The data was originally published by Harrison, D. and Rubinfeld, D.L. Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978. The dataset is small in size with only 506 cases. It can be used to predict the median value of a home, which is done here. There are 14 attributes in each case of the dataset. They are: CRIM - per capita crime rate by town ZN - proportion of residential land zoned for lots over 25,000 sq.ft. INDUS - proportion of non-retail business acres per town. CHAS - Charles River dummy variable (1 if tract bounds river; 0 otherwise) NOX - nitric oxides concentration (parts per 10 million) RM - average number of rooms per dwelling AGE - proportion of owner-occupied units built prior to 1940 DIS - weighted distances to five Boston employment centres RAD - index of accessibility to radial highways TAX - full-value property-tax rate per $10,000 PTRATIO - pupil-teacher ratio by town B - 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town LSTAT - % lower status of the population MEDV - Median value of owner-occupied homes in $1000's Aim To implement a linear regression with regularization via gradient descent. to implement gradient descent with Lp norm, for 3 different values of p in (1,2] To contrast the difference between performance of linear regression Lp norm and L2 norm for these 3 different values. Tally that the gradient descent for L2 gives same result as matrix inversion based solution. All the code is written in a single python file. The python program accepts the data directory path as input where the train and test csv files reside. Note that the data directory will contain two files train.csv used to train your model and test.csv for which the output predictions are to be made. The output predictions get written to a file named output.csv. The output.csv file should have two comma separated columns [ID,Output]. Working of Code NumPy library would be required, so code begins by importing it Import phi and phi_test from train and test datasets using NumPy's loadtxt function Import y from train dataset using the loadtxt function Concatenate coloumn of 1s to right of phi and phi_test Apply min max scaling on each coloumn of phi and phi_test Apply log scaling on y Define a function to calculate change in error function based on phi, w and p norm Make a dictionary containing filenames as keys and p as values For each item in this dictionary Set the w to all 0s Set an appropriate value for lambda and step size Calculate new value of w Repeat steps until error between consecutive ws is less than threshold Load values of id from test data file Calculate y for test data using phi test and applying inverse log Save the ids and y according to filename from dictionary Feature Engineering Columns of phi are not in same range, this is because their units are different i.e phi is ill conditioned So, min max scaling for each column is applied to bring them in range 0-1 Same scaling would be required on columns of phi test Log scaling was used on y. This was determined by trial and error Comparison of performance (p1=1.75, p2=1.5, p3=1.3) As p decreases error in y decreases As p decreases norm of w increases but this can be taken care by increasing lambda As p decreases number of iterations required decreases Tuning of Hyperparameter If p is fixed and lambda is increased error decreases up to a certain lambda, then it starts rising So, lambda was tuned by trial and error. Starting with 0, lambda was increased in small steps until a minimum error was achieved. Comparison of L2 gradient descent and closed form Error from L2 Gradient descent were 4.43268 and that from closed form solution was 4.52624. Errors are comparable so, the L2 gradient descent performs closely with closed form solution.
ewjohn127
An inferential linear regression model using housing and property variables to predict home prices in King County Washington, USA
rkkwan
Linear regression predictive modeling on housing prices.
claudiatsai
Using linear regression model to predict the features for housing prices
This is a model of housing prices based using multiple regression and logistic regression in R