The purpose of this R project is to create a **rating recommender system through machine learning training** That recommender system will be able to predict a users rating into a new movie. For training and testing our ML models, we will use the 10M (millions) row rating dataset named MovieLens created by the University of Minnesota. It was released at 1/2009 so our newest movies are until 2008. In order to find a pattern and behavior of the data, the data sets where enhanced by many new features (dimensions). As validation of the models we wil use RMSE. During the project are given more explanations. Many algorithms and ML models where used in order to achieve the lowest RMSE. Such us: **Matrix Factorization with parallel stochastic gradient descent, H2o stacked ensembles of (GBM,GLM,DRF,NN). Also they where used H2o Auto ML models** More details are below and also during the project. In case you dont want to wait and train the models, you can download them from my github and load them. There are 2 types of recommender systems: **Content filtering (based on the description of the item also called meta data or side information)** And **collaborative Filtering**: Those techniques are calculating the similarity measures of the target ITEMS and finding the minimum (Euclidean distance, or Cosine distance, or other metric, it depends on the algorithm). This is done by filtering the interests of a user, by collecting preferences from many users (collaborating). The underlying assumption is that if a person X has the same opinion as a person Y then the recommendation system should be based on preferences of person Y (similarity). We will enhance the collaborative filtering with the help of **Matrix factorization**. MF is a class of collaborative filtering algorithms used in recommender systems. Matrix factorization algorithms work by **decomposing the user-item interaction matrix into the product of two lower dimensionality rectangular matrices**. This family of methods became **widely known during the Netflix prize challenge due to its effectiveness as reported by Simon Funk in his 2006 blog post**, where he shared his findings with the research community LINK (https://en.wikipedia.org/wiki/Matrix_factorization_(recommender_systems) We will apply **Matrix Factorization with parallel stochastic gradient descent**. With the help of "recosystem" package it is an R wrapper of the LIBMF library which creates a Recommender System by Using Parallel Matrix Factorization. The main task of recommender system is to predict unknown entries in the rating matrix based on observed values. The main purpose is to calculate the matrix RMXn by the product of the two matrixes of the lower dimension, Pkxm and Qkxn : RQ More info on the recosystem package and the techniques LINK (https://cran.r-project.org/web/packages/recosystem/vignettes/introduction.html)
Stars
3
Forks
3
Watchers
3
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
1
commits