Found 260 repositories(showing 30)
radualexandrub
Multiple C/C++ optimization algorithms (derivative and non-derivative) for finding the local minimum of a cost function. Algorithms: Grid-Search, Pattern-Search, Golden-Ratio, Gradient Descent, Newton, Quasi-Newton, Cauchy/Steepest Descent.
pallogu
Nodejs implementation of Neural Network. It uses compute cluster for map reduce and implements stochastic/step/batch gradient descent for finding of global minimum of cost function.
Machine learning using Ridge Regression with sklearn. Creating various neuron mathematical models (ANN) Trained with gradient descent by minimizing cost function. Train and test LSTM Memory Networks.
Overparameterization and overfitting are common concerns when designing and training deep neural networks. Network pruning is an effective strategy used to reduce or limit the network complexity, but often suffers from time and computational intensive procedures to identify the most important connections and best performing hyperparameters. We suggest a pruning strategy which is completely integrated in the training process and which requires only marginal extra computational cost. The method relies on unstructured weight pruning which is re-interpreted in a multiobjective learning approach. A batchwise Pruning strategy is selected to be compared using different optimization methods, of which one is a multiobjective optimization algorithm. As it takes over the choice of the weighting of the objective functions, it has a great advantage in terms of reducing the time consuming hyperparameter search each neural network training suffers from. Without any a priori training, post training, or parameter fine tuning we achieve highly reductions of the dense layers of two commonly used convolution neural networks (CNNs) resulting in only a marginal loss of performance. Our results empirically demonstrate that dense layers are overparameterized as with reducing up to 98 % of its edges they provide almost the same results. We contradict the theory that retraining after pruning neural networks is of great importance and opens new insights into the usage of multiobjective optimization techniques in machine learning algorithms in a Keras framework. The Stochastic Multi Gradient Descent Algorithm implementation in Python3 is for usage with Keras and adopted from paper of S. Liu and L. N. Vicente: "The stochastic multi-gradient algorithm for multi-objective optimization and its application to supervised machine learning". It is combined with weight pruning strategies to reduce network complexity and inference time.
Gradient descent is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost.
shuyangsun
A Python script to graph simple cost functions for linear and logistic regression. Showing how choosing convex or con-convex function can effect gradient descent.
gogetteranushka
No description available
Srujanx
mplemented forward propagation, backpropagation, and gradient descent manually using only NumPy. Designed a 2-layer neural network from the ground up —defining weights, biases, activation functions, and cost calculations explicitly.Verified every computation step with pen-and-paper derivations to confirm my Python implementation matched the theory.
Overparameterization and overfitting are common concerns when designing and training deep neural networks. Network pruning is an effective strategy used to reduce or limit the network complexity, but often suffers from time and computational intensive procedures to identify the most important connections and best performing hyperparameters. We suggest a pruning strategy which is completely integrated in the training process and which requires only marginal extra computational cost. The method relies on unstructured weight pruning which is re-interpreted in a multiobjective learning approach. A batchwise Pruning strategy is selected to be compared using different optimization methods, of which one is a multiobjective optimization algorithm. As it takes over the choice of the weighting of the objective functions, it has a great advantage in terms of reducing the time consuming hyperparameter search each neural network training suffers from. Without any a priori training, post training, or parameter fine tuning we achieve highly reductions of the dense layers of two commonly used convolution neural networks (CNNs) resulting in only a marginal loss of performance. Our results empirically demonstrate that dense layers are overparameterized as with reducing up to 98 % of its edges they provide almost the same results. We contradict the theory that retraining after pruning neural networks is of great importance and opens new insights into the usage of multiobjective optimization techniques in machine learning algorithms in a Keras framework. The Stochastic Multi Gradient Descent Algorithm implementation in Python3 is for usage with Keras and adopted from paper of S. Liu and L. N. Vicente: "The stochastic multi-gradient algorithm for multi-objective optimization and its application to supervised machine learning". It is combined with weight pruning strategies to reduce network complexity and inference time.
rish283
A noisy non-linear signal is generated to test gradient descent through LMS (Least mean squares) and Correntropy methods of minimizing the cost function
SudeErzurumlu
This repository implements the Multivariable Gradient Descent algorithm, an optimization technique used to minimize a multivariable function by iteratively moving in the direction of the steepest descent. It is a core method in machine learning, used for training models and optimizing cost functions with respect to multiple variables.
In the above, I have dealt with a simple neural network in order to work closely with the neural network basics which mainly includes gradient descent , feed forward algorithm , back-propagation algorithm. Doing things at the fundamental level to have better understanding of neural networks. My work also includes varying the hyperparameters such as epochs , learning rates and neurons in dense layer or hidden layer and studied their effects on cost function and speed of convergence . Changes in error with learning rates and epochs for different number of neurons in hidden layers was also studied.
amirrezafahimi
An overview of Gradient Descent, matplolib, cost functions and MSE
Implementation of logistic regression using only numpy and matplotlib (no scikit-learn). Includes gradient descent, cost function visualization, and decision boundary plots with GIF animations for better understanding.
shubham9793
Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a statistical method that is used for predictive analysis. Linear regression makes predictions for continuous/real or numeric variables such as sales, salary, age, product price, etc. Linear regression algorithm shows a linear relationship between a dependent (y) and one or more independent (y) variables, hence called as linear regression. Since linear regression shows the linear relationship, which means it finds how the value of the dependent variable is changing according to the value of the independent variable. The linear regression model provides a sloped straight line representing the relationship between the variables. Consider the below image: Linear Regression in Machine Learning Mathematically, we can represent a linear regression as: y= a0+a1x+ ε Here, Y= Dependent Variable (Target Variable) X= Independent Variable (predictor Variable) a0= intercept of the line (Gives an additional degree of freedom) a1 = Linear regression coefficient (scale factor to each input value). ε = random error The values for x and y variables are training datasets for Linear Regression model representation. Types of Linear Regression Linear regression can be further divided into two types of the algorithm: Simple Linear Regression: If a single independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Simple Linear Regression. Multiple Linear regression: If more than one independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Multiple Linear Regression. Linear Regression Line A linear line showing the relationship between the dependent and independent variables is called a regression line. A regression line can show two types of relationship: Positive Linear Relationship: If the dependent variable increases on the Y-axis and independent variable increases on X-axis, then such a relationship is termed as a Positive linear relationship. Linear Regression in Machine Learning Negative Linear Relationship: If the dependent variable decreases on the Y-axis and independent variable increases on the X-axis, then such a relationship is called a negative linear relationship. Linear Regression in Machine Learning Finding the best fit line: When working with linear regression, our main goal is to find the best fit line that means the error between predicted values and actual values should be minimized. The best fit line will have the least error. The different values for weights or the coefficient of lines (a0, a1) gives a different line of regression, so we need to calculate the best values for a0 and a1 to find the best fit line, so to calculate this we use cost function. Cost function- The different values for weights or coefficient of lines (a0, a1) gives the different line of regression, and the cost function is used to estimate the values of the coefficient for the best fit line. Cost function optimizes the regression coefficients or weights. It measures how a linear regression model is performing. We can use the cost function to find the accuracy of the mapping function, which maps the input variable to the output variable. This mapping function is also known as Hypothesis function. For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the average of squared error occurred between the predicted values and actual values. It can be written as: For the above linear equation, MSE can be calculated as: Linear Regression in Machine Learning Where, N=Total number of observation Yi = Actual value (a1xi+a0)= Predicted value. Residuals: The distance between the actual value and predicted values is called residual. If the observed points are far from the regression line, then the residual will be high, and so cost function will high. If the scatter points are close to the regression line, then the residual will be small and hence the cost function. Gradient Descent: Gradient descent is used to minimize the MSE by calculating the gradient of the cost function. A regression model uses gradient descent to update the coefficients of the line by reducing the cost function. It is done by a random selection of values of coefficient and then iteratively update the values to reach the minimum cost function. Model Performance: The Goodness of fit determines how the line of regression fits the set of observations. The process of finding the best model out of various models is called optimization. It can be achieved by below method: 1. R-squared method: R-squared is a statistical method that determines the goodness of fit. It measures the strength of the relationship between the dependent and independent variables on a scale of 0-100%. The high value of R-square determines the less difference between the predicted values and actual values and hence represents a good model. It is also called a coefficient of determination, or coefficient of multiple determination for multiple regression. It can be calculated from the below formula: Linear Regression in Machine Learning Assumptions of Linear Regression Below are some important assumptions of Linear Regression. These are some formal checks while building a Linear Regression model, which ensures to get the best possible result from the given dataset. Linear relationship between the features and target: Linear regression assumes the linear relationship between the dependent and independent variables. Small or no multicollinearity between the features: Multicollinearity means high-correlation between the independent variables. Due to multicollinearity, it may difficult to find the true relationship between the predictors and target variables. Or we can say, it is difficult to determine which predictor variable is affecting the target variable and which is not. So, the model assumes either little or no multicollinearity between the features or independent variables. Homoscedasticity Assumption: Homoscedasticity is a situation when the error term is the same for all the values of independent variables. With homoscedasticity, there should be no clear pattern distribution of data in the scatter plot. Normal distribution of error terms: Linear regression assumes that the error term should follow the normal distribution pattern. If error terms are not normally distributed, then confidence intervals will become either too wide or too narrow, which may cause difficulties in finding coefficients. It can be checked using the q-q plot. If the plot shows a straight line without any deviation, which means the error is normally distributed. No autocorrelations: The linear regression model assumes no autocorrelation in error terms. If there will be any correlation in the error term, then it will drastically reduce the accuracy of the model. Autocorrelation usually occurs if there is a dependency between residual errors.
Federico-PizarroBejarano
Abstract This investigation determines the extent that characters can be identified in images using the logistic regression and single-layer neural network algorithms. Optical character recognition (OCR) is a computer vision, supervised learning problem. The dependent variables were the optimal value of the regularization parameter lambda, the accuracy on the training, cross validation, and test sets, and the time needed to train each classifier. A dataset of 74,000 images composed of fonts, handwritten characters, and real images of letters and numbers was used. For the purposes of this investigation only a subset of the font dataset was used. Each image was resized to be 20x20 pixels and then converted to a 1x400 vector of pixel values. The logistic regression algorithm attempts to fit parameters to the 400 pixel values to form a hypothesis function. To optimize the parameters, the algorithm defines a cost function and then performs gradient descent on the parameters. The tunable parameters were additional features added in an attempt to create more complex, representative functions. A single-layer neural network passes the input data to a hidden layer where the data is partially processed. The partially processed data is then passed to the output layer where the final predictions are made. The tunable parameter was the number of hidden units in the hidden layer. The logistic regression algorithm achieved an accuracy of 85.14% with no added features and a lambda value of 1. The neural network achieved a significantly higher accuracy of 90.19% using 200 hidden units and no regularization. Logistic regression had a time complexity of O(n) while the neural network had a significantly better time complexity of O(√h). This paper investigates the properties of both algorithms as well as establishes the inability of both algorithms to identify characters to sufficiently high accuracies.
rashida048
Machine Learning - Cost Function and Gradient Descent for Linear Regression
aliejabbari
"Compare Gradient Descent and Adam optimization algorithms in finding global minimum of complex cost functions. Visualize paths & analyze performance."
Mehdi-Abidi
A Python implementation of linear regression using gradient descent. It includes hypothesis and cost functions, iterative parameter updates, and convergence checks. Visualizations include cost function plots, regression lines, and a 3D surface plot of the cost function using Plotly.
In this Cat recognition project I am building the general architecture of a learning algorithm, including: Initializing parameters, Calculating the cost function and its gradient, Using an optimization algorithm (gradient descent), Gather all three functions above into a main model function, in the right order.
MachineLearning-FA2017
Lab Assignment Four: Extending Logistic Regression In this lab, you will compare the performance of logistic regression optimization programmed in scikit-learn and via your own implementation. You will also modify the optimization procedure. This report is worth 10% of the final grade. Please upload a report (one per team) with all code used, visualizations, and text in a rendered Jupyter notebook. Any visualizations that cannot be embedded in the notebook, please provide screenshots of the output. The results should be reproducible using your report. Please carefully describe every assumption and every step in your report. Dataset Selection Select a dataset identically to the way you selected for the lab one (i.e., table data). You are not required to use the same dataset that you used in the past, but you are encouraged. You must identify a classification task from the dataset that contains three or more classes to predict. That is it cannot be a binary classification; it must be multi-class prediction. Grading Rubric Preparation and Overview (30 points total) [5 points] Explain the task and what business-case or use-case it is designed to solve (or designed to investigate). Detail exactly what the classification task is and what parties would be interested in the results. [10 points] (mostly the same processes as from lab one) Define and prepare your class variables. Use proper variable representations (int, float, one-hot, etc.). Use pre-processing methods (as needed) for dimensionality reduction, scaling, etc. Remove variables that are not needed/useful for the analysis. Describe the final dataset that is used for classification/regression (include a description of any newly formed variables you created). [15 points] Divide you data into training and testing data using an 80% training and 20% testing split. Use the cross validation modules that are part of scikit-learn. Choose an appropriate training and testing split for your data. For example, it might be more appropriate to use stratified shuffle splits or it might be more appropriate to use contiguous splits. Describe for why your split method is appropriate for your dataset. Modeling (50 points total) [20 points] Create a custom, one-versus-all logistic regression classifier using numpy and scipy to optimize. Use object oriented conventions identical to scikit-learn. You should start with the template used in the course. You should add the following functionality to the logistic regression classifier: Ability to choose optimization technique when class is instantiated: either steepest descent, stochastic gradient descent, or Newton's method. Update the gradient calculation to include a regularization term (either L1 or L2 norm of the weights). Associate a cost with the regularization term, "C", that can be adjusted when the class is instantiated. [15 points] Train your classifier to achieve good generalization performance. That is, adjust the optimization technique and the value of the regularization term "C" to achieve the best performance on your test set. Does this method of selecting parameters seem justified? That is, do you think there is any "data snooping" involved with this method of selecting parameters? [15 points] Compare the performance of your "best" logistic regression optimization procedure to the procedure used in scikit-learn. Visualize the performance differences in terms of training time, training iterations, and memory usage while training. Discuss the results. Deployment (10 points total) Which implementation of logistic regression would you advise be used in a deployed machine learning model, your implementation or scikit-learn (or other third party)? Why? Exceptional Work (10 points total) You have free reign to provide additional analyses. One idea: Make your implementation of logistic regression compatible with the GridSearchCV function that is part of scikit-learn.
hsb-code
Computation of gradient to minimize cost of Machine Learning Algorithm
No description available
vaibhavr54
Hands-on Python and notebook examples illustrating gradient descent, linear regression, and data visualization from scratch and with scikit-learn. Includes clear code, visualizations, and CSV data for practical understanding of optimization and ML fundamentals.
kanwartaimoor
multivariate and univariate linear regression using MSE as cost function and gradient descent to minimize the cost function
minggli
simple implementation of standard Gradient Descent algorithm, along with Sigmoid and GLM cost functions
An implementation for the Logistic Regression algorithm from scratch, imcluding the cost function, calculate the gradient and gradient descent
KerolosNabil7
Adaline (Adaptive Linear Neuron) is a single-layer neural network that uses a linear activation function to make predictions. It adjusts the weights of input features to minimize the difference between predicted and actual output using the gradient descent optimization algorithm and mean squared error cost function.
An implementation for the Linear Regression algorithm from the scratch, imcluding the cost function, calculate the gradient and gradient descent
subarnab219
Logistic regression using numpy on a dataframe. Includes calculation and optimization of cost function using gradient descent.