Found 83 repositories(showing 30)
A distributed Spark/Scala implementation of the isolation forest and extended isolation forest algorithms for unsupervised outlier detection, featuring support for scalable training and ONNX export for easy cross-platform inference.
BNP Paribas Kaggle Data Set Data source: https://www.kaggle.com/c/bnp-paribas-cardif-claims-management Outlier Detection- Ensemble unsupervised learning method - Isolation Forest The isolation algorithm is an unsupervised machine learning method used to detect abnormal anomalies in data such as outliers. This is once again a randomized & recursive partition of the training data in a tree structure. The number of sub samples and tree size is specified and tuned appropriately. The distance to the outlier is averaged calculating an anomaly detection score: 1 = outlier 0 = close to zero are normal data.
shivam1808
Credit Card Fraud Detection using Isolation Forest Algorithm and Local Outlier Factor(LOF) Algorithm.
This project aims to detect credit card fraud using Anamoly detection techniques such as Isolation Forest and Local Outlier Factor algorithms.
youngdataspace
Detect outliers using Isolation Forest - an anomaly detection machine learning algorithm
philippjh
Implementation of the "Generalized Isolation Forest" (GIF) algorithm for unsupervised detection of outliers in data.
stuti24m
The project includes the modeling of data set using a machine learning paradigm - Isolation Forest and Local Outlier Factor, with Credit Card Fraud Detection being the base. In this process, I have focused more on analyzing the feature modeling and possible business use cases of the algorithm’s output than on the algorithm itself.
Anomaly detection(Outlier detection) by using Local Outlier Factor algorithm. We can also detect anomaly by using Isolation forest Algorithm.
doguilmak
In this project, I would like to present 4 different anomaly detection algorithms (Local Outlier Factor, Elliptic Envelope, One Class SVM, Isolation Forest) with an example.
Throughout the financial sector, machine learning algorithms are being developed to detect fraudulent transactions. In this project, that is exactly what we are going to be doing as well. Using a dataset of of nearly 28,500 credit card transactions and multiple unsupervised anomaly detection algorithms, we are going to identify transactions with a high probability of being credit card fraud. In this project, we will build and deploy the following two machine learning algorithms: Local Outlier Factor (LOF) Isolation Forest Algorithm Furthermore, using metrics suchs as precision, recall, and F1-scores, we will investigate why the classification accuracy for these algorithms can be misleading. In addition, we will explore the use of data visualization techniques common in data science, such as parameter histograms and correlation matrices, to gain a better understanding of the underlying distribution of data in our data set. Let's get started!
Yashas-H
Anomaly detection using isolation forest algorithm and local outlier factor
saikaushik0410
Credit card fraud detection using Isolation Forest and Local Outlier Factor Algorithms
Machine Learning - Credit Card Fraud detection using Isolation Forest Algorithm and Local Outlier Factor(LOF) Algorithm
This project implements fraud detection using the Isolation Forest algorithm. Isolation Forest is an effective anomaly detection algorithm that isolates outliers in a dataset. The goal of this project is to identify fraudulent activities or transactions within a dataset by leveraging the Isolation Forest algorithm.
Kunleiky
In this project, I carried out detailed detection of outliers in a health care providers dataset using different anomaly detection algorithms including Isolation Forest, Local Outlier Factor and Elliptic Envelope algorithms
manasc12
Credit card fraud detection system using machine learning anomaly detection algorithms. This project implements and compares three different anomaly detection techniques: Isolation Forest, Local Outlier Factor, and One-Class SVM
ochiengfrancis
In this notebook I have used two algotrithm to solve this classification Problem in orderto perform anomaly detection on the Dataset: 1. Isolation Forest Algorithm 2. Local Outlier Factor Algorithm
This project builds a **hybrid anomaly detection system** using multiple unsupervised algorithms — **Isolation Forest**, **Local Outlier Factor (LOF)**, and **Elliptic Envelope** — to detect **fraudulent transactions** from the `SecurePay` / `creditcard.csv` dataset.
guillem-escriba
The main aplication of the Isolation Forest (non-linear ML algorithm) is to detect both, outliers and anomalies, in this notebook we will see how to apply it to the detection of abnormal tyroid of patients.
osamaahmed17
Fraud detection is a complex issue that requires a substantial amount of planning before throwing machine learning algorithms at it. I was able to get data-set from Kaggle Competition and use different techniques of Machine Learning which includes Local Outlier Factor and Isolation Forest to detect the outliers.
Geospatial Election Integrity Analysis — Oyo State, Nigeria 🇳🇬 A data-driven project analyzing voting patterns from the 2023 Nigerian General Election to detect anomalies, outlier polling units, and potential irregularities using Python, geospatial clustering, and anomaly detection algorithms (DBSCAN, Moran’s I, Getis-Ord Gi*, Isolation Forest).
In this project, several anomaly detection techniques of sklearn package have been explored to train a machince learning model to detect credict card fraud. Methods such as Local outlier factor and isolation forest algorithm was used to calculate the anomaly scores. These algorithms use a dataset of slightly under 30000 credit card transactions to predict a fradualent transaction.
In this machine learning project we conducted analysis to recognize fraudulent credit card transactions. The datasets contains transactions made by credit cards in September 2013 by European cardholders, where we have 492 frauds out of 284,807 transactions. In anomaly detection, the local outlier factor (LOF) algorithm, and isolation forest algorithm were used. Other libraries used: Pandas, Sklearn, Seaborn, and Matplotlib.
Web access logs are generated each time when we visit any website on internet. These contain various important information like IP address, data transferred, timestamp etc. Behaviour of clients/customers can be understood by these logs generated on server. In other words these provide us information about various processes/actions. In our study we are implementing Isolation forest and Local Outlier Factor algorithms for anomaly detection in CDNs.
aakritimittal11
Throughout the financial sector, machine learning algorithms are being developed to detect fraudulent transactions. In this project, this is exactly what I did as well. Using a dataset of of nearly 28,500 credit card transactions and multiple unsupervised anomaly detection algorithms, I identified transactions with a high probability of being credit card fraud. In this project,I built and deploy the following two machine learning algorithms: Local Outlier Factor (LOF) Isolation Forest Algorithm Furthermore, using metrics such as precision, recall, and F1-scores, I investigated why the classification accuracy for these algorithms can be misleading. In addition, I also tried my hands on to explore the use of data visualization techniques common in data science, such as parameter histograms and correlation matrices, to gain a better understanding of the underlying distribution of data in our data set.
In this project, proposed a system that detect a fraud in credit card transactions. In recent time, digitization is increasing day to day. Many people is using digital money instead of hard money. So, here comes the fraud system. Credit card companies shall be able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Throughout the financial sector, machine learning algorithms are being developed to detect fraudulent transactions. Using a dataset of nearly 28,500 credit card transactions and multiple unsupervised anomaly detection algorithms, we are going to identify transactions with a high probability of being credit card fraud. In this project, we will build and deploy the two algorithms Local Outlier Factor (LOF) and Isolation Forest Algorithm.
PavanKumar181098
key features: EDA, Logistic Regression, Random Forest, Naive Bayes, Decison Trees, Ensemble methods, Ridge, Lasso models. business objective: the objective of the analysis is to predict whether a company will go bankrupt or not given the various features of a company that would help in predicting the strength of a company and its adaptability like industrial risk, credibility, competitiveness, etc. To also find out which features of the entire data set affect the classification variable the most. Data Visualizations: there are 250 rows and 7 columns, each row represents one company. the first six columns represent the various features like industrial risk, financial flexibility, credibility, etc. and the seventh column denotes whether the company is bankrupt or not. the features take three values 0, 0.5 and 1.0, the class variable takes two values- bankrupt and non-bankrupt. For industrial risk, the value of 0 means the risk is low and 1 means that the risk is high. But if the credibility is 0 then the credibility is low and if 1 then the credibility is high. EDA - firstly, a heatmap is plotted with the input data being the correlation values between the features and the class variable. With financial flexibility, credibility and competitiveness being the features having the most correlation values with respect to the class variable. Hence they are the most important features. secondly, various cross tabulation plots are then plotted like the ones between financial_flexibility and class variable, the operating_risk and class variable. thirdly, isolation forests and anamoly detection is done to detect the outliers, there were three outliers and they were removed. Feature Engineering- Univariate Feature Selection and Random Feature Elimination algorithms are done to identify the most important feature for the classification to be done. Bothe the algorithms give the same results and these results coincide with the conclusions made after the correlation analysis. Data Modeling: the train-test splitting model is done with the test data being 20% of the dataset. Various classification models are used to classify the data accurately. Logistic Regression is used and an accuracy of 100% is obtained on both the training and testing dataset. The ridge and lasso regularization modesl are used, elasitc net model is also used. knearest neighbors is used, naive bayes classifier is used, support vector machines and decision tree classifier. Random Forest classifier and neural network models are also used. The various models are used to train and predict the values on the test dataset. The finalized model is Decision Tree Classifier the F1 score is: 0.975. Stremalit api is used to deploy this model.
Meowcenary
Outlier detection in Go using the isolation forest algorithm
kunal-visoulia
Credit Card Fraud Detection using Isolation Forest Algorithm and Local Outlier Factor Algorithm
amartyacodes
Credit Card fraud detection using Local Outlier Factor Algorithm and Isolation forest Algorithm