Found 140 repositories(showing 30)
dronefreak
Human action classification system with pose-based (MediaPipe) and video-based (3D CNN) models. Features 100+ architectures for real-time pose classification and temporal models pretrained on UCF-101/HMDB51.
lukereichold
Human action classification for video, offline and natively on iOS via Core ML
mohanrajmit
Video classification using CNN and LSTM
No description available
With recent advances in both Artificial Intelligence (AI) and Internet of Things (IoT) capabilities, it is more possible than ever to implement surveillance systems that can automatically identify people who might represent a potential security threat to the public in real-time. Imagine a surveillance camera system that can detect various on-body weapons, suspicious objects, and traffic. This system could transform surveillance cameras from passive sentries into active observers, which would help prevent a possible mass shooting in a school, stadium, or mall. In this project, we tried to realize such systems by implementing Smart-Monitor, an AI-powered threat detector for intelligent surveillance cameras. The developed system can be deployed locally on the surveillance cameras at the network edge. Deploying AI-enabled surveillance applications at the edge enables the initial analysis of the captured images on-site, reducing the communication overheads and enabling swift security actions. We developed a mobile app that users can detect suspicious objects in an image and video captured by several cameras at the network edge. Also, the model can generate a high-quality segmentation mask for each object instance in the photo, along with the confidence percentage. The camera side used a Raspberry Pi 4 device, Neural Compute Stick 2 (NCS 2), Logitech C920 webcam, motion sensors, buzzers, pushbuttons, LED lights, Python Face recognition, and TensorFlow Custom Object Detection. When the system detects a motion in the surrounding environment, the motion sensors send a signal to the Raspberry Pi device notifying it to start capturing images for such physical activity. Using Python’s face recognition and TensorFlow 2 custom object detection Smart-Monitor can recognize eight classes, including a baseball bat, bird, cat, dog, gun, hammer, knife, and human faces. Finally, we evaluated our system using various performance metrics such as classification time and accuracy, scalability, etc.
psurya1994
This is the MATLAB code for the paper, "Autonomous UAV for Suspicious Action Detection using Pictorial Human Pose Estimation and Classification" published in published in Electronic Letters on Computer Vision and Image Analysis.
Artificial Intelligence and Machine Learning have empowered our lives to a large extent. The number of advancements made in this space has revolutionized our society and continue making society a better place to live in. In terms of perception, both Artificial Intelligence and Machine Learning are often used in the same context which leads to confusion. AI is the concept in which machine makes smart decisions whereas Machine Learning is a sub-field of AI which makes decisions while learning patterns from the input data. In this blog, we would dissect each term and understand how Artificial Intelligence and Machine Learning are related to each other. What is Artificial Intelligence? The term Artificial Intelligence was recognized first in the year 1956 by John Mccarthy in an AI conference. In layman terms, Artificial Intelligence is about creating intelligent machines which could perform human-like actions. AI is not a modern-day phenomenon. In fact, it has been around since the advent of computers. The only thing that has changed is how we perceive AI and define its applications in the present world. The exponential growth of AI in the last decade or so has affected every sphere of our lives. Starting from a simple google search which gives the best results of a query to the creation of Siri or Alexa, one of the significant breakthroughs of the 21st century is Artificial Intelligence. The Four types of Artificial Intelligence are:- Reactive AI – This type of AI lacks historical data to perform actions, and completely reacts to a certain action taken at the moment. It works on the principle of Deep Reinforcement learning where a prize is awarded for any successful action and penalized vice versa. Google’s AlphaGo defeated experts in Go using this approach. Limited Memory – In the case of the limited memory, the past data is kept on adding to the memory. For example, in the case of selecting the best restaurant, the past locations would be taken into account and would be suggested accordingly. Theory of Mind – Such type of AI is yet to be built as it involves dealing with human emotions, and psychology. Face and gesture detection comes close but nothing advanced enough to understand human emotions. Self-Aware – This is the future advancement of AI which could configure self-representations. The machines could be conscious, and super-intelligent. Two of the most common usage of AI is in the field of Computer Vision, and Natural Language Processing. Computer Vision is the study of identifying objects such as Face Recognition, Real-time object detection, and so on. Detection of such movements could go a long way in analyzing the sentiments conveyed by a human being. Natural Language Processing, on the other hand, deals with textual data to extract insights or sentiments from it. From ChatBot Development to Speech Recognition like Amazon’s Alexa or Apple’s Siri all uses Natural Language to extract relevant meaning from the data. It is one of the widely popular fields of AI which has found its usefulness in every organization. One other application of AI which has gained popularity in recent times is the self-driving cars. It uses reinforcement learning technique to learn its best moves and identify the restrictions or blockage in front of the road. Many automobile companies are gradually adopting the concept of self-driving cars. What is Machine Learning? Machine Learning is a state-of-the-art subset of Artificial Intelligence which let machines learn from past data, and make accurate predictions. Machine Learning has been around for decades, and the first ML application that got popular was the Email Spam Filter Classification. The system is trained with a set of emails labeled as ‘spam’ and ‘not spam’ known as the training instance. Then a new set of unknown emails is fed to the trained system which then categorizes it as ‘spam’ or ‘not spam.’ All these predictions are made by a certain group of Regression, and Classification algorithms like – Linear Regression, Logistic Regression, Decision Tree, Random Forest, XGBoost, and so on. The usability of these algorithms varies based on the problem statement and the data set in operation. Along with these basic algorithms, a sub-field of Machine Learning which has gained immense popularity in recent times is Deep Learning. However, Deep Learning requires enormous computational power and works best with a massive amount of data. It uses neural networks whose architecture is similar to the human brain. Machine Learning could be subdivided into three categories – Supervised Learning – In supervised learning problems, both the input feature and the corresponding target variable is present in the dataset. Unsupervised Learning – The dataset is not labeled in an unsupervised learning problem i.e., only the input features are present, but not the target variable. The algorithms need to find out the separate clusters in the dataset based on certain patterns. Reinforcement Learning – In this type of problems, the learner is rewarded with a prize for every correct move, and penalized for every incorrect move. The application of Machine Learning is diversified in various domains like Banking, Healthcare, Retail, etc. One of the use cases in the banking industry is predicting the probability of credit loan default by a borrower given its past transactions, credit history, debt ratio, annual income, and so on. In Healthcare, Machine Learning is often been used to predict patient’s stay in the hospital, the likelihood of occurrence of a disease, identifying abnormal patterns in the cell, etc. Many software companies have incorporated Machine Learning in their workflow to steadfast the process of testing. Various manual, repetitive tasks are being replaced by machine learning models. Comparison Between AI and Machine Learning Machine Learning is the subset of Artificial Intelligence which has taken the advancement in AI to a whole new level. The thought behind letting the computer learn from themselves and voluminous data that are getting generated from various sources in the present world has led to the emergence of Machine Learning. In Machine Learning, the concept of neural networks plays a significant role in allowing the system to learn from themselves as well as maintaining its speed, and accuracy. The group of neural nets lets a model rectifying its prior decision and make a more accurate prediction next time. Artificial Intelligence is about acquiring knowledge and applying them to ensure success instead of accuracy. It makes the computer intelligent to make smart decisions on its own akin to the decisions made by a human being. The more complex the problem is, the better it is for AI to solve the complexity. On the other hand, Machine Learning is mostly about acquiring knowledge and maintaining better accuracy instead of success. The primary aim is to learn from the data to automate specific tasks. The possibilities around Machine Learning and Neural Networks are endless. A set of sentiments could be understood from raw text. A machine learning application could also listen to music, and even play a piece of appropriate music based on a person’s mood. NLP, a field of AI which has made some ground-breaking innovations in recent years uses Machine Learning to understand the nuances in natural language and learn to respond accordingly. Different sectors like banking, healthcare, manufacturing, etc., are reaping the benefits of Artificial Intelligence, particularly Machine Learning. Several tedious tasks are getting automated through ML which saves both time and money. Machine Learning has been sold these days consistently by marketers even before it has reached its full potential. AI could be seen as something of the old by the marketers who believe Machine Learning is the Holy Grail in the field of analytics. The future is not far when we would see human-like AI. The rapid advancement in technology has taken us closer than ever before to inevitability. The recent progress in the working AI is much down to how Machine Learning operates. Both Artificial Intelligence and Machine Learning has its own business applications and its usage is completely dependent on the requirements of an organization. AI is an age-old concept with Machine Learning picking up the pace in recent times. Companies like TCS, Infosys are yet to unleash the full potential of Machine Learning and trying to incorporate ML in their applications to keep pace with the rapidly growing Analytics space. Conclusion The hype around Artificial Intelligence and Machine Learning are such that various companies and even individuals want to master the skills without even knowing the difference between the two. Often both the terms are misused in the same context. To master Machine Learning, one needs to have a natural intuition about the data, ask the right questions, and find out the correct algorithms to use to build a model. It often doesn’t requiem how computational capacity. On the other hand, AI is about building intelligent systems which require advanced tools and techniques and often used in big companies like Google, Facebook, etc. There is a whole host of resources to master Machine Learning and AI. The Data Science blogs of Dimensionless is a good place to start with. Also, There are Online Data Science Courses which cover the various nitty gritty of Machine Learning.
Abstract— Violence detection has been investigated extensively in the literature. Recently, IOT based violence video surveillance is an intelligent component integrated in security system of smart buildings. Violence video detector is a specific kind of detection models that should be highly accurate to increase the model’s sensitivity and reduce the false alarm rate. This paper proposes a novel architecture of ConvLSTM model that can run on low-cost Internet of Things (IOT) device such as raspberry pi board. The paper utilized convolutional neural networks (CNNs) to learn spatial features from video’s frames that were applied to Long Short- Term Memory (LSTM) for video classification into violence/non-violence classes. A complex dataset including two public datasets: RWF-2000 and RLVS-2000 was used for model training and evaluation. The challenging video content includes crowds and chaos, small object at far distance, low resolution, and transient action. Additionally, the videos were captured in various environments such as street, prison, and schools with several human actions such as playing football, basketball, tennis, swimming and eating. The experimental results show high performance of the proposed violence detection model in terms of average metrics having an accuracy of 73.35 %, recall of 76.90 %, precision of 72.53 %, F1 score of 74.01 %, false negative rate of 23.10 %, false positive rate of 30.20 %, and AUC of 82.0 %.
dodowujiayu
No description available
programindz
Fine-Tuning Google's Vision Transformer LoRA technique. Two different LoRA adapters are tuned for separate classification (food and human actions). A simple Gradio interface is implemented to run the inference.
mragpavank
Business Problem IBM HR Analytics Employee Attrition & Performance. Predict attrition of your valuable employees. Attrition is a problem that impacts all businesses, irrespective of geography, industry and size of the company. Employee attrition leads to significant costs for a business, including the cost of business disruption, hiring new staff and training new staff. As such, there is great business interest in understanding the drivers of, and minimizing staff attrition. In this context, the use of classification models to predict if an employee is likely to quit could greatly increase the HR’s ability to intervene on time and remedy the situation to prevent attrition. While this model can be routinely run to identify employees who are most likely to quit, the key driver of success would be the human element of reaching out the employee, understanding the current situation of the employee and taking action to remedy controllable factors that can prevent attrition of the employee. This data set presents an employee survey from IBM, indicating if there is attrition or not. The data set contains approximately 1500 entries. Given the limited size of the data set, the model should only be expected to provide modest improvement in indentification of attrition vs a random allocation of probability of attrition. While some level of attrition in a company is inevitable, minimizing it and being prepared for the cases that cannot be helped will significantly help improve the operations of most businesses. As a future development, with a sufficiently large data set, it would be used to run a segmentation on employees, to develop certain “at risk” categories of employees. This could generate new insights for the business on what drives attrition, insights that cannot be generated by merely informational interviews with employees. Uncover the factors that lead to employee attrition and explore important questions such as ‘show me a breakdown of distance from home by job role and attrition’ or ‘compare average monthly income by education and attrition’. This is a fictional data set created by IBM data scientists. Education 1 'Below College' 2 'College' 3 'Bachelor' 4 'Master' 5 'Doctor' EnvironmentSatisfaction 1 'Low' 2 'Medium' 3 'High' 4 'Very High' JobInvolvement 1 'Low' 2 'Medium' 3 'High' 4 'Very High' JobSatisfaction 1 'Low' 2 'Medium' 3 'High' 4 'Very High' PerformanceRating 1 'Low' 2 'Good' 3 'Excellent' 4 'Outstanding' RelationshipSatisfaction 1 'Low' 2 'Medium' 3 'High' 4 'Very High' WorkLifeBalance 1 'Bad' 2 'Good' 3 'Better' 4 'Best' IBM HR Analytics Employee Attrition & Performance Predict attrition of your valuable employees IBM HR Analytics Employee Attrition & Performance IBM HR Analytics Employee Attrition & Performance
yangwangx
Improving Human Action Recognition by Non-action Classification
The human hand plays a crucial role in conveying emotions and carrying out most day-to-day activities. Therefore numerous modern technologies - ranging from gesture control to autonomous driving - would benefit from the reliable recognition of certain hand actions. This can be done using a two-step approach, in which first hand poses are obtained from video frames and then the resulting sequences are classified in the 3D skeleton space. Existing techniques that aim to solve the second step are mostly based on deep learning methods. Given the high complexity and dimensionality of the human hand, these require large amounts of training data to achieve good performance. As the collection of precisely annotated hand pose data is time-consuming and expensive, data augmentation appears as an advantageous practice to increase the recognition accuracy for a given classifier. This thesis proposes a suitable WGAN-GP architecture for the generation of synthetic hand skeleton sequences with variable length. The recommended critic consists of a multi-layer perceptron with three hidden layers, while the generator is based on two RNNs and receives a start frame as input. Both networks are conditioned on the action class. The best performing model was trained on multiple classes simultaneously and selected based on the smallest generator loss. When its synthetic samples were used to augment the training set of a 1-layer LSTM classifier, the classification error on several subsets as well as on the complete dataset decreased. Quantitative results show that the chosen GAN-based data augmentation outperforms alternative standard methods. Furthermore, no clear correlation between the visual appearance of the generated samples and their resulting improvement on recognition accuracy was found.
Motor Imagery-based Brain-Computer Interfaces (MI-BCI) are a promise to revolutionize the way humans interact with machinery or software, performing actions by just thinking about them. Patients suffering from critical movement disabilities, such as amyotrophic lateral sclerosis (ALS) or tetraplegia, could use this technology to control a wheelchair, robotic prostheses, or any other device that could let them interact independently with their surroundings. The focus of this project is to aid communities affected by these disorders with the development of a method that is capable of detecting, as accurately as possible, the intention to execute movements (without them occurring) in the upper extremities of the body. This will be done through signals acquired with an electroencephalogram (EEG), their conditioning and processing, and their subsequent classification with artificial intelligence models. In addition, a digital signal filter will be designed to keep the most characteristic frequency bands of each individual and increase accuracy significantly. After extracting discriminative statistical, frequential, and spatial features, it was possible to obtain an 88% accuracy on validation data when it came to detecting whether a participant was imagining a left-hand or a right-hand movement. Furthermore, a Convolutional Neural Network (CNN) was used to distinguish if the participant was imagining a movement or not, which achieved a 78% accuracy and a 90% precision. These results will be verified by implementing a real-time simulation with the usage of a robotic arm.
asieraguado
Machine learning for human pose and action classification
Aryia-Behroziuan
Knowledge-representation is a field of artificial intelligence that focuses on designing computer representations that capture information about the world that can be used to solve complex problems. The justification for knowledge representation is that conventional procedural code is not the best formalism to use to solve complex problems. Knowledge representation makes complex software easier to define and maintain than procedural code and can be used in expert systems. For example, talking to experts in terms of business rules rather than code lessens the semantic gap between users and developers and makes development of complex systems more practical. Knowledge representation goes hand in hand with automated reasoning because one of the main purposes of explicitly representing knowledge is to be able to reason about that knowledge, to make inferences, assert new knowledge, etc. Virtually all knowledge representation languages have a reasoning or inference engine as part of the system.[10] A key trade-off in the design of a knowledge representation formalism is that between expressivity and practicality. The ultimate knowledge representation formalism in terms of expressive power and compactness is First Order Logic (FOL). There is no more powerful formalism than that used by mathematicians to define general propositions about the world. However, FOL has two drawbacks as a knowledge representation formalism: ease of use and practicality of implementation. First order logic can be intimidating even for many software developers. Languages that do not have the complete formal power of FOL can still provide close to the same expressive power with a user interface that is more practical for the average developer to understand. The issue of practicality of implementation is that FOL in some ways is too expressive. With FOL it is possible to create statements (e.g. quantification over infinite sets) that would cause a system to never terminate if it attempted to verify them. Thus, a subset of FOL can be both easier to use and more practical to implement. This was a driving motivation behind rule-based expert systems. IF-THEN rules provide a subset of FOL but a very useful one that is also very intuitive. The history of most of the early AI knowledge representation formalisms; from databases to semantic nets to theorem provers and production systems can be viewed as various design decisions on whether to emphasize expressive power or computability and efficiency.[11] In a key 1993 paper on the topic, Randall Davis of MIT outlined five distinct roles to analyze a knowledge representation framework:[12] A knowledge representation (KR) is most fundamentally a surrogate, a substitute for the thing itself, used to enable an entity to determine consequences by thinking rather than acting, i.e., by reasoning about the world rather than taking action in it. It is a set of ontological commitments, i.e., an answer to the question: In what terms should I think about the world? It is a fragmentary theory of intelligent reasoning, expressed in terms of three components: (i) the representation's fundamental conception of intelligent reasoning; (ii) the set of inferences the representation sanctions; and (iii) the set of inferences it recommends. It is a medium for pragmatically efficient computation, i.e., the computational environment in which thinking is accomplished. One contribution to this pragmatic efficiency is supplied by the guidance a representation provides for organizing information so as to facilitate making the recommended inferences. It is a medium of human expression, i.e., a language in which we say things about the world. Knowledge representation and reasoning are a key enabling technology for the Semantic Web. Languages based on the Frame model with automatic classification provide a layer of semantics on top of the existing Internet. Rather than searching via text strings as is typical today, it will be possible to define logical queries and find pages that map to those queries.[13] The automated reasoning component in these systems is an engine known as the classifier. Classifiers focus on the subsumption relations in a knowledge base rather than rules. A classifier can infer new classes and dynamically change the ontology as new information becomes available. This capability is ideal for the ever-changing and evolving information space of the Internet.[14] The Semantic Web integrates concepts from knowledge representation and reasoning with markup languages based on XML. The Resource Description Framework (RDF) provides the basic capabilities to define knowledge-based objects on the Internet with basic features such as Is-A relations and object properties. The Web Ontology Language (OWL) adds additional semantics and integrates with automatic classification reasoners.[15]
PRITHIVSAKTHIUR
Human-Action-Recognition is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for multi-class human action recognition. It uses the SiglipForImageClassification architecture to predict human activities from still images.
Human action classification: we zoom here in more to the body of the person who is performing the action. We consider the interactions between different body parts and joints.
jashswayam
This project implements a real-time multi-person pose classification system using YOLOv11Pose for pose detection and a custom LSTM-based neural network for action classification. The application can process both video files and live webcam streams, detecting and classifying human actions based on pose keypoints.
sanket-pixel
This repository contains my projects related to video analytics with topics like Video clip classification, Temporal video segmentation, Spatio-temporal action detection, Saptio-temporal modeling of humans and objects, Anticipation, Weakly supervised learning, Video summarization, Affordance, Egocentric videos, Multi-person tracking .
No description available
kelseyzeng0610
A real-time human action recognition system implementing the Action Transformer (AcT) architecture for pose-based action classification using webcam input.
shayanray
Train a recurrent neural network for human action classification. RNN is designed handle sequential data
banasiddhaPatil
We are implement an proposed model for a Using Convolutional Neural Network for Recognizing Human Action Based on Yoga Pose Classification Using Image Processing and Deep Learning
This project analyzes the biometric identification and tracking related technologies of human- computer interaction. Based on face detection algorithm, we propose a position-based head motion detection algorithm, which does not depend on the specific biometric identification and tracking. It uses feature classification method to detect eye opening and closing actions.
philipp-hellwig
Human action classification in videos using CNNs and LSTMs.
Jahidur1414
No description available
Ruta-kul
No description available
bwleith
Human action image classification
penn-figueroa-lab
No description available