Found 204 repositories(showing 30)
molyswu
using Neural Networks (SSD) on Tensorflow. This repo documents steps and scripts used to train a hand detector using Tensorflow (Object Detection API). As with any DNN based task, the most expensive (and riskiest) part of the process has to do with finding or creating the right (annotated) dataset. I was interested mainly in detecting hands on a table (egocentric view point). I experimented first with the [Oxford Hands Dataset](http://www.robots.ox.ac.uk/~vgg/data/hands/) (the results were not good). I then tried the [Egohands Dataset](http://vision.soic.indiana.edu/projects/egohands/) which was a much better fit to my requirements. The goal of this repo/post is to demonstrate how neural networks can be applied to the (hard) problem of tracking hands (egocentric and other views). Better still, provide code that can be adapted to other uses cases. If you use this tutorial or models in your research or project, please cite [this](#citing-this-tutorial). Here is the detector in action. <img src="images/hand1.gif" width="33.3%"><img src="images/hand2.gif" width="33.3%"><img src="images/hand3.gif" width="33.3%"> Realtime detection on video stream from a webcam . <img src="images/chess1.gif" width="33.3%"><img src="images/chess2.gif" width="33.3%"><img src="images/chess3.gif" width="33.3%"> Detection on a Youtube video. Both examples above were run on a macbook pro **CPU** (i7, 2.5GHz, 16GB). Some fps numbers are: | FPS | Image Size | Device| Comments| | ------------- | ------------- | ------------- | ------------- | | 21 | 320 * 240 | Macbook pro (i7, 2.5GHz, 16GB) | Run without visualizing results| | 16 | 320 * 240 | Macbook pro (i7, 2.5GHz, 16GB) | Run while visualizing results (image above) | | 11 | 640 * 480 | Macbook pro (i7, 2.5GHz, 16GB) | Run while visualizing results (image above) | > Note: The code in this repo is written and tested with Tensorflow `1.4.0-rc0`. Using a different version may result in [some errors](https://github.com/tensorflow/models/issues/1581). You may need to [generate your own frozen model](https://pythonprogramming.net/testing-custom-object-detector-tensorflow-object-detection-api-tutorial/?completed=/training-custom-objects-tensorflow-object-detection-api-tutorial/) graph using the [model checkpoints](model-checkpoint) in the repo to fit your TF version. **Content of this document** - Motivation - Why Track/Detect hands with Neural Networks - Data preparation and network training in Tensorflow (Dataset, Import, Training) - Training the hand detection Model - Using the Detector to Detect/Track hands - Thoughts on Optimizations. > P.S if you are using or have used the models provided here, feel free to reach out on twitter ([@vykthur](https://twitter.com/vykthur)) and share your work! ## Motivation - Why Track/Detect hands with Neural Networks? There are several existing approaches to tracking hands in the computer vision domain. Incidentally, many of these approaches are rule based (e.g extracting background based on texture and boundary features, distinguishing between hands and background using color histograms and HOG classifiers,) making them not very robust. For example, these algorithms might get confused if the background is unusual or in situations where sharp changes in lighting conditions cause sharp changes in skin color or the tracked object becomes occluded.(see [here for a review](https://www.cse.unr.edu/~bebis/handposerev.pdf) paper on hand pose estimation from the HCI perspective) With sufficiently large datasets, neural networks provide opportunity to train models that perform well and address challenges of existing object tracking/detection algorithms - varied/poor lighting, noisy environments, diverse viewpoints and even occlusion. The main drawbacks to usage for real-time tracking/detection is that they can be complex, are relatively slow compared to tracking-only algorithms and it can be quite expensive to assemble a good dataset. But things are changing with advances in fast neural networks. Furthermore, this entire area of work has been made more approachable by deep learning frameworks (such as the tensorflow object detection api) that simplify the process of training a model for custom object detection. More importantly, the advent of fast neural network models like ssd, faster r-cnn, rfcn (see [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models-coco-models) ) etc make neural networks an attractive candidate for real-time detection (and tracking) applications. Hopefully, this repo demonstrates this. > If you are not interested in the process of training the detector, you can skip straight to applying the [pretrained model I provide in detecting hands](#detecting-hands). Training a model is a multi-stage process (assembling dataset, cleaning, splitting into training/test partitions and generating an inference graph). While I lightly touch on the details of these parts, there are a few other tutorials cover training a custom object detector using the tensorflow object detection api in more detail[ see [here](https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/) and [here](https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9) ]. I recommend you walk through those if interested in training a custom object detector from scratch. ## Data preparation and network training in Tensorflow (Dataset, Import, Training) **The Egohands Dataset** The hand detector model is built using data from the [Egohands Dataset](http://vision.soic.indiana.edu/projects/egohands/) dataset. This dataset works well for several reasons. It contains high quality, pixel level annotations (>15000 ground truth labels) where hands are located across 4800 images. All images are captured from an egocentric view (Google glass) across 48 different environments (indoor, outdoor) and activities (playing cards, chess, jenga, solving puzzles etc). <img src="images/egohandstrain.jpg" width="100%"> If you will be using the Egohands dataset, you can cite them as follows: > Bambach, Sven, et al. "Lending a hand: Detecting hands and recognizing activities in complex egocentric interactions." Proceedings of the IEEE International Conference on Computer Vision. 2015. The Egohands dataset (zip file with labelled data) contains 48 folders of locations where video data was collected (100 images per folder). ``` -- LOCATION_X -- frame_1.jpg -- frame_2.jpg ... -- frame_100.jpg -- polygons.mat // contains annotations for all 100 images in current folder -- LOCATION_Y -- frame_1.jpg -- frame_2.jpg ... -- frame_100.jpg -- polygons.mat // contains annotations for all 100 images in current folder ``` **Converting data to Tensorflow Format** Some initial work needs to be done to the Egohands dataset to transform it into the format (`tfrecord`) which Tensorflow needs to train a model. This repo contains `egohands_dataset_clean.py` a script that will help you generate these csv files. - Downloads the egohands datasets - Renames all files to include their directory names to ensure each filename is unique - Splits the dataset into train (80%), test (10%) and eval (10%) folders. - Reads in `polygons.mat` for each folder, generates bounding boxes and visualizes them to ensure correctness (see image above). - Once the script is done running, you should have an images folder containing three folders - train, test and eval. Each of these folders should also contain a csv label document each - `train_labels.csv`, `test_labels.csv` that can be used to generate `tfrecords` Note: While the egohands dataset provides four separate labels for hands (own left, own right, other left, and other right), for my purpose, I am only interested in the general `hand` class and label all training data as `hand`. You can modify the data prep script to generate `tfrecords` that support 4 labels. Next: convert your dataset + csv files to tfrecords. A helpful guide on this can be found [here](https://pythonprogramming.net/creating-tfrecord-files-tensorflow-object-detection-api-tutorial/).For each folder, you should be able to generate `train.record`, `test.record` required in the training process. ## Training the hand detection Model Now that the dataset has been assembled (and your tfrecords), the next task is to train a model based on this. With neural networks, it is possible to use a process called [transfer learning](https://www.tensorflow.org/tutorials/image_retraining) to shorten the amount of time needed to train the entire model. This means we can take an existing model (that has been trained well on a related domain (here image classification) and retrain its final layer(s) to detect hands for us. Sweet!. Given that neural networks sometimes have thousands or millions of parameters that can take weeks or months to train, transfer learning helps shorten training time to possibly hours. Tensorflow does offer a few models (in the tensorflow [model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models-coco-models)) and I chose to use the `ssd_mobilenet_v1_coco` model as my start point given it is currently (one of) the fastest models (read the SSD research [paper here](https://arxiv.org/pdf/1512.02325.pdf)). The training process can be done locally on your CPU machine which may take a while or better on a (cloud) GPU machine (which is what I did). For reference, training on my macbook pro (tensorflow compiled from source to take advantage of the mac's cpu architecture) the maximum speed I got was 5 seconds per step as opposed to the ~0.5 seconds per step I got with a GPU. For reference it would take about 12 days to run 200k steps on my mac (i7, 2.5GHz, 16GB) compared to ~5hrs on a GPU. > **Training on your own images**: Please use the [guide provided by Harrison from pythonprogramming](https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/) on how to generate tfrecords given your label csv files and your images. The guide also covers how to start the training process if training locally. [see [here] (https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/)]. If training in the cloud using a service like GCP, see the [guide here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_cloud.md). As the training process progresses, the expectation is that total loss (errors) gets reduced to its possible minimum (about a value of 1 or thereabout). By observing the tensorboard graphs for total loss(see image below), it should be possible to get an idea of when the training process is complete (total loss does not decrease with further iterations/steps). I ran my training job for 200k steps (took about 5 hours) and stopped at a total Loss (errors) value of 2.575.(In retrospect, I could have stopped the training at about 50k steps and gotten a similar total loss value). With tensorflow, you can also run an evaluation concurrently that assesses your model to see how well it performs on the test data. A commonly used metric for performance is mean average precision (mAP) which is single number used to summarize the area under the precision-recall curve. mAP is a measure of how well the model generates a bounding box that has at least a 50% overlap with the ground truth bounding box in our test dataset. For the hand detector trained here, the mAP value was **0.9686@0.5IOU**. mAP values range from 0-1, the higher the better. <img src="images/accuracy.jpg" width="100%"> Once training is completed, the trained inference graph (`frozen_inference_graph.pb`) is then exported (see the earlier referenced guides for how to do this) and saved in the `hand_inference_graph` folder. Now its time to do some interesting detection. ## Using the Detector to Detect/Track hands If you have not done this yet, please following the guide on installing [Tensorflow and the Tensorflow object detection api](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md). This will walk you through setting up the tensorflow framework, cloning the tensorflow github repo and a guide on - Load the `frozen_inference_graph.pb` trained on the hands dataset as well as the corresponding label map. In this repo, this is done in the `utils/detector_utils.py` script by the `load_inference_graph` method. ```python detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='') sess = tf.Session(graph=detection_graph) print("> ====== Hand Inference graph loaded.") ``` - Detect hands. In this repo, this is done in the `utils/detector_utils.py` script by the `detect_objects` method. ```python (boxes, scores, classes, num) = sess.run( [detection_boxes, detection_scores, detection_classes, num_detections], feed_dict={image_tensor: image_np_expanded}) ``` - Visualize detected bounding detection_boxes. In this repo, this is done in the `utils/detector_utils.py` script by the `draw_box_on_image` method. This repo contains two scripts that tie all these steps together. - detect_multi_threaded.py : A threaded implementation for reading camera video input detection and detecting. Takes a set of command line flags to set parameters such as `--display` (visualize detections), image parameters `--width` and `--height`, videe `--source` (0 for camera) etc. - detect_single_threaded.py : Same as above, but single threaded. This script works for video files by setting the video source parameter videe `--source` (path to a video file). ```cmd # load and run detection on video at path "videos/chess.mov" python detect_single_threaded.py --source videos/chess.mov ``` > Update: If you do have errors loading the frozen inference graph in this repo, feel free to generate a new graph that fits your TF version from the model-checkpoint in this repo. Use the [export_inference_graph.py](https://github.com/tensorflow/models/blob/master/research/object_detection/export_inference_graph.py) script provided in the tensorflow object detection api repo. More guidance on this [here](https://pythonprogramming.net/testing-custom-object-detector-tensorflow-object-detection-api-tutorial/?completed=/training-custom-objects-tensorflow-object-detection-api-tutorial/). ## Thoughts on Optimization. A few things that led to noticeable performance increases. - Threading: Turns out that reading images from a webcam is a heavy I/O event and if run on the main application thread can slow down the program. I implemented some good ideas from [Adrian Rosebuck](https://www.pyimagesearch.com/2017/02/06/faster-video-file-fps-with-cv2-videocapture-and-opencv/) on parrallelizing image capture across multiple worker threads. This mostly led to an FPS increase of about 5 points. - For those new to Opencv, images from the `cv2.read()` method return images in [BGR format](https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/). Ensure you convert to RGB before detection (accuracy will be much reduced if you dont). ```python cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB) ``` - Keeping your input image small will increase fps without any significant accuracy drop.(I used about 320 x 240 compared to the 1280 x 720 which my webcam provides). - Model Quantization. Moving from the current 32 bit to 8 bit can achieve up to 4x reduction in memory required to load and store models. One way to further speed up this model is to explore the use of [8-bit fixed point quantization](https://heartbeat.fritz.ai/8-bit-quantization-and-tensorflow-lite-speeding-up-mobile-inference-with-low-precision-a882dfcafbbd). Performance can also be increased by a clever combination of tracking algorithms with the already decent detection and this is something I am still experimenting with. Have ideas for optimizing better, please share! <img src="images/general.jpg" width="100%"> Note: The detector does reflect some limitations associated with the training set. This includes non-egocentric viewpoints, very noisy backgrounds (e.g in a sea of hands) and sometimes skin tone. There is opportunity to improve these with additional data. ## Integrating Multiple DNNs. One way to make things more interesting is to integrate our new knowledge of where "hands" are with other detectors trained to recognize other objects. Unfortunately, while our hand detector can in fact detect hands, it cannot detect other objects (a factor or how it is trained). To create a detector that classifies multiple different objects would mean a long involved process of assembling datasets for each class and a lengthy training process. > Given the above, a potential strategy is to explore structures that allow us **efficiently** interleave output form multiple pretrained models for various object classes and have them detect multiple objects on a single image. An example of this is with my primary use case where I am interested in understanding the position of objects on a table with respect to hands on same table. I am currently doing some work on a threaded application that loads multiple detectors and outputs bounding boxes on a single image. More on this soon.
sunilkumarmaurya786693
# Intelligence traffic monitoring system ### About Due to a huge number of vehicles ,very busy road and parking which may not be possible manually as a human being, tends to get fatigued due to monotonous nature of the job and they cannot keep track of the vehicles when there are multiple vehicles are passing in a very short time. So modern cities need to establish effective automatic systems for traffic management and scheduling. The objective of this project is to design and develop an accurate and automatic number plate recognition system, Automatic traffic light control using google Api live traffic density data, smart fine system and also We can track the lost vehicle using vehicle number plate detection and find its location by google Map API. Intelligent Traffic Monitoring System (ITMS) is an image processing and machine learning technology to identify vehicles by their license plates and we uses the microService of google API for live traffic density. ### Features 1. License plate number recognition. 2. Matching the plate number with Database. 3. Intelligence traffic light control using live traffic density data. 4. Show traffic density of particular area for some duration of month in form of graph. 5. Online Vehicle license registration. 6. Smart fine system. ###Applications 1. Automated track the location of stolen vehicle 2. Anti-Theft/ Vehicle detection. 3. Traffic light automation ,no requirement of Traffic police. 4. Smart fine /E Challan Systems. 5. Car Parking / Automatic Toll Deduction. 6. Law Enforcement 7. VIP/Ambulance path Clearance 8. Help the government to take ● Increase the efficiency of existing transport infrastructure ● Develop a license plate recognition system, ● Build a smart fine system and in future enhancement automated fine systems for vehicles. ● Live Traffic detection system and automated traffic light control system. ● Predict the traffic density using machine learning for specific areas by its previous data. ● Automated lost vehicle detection system and information to administration. ● Handle traffic congestion using automated light control system. ### Installation * Clone the project. * Run `yarn install` to install the dependencies. * Run `yarn start` to view the project in action. ### OpenCV Demo to Count Vehicles * In "countingCars" directory, run 'python count.py' . ### License plate detection go to vehicle_number_by_its_pate folder and type python3 licenseplateDetection.py 1.jpg #secreenshot <img src="./screenshot/IMG_20200901_103735.jpg"> <img src="./screenshot/IMG_20200901_103751.jpg"> <img src="./screenshot/IMG_20200901_103811.jpg"> <img src="./screenshot/IMG_20200901_103826.jpg"> <img src="./screenshot/IMG_20200901_103844.jpg"> <img src="./screenshot/IMG_20200901_103906.jpg"> <img src="./screenshot/IMG_20200901_103943.jpg"> <img src="./screenshot/IMG_20200901_104003.jpg"> <img src="./screenshot/IMG_20200901_104044.jpg"> <img src="./screenshot/IMG_20200902_032314.jpg">
masloff-open-projects
Small free software to create a CCTV system with OpenCV from a single camera in your home or garden. Can stream video on a local network, detect motion, detect faces, detect person. Has a system of hooks for actions
LiYangSir
Opencv实战基于python,银行卡识别、全景图片拼接、OCR图片识别
infinitel8p
Motion detection using OpenCV - This is a simple Python script for a action cam that records the birds arriving at a feeding station on my balcony. The script is like a security cam that records video when it detects movement. The recorded videos are saved in MP4 format.
zdzhaoyong
The EICAM is the a part of PIL library, which includes some pretty useful tools for C++ programing espesially in the areas of moblile robotics and CV (computer vision). Since cameras projection and unprojection actions are often needed in CV or other area such as Robot localization like SLAM (simultaneous localization and mapping), we provided EICAM for efficient implimentation of different camera models including PinHole, ATAN, OpenCV, OCAM .etc.
mallickboy
A real-time gesture recognition system developed using OpenCV and MediaPipe in PyQt, leveraging multi-threading for 3x faster performance. The system supports 20+ configurable actions and sensitivity settings, providing intuitive control for gaming applications like Valorant.
huzhangron
System for detection of abnormal or cheating activities in exam. This is done by using artificial neural networks for detecting the body posture of the student during the examination using the cctv footage of the classroom. Actions like turning back, bending etc are detected. Faces are registered to a database by pre-computing the face embeddings of the students.The student is recognised using facial recognition and a report about his activities along with a timestamp is sent to the examiners following which action can be taken after reviewing the report . Technologies involved are: machine learning for detection of student cheating activity in exam; OpenCV and deep learning for face recognition and identification. The database used is SQLite.
m6c7l
Kinect2 along with OpenCV featuring Tk/Tcl in Action
krx7h
Control Subway Surfers with your face gestures using your webcam! 🧠 Built with Python, OpenCV, MediaPipe, and PyAutoGUI to map nose movement to in-game actions like jump, slide, and turn.
Yashparmar29
CricVision AI: A Deep Learning system to classify cricket shots (Cover Drive, Pull, Late Cut) in real-time. Built with Python, OpenCV, and MediaPipe, it extracts skeletal coordinates to analyze batsman biomechanics. Using LSTM/CNN models for action recognition, it automates sports analytics via Computer Vision.
sanjeebtiwary
This is a GitHub repository containing code for a Human Activity Recognition system using OpenCV and deep learning. The system uses a pre-trained model to recognize human actions in real-time video captured from a camera. The code is written in Python and can be run from the command line.
Bhargavoza1
Cutting-edge kidney tumor classification system developed using CUDA C++ and OpenCV. Features a robust neural network implemented entirely in CUDA C++, seamlessly integrated with a Golang server. front end powered by React. Deployed on Azure AKS with GitHub Actions for streamlined deployment.
micco66
A Python/OpenCV system that uses an external camera and dual-PC setup to analyze hands and automate preflop fold/play decisions. Designed with external hardware integration to remain undetected by online poker servers, transmitting actions via USB 3.0 to mimic mouse input. A work in progress.
GuangwenSi
In this Respository, The trend Project OpenCV project has been made, basically OpenCV Projects mostly code in Python, JAVA, C++, C. Otherwise OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. Being a BSD-licensed product, OpenCV makes it easy for businesses to utilize and modify the code. The library has more than 2500 optimized algorithms, which includes a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms. These algorithms can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D models of objects, produce 3D point clouds from stereo cameras, stitch images together to produce a high resolution image of an entire scene, find similar images from an image database, remove red eyes from images taken using flash, follow eye movements, recognize scenery and establish markers to overlay it with augmented reality, etc. OpenCV has more than 47 thousand people of user community and estimated number of downloads exceeding 18 million. The library is used extensively in companies, research groups and by governmental bodies.
Shikshamishra
In video processing, a video can be represented with some hierarchical structure units, such as scene, shot and frame. Also, video frame is the lowest level in the hierarchical structure. The content-based video browsing and retrieval, video-content analysis use these structure units. In video retrieval, generally, video applications must first partition a given video sequence into video shots. A video shot is defined as an image or video frame sequence that presents continuous action. The frames in a video shot are captured from a single operation of one camera. The complete video sequence is generally formed by joining two or more video shots .We have used Twilio api for messaging. It is a api which provides sid, auth-token, messaging number for sending message. Libraries used:- Client, Credentials. When any motion it detected then message will be sent on users phoneWe have used Twilio api for messaging. Software requirements are:OpenCv 3:- used for reading webcam, Annaconda:- contain all the basic libraries used for detection and tracking, Os:- for importing files from disk, Numpy :- used for all the mathematical implementation. Playsound:- to read the sound file from the memory and play it during detection.
sherlock1987
In this project, a fire extinguishing control system based on computer vision recognition is designed, which uses single chip computer as the control module of the car, PC and camera as the fire source identification and image processing module. Compared with PC, MCU can give full play to its advantages, output PWM wave to control the speed of the motor, and then control the overall movement of the car; PC can also play its powerful computing skills, showing the image of the fire source after processing. The fire extinguishing car mainly consists of power supply circuit, motor drive circuit, fire extinguishing device, single chip control circuit, and the overall framework of the car. The power supply circuit is used to provide the working power needed by the system. The driving chip drives the motor to control the movement of the car and the fire extinguishing device to extinguish the fire. The main work process of this design is: firstly, the data is transmitted to PC through camera. PC uses OPENCV library function and VISUAL STUDIO software to process the data, identify the flame position, and mark it on the image with rectangular box. After finding the fire source, the action signal is transmitted to the single-chip computer. Finally, the single-chip computer controls the movement of the car according to the established procedures and the specific commands of the PC, and completes a series of fire fighting actions.
hoo334
binocular-stereo-vision, vignetting-correction
No description available
shivarajhiremath8
An AI-powered virtual mouse that uses hand gesture recognition with MediaPipe and OpenCV to control the cursor and perform click actions in real time.
Varshetaganesh
An AI-powered crowd management system uses CCTV and OpenCV to extract video frames, with YOLO detecting and counting people in real time. Alerts are sent via Telegram if crowd size exceeds a threshold, enabling quick action. The system is scalable, real-time, automates control, and includes privacy safeguards.
prajesdas
Control zoom using just your hand gestures! ZoomGestures uses OpenCV, MediaPipe, and PyAutoGUI to detect your hand and trigger zoom-in and zoom-out actions. Perfect for touch-free interactions!
Reetikajadhav
A deep learning project that detects violent actions in ice hockey videos using frame extraction and computer vision. Violence detection in sports videos using OpenCV and CNN-based models - focused on Ice Hockey gameplay analysis. Detecting aggressive or violent events in Ice Hockey footage using deep learning and computer vision techniques.
Real-time ambulance detection using YOLOv8 and OpenCV. This script detects ambulance in a video feed and sends a binary (1/0) command, designed to interface with hardware like an Arduino via serial port for alerts or automated actions.
faezedrx
This project uses OpenCV and MediaPipe to track hand movements and detect a "crossed hands" gesture. When identified, it triggers an automatic system shutdown. The program operates in real-time via a webcam and effectively performs actions based on hand positions.
Nicofontanarosa
Hand Gesture Control is a computer vision–based system that allows users to control their computer using hand gestures captured through a webcam. Built with OpenCV and MediaPipe, it tracks hand movements in real time and maps gestures to system actions such as mouse movement and clicks, enabling touchless and intuitive human–computer interaction.
Ausers11
一些opencv实战项目 目前有:信用卡数字、扫描文档并进行OCR图像识别、图像拼接、答题卡识别
web2ls
No description available
Sanjay-2806
This Python program uses MediaPipe and OpenCV to control browser tabs and actions with hand gestures in real-time.
Salmanzakariya
A virtual mouse in Python using OpenCV, Mediapipe, and Autopy leverages computer vision to track hand movements and gestures via a webcam. OpenCV captures video, Mediapipe detects hand landmarks, and Autopy simulates mouse actions, enabling control of the cursor without physical input devices.