Multimodal Affect : Unified deep learning framework for emotion and sentiment recognition from video, audio, and text. Powered by BERT, ResNet3D, and CNNs. End-to-end training, robust evaluation —built for research and real-world affective computing.
Stars
2
Forks
0
Watchers
2
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
32
commits
docs: Add project roadmap (Excalidraw), results summary, and training/validation/test loss and metric curves.
ca85aa7View on GitHubfeat(endpoint deployment): Add secure SageMaker endpoint deployment script using environment variables for IAM role, S3 model URI, and endpoint name.
46909ecView on GitHubfeat(launcher script): Add SageMaker training launcher script with TensorBoard logging, hyperparam config, and S3 data channels.
f7b1318View on GitHubfeat(predict funtion inference): Add SageMaker predict_fn for multimodal video segmentation, inference, and top-3 emotion/sentiment predictions with Whisper ASR.
d8d8943View on GitHubfeat(model function inference): Add SageMaker model_fn for inference with FFmpeg check, multimodal model/transcriber setup, and device-aware loading.
f6df78cView on GitHubfeat(i/o inference sagemaker): Add SageMaker-compatible input/output handlers for S3 video processing and JSON prediction serialization.
e4817d7View on GitHubfeat(s3 utility inference): Add utility to download video files from S3 URI to a local temporary file for inference.
404aa03View on GitHubfeat: Add VideoUtteranceProcessor for precise video segment extraction with FFmpeg, supporting utterance-level processing for multimodal models.
3a32902View on GitHubfeat(deployment inference Audio Processor): Add AudioProcessor with FFmpeg-based audio extraction, Mel spectrogram conversion, normalization, and fixed-length padding/truncation for inference.
98ba271View on GitHubfeat(deployment inference video processor): Add VideoProcessor class for video frame extraction, resizing, normalization, and tensor conversion.
ac60412View on GitHubfeat(deployment inference): Add inference script with FFmpeg setup, multimodal preprocessing, and S3 integration for predictions.
b7008bfView on GitHubfeat(deployment): Implement deployment-ready multimodal model with frozen BERT, ResNet3D, CNN encoders and dual classification heads.
e61bd65View on GitHubchore(deployment requirements): Add deployment requirements with pinned versions for CUDA 12.1, Whisper ASR, and inference optimizations
a340afbView on GitHub