Acerca de

Covid terminator

Demo

In this demo, text color show mask wearing(green) or no mask(red), and box color show social distance violation(red) or not(green).

Overview

Since the COVID outbreak, in 2020, people around world are suffer from its high dissemination and lethality. However, this epidemic could be effectively mitigated or even ended if every one are wearing masks and keeping a safe social distance. Therefore, this project is to help detect and warning people to wear mask and keep social distance from other other using deeping learning and depth camera.

There are two general tasks this project reaches:

Detect each face (using Haar Cascade) and classify whether it wears a mask (using VGG19 )
Determine whether each person violate the social distance rule (using RealSense Camera D435)

Architecture

Face Detection

Haar Cascades Feature-based Cascade Classifier is used in this project for human face detection in real time. It is a machine learning based approach in which a cascade function is trained from a lot of positive and negative images. Initially, the algorithm needs a lot of positive images (images of faces) and negative images (images without faces) to train the classifier. Then we need to extract features from it. They are like our convolutional kernel. Each feature is a single value obtained by subtracting the sum of pixels under the white rectangle from the sum of pixels under the black rectangle.

A simplified illustration shows below:

Face Classifier

In this project, VGG19 model is used as backbone for feature extraction. VGG19 is a very deep Convolutional Neural Network (CNN) for large-scale image recognition. The backbone layers I adapted from VGG19 include two Conv3x3 (64) , two Conv3x3(128), four Conv3x3(256), eight Conv3x3(512) and four MaxPooling. Then, I add one flatten layer and one sigmoid dense layer to build the classification model.

The default architecture is shown following:

Realsense Depth Camera

The Realsense Camera D435 is used as the depth camera for live streaming. The outputs for each frame are the color image (right) and the pixel-aligned depth image (left) shown below: