My Summer Internship at VSC DigiTech

Original article was published by Chirumamilla Nandavardhan on Deep Learning on Medium

My Summer Internship at VSC DigiTech

During my 8 weeks summer internship at VSC DigiTech, a Hyderabad based tech start-up, I had the opportunity to work on one of their projects called “Online Proctor”. This internship took place during the summer of 2020, when Covid-19, a disease that was classified as a pandemic had wreaked havoc all over the world. During these unprecedented times, entire countries were in lockdown and a new normal has been established. All universities and schools have been shut down and learning has shifted online. Online classes, which were new both to the teachers and students had their own set of disadvantages and inconveniences. One of the biggest difficulties was quality control. It was a difficult task for the management of an institution to determine where the fault lies for the poor effectiveness of online classes, is the students slacking off or is it the faculty’s inability to cope up with the new system in place, etc. Also many ineffective methods like manual attendance are being used. So, to assist the universities and schools with these tasks, VSC DigiTech has come up with an innovative idea “Online Proctor”.

Online Proctor is a tool that can be used to monitor the students during online classes and generate a performance report for each student for every class, as an attendance score. This would play role similar to that of manual attendance for classroom held classes except that this score would be an even more accurate measure of how attentive the student is. This tool uses has 3 deep learning-based modules, namely face detection, face recognition and emotion recognition. At the beginning of every class, each student’s identity would be verified. After the class starts, pictures of the attendee would be captured at random intervals and would be checked for the presence of the student. If the student is present in the picture, then the emotion of the student would be analysed. Depending on the emotion, a score would be allotted for each picture. At the end of class, the combined score would be generated which would then be updated to the database. All the tasks would be completed in the background without disturbing the class. This way, flow of the class would not be disturbed.

Let us talk about the three deep learning models in detail. We have analysed the performance of two different approaches for the task of face detection. One approach is Retina Face, a state-of-the-art model for face detection. This was implemented in pytorch, an open source library primarily used for machine learning tasks. The retina face employs a pixel-wise face localization method for face detection. The other approach based on using Haar cascades for face detection. This was implemented using OpenCV, an open source library for computer vision tasks. Although, the performance of the Retina Face based approach is better than the Haar cascades based approach, the Haar cascades based approach is faster than the Retina face approach. Here, a trade-off between speed and performance has been observed. Keeping the required performance and speed in mind and the current technique in use, the Haar cascade based approach has been used for the face detection module.

Verification of identity is an important aspect so that we can detect proxies and other such misleading acts. We use OpenFace , a state of the art face recognition technique proposed by the Carnegie Mellon University to accomplish the task of face recognition/verification. In the OpenFace approach, a pretrained model is used to generate a 128 dimensional embedding for each face. At the time of verification, the Euclidian distance between the generated embedding of the captured face and stored embedding of the face in the database is calculated. The distance gives us a measure of how similar the two faces are and based on some threshold determined threshold values, we can output the verification results. Another task is the task of emotion recognition. For this task, we have trained a CNN(Convolutional Neural Network) based neural network on the FER dataset for a few hundred epochs. We trained the model to classify the images into three classes, namely positive, neutral and negative. All these three modules have been integrated and launched as a single API using flask and aws cloud platform. This API would later be integrated with an online conferencing solution(in development).In the future, Online Proctor tool aims to provide a much more detailed analysis of the classes that take place. One such analysis would be analysing the audio and transcripts of a class and to obtain valuable insights from it.

These 8 weeks at VSC DigiTech has been an amazing experience giving me an opportunity to learn, teach, coordinate tasks with fellow teammates and to lead. I got the opportunity to be part of developing an innovative solution to the arising demands in these unprecedented times. I am very thankful to my mentor and CEO of VSC DigiTech, Mr Venugopal for giving me this opportunity and guiding me throughout my journey at VSC Digitech. I am also very grateful to another mentor Mr Padma Neeraj Kumar and also my teammate Ms Akanksha Telagamsetty for their support.