Source: Deep Learning on Medium
Multi-Person 2D Pose Estimation
This is very interesting → in real-time multiple people pose tracking. (new representation of encoding → this is very sexy).
Very good → the lighting is not good for the demo but it really worked very well. (realtime we need a lot of computational power → how can we achieved this?).
Heatmap is multiple one → this is quite challenging. (is this problem solved? → this is no → we need to optimize the key part into people → each people).
Efficient representation → was the key idea.
The direction vector → is there → this is related to the optimal flow → position and the orientation of the human body parts.
So we are encoding the connection → between different body parts. (we need both direction encoding) (point to one body part from another).
Hence the representation can be looked like that → remove the incorrect parts.
This research → relates to graph theory. (the nodes are now going to be optimized to fit some certain model).
There are different branches → very good → and this is the iterative method → training was hard but really did well. (this became open-pose).
This became realtime → open-pose → there is a complicated post-processing step in the prediction.
Multi-pose is much harder to optimize.
Some approaches → are good and mode efficient.
Very good implementation and quite complicated preprocessing step.
There are multiple stages → quite hard to optimize.
Holly shit the post-processing step is super complicated → so much preprocessing has to be done.
There is another network called faster RCNN. (pyramid network is the backbone).
Wow, quite complicated architecture.