Source: Deep Learning on Medium
Author: Zhi Zhang, Applied Scientist at Amazon
By analyzing user feedback and requests, we are happy to announce the new features in GluonCV 0.4:
- New application: Human Pose Estimation model
- Faster Deployment: INT8 deploy model and Pruned ResNet for faster inference
- Better base classification networks: ResNext, SE_ResNext series(we provide 80%～81+% accuracy models pre-trained on ImageNet）
- Faster/Mask-RCNN models with Feature Pyramid Networks(FPN).
Meanwhile, the usability and stability of existing modules have been improved dramatically.
Human pose estimation models are crucial for analyzing human behavior. GluonCV provides a complete set for human pose applications, including network definitions, training scripts, loss function, and metrics. We also provide tutorials for bootstrapping your applications.
Let’s see some awesome real-life examples:
The following table summarizes our pre-trained human pose estimation models on COCO dataset with state-of-the-art performances.
Deploy with INT8
We have been collaborating deeply with Intel, and introducing INT8 model deployment in GluonCV! Powered by Intel Deep Learning Boost(VNNI), INT8 quantized models in GluonCV can achieve significant speedup over 32bit floating point operators. Benchmark on AWS EC2 C5 instances：
Now you can use
int8 versions of models out of GluonCV model zoo:
resnet50_v1_int8 is the quantized version of
resnet50_v1, later we will introduce API to convert all models to INT8. Note that you will need a Skylake or newer Intel CPU in order to achieve reasonable speed up due to hardware instruction limitation.
We know CNNs are redundant in most case, GluonCV 0.4 provides you a bunch of resnet with pruned structures and parameters. You now can achieve up to 9 times faster without losing significant inference accuracy.
resnet50_v1d_0.37 contains roughly 0.37x parameters of
5.01x indicate that it’s 5 times faster during inference. You can refer to a more intuitive version here to choose the right network for your purposes.
More interesting GANs
Faster/Mask-RCNN with FPN
SoTA performance provided by Faster/Mask-RCNN with FPN
Improvements and Bug fixes
- All ResNets and variants now support
- Pre-trained object detection models are able to
reset_class, by defining
reuse_weightsit can reuse partial knowledge of previous categories, allowing models to detect classes without finetuning. Please refer to this tutorial.
- Now PSP and DeepLabv3 models can
hybridizelike other models
- Fix some random
NaNproblems(requires mxnet nightly)
- Improve GPU NMS op（requires mxnet nightly）
GluonCV training skills in v0.3 are now public
We have unveiled tricks used in v0.3 with significant impact on pre-trained models
- Bag of Tricks for Image Classification with Convolutional Neural Networks
- Bag of Freebies for Training Object Detection Neural Networks
We sincerely appreciate contributors: @xinyu-intel @hetong007 @zhreshold @khetan2 @chinakook @Jerryzcn@husonchen @zhanghang1989 @sufeidechabei @brettkoonce @mli @lgov @djl11 @YutingZhang @mzchtx@sharmalakshay93 @astonzhang @LcDog @zx-code123 @adursun @ifeherva @ZhennanQin @islinwh @jianantian@feynmanliang @ivechan @eric-haibin-lin
Please Like/Star/Fork/Comment/Contribute if you like GluonCV!
 He T, Xie J, Zhang Z, et al. Bag of tricks for image classification with convolutional neural networks[J]. arXiv preprint arXiv:1812.01187, 2018.
 Zhang Z, He T, Zhang H, et al. Bag of Freebies for Training Object Detection Neural Networks[J]. arXiv preprint arXiv:1902.04103, 2019.
 Intel Deep Learning Boost. https://www.intel.ai/intel-deep-learning-boost
 Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2223–2232.
 Ledig, Christian, et al. “Photo-realistic single image super-resolution using a generative adversarial network.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.