Source: Deep Learning on Medium
A day at Tensorflow Roadshow Bangalore
A summary of all the information shared during the event.
I got the privilege to attend Tensorflow Roadshow Bangalore conducted by Google and have a direct view of all the latest developments in Tensorflow 2.0 as well as networking with some of the coolest people in the community. I didn’t get a confirmation last year but I was glad I got a chance to witness the event this time and I have decided to share all the things that happened over the course of the event.
There were a total of 16 talks and I would like to give a brief of overall information that is shared
Keynote: ML today and tomorrow
A brief overview of what developments have allowed the rapid development of deep learning is present which are
- More data for training available now compared to before.Open datasets such as OpenImages dataset by Google with around 9 million images
- Computational power has increased exponentially. A single TPU can give 45 Teraflops computational power and can stack 64 of them to reach as high as 100 Petaflops
- Lots of research in the field of deep learning which led to a lot of improvement in what can be achieved today compared to say 7 years ago. For example RNN had to be trained sequentially resulting in it taking a lot of time for training. With the advent of transformer architecture now we train in parallel and able to achieve better results, utilize the computational resources better and take lot less time. Similarly if you see the NLP part there has been tremendous progress over last 1–2 years with advent BERT and its variant architectures, XLNET etc.
Few applications of usage of Tensorflow in real world are shown :-
- SUMMIT fastest supercomputer in the world with 27000 GPU uses Tensorflow to do extreme weather prediction
- CHILE based startup Notco uses Tensorflow and Deep Learning to find what are all the things which make a food desirable for a human and trying to provide the same experience using healthy items.
- Major startups in India are using Tensorflow for their usecases. Sharechat finds user preferences using Deep Learning. Dunzo uses deep learning to estimate demand. Nobroker uses deep learning to understand all the house data.
- IITD students developed an Android app to determine air quality using a photo. They have used Tensorflow Lite so that all the computation happens on-device and no need of a server such that the app works even if there is no internet connection.
TensorFlow 2.0 updates
- Keras api :- The api is made much simpler by integrating keras directly into tensorflow.
- Sessions are dead :- Tensorflow 2.0 brings along with it eager execution and hence no more session. Now there is no need to run a session to inspect values inside a tensor, direct printing of a tensor gives its value
- Python code to tensorflow graph :- Now any code written in python can be converted to tensorflow graph by using tf.function decorator. The use of converting the code to a graph is now it can utilize hardware such as GPU
- Access to low-level ops :- Low level operations can still be accessed using tf.raw_ops
- Removal of tf.contrib :- tf.contrib is now removed and few import code is moved to core api
- Distribution Strategy :- Provides a lot of options to distribute workload across GPU
- Tensorboard upgrade :- Tensorboard now supports colab and profiling neural network performance
- Backward compatibility :- Tensorflow 1.x is still supported using tf.compat. tf_upgrade_v2 script is available to upgrade code from 1.0 to 2.0
- Datasets and tutorials :- Lots of new datasets and tutorials are added
- Single model serialization :- Single stored model can be deployed on multiple platforms
Has support for preprocessing and tokenization in-built in tensorflow
Support for Ragged tensors since the tensor size need not be constant for text
Different types of tokenizers like whitespace tokenizer, unicode tokenizer and wordpiece tokenizer are discussed
Has in-built support for vocabulary but supporting all the languages is a huge task. Wordpiece vocabulary generation is used to reduce the vocabulary
TF Lite is used to deploy on-device models on mobiles and embedded devices. On-device ai is important due to latency, network connectivity and privacy reasons.
TF Lite runs on Android, IOS, Raspberry, Microcontrollers etc.
Google’s portrait mode uses on-device ai
Model performance on edge devices can be improved in below 3 methods
- Quantization :- By reducing 32bit to 8bit. Many hardware accelerators such as GPU perform better with uint8
- Pruning :- By removing unused connections/weights in neural network
- Hardware Accelerators :- Now with support of accelerators like GPU, Edge TPU, DSP, CPU can achieve better performance
In TF 2.0 Delegates are used to move tasks which need to be accelerated using accelerator such as GPU.
Tensorflow Select acts as bridge for supporting unsupported ops in Tensorflow Lite although unoptimized. TF Lite allows the developers to reduce binary size by removing unnecessary ops.
Beyond mobile the code runs on other embedded devices like Raspberry, microcontroller etc. with no changes required.
Future Work :-
- Improved converter to make the code work on edge devices
- Better error handling diagnostics
- Control flow support
- Stabilizing runtime bindings
- Support library for pre and post processing
- More ops to be added for microcontroller. Improved conversion and testing tools. Support for Arduino
TF Distribution Strategy
Easy and simple support for distributing training load across GPU with just a few lines of code. The below strategies are supported
- MirroredStrategy :- Supports training on multiple GPU by mirroring the same procedure in each
- MultiWorkerMirroredStrategy :- Useful when training across machines. Similar to Mirrored Strategy distributes workload across each machine and to each GPU in every machine
- ParameterServerStrategy :- Support for parameter servers
- TPUStrategy :- Similar to above strategies but for TPU instead of GPU
tf.data input pipeline works seamlessly with any distribution strategy
Nothing needs to be installed, TFJS runs by default. All data resides on the client side and hence privacy protective.
No server-side calls, everything runs on browser. Used WEBGL to get all mathematical operations needed for computation rather than starting from scratch.
TFJS api for peprocessing data is available now.
React Native support is added now.
Getting involved with Tensorflow
To get involved with Tensorflow, Explore ML academy events are happening across India. Interested students can volunteer.
Tensorflow usergroups are present in Bangalore and other places for interested people.
Can join any Special Interest Group for Tensorflow as well.
Tensorflow RFC’s can be commented upon for suggestions or issues
Model Optimization and Quantization
Model optimization can be done in two days, optimizing after training or optimizing along with training.
Quantization :- Reduce precision from float32 to float16. It improves the performance, reduces the model size but is a lossy transformation.
Post training quantization :- Train and save the model, convert using TF Lite converter. Optimization options can be specified during converting. But there is more loss since optimization is done after training
Can quantize just weights using hybrid optimization or can perform full integer quantization of weights+activations by sending a representative dataset.
Quantization aware training :- Apply optimization during training such that the network is aware of the optimization happening and hence can perform better compared to above. In this the optimization is applied on the forward pass of training just like how it’s done during inference. Since now the network is being trained on the kind of weights it will see in inference it can optimize better to achieve good results within the constraints.
More details in below link
Tensorflow Extended(TFX) helps you build a complete end-to-end machine larning pipeline. TFX integrates with a meta-data store. This helps in tracking multiple experiments, comparing results and model performance etc.
Has support for data visualization using Tensorflow Data Visualization(TFDV)
Support for preprocessing such as tokenization with Tensorflow Transform
Tensorflow Serving for versioning and supporting multiple models.
Provided an example of the pipeline demonstrating dealing with Chicago Cab Dataset .
Swift for Tensorflow
The advantage of Swift compared to Python is it’s fast and it has interoperability with both C and python directly. Cross-platform, easy to learn and use, opensource. Supports differential programming.
Demo and colab code for swift is displayed.A new course by fast.ai for Deep Learning with Swift is released.
Overall it has been an exciting and informative day. Could get to learn a lot of things which I hadn’t heard about before and get updated with all the latest developments. I am looking forward to what awaits in the future.