Source: Deep Learning on Medium
Deep Learning, Natural Language Processing and Climate Change
Where I discuss the costs and benefits of deep learning (and specifically NLP occasionally) on climate change.
- Benefit: deep learning helps extract knowledge from media (text, video, audio, images) in new ways, which can help understand climate change impact including sources and previously unidentified contributors.
- Cost: deep learning can also consume massive amounts of processing power. Though usefully, for many, there are incentives to reduce processing power because of its cost.
Here I capture a few resources I’ve found on these topics. It’s by no means complete, so feedback/comments most welcome.
Deep learning as a contributor to climate change
While it’s fantastic there are institutes like MILA that presented a vision for how the scientific community can collaborate, commercial enterprises will first and foremost optimise for cost. And there are a few forces at play:
- Larger is better: Deep learning performance is still substantially affected by larger models that take longer to learn, consume more resources and cost more.
- From 2019 onwards, deep learning is becoming much more important to enterprises in general. In the last couple of years, big strides have been made to access to everyday spoken and written language (speech to text and natural language “understanding”), and this is only possible with deep neural networks. Specifically, intelligible (not perfect) automated conversation transcriptions at scale is helping improve knowledge transfer and reduce risk, and better tools for understanding and generating the written word are also improving rapidly.
- Automated machine learning optimisation: Even basic configuration settings like the structure of the neural network can dramatically change outcomes, and there’s been improved support for automated experimentation with different combinations of settings. This is generally referred to as AutoML. (For the technically inclined this includes aspects such as “neural architecture search”, where the numbers of layers and neurons in a neural network are tweaked, which you can see with Tensorflow Playground.)
- Resource consumption for technology in general is increasing, and machine learning’s piece of that pie is also increasing. Cloud computing’s promise is “infinite resources” (including compute and storage), and even with costs consistently dropping, and cost optimisation the top agenda item for enterprises, it’s safe to assume that a given company’s consumption and production of data will continue to increase, in general. Machine learning and the resources it uses will almost certainly be a significant contributor to that.
It’s not all doom though: deep learning trends that can help limit climate change
People are pretty good optimisers, and in general in the machine learning community the pattern is predictable: once something being researched is actually possible, work then starts to make it economically feasible in production, which apart from robustness, also means cost-effective resource use.
Optimising the algorithms.
While there was a huge backlog of amazing research to catch up to from the 80s onwards, once deep learning became popular again (c. 2011), better algorithms have been the other major reason why quality of deep learning has improved over the last 7–8 years, including new architectures (such as attention and transformers), better learning rate algorithms, better activation functions (the key function of a neuron) and better open data sets, to name a few.
Optimising specifically for cost and time.
While optimisations typically improve performance, Stanford University’s DAWNBench specifically focuses on reduced resource consumption: “DAWNBench is a benchmark suite for end-to-end deep learning training and inference. Computation time and cost are critical resources in building deep models, yet many existing benchmarks focus solely on model accuracy.” See how this gave rise to reducing training time and cost of the ImageNet database from “a few days” to 18 minutes costing $40 by Jeremy Howard and team at Fast.ai.
Some of these high performance machine learning models, particularly in NLP, got impractically big to be pushed into production. Some pretty exciting progress has been made in the last couple of years to improve machine intelligence of (mostly English) text, with the most influential being Google BERT at the end of 2018, with an approach that beat state-of-the-art deep learning NLP in multiple categories at once. With the number of machine learning parameters growing from 340m+ for BERT, OpenAI’s GPT-2 reaching mainstream press in February 2019, and peaking in Aug 2019 with Nvidia’s monstrous 8 billion parameter model, things got out of hand, and with almost no one having the resources to use these models directly. (For example, in my own deep learning / NLP research at CognitionX, I found one instance of BERT (using the excellent bert-as-service) would consume 16GB at startup and stabilise at around 6GB, which can cost many hundreds, if not thousands of dollars a month: e.g. Heroku | AWS | Google | Azure.)
So how did people solve this? It’s not completely solved, but some success has being demonstrated in using “deep learning model distillation” such as HuggingFace’s DistillBERT, employing a “teacher / student” model to trim down a model by 90%+ while keeping or even improving performance.
Deep learning technology as a tool to help combat climate change
Prof. Yoshua Bengio’s MILA presented (at an Element AI talk 31 Jan 2019 in London) these headlines for the environmental applications of AI:
- Optimising energy resources: smart grids, demand forecasting, minimising transport costs
- Visceralisation: visualisation with the objective to bring emotion into the communication, citing research that indicates strong feelings can help create lasting memories visualisation future impact of climate change. For example, see the UK Natural Trust simulation of impact of climate change on historic buildings 25 Oct 2019
- Climate modeling: predicting the effect of climate change
- Accelerated R&D of new materials
- Conservation efforts
Also Bengio presented humanitarian applications of AI:
- Computer vision on satellite images: crisis response, human rights monitoring, conservation efforts (such as illegal logging in Indonesia Oct 2019)
- Agriculture: plant disease detection, optimisation of treatments
- Healthcare: detecting medical conditions, proposed treatments, helping with disabilities and reducing costs
- Education, and Environment
Climate Change AI: comprehensive introduction to a range of research papers to educate you on concepts relevant to climate change that includes electricity systems, transport, buildings and cities, industry, farms and forests, CO2 removal, climate prediction, societal impacts, solar geoengineering, education and finance.
Montreal Declaration for Responsible Development of AI: “proposed ethical principles based on 10 fundamental values: well-being, respect for autonomy, protection of privacy and intimacy, solidarity, democratic participation, equity, diversity inclusion, prudence, responsibility and sustainable development. [MILA]”
Energy and Policy Considerations for Deep Learning in NLP (PDF, 5 June 2019):
Deeplearning.ai: a great course for programmers who are comfortable with a little algebra and calculus that you finish up with a solid understanding of the fundamentals and hands-on experience.
State of play Cloud computing 2019: the number one priority for enterprises is cloud cost optimisation
Cloud cost is dropping dramatically (Jan 2018): A look at Amazon AWS pricing over the last few years
Born Again Neural Networks (12 May 2018): knowledge distillation through teacher/student secondary learning
Dec 14 2019 Vancouver: Academic Research Workshop NeurIPS Tackling Climate Change with Machine Learning including global tech leader and senior Google Fellow, Jeff Dean. The workshop will not record proceedings so those attending need to take notes for make benefit great combat climate change.