Original article was published by Leon Overweel on Artificial Intelligence on Medium
Productized Artificial Intelligence 🔌
OpenAI is exclusively licensing GPT-3 to Microsoft. What does this mean for their future relationship?
GPT-3 is OpenAI’s latest gargantuan language model (see DT #42) that’s uniquely capable of performing many different “text-in, text-out” tasks — demos range from imitating famous writers to generating code (#44) — without needing to be fine-tuned: its crazy scale makes it a few-shot learner.
In July 2019, OpenAI announced it got a $1 billion investment from Microsoft. Back then, this raised some eyebrows in the (academic) machine learning community, which can sometimes be a bit allergic to the commercialization of AI (#19). The exact terms of the investment were never disclosed, but some key elements of the deal were. Tom Simonite for WIRED:
Most interesting bit of the OpenAI announcement: “we intend to license some of our pre-AGI technologies, with Microsoft becoming our preferred partner.”
Now, a year and a bit later, that’s exactly what happened. From the OpenAI blog:
In addition to offering GPT-3 and future models via the OpenAI API, and as part of a multiyear partnership announced last year, OpenAI has agreed to license GPT-3 to Microsoft for their own products and services.
What does that mean? Nick Statt for The Verge:
A Microsoft spokesperson tells The Verge that its exclusive license gives it unique access to the underlying code of GPT-3, which contains technical advancements it hopes to integrate into its products and services.
In their blog post, Microsoft pitches this as a way to “expand [their] Azure-powered AI platform in a way that democratizes AI technology,” to which the community again reacted negatively: if you want to democratize AI, why not just open-source GPT-3’s code and training data?* I agree that “democratizing” is a bit of a stretch, but I think there’s a much more interesting discussion to be had here than the one on a self-congratulatory word choice in a corporate press release. Perhaps ironically, that discussion also starts from overanalyzing another few words in that very same press release.
According to Microsoft’s blog post about the licensing deal, GPT-3 “is trained on Azure’s AI supercomputer.” I wonder if that means OpenAI is now using Microsoft’s open-source DeepSpeed library (#34) to train its GPT models. DeepSpeed is a library for distributed training of enormous ML models that has specific features to support training large Transformers; Microsoft Research claimed in May that it’s capable of training models with up to 170 billion parameters (#40). GPT-3 is a 175-billion-parameter Transformer that was released in June, just one month later. That seems unlikely to be a coincidence, and Microsoft’s latest DeepSpeed update (#49) even includes some experimental work using the GPT-3 architecture.
So this suggests that the partnership goes beyond just the exchange of Microsoft’s money and compute for OpenAI’s trained models and ML brand strength (an exchange of cloud for clout, if you will) that we previously expected. Are the companies actually also deeply collaborating on ML and systems engineering research? I’d love to find out.
If so, this could be an early indication that Microsoft — who I’m sure is at least a little bit envious of Google’s ownership of DeepMind — will eventually want to acquire OpenAI. And it could be a great fit. Looking at Microsoft’s recent acquisition history, it has so far let GitHub (which it acquired two years ago) continue to operate largely autonomously. This makes it an attractive potential parent company for OpenAI: the lab probably wouldn’t have to give up too much of its independence under Microsoft’s stewardship. So unless OpenAI actually invents and monetizes some form of artificial general intelligence (AGI) in the next five to ten years — which I don’t think they will — I wouldn’t be surprised if they end up becoming Microsoft’s DeepMind.
Quick productized AI links 🔌
- 🗂 Amsterdam (where I live!) and Helsinki (where I don’t live) have launched their “AI algorithm registries.” These are actually a pretty cool idea: whenever a municipalities “utilizes algorithmic systems as part of [their] city services,” these systems must be cataloged in the city’s algorithm registry. Amsterdam’s registry currently has three entries: (1) license plate-recognizing automated parking control cars, (2) a pilot for algorithm-assisted fraud surveillance for holiday home rentals, and (3) a natural language processing system for categorizing reports of trash in public space. These registries may become a good source of productized AI links for me, but more importantly, this is a great step for building transparency, trust and accountability into these systems.
Machine Learning Research 🎛
- 🖼 An Image is Worth 16×16 words: Transformers for Image Recognition at Scale is a paper under review for ICLR 2020 that’s been making the rounds on Twitter. I found Yannick Kilcher’s explainer video — which starts with a lovely rant about “double-blind” peer review — a good introduction to the model, which could be the start of Transformers overtaking convolutional models at the very largest scales of computer vision models.
- 👩🏾💻 Building on their previous three years of graduate school application mentoring programs, Black in AI has launched an Academic Positions program to support Black junior researchers getting started in “careers in academia, industry, and policy.” The launch blog post includes details about the program, tips on how academics and organizations can support it, and lots of additional resources. This is a great link to amplify within your ML network!
- ⚡️TensorSensor is a Python package that “clarifies” (visualizes) the dimensions of tensors in numpy, TensorFlow or PyTorch. I recently had to reproduce a paper that wrote down its math in a simplified form that ignored the out-channel dimension of convolutional filters, and spent a lot of time trying to get all my matrices to line up correctly with that extra dimension. This tool would’ve made that a lot easier! Also check out Terence Parr’s introduction to TensorSensor.
- 🔗 Cool new arXiv.org feature: Papers with Code-discovered implementations are now linked right on a paper’s abstract page. I’ve always found it quite easy to find any available implementations with a few quick Google and GitHub searches, but integrations like this are great for discoverability.
Artificial Intelligence for the Climate Crisis 🌍
- ⛅️ Some recent progress on nowcasting (predicting over the next few hours) the locations of clouds: Berthomier et al. (2020) compared the effectiveness of several deep learning models for the task, and Jack Kelly of Open Climate Fix open-sourced a Python notebook that approaches the same problem using optimal flow. This work is a step towards improving predictions of solar panels’ power output, an important task for operators as an increasing fraction of the electricity supply on their grids transitions to solar.
Cool Things ✨
- 🛴 I came across this short story by Janelle Shane when it premiered as a New York Times “Op-Ed From the Future” last year, but forgot to share it at the time. I rediscovered and reread it this week, and I still think it’s delightful: We Shouldn’t Bother the Feral Scooters of Central Park.
- 🎨 Imaginaire is NVIDIA’s universal library for image and video synthesis, including algorithms such as SPADE (GauGAN), pix2pixHD, MUNIT, FUNIT, COCO-FUNIT, vid2vid, few-shot vid2vid. Check out this demo video to see what it’s capable of, from summer-to-winter transformations to automatically animating motion into pictures.