Original article was published on Artificial Intelligence on Medium
Jetson Lc0 – Running Leela Chess Zero on Nvidia Jetson, a Portable GPU Device
Hello there! I am 15-year-old Women International Master Evelyn Zhu. The beautiful game of chess has and continues to be a huge part of my life. At the age of seven, I started playing chess competitively. Since then, I’ve worked my way up to being one of the top players of my age in the USA. Recently, chess engines have grown to be super powerful, and have displayed absolutely amazing games. Like most other chess players, I run an engine on my laptop for purposes such as opening preparation and game analysis.
For the past two years, I’ve been more or less settled with the neural network (NN) based Leela Chess Zero (aka Lc0) engine, the open-source implementation of Google DeepMind’s Alpha Zero. It has consistently beaten other engines, most recently winning the 17th season of the TCEC against Stockfish with a score of 52.5–47.5. Lc0 is quite fascinating as is seems to show evaluations in a more positional manner, without searching very deep as other non-NN engines do. However, in order to unleash the power of its NN algorithms, we must use some kind of GPU accelerator device, such as Nvidia’s modern GPU GTX 10 or RTX 20 series.
Modern laptops are often equipped with Nvidia GTX or RTX GPU. They are pretty powerful for running Lc0. The problem is that the compact profile of a laptop often can’t sustain the running of Lc0. For example, my laptop (with GTX 1650) sometimes crashes because of overheat when running Lc0 in ChessBase.
I was wondering if there was a way for laptops to connect to an external, portable Nvidia GPU device and, at the same time, still be able to launch the Lc0 engine in the usual fashion with ChessBase.
I raised this question to my dad, who is familiar with computer peripheral devices, and he told me there is probably a solution. He mentioned that Nvidia recently released a considerably powerful GPU device, the Jetson Xavier NX module, which is intended for IoT (Internet of Things) purposes such as AI-powered Robotics and autonomous vehicles. This new device has 384 NVIDIA CUDA® Cores and 48 Tensor Cores and claims to reach 21 TOPS AI performance, which appears to be an exceptionally good candidate for running Lc0.
He advised me to figure out the following three points to make this work:
1. Launch Lc0 engine backend inside the Xavier NX device: The Lc0 source code is mainly for Intel x86/amd64 architecture for both Windows and Linux (particularly Ubuntu Linux 18.04). Jetson Xavier NX is based on ARM64 architecture but does come with Ubuntu Linux 18.04. As such, I need to recompile the source code against ARM64 so the engine can run inside the Xavier NX device.
2. Launch UCI engine frontend within ChessBase on Windows laptop: Since I still want to use a Windows laptop/ChessBase the same way I did before, I have to create a UCI engine program for ChessBase to load and process UCI commands issued by ChessBase. This does not seem very difficult since I can simply borrow the main program from the Lc0 source code and only keep the part with UCI loop logics.
3. Communications between frontend UCI and backend Lc0: I need to create a data communication channel to transfer the UCI commands to the backend Lc0 engine. Then, I have to return the results back to the frontend UCI engine. This can be done via either network programming or serial port programming since Jetson Xavier NX supports both.
After an intensive three weeks of work, I managed to come up with a simple Windows UCI engine program (point 2) as well as the communication mechanism between UCI and Lc0 backend (point 3). My dad helped me crack point 1 as it took some effort to tweak and recompile the Lc0 source code for ARM64 and Nvidia CUDA/CUDNN libraries to successfully load it in the Xavier NX device.
With everything in place, I’m able to simply connect the Xavier NX device with a MicroUSB-to-USB data cable to my Windows laptop, launch the Lc0 engine from the Xavier NX device (via PuTTY, a secure remote login tool), and then, as usual, load the UCI engine from Chessbase. And that’s it! The performance is great with the popular network 256×20-t40–1541.pb.gz.
Here is a demo video showing how to run the Jetson Lc0 engine.
The Jetson Xavier NX device is small and very convenient to carry. I also managed to put it inside a case that is dedicated to the Jetson Nano development board.
Besides Jetson Xavier NX, the same programs can be used on a more powerful Jetson AGX Xavier, which doubles the performance of Lc0.
The frontend UCI engine doesn’t have to use the Jetson Xavier embedded device. Instead, it can connect to any engine on remote machines. For example, we can recompile Lc0 or Stockfish engines that run on a remote Linux host with a powerful GPU or strong CPU.
I feel quite happy with all the effort to work on this portable Leela Chess Zero device, and it really solves my laptop crash issue! I am now using this device for any chess analysis that needs an engine’s help. I’m hoping my work will help many other chess enthusiasts who face a similar issue.
More information about the design of this device can be found on my website http://www.ezchess.org/
Evelyn Zhu is a Women International Master and U.S. National Master from Long Island, New York. She started playing chess competitively at a young age and has won numerous individual and team championships. She greatly enjoys giving back to the chess community through her work as a tournament organizer, tournament director, and camp coach. Along with chess, Evelyn finds passion in computer science and engineering. As such, she frequently participates in coding competitions and enjoys learning about the potential of computers and AI. You can reach her at firstname.lastname@example.org