Training Your Neural Net with eGPU Acceleration on Mac with Tensorflow 1.5



I finally got this thing done. It was really a nightmare for the whole past week.

— Me Myself

I bought an eGPU case for my Macbook recently and therefore, I have almost dedicated a whole week on getting the equipment working. Congrates to myself, it worked before I crashed.

Firstly, the environment:

  • Mantiz eGPU – Thunderbolt 3
  • Macbook (early 2015, shame that I did not aware it was Thunderbolt 2 in the first place, use TB3 if you have one)
  • NVIDIA GPU card. I am using GeForce 950 by the way.

Ignore the following if you are using Thunderbolt 3. Also, you may check other eGPUs if you are using TB2, just to save costs.

A pic of the mantiz dockfrom egpu.io.

Enable GPU Card

For a Macbook, you may see different solutions depend on different system versions. You may check it from ‘Apple’ -> ‘About This Mac’. Like it appeared on the screenshot, I am using 10.13.4.

Disable SIP

No matter which version you are using. You will be always need to disable SIP first.

To check if it has been enabled, go to terminal and

csrutil status

If it says it has been enabled, do the following:

  1. Boot your machine and keep pressing command + R to recovery mode.
  2. From menu bar, select Terminal then type in
csrutil disable

3. Restart.

Enable GPU

For this specific version, 10.13.4, I personally prefer using purge-wrangler. For others, you may check on egpu.io for more information. To install it:

curl -s "https://api.github.com/repos/mayankk2308/purge-wrangler/releases/latest" | grep '"browser_download_url":' | sed -E 's/.*"([^"]+)".*/\1/' | xargs curl -L -s -0 > purge-wrangler.sh && chmod +x purge-wrangler.sh && ./purge-wrangler.sh && rm purge-wrangler.sh

Then do:

purge-wrangler

You will see like below:

Just type 2 and enter. It will do everything for you. After the reboot, you will find a new icon on your Mac.

To secure your eGPU is enabled, go to ‘Activity monitor’ then pressing Command + 4. You will find two GPUs are listed here.


Enable Tensorflow GPU computing

This is the nasty part. Please be sure that you are using the exact version of all the dependencies. This part is mainly inspired from here.

CUDA

tar -zxf cudnn-9.1-osx-x64-v7-ga.tgz
cd cuda
sudo cp -RPf include/* /Developer/NVIDIA/CUDA-9.1/include/
sudo cp -RPf lib/* /Developer/NVIDIA/CUDA-9.1/lib/
sudo ln -s /Developer/NVIDIA/CUDA-9.1/lib/libcudnn* /usr/local/cuda/lib/

Then add path variables:

vim ~/.bash_profile
export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH=$CUDA_HOME/lib:$CUDA_HOME/extras/CUPTI/lib
export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
export PATH=$CUDA_HOME/bin:$PATH
source ~/.bash_profile

Be sure there is a valid output from deviceQuery:

cd /Developer/NVIDIA/CUDA-9.1/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery

If any error occurs, keep tweaking the version of CUDA to resolve it. Or the next steps will never work. Here is a my sample output:

Install Tensorflow Build Dependencies

brew install coreutils

Install OpenMP

brew install cliutils/apple/libomp

Install bazel 0.9.0, download this and overwrite the latest version with it.

mv ./bazel.rb /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core/Fomula/bazel.rb
brew install bazel
bazel version

Please ensure it is on 0.9.0.

We have to use XCode 8.2 to get the correct Clang version. Download it from here. You may rename it to XCode8.2.app to avoid overwriting and put it into Application folder.

To activate:

sudo xcode-select -s /Applications/Xcode8.2.app

To use the latest one:

sudo xcode-select -s /Applications/Xcode.app

Also, we would better to have a new conda environment.

conda create --p egpu python=3.6
source activate egpu
pip install six numpy wheel

Build Tensorflow

Download Tensorflow

git clone https://github.com/tensorflow/tensorflow.git -b v1.5.0

To make it compatible to MacOS

cd tensorflow
sed -i -e "s/ __align__(sizeof(T))//g" tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc
sed -i -e "s/ __align__(sizeof(T))//g" tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc
sed -i -e "s/ __align__(sizeof(T))//g" tensorflow/core/kernels/split_lib_gpu.cu.cc

To complie (This may take half an hour)

./configure
#Please specify the location of python.: Accept the default option
#Please input the desired Python library path to use.: Accept the default option
#Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
#Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
#Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
#Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
#Do you wish to build TensorFlow with GDR support? [y/N]: n
#Do you wish to build TensorFlow with VERBS support? [y/N]: n
#Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
#Do you wish to build TensorFlow with CUDA support? [y/N]: y
#Please specify the CUDA SDK version you want to use, e.g. 7.0.: 9.1
#Please specify the location where CUDA 9.1 toolkit is installed.: Accept the default option
#Please specify the cuDNN version you want to use.: 7
#Please specify the location where cuDNN 7 library is installed.: Accept the default option
##Please specify a list of comma-separated Cuda compute capabilities you want to build with.
##You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. (GTX10X0: 6.1, GTX9X0: 5.2)
#Please note that each additional compute capability significantly increases your build time and binary size.: 6.1
#Do you want to use clang as CUDA compiler? [y/N]: n
#Please specify which gcc should be used by nvcc as the host compiler.: Accept the default option
#Do you wish to build TensorFlow with MPI support? [y/N]: n
#Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified: Accept the default option
#Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n

export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib
export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
export PATH=$DYLD_LIBRARY_PATH:$PATH

#bazel clean --expunge
bazel build --config=cuda --config=opt --action_env PATH --action_env LD_LIBRARY_PATH --action_env DYLD_LIBRARY_PATH //tensorflow/tools/pip_package:build_pip_package

Install the Built Tensorflow

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install /tmp/tensorflow_pkg/tensorflow-1.5.0-cp36-cp36m-macosx_10_7_x86_64.whl

Congrates if you run on all success.

Test

Try with the following:

with tf.device('/gpu:0'):
x = tf.constant(10)
y = tf.constant(20)
mul = tf.multiply(x, y)
with tf.Session() as sess:
sess.run(mul)

Then you will see the output comes from your external GPU card. Beautiful!

Source: Deep Learning on Medium