Step-by-step: deep learning server environment setting(Windows 10 & CentOS 7 Dual Boot)

Introduction

Recently I got my own PC and its operating system was Windows 10; however what really important for me was not the superiority for gaming or for document editing…etc, since I am not a gamer at all. Moreover I will take the deep learning course this semester, so I decide to transform this computer into a dual-boot remote server, for which I could connect the PC remotely using my laptop.

Here are step by step instructions to setup deep learning environment on CentOS 7, and give you some basic knowledge about CentOS 7 linux installation, disk partition, openSSH configuration…etc.

Most important of all, if there is anything wrong, please feel free to correct me.

Getting Started: Hardware

The list below is the recommendation for hardware components, if needed.

  1. CPU Amd 1700
  2. Gpu 1070 Ti NVIDIA
  3. Ram 8G
  4. Motherboard b350-F ASUS

To emphasize, you must have GPU 1070 Ti or greater (e.g. 1080 Ti) since that for those other GPU such as 1060 Ti, its RAM < 3G and is indeed not enough for deep learning ( Tensorflow ) training process.

Getting Started: Software

So my computer was originally booted in Windows 10, and I have a 4TB (Seagate) memory in total, with C:\ is 1600 GB and D:\ is 2100 GB in origin setting.

And I decided to leave C:\ for 400GB ( all program files included) for Windows 10 since that I would rarely use them, and give the rest 3300GB to my centOS operation.

1. Create the Bootable USB for CentOS 7

To sum up, first we need to download .iso file of CentOS 7, for me I downloaded the “Everything ISO” file (8.6GB). You can get the same .iso file from here: https://www.centos.org/download/

Because my computer doesn’t contain an optical disk drive, I decided to boot my Windows 10 from the USB. And therefore I need to put the CentOS .iso file into my USB and also transform it into a bootable format. I followed the tutorial here:

Link: Create a Bootable USB Flash Drive for Windows 10 on MAC OS X(https://www.youtube.com/watch?v=Nhgjqbq_zYA)

# please make sure your USB is already re-format.
$ diskutil list
$ diskutil unmountDisk /dev/disk2
$ sudo dd if=/path/to/your/iso of=/dev/disk2 bs=1m
# it takes about 15 minutes to finish it

If you don’t have a MAC or other Linux computer to do that, you can still use Win 10 to achieve the same thing using the .exe software UltraISO.

2. Create/Reduce/Merge Partition on Win 10 hard disk

Please follow the link to set up your own partition. As previously described, I decided to leave C:\ for 400GB ( all program files included), delete my D:\ for Windows 10 since that I would rarely use them, and give the rest 3300GB tomy centOS operation. It’s quite easy with the GUI. If you encountered any problem, please see Difficulty 2–1 & 2–2 below.

Link: https://technet.microsoft.com/en-us/library/gg309170.aspx

Difficulty 2–1. Your partitions are not yet merged

If your partitions are under such circumstances:

Partition Wizard (Win 10)

You can download the Partition Wizard to combine those unallocated spaces, please follow the tutorial.

Link: https://www.partitionwizard.com/mergepartitions/how-to-merge-unallocated-spaces.html

Difficulty 2–2. Your cannot perform any actions on certain unallocated space

It’s probably because for some reasons the unallocated partition is in MBRformat but not the GPT format. You can check the attribute of the red-box cropped partition as: Right-Click -> ‘Attribute (P)’ -> Volume -> and check your partition style. If it is MBR then it must need to be transformed.

Partition Style: MBR (need to be transformed!)

How to do that? use the command MBR2GPT !

You need to use those command in the Windows PE Command Line. You can do that by the tutorial provided by Microsoft.

Link: https://docs.microsoft.com/en-us/windows/deployment/mbr-to-gpt

However, since it’s a bit too much information so I would summarize some important steps below.

  1. Enter Windows RE setting and use the Command line
  2. mbr2gpt /validate
  3. mbr2gpt /convert
  4. Conversion completed successfully, reboot it. Remember to switch the firmware to boot to UEFI mode if you are done with the command MBR2GPT

3. Enter BIOS and boot from USB

For my PC, I can enter BIOS by long pressing the delete button. Choosing that to boot from the USB (CentOS) and you might see the following screen. Click on Install CentOS 7 or the second one are both fine.

http://linux.vbird.org/linux_basic/0157installcentos7.php

Difficulty 3–1. initramfs unpacking failed junk in compressed archive centos

I encountered the problem after clicking Install CentOS 7 right away, and the PC immediately reboot. The problem is that the USB is somewhat broken, just change another USB and boot it again.

4. Easy Setting for installation

just some easy stuff: keyboard, date and time, language…etc

http://linux.vbird.org/linux_basic/0157installcentos7.php

Most importantly, you have to choose the right software(S) for your server:

http://linux.vbird.org/linux_basic/0157installcentos7.php

I choose the “Server with GUI” (reference: http://linux.vbird.org/linux_basic/0157installcentos7.php), and It works well so I would recommend you to do so.

5. (MOST IMPORTANT) Disk partition scheme

Check the Link: https://unix.stackexchange.com/questions/35001/partitioning-a-2tb-drive/35037

There are so many suggestions for disk partition scheme, please do a through research before performing the following steps. Here are what I got:

Windows 10 : 400GB

CentOS 7: 3300GB (as below)

/boot: 1GB (standard partition)

/boot/efi: 300MB (standard partition)

/biosboot: 2MB (standard partition)

/usr: 30GB (LVM, used for software that doesn’t come with the distro)

/usr/local: 30GB (LVM)

/var: 30GB (LVM)

/tmp: 30GB (LVM)

/: 50GB (LVM)

/swap: 20GB (LVM)

/opt: 10GB (LVM)

/home: 3100GB (LVM)

https://www.gerrywilliams.net/2017/07/setup-lvm-on-linux-install/

6. create user & root password setup

easy stuff. After filling them, the installation process will start and it takes about 50 minutes. You need to reboot after that.

7. Done: installation of centOS 7

If you need more detailed information for installation of centos 7, please check: http://www.elinuxbook.com/step-by-step-installation-of-centos-7-with-screenshots/

http://www.elinuxbook.com/step-by-step-installation-of-centos-7-with-screenshots/

8. Let’s setup the SSH

$ yum -y update # it takes about 45 mins
# after that you should get the latest version of openSSH-server
# if not...
$ yum install openssh-server
# check the server status
$ /sbin/service sshd status
# start the server
$ /sbin/service sshd start
# stop the server
$ /sbin/service sshd stop

(VERY IMPORTANT) since that a default installation of ssh isn’t perfect, and when running an ssh server there are a few simple steps that can dramatically harden an installation. Please do that as follows (I’ve done the Step 1 to 5)

Link: https://wiki.centos.org/HowTos/Network/SecuringSSH

Difficulty 8–1: Setup PPPoE in CentOS

$ yum install rp-pppoe
$ systemctl stop NetworkManager.service
$ systemctl disable NetworkManager.service
$ pppoe-setup
Welcome to the PPPoE client setup. First, ...
LOGIN NAME
Enter your Login Name (default root):
# please do $ifconfig before enter the ethernet
Enter the Ethernet interface connected to the PPPoE modem
For Solaris, this is likely to be something like /dev/hme0.
For Linux, it will be ethX, where 'X' is a number.
(default eth0):
Please enter the IP address of your ISP's primary DNS server. (just ask your ISP~)
Please enter the IP address of your ISP's secondary DNS server. (ignore)
...
Please enter your Password:******
Please re-enter your Password:******
** Summary of what you entered **
Ethernet Interface: eth0
User name: root
Activate-on-demand: No
DNS: Do not adjust
Firewalling: NONE
User Control: yes
Accept these settings and adjust configuration files (y/n)? y
$ /etc/init.d/network restart
# it should work now!

9. Install Python 3.6

I originally don’t think that it will be a tough process; however it take so much more time then expected because I have to install many lacked yum packages beforehand.

Please do this first:

$ yum groupinstall "Development tools"
$ yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel

Then configuring and installing python as follows:

$ ./configure --prefix=/usr/local LDFLAGS="-Wl,-rpath /usr/local/lib"
$ su
$ make && make altinstall

it will help you to install the python3.6 more smoothly…!

Also, the IUS is very important, and you can utilize it to help you download the pip3 and others. Please read: https://www.digitalocean.com/community/tutorials/how-to-install-python-3-and-set-up-a-local-programming-environment-on-centos-7

10. Install NVIDIA GPU driver & CUDA & cuDNN

# first of all, you have to close the X server; that is, use the command line interface of CentOS instead of GUI mode
# currently is graphical.target
$ systemctl get-default
$ sudo systemctl set-default multi-user.target
$ reboot # it will become cmd line mode

Then, follow this tutorial (really helpful!!) and you can finish the installation. Link: https://gist.github.com/wangruohui/df039f0dc434d6486f5d4d098aa52d07

However, I want to make some important points:

→ install driver without –dkms works fine

→ you should NOT install the latest version of CUDA (i.e. CUDA-9.1) since that Tensorflow will not support it. Download the CUDA-9.0 instead.

→ cuDNN should be aimed for CUDA-9.0, too.

11. setup $PATH and soft link

# go into ~.bash_profile, edit:
export LD_LIBRARY_PATH=blablabla # this is for CUDA
export PYTHONPATH=blablabla
# in commandline:
$ sudo ln -s /usr/local/bin/python3.6 /usr/bin/python3
# same for pip3.6 
$ sudo ln -s /usr/bin/pip3.6 /usr/bin/pip3`

12. Final check

# after installing the GPU driver, ...
$ nvidia-smi
# after installing python3.6, please pip3 install the tensorflow-gpu 
$ python3
>>> import tensorflow as tf
>>> (with no errors... Congrats!!)

Thanks for reading! Once again, if there is anything wrong, please feel free to correct me. 🙂

Source: Deep Learning on Medium