Picking the right Linux distribution for a deep learning server

While waiting for the computer parts to arrive I am thinking about the right Linux distribution for a deep learning server. Visiting the official Nvidia system requirements for CUDA on Linux web page they only support RHEL, CentOS, Fedora, SUSE and Ubuntu. I personally used Suse, RedHat and Debian before, and currently my sympathies are with Debian (great community, good packet manager and repository, reasonable update intervals, a reliable distribution and completely open).

However, CentOS is an alternative being the community version of RedHat, and it seems to be just as stable and reliable. On the other hand, the update intervals are very long (the last major release was 5 years ago), and there seems to be less of a community compared to Debian.

Ubuntu is currently the most popular distribution, but somewhat more for desktop use. Also, I don’t like the profit orientation of Canonical, the company behind Ubuntu. A plus is, that it is a derivative from Debian with the same package system that I like.

With Fedora I am missing the long-term support (only 13 month). And my personal perception is that it is somewhat on the descending branch.

SUSE had it’s strong days 20 years ago and has for sure the smallest community of all mentioned.

Trying to consider facts I looked at Google Trends on how many search requests have been started for the different distributions in combination with CUDA:

2018-12-09 Google OS CUDA

Ubuntu comes out first by more than a factor of 10, followed by Debian and CentOS. But this only means, that many Ubuntu users look for help on CUDA, which could also be caused by CUDA not being easy to get to run with Ubuntu or the Ubuntu users being less experienced with Linux. So, I did also look at how much hits, i.e. answers a similar Google search results in. Again, Ubuntu is first but only by factor 2.5. Debian is again second, in front of CentOS by another factor 2.5. Interesting that the only by Nvidia not supported distribution Debian comes out so strong.

A good insight on CUDA, as CUDA is a must have for Deep Learning nowadays, but finally I want to program with Keras. So, I did the same for Keras in combination with the distribution:

2018-12-09 Google OS Keras

A similar picture: Ubuntu clearly first, Debian second but closer followed by CentOS. So rationally everything speaks for Ubuntu. But my heart beats for the open and reliable Debian, being the number two and the ancestor of Ubuntu. Not supported by Nvidia, but I will give it a try!

 

Picking the right Deep Learning Hardware

After some initial experiments building upon some textbook examples I figured that I need a dedicated server. My desktop is ok (i7-6700k with GTX 980) but I want to use it for other purposes as well like gaming (Dota2) and don’t want it to be restricted on when to have it running and not. Also my Windows is not ideal for TensorFlow, so I think I will buy a dedicated Linux Server with a good GPU and put it in my basement where it can run day and night to train for my experiments.

The question is: what to buy to get a good value for money. It is still just a hobby in its infants, so it should be a huge invest. On the other hand I can use it for multiple purposes, like NAS, owncloud or as a VPN server. So I searched the web and the archive of my favorite computer magazine c’t.  The magazine just tested the new RTX 2070 and found it ok, but not superiorer to the GTX 1080TI which you can get for the same price. However, the webpage from Tim Dettmers educated me, that the RTX can be run on 16-bit training of Neural Networks instead of 32-bit and thereby effectively double the memory size. A strong argument, as I did already run into memory shortage with my first finger exercises on my GTX. Conclusion of Tim and myself: “Currently, my main recommendation is to get an RTX 2070 GPU and use 16-bit training.”

On-top I decided to get a i5-9600k which is according to some benchmark 10-30% faster than my 3 year old i7-6700k. The rest is pretty standard: Samsung 970 EVO 500 GB SSD, 32 GB DDR4-3000 on a MSI Z370 Tomahawk Mainboard. Just the 800W power supply stands out with it’s reserve for further GPUs 🙂

Now I am waiting for the parcels to arrive …

Purpose

Already in 1995 I was fascinated by the potential of neural networks, back then being more a vision than reality. Without much research I tried implementing my own naïve ideas which I got while learning C++ with Herbert Schildt’s newest book. With C++ I was fascinated about the concept of object oriented programming, and got the idea of creating a neural network with neurons being C++ objects connected by synapses being pointers with assigned weights to the next neuron object. I had first successes with mastering the Tic-Tac-Toe game but never really finished as the semester started again.

With recent quantum leap developments in deep learning showcased on my other hobby – the board game Go – I followed the news closely and it again dragged me more than 20 years later into it. I have nowadays even less time, but still I bought some books on deep learning, the latest being “Deep Learning with Python” by Francois Chollet. Starting to read it I can’t wait to try it once more.

As I studied experimental physics and engineering, I am very much practice-oriented by nature wanting to get my hands dirty and code. That’s why I called this Blog “Experimenting with Deep Learning based on Python, Keras and TensorFlow” looking to share my experiences – fascinating or frustrating – and looking for like-minded people to discuss.

First trys 1992
My first naive experiments back in 1995 (sorry for the variables being German named)