Installation structions for configuration DL environment
- Ubuntu 18.04 LTS, kernel version 4.18 (You may use
uname -r
to check your kernel version, currently, kernel version 5.0.0 is suffering the losing of Nvidia driver from time to time) - Nvidia Tesla P40 24G/Tesla P4 8G/Nvidia Quadro P2000 8G
- CUDA 10.0
- CUDNN 7.6.1 for CUDA 10.0
- opencv 3.4.5
- tensorflow-gpu 1.12.0
- caffe 1.0, from BVLC
- RefineDet, also with BVLC aligned
- pytorch 1.1, which libtorch takes gcc>=6. for compilation on Ubuntu 18.04
Tips:
- The installation may differ from many subtle differences, like: opencv version/cuda version, and of course, different modules included in your opencv. Even the author, me, install it very differently from time to time. The most import thing is: you are reseponsible for your environment, so if you had met problems during your installation, you should be able to solve them yourself and this blog is only providing a guide for the environment configuration instead of guaranting a successufully installation;
- At current stage, an update of Ubuntu kernel, will cause the loss of Nvidia driver. A version of 418 works the best for 4.18~4.24 version of Ubuntu kernel;
- DO NOT use sudo apt autoremove, unless you know what you are doing/removing;
- Read the instructions carefully until you know what you are doing;
make -j
may cause OOM problem, do not use it unless you know your hardware’s capability;
The full installation (until the successful installation of pytorch) will takes around 3 ~ 4 hours, depends on how familiar you are with Linux system…: Good luck, now let’s launch!
Install system dependency
Add Aliyun source
If you are not in China, you may skip this
1 | sudo cp /etc/apt/sources.list /etc/apt/sources.list.bak |
you will have no vim before you proceed
1 | sudo gedit /etc/apt/sources.list |
add following configurations to the source list:
1 | deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse |
Refresh system
1 | sudo apt update |
Necessary tools for linux
1 | sudo apt install openssh-server |
Installing: protobuf/opencv/hdf5/boost/openblas/atlas/lapack/gflags/glog/lmdb
1 | sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler |
Recommanded packages: freeglut, x11, xmu, xi gl1-mesa, glu1-mesa, glu1-mesa-dev: compiling cuda will throw exceptions: Missing recommanded libraries … if you don’t install them
1 | sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev |
Other necessary packages:
1 | sudo apt-get install graphviz # Install graphviz, or will cause pytest failed later |
Install Nvidia drivers
Disable nouveau
1 | sudo vim /etc/modprobe.d/blacklist.conf |
Add, no need to source:
1 | blacklist nouveau |
Install drivers
using .run file
1 | sudo bash ./NVIDIA-Linux-x86_64-418.67.run |
Check, running following command should have popups
1 | nvidia-smi |
Reinstall drivers
1 | sudo apt-get purge nvidia* |
Install CUDA
You don’t need to downgrade gcc installing CUDA 10.0
NOTES: Currently, Nvidia driver 4.10 and previous does not work with CUDA 10.0, so you’d better use CUDA 10.1 instead.
Install cuda
Do NOT install the drivers from cuda_***.run. However, later installation shows that the driver also works.
1 | sudo bash ./cuda_10.0.130_418.64_linux.run |
Configure system path
1 | sudo vim ~/.bashrc |
Test
1 | cd ~/NVIDIA_CUDA-10.0_Samples/1_Utilities/deviceQuery/samples/ |
This indicates that the successfull installation of cuda:
1 | - deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime |
Note that the driver version and runtime version should better be the same here.
Install CUDNN
Latest CUDNN could be installed through dpkg, but you’d better install it from original file:
1 | dpkg -i libcudnn7_7.6.2.24-1+cuda10.0_amd64.deb |
Or, you can install it from original file.
1 | tar -xvf cudnn-10.0-linux-x64-v7.6.1.34.tgz |
Install Anaconda
Damn idiots, you should know how to install anaconda and its environment
Install Anaconda from shell
1 | bash Anaconda3-2019.03-Linux-x86_64.sh |
Create virtual environment
1 | conda create -n caffe-gpu python=3.6 |
Install opencv
Import notes:: Install from source! You are an grown-up now, you should be able to handle it. Earlier version, like 3.2.0 and before, using opencv may cause problem, please recompile opencv:
1 | /usr/local/lib/libopencv_imgcodecs.so.3.2.0: undefined reference to `TIFFReadRGBAStrip@LIBTIFF_4.0_apos |
Then Download opencv 3.4.5 and opencv_contrib 3.4.5. Check your opencv version:
1 | pkg-config --modversion opencv |
In case of reinstall, you got to remove all opencv packages before you proceed, works like charm
1 | sudo apt remove libopencv* |
Install dependencies
I still didn’t figure out whether “pip install opencv-python” in the virtual environment will add opencv lib to LD_LIBRARY_PATH
1 | sudo apt-get install build-essential |
E: Unable to locate package libjasper-dev, This is because we had configured the Ali source for our ubuntu
1 | sudo add-apt-repository "deb http://security.ubuntu.com/ubuntu xenial-security main" |
If you are running a brand new os: i.e., an os just installed without any updates, you may fail install above packages, an upgrade of your system will solve the problem.
Install opencv 3.4.5 from source
1 | unzip opencv-3.4.5.zip |
Make
Note: you should change the cmake paths according to your own system.
1 | cd opencv-3.4.5 |
Opencv with CPU only:
1 | cmake \ |
Possible problems
- ippicv downloads failed: If you are in mainland China, you may have trouble building the opencv project 3rd party lib: ippicv. You can download it manually from here. Then cd to
${OPENCV_ROOT}/third_party/ippicv/
and modify the ippicv.cmake file from:
1 | "https://raw.githubusercontent.com/opencv/opencv_3rdparty/${IPPICV_COMMIT}/ippicv/" |
to (depending on where you put lib ippicv):
1 | "file://~/Downloads/" |
Or:
1 | https://github.com/opencv/opencv_3rdparty/blob/ippicv/internal_3.4_20190204/ippicv/ |
opencv_cudacodec compile failed: cudacodec module is not used any more, so use
-D BUILD_opencv_cudacodec=OFF
to turn this module off in your installation;boostdesc or Xfeatures2d installation failed:
Cannot find header 'boostdesc_bgm_bi.i'
and so on. These files could not be downloaded due to the network of mainland China, download them independently (you may find them in a thread of OpenCV issue) and put them intomodules/xfeatures2d/src
.
Share the link library through the system
1 | sudo ldconfig -v |
Add opencv libraries to system path, and add /usr/local/lib
to opencv.conf file
1 | sudo vim /etc/ld.so.conf.d/opencv.conf |
Configure bash
1 | sudo gedit /etc/bash.bashrc |
Add following command to the .bashrc file
1 | PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig |
Test
1 | cd ${OPENCV_ROOT}/opencv-3.4.5/samples/cpp/example_cmake |
You should see the popup of your camera, with Hello opencv on the left corner
Install OpenCV 4
Install system essentials:
1 | sudo apt-get install build-essential |
Install torch-gpu
You may refer to this link here.
Remove old environment
1 | conda remove --name torch-gpu --all |
create virtual env:
1 | conda create -n torch-gpu python=3.6 |
Install pylab:
1 | pip install numpy scipy matplotlib lmdb |
Install from source
A anaconda environment is highly recommended
Install dependencies
1 | conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing |
Depending on your cuda version, it could be [magma-cuda92 | magma-cuda100 | magma-cuda101 ]
1 | conda install -c pytorch magma-cuda101 |
Get PyTorch Source
1 | git clone --recursive https://github.com/pytorch/pytorch |
compile libtorch, Note that the core of libtorch is independent from python
1 | cd pytorch |
Please use GCC 6 or higher on Ubuntu 17.04 and higher. For more information, see: here.
You might need to export some required environment variables here. Normally setup.py sets good default env variables, but you’ll have to do that manually. The path: “$(dirname $(which conda))/../“ looks wierd but actually that`s what it is:
1 | export NO_MKLDNN=1 |
Install:
1 | python setup.py install # build and install |
The previous step take tens of minutes, after successed, you will see message:
1 | -- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/operator_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64" |
Install mkl
Note that the mkl is distinct from mkl-dnn: Download full mkl package from here and:
1 | sudo ./install.sh |
Note that the mkl default installation dir is /opt/intel. You should take sudo
to make the proper installation.
Possibile problem
All the environment configuration is quite fun, it takes a lot of effort, but when you have finished all the work, you may feel indeed released. The hardest part is configure system dependencies.
1 | fatal error: caffe/proto/caffe.pb.h: No such file or directory #include "caffe/protoc/caffe.pb.h" |
you shall meet this problem using caffe: Source from here.
Highly recommanded installations
- vscode, vscode is a fucking saver;
- chrome, it’s 21 century now, use some real fucking browser bro.