Hexo

  • Home

  • Archives

Deep Learning Environment

Posted on 2019-07-04 Edited on 2020-01-10

Installation structions for configuration DL environment

  • Ubuntu 18.04 LTS, kernel version 4.18 (You may use uname -r to check your kernel version, currently, kernel version 5.0.0 is suffering the losing of Nvidia driver from time to time)
  • Nvidia Tesla P40 24G/Tesla P4 8G/Nvidia Quadro P2000 8G
  • CUDA 10.0
  • CUDNN 7.6.1 for CUDA 10.0
  • opencv 3.4.5
  • tensorflow-gpu 1.12.0
  • caffe 1.0, from BVLC
  • RefineDet, also with BVLC aligned
  • pytorch 1.1, which libtorch takes gcc>=6. for compilation on Ubuntu 18.04

Tips:

  • The installation may differ from many subtle differences, like: opencv version/cuda version, and of course, different modules included in your opencv. Even the author, me, install it very differently from time to time. The most import thing is: you are reseponsible for your environment, so if you had met problems during your installation, you should be able to solve them yourself and this blog is only providing a guide for the environment configuration instead of guaranting a successufully installation;
  • At current stage, an update of Ubuntu kernel, will cause the loss of Nvidia driver. A version of 418 works the best for 4.18~4.24 version of Ubuntu kernel;
  • DO NOT use sudo apt autoremove, unless you know what you are doing/removing;
  • Read the instructions carefully until you know what you are doing;
  • make -j may cause OOM problem, do not use it unless you know your hardware’s capability;

The full installation (until the successful installation of pytorch) will takes around 3 ~ 4 hours, depends on how familiar you are with Linux system…: Good luck, now let’s launch!

Install system dependency

Add Aliyun source

If you are not in China, you may skip this

1
sudo cp /etc/apt/sources.list /etc/apt/sources.list.bak

you will have no vim before you proceed

1
sudo gedit /etc/apt/sources.list

add following configurations to the source list:

1
2
3
4
5
6
7
8
9
10
deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse

Refresh system

1
2
sudo apt update
sudo apt upgrade

Necessary tools for linux

1
2
sudo apt install openssh-server
sudo apt install net-tools

Installing: protobuf/opencv/hdf5/boost/openblas/atlas/lapack/gflags/glog/lmdb

1
2
3
4
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev

Recommanded packages: freeglut, x11, xmu, xi gl1-mesa, glu1-mesa, glu1-mesa-dev: compiling cuda will throw exceptions: Missing recommanded libraries … if you don’t install them

1
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev

Other necessary packages:

1
2
3
4
sudo apt-get install graphviz  # Install graphviz, or will cause pytest failed later
sudo apt install git
sudo apt-get install vim-gtk
sudo apt-get install ibus-pinyin

Install Nvidia drivers

Disable nouveau

1
sudo vim /etc/modprobe.d/blacklist.conf

Add, no need to source:

1
blacklist nouveau

Install drivers

using .run file

1
sudo bash ./NVIDIA-Linux-x86_64-418.67.run

Check, running following command should have popups

1
nvidia-smi

Reinstall drivers

1
2
3
sudo apt-get purge nvidia*
sudo add-apt-repository ppa:graphics-drivers
sudo bash NVIDIA-Linux-x86_64-418.64.run

Install CUDA

You don’t need to downgrade gcc installing CUDA 10.0

NOTES: Currently, Nvidia driver 4.10 and previous does not work with CUDA 10.0, so you’d better use CUDA 10.1 instead.

Install cuda

Do NOT install the drivers from cuda_***.run. However, later installation shows that the driver also works.

1
sudo bash ./cuda_10.0.130_418.64_linux.run

Configure system path

1
2
3
4
5
6
sudo vim ~/.bashrc

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

source ~/.bashrc

Test

1
2
3
cd ~/NVIDIA_CUDA-10.0_Samples/1_Utilities/deviceQuery/samples/
sudo make
sudo ./deviceQuery

This indicates that the successfull installation of cuda:

1
2
3
- deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime
- Version = 10.1, NumDevs = 1
- Result = PASS

Note that the driver version and runtime version should better be the same here.

Install CUDNN

Latest CUDNN could be installed through dpkg, but you’d better install it from original file:

1
dpkg -i libcudnn7_7.6.2.24-1+cuda10.0_amd64.deb

Or, you can install it from original file.

1
2
3
4
tar -xvf cudnn-10.0-linux-x64-v7.6.1.34.tgz

sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/* /usr/local/cuda/lib64/

Install Anaconda

Damn idiots, you should know how to install anaconda and its environment

Install Anaconda from shell

1
bash Anaconda3-2019.03-Linux-x86_64.sh

Create virtual environment

1
2
conda create -n caffe-gpu python=3.6
conda activate caffe-gpu

Install opencv

Import notes:: Install from source! You are an grown-up now, you should be able to handle it. Earlier version, like 3.2.0 and before, using opencv may cause problem, please recompile opencv:

1
/usr/local/lib/libopencv_imgcodecs.so.3.2.0: undefined reference to `TIFFReadRGBAStrip@LIBTIFF_4.0_apos

Then Download opencv 3.4.5 and opencv_contrib 3.4.5. Check your opencv version:

1
pkg-config --modversion opencv

In case of reinstall, you got to remove all opencv packages before you proceed, works like charm

1
sudo apt remove libopencv*

Install dependencies

I still didn’t figure out whether “pip install opencv-python” in the virtual environment will add opencv lib to LD_LIBRARY_PATH

1
2
3
4
5
6
sudo apt-get install build-essential
sudo apt-get install cmake libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev
sudo apt-get install build-essential qt5-default ccache libv4l-dev libavresample-dev libgphoto2-dev libopenblas-base libopenblas-dev doxygen libvtk6-dev
sudo apt-get install python3-dev python3-numpy libgtk-3-dev libxvidcore-dev libx264-dev gfortran openexr
sudo apt-get install pkg-config

E: Unable to locate package libjasper-dev, This is because we had configured the Ali source for our ubuntu

1
2
3
sudo add-apt-repository "deb http://security.ubuntu.com/ubuntu xenial-security main"
sudo apt update
sudo apt install libjasper1 libjasper-dev

If you are running a brand new os: i.e., an os just installed without any updates, you may fail install above packages, an upgrade of your system will solve the problem.

Install opencv 3.4.5 from source

1
2
unzip opencv-3.4.5.zip 
tar zxvf opencv_contrib-3.4.5.tar.gz

Make

Note: you should change the cmake paths according to your own system.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
cd opencv-3.4.5
mkdir build && cd build
cmake \
-D CMAKE_BUILD_TYPE=Release \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D WITH_CUDA=ON \
-D BUILD_opencv_cudacodec=OFF \
-D ENABLE_PRECOMPILED_HEADERS=OFF\
-D ENABLE_FAST_MATH=1 \
-D CUDA_FAST_MATH=1 \
-D WITH_CUBLAS=1 \
-D INSTALL_PYTHON_EXAMPLES=ON \
-D INSTALL_C_EXAMPLES=ON \
-D OPENCV_ENABLE_NONFREE=ON \
-D OPENCV_EXTRA_MODULES_PATH=/home/ubuntu/Installation/opencv_contrib-3.4.5/modules/ \
-D PYTHON_EXECUTABLE=/home/ubuntu/anaconda3/envs/torch-gpu/bin/python \
-D BUILD_TIFF=ON ..

make -j4
sudo make install

Opencv with CPU only:

1
2
3
4
5
6
7
8
9
10
11
cmake \
-D CMAKE_BUILD_TYPE=Release \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D ENABLE_PRECOMPILED_HEADERS=OFF\
-D ENABLE_FAST_MATH=1 \
-D INSTALL_PYTHON_EXAMPLES=ON \
-D INSTALL_C_EXAMPLES=ON \
-D OPENCV_ENABLE_NONFREE=ON \
-D OPENCV_EXTRA_MODULES_PATH=/home/ubuntu/Installation/opencv_contrib-3.4.5/modules/ \
-D PYTHON_EXECUTABLE=/home/ubuntu/anaconda3/envs/torch-cpu/bin/python \
-D BUILD_TIFF=ON ..

Possible problems

  1. ippicv downloads failed: If you are in mainland China, you may have trouble building the opencv project 3rd party lib: ippicv. You can download it manually from here. Then cd to ${OPENCV_ROOT}/third_party/ippicv/ and modify the ippicv.cmake file from:
1
"https://raw.githubusercontent.com/opencv/opencv_3rdparty/${IPPICV_COMMIT}/ippicv/"

to (depending on where you put lib ippicv):

1
"file://~/Downloads/"

Or:

1
https://github.com/opencv/opencv_3rdparty/blob/ippicv/internal_3.4_20190204/ippicv/
  1. opencv_cudacodec compile failed: cudacodec module is not used any more, so use -D BUILD_opencv_cudacodec=OFF to turn this module off in your installation;

  2. boostdesc or Xfeatures2d installation failed: Cannot find header 'boostdesc_bgm_bi.i' and so on. These files could not be downloaded due to the network of mainland China, download them independently (you may find them in a thread of OpenCV issue) and put them into modules/xfeatures2d/src.

Share the link library through the system

1
sudo ldconfig -v

Add opencv libraries to system path, and add /usr/local/lib to opencv.conf file

1
2
3
sudo vim /etc/ld.so.conf.d/opencv.conf 

sudo ldconfig

Configure bash

1
sudo gedit /etc/bash.bashrc

Add following command to the .bashrc file

1
2
3
4
5
6
7
PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig
export PKG_CONFIG_PATH


source /etc/bash.bashrc

sudo updatedb

Test

1
2
3
4
cd ${OPENCV_ROOT}/opencv-3.4.5/samples/cpp/example_cmake
cmake .
make
./opencv_example

You should see the popup of your camera, with Hello opencv on the left corner

Install OpenCV 4

Install system essentials:

1
2
3
sudo apt-get install build-essential
sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev

Install torch-gpu

You may refer to this link here.

Remove old environment

1
conda remove --name torch-gpu --all

create virtual env:

1
2
3
conda create -n torch-gpu python=3.6
conda activate torch-gpu
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

Install pylab:

1
2
pip install numpy scipy matplotlib lmdb
pip install pandas scikit-image filterpy

Install from source

A anaconda environment is highly recommended

Install dependencies

1
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing

Depending on your cuda version, it could be [magma-cuda92 | magma-cuda100 | magma-cuda101 ]

1
conda install -c pytorch magma-cuda101

Get PyTorch Source

1
git clone --recursive https://github.com/pytorch/pytorch

compile libtorch, Note that the core of libtorch is independent from python

1
cd pytorch

Please use GCC 6 or higher on Ubuntu 17.04 and higher. For more information, see: here.
You might need to export some required environment variables here. Normally setup.py sets good default env variables, but you’ll have to do that manually. The path: “$(dirname $(which conda))/../“ looks wierd but actually that`s what it is:

1
2
3
4
5
export NO_MKLDNN=1
export NO_SYSTEM_NCCL=1
export CUDNN_LIB_DIR="/usr/local/cuda-10.1/lib64"
export CUDNN_INCLUDE_DIR="/usr/local/cuda-10.1/include"
export CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"

Install:

1
2
python setup.py install     # build and install
python setup.py clean --all # clean the build

The previous step take tens of minutes, after successed, you will see message:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/operator_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/math_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/mpi_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu/openmpi/lib"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/conv_op_cache_cudnn_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/batch_matmul_op_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/elementwise_op_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/generate_proposals_op_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/generate_proposals_op_util_nms_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/operator_fallback_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/reshape_op_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/roi_align_op_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/test/utility_ops_gpu_test" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/bin/test_jit" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/bin/test_api" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/lib/libcaffe2_detectron_ops_gpu.so" to "$ORIGIN:/home/ubuntu/anaconda3/envs/torch-gpu/lib:/usr/local/cuda/lib64"
-- Set runtime path of "/home/ubuntu/Workspace/pytorch/torch/lib/libcaffe2_module_test_dynamic.so" to "$ORIGIN:/usr/local/cuda/lib64:/home/ubuntu/anaconda3/envs/torch-gpu/lib"

Install mkl

Note that the mkl is distinct from mkl-dnn: Download full mkl package from here and:

1
sudo ./install.sh

Note that the mkl default installation dir is /opt/intel. You should take sudo to make the proper installation.

Possibile problem

All the environment configuration is quite fun, it takes a lot of effort, but when you have finished all the work, you may feel indeed released. The hardest part is configure system dependencies.

1
fatal error: caffe/proto/caffe.pb.h: No such file or directory #include "caffe/protoc/caffe.pb.h"

you shall meet this problem using caffe: Source from here.

Highly recommanded installations

  • vscode, vscode is a fucking saver;
  • chrome, it’s 21 century now, use some real fucking browser bro.
CPP-Primer-Notes
  • Table of Contents
  • Overview

Zepyhrus

12 posts
  1. 1. Install system dependency
    1. 1.1. Add Aliyun source
    2. 1.2. Refresh system
  2. 2. Install Nvidia drivers
    1. 2.1. Disable nouveau
    2. 2.2. Install drivers
    3. 2.3. Reinstall drivers
  3. 3. Install CUDA
    1. 3.1. Install cuda
    2. 3.2. Configure system path
    3. 3.3. Test
  4. 4. Install CUDNN
  5. 5. Install Anaconda
    1. 5.1. Install Anaconda from shell
    2. 5.2. Create virtual environment
  6. 6. Install opencv
    1. 6.1. Install dependencies
    2. 6.2. Install opencv 3.4.5 from source
    3. 6.3. Make
    4. 6.4. Possible problems
    5. 6.5. Share the link library through the system
    6. 6.6. Configure bash
    7. 6.7. Test
  7. 7. Install OpenCV 4
  8. 8. Install torch-gpu
    1. 8.1. Remove old environment
    2. 8.2. Install pylab:
    3. 8.3. Install from source
    4. 8.4. Install dependencies
    5. 8.5. Get PyTorch Source
  9. 9. Install mkl
  10. 10. Possibile problem
    1. 10.1. Highly recommanded installations
© 2020 Zepyhrus
Powered by Hexo v3.9.0
|
Theme – NexT.Mist v7.3.0