cndaqiang Web Linux DFT

Centos 7 安装Cuda10 过程记录

2019-03-31
cndaqiang
RSS

在阿里云的GPU服务器上安装cuda没有什么问题,在自己的服务器上安装遇到一些问题,记录如下

注意

硬盘空间 需要root用户安装

参考
CentOS 7 设置默认进入图形界面或文本界面
Linux安装Nvidia显卡驱动:禁用The Nouveau kernel driver的方法!

查看显卡是否存在

[cndaqiang@mom ~]$ lspci  | grep -i vga
05:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2)
09:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) (rev 05)

禁用Nouveau驱动

Centos之前安装了Nouveau的驱动

  1. 把驱动加入黑名单中: 在/etc/modprobe.d/blacklist.conf 在后面加入:
    blacklist nouveau
    
  2. 使用 dracut重新建立 initramfs p_w_picpath file :
    #备份 the initramfs file
    $ sudo mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
    #重新建立 the initramfs file
    $ sudo dracut -v /boot/initramfs-$(uname -r).img $(uname -r)
    

开机默认文本模式

[cndaqiang@mom ~]$ systemctl get-default
multi-user.target
[cndaqiang@mom ~]$ sudo systemctl set-default multi-user.target

重启,检查nouveau driver确保没有被加载!

$ lsmod | grep nouveau

一些依赖

yum install epel-release
yum install --enablerepo=epel dkms

安装

CUDA Toolkit 10.1 Download下载cuda_10.1.105_418.39_linux.run

./cuda_10.1.105_418.39_linux.run

安装即可,安装结果

[root@mom cndaqiang]# ./cuda_10.1.105_418.39_linux.run 
===========
= Summary =
===========

Driver:   Installed
Toolkit:  Installed in /usr/local/cuda-10.1/
Samples:  Not Selected

Please make sure that
 -   PATH includes /usr/local/cuda-10.1/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.1/lib64, or, add /usr/local/cuda-10.1/lib64 to /etc/ld.so.conf and run ldconfig as root

最后把安装所说的添加PATH和LD_LIBRARY_PATH

编译vasp所需的cublascuda库,在安装cuda时自定义的安装目录的lib目录下,或者在 /usr/lib 里面,编译时,注意修改makefile.include

查看运行状况

[cndaqiang@mom vasp.5.4.4-gpu]$ nvidia-smi
Mon Apr  1 21:37:29 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.39       Driver Version: 418.39       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 750 Ti  Off  | 00000000:05:00.0 Off |                  N/A |
| 30%   25C    P0     1W /  38W |      0MiB /  2001MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

或实时显示watch -n 0 "nvidia-smi"


本文首发于我的博客@cndaqiang.
本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!


目录

访客数据