CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variabl

文章目录[隐藏]

问题描述

安装OpenPCDet时,
python setup.py develop
报错:

UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at  /opt/conda/conda-bld/pytorch_1616554790289/work/c10/cuda/CUDAFunctions.cpp:109.)
  return torch._C._cuda_getDeviceCount() > 0
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda-10.2'
running develop
running egg_info
writing pcdet.egg-info/PKG-INFO
writing dependency_links to pcdet.egg-info/dependency_links.txt
writing requirements to pcdet.egg-info/requires.txt
writing top-level names to pcdet.egg-info/top_level.txt
reading manifest file 'pcdet.egg-info/SOURCES.txt'
writing manifest file 'pcdet.egg-info/SOURCES.txt'
running build_ext
building 'pcdet.ops.iou3d_nms.iou3d_nms_cuda' extension
Traceback (most recent call last):
  File "setup.py", line 125, in <module>
    'src/sampling_gpu.cu',
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/command/develop.py", line 34, in run
    self.install_for_development()
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/command/develop.py", line 136, in install_for_development
    self.run_command('build_ext')
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 708, in build_extensions
    build_ext.build_extensions(self)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 524, in unix_wrap_ninja_compile
    cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 423, in unix_cuda_flags
    cflags + _get_cuda_arch_flags(cflags))
  File "/home/gengruixu/anaconda3/envs/pytorch18/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1561, in _get_cuda_arch_flags
    arch_list[-1] += '+PTX'
IndexError: list index out of range

解决

可见,最初的错误是:

CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.


https://github.com/pytorch/pytorch/issues/49081#issuecomment-766793705
上找到解决方法:

yurunsheng1 commented on 25 Jan
apt-get install nvidia-modprobe

This works for me.

这个也work for me.

The nvidia-modprobe utility is used by user-space NVIDIA driver components to make sure the NVIDIA kernel module is loaded and that the NVIDIA character device files are present. These facilities are normally provided by Linux distribution configuration systems such as udev.

版权声明:本文为CSDN博主「R.X. NLOS」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/qazwsxrx/article/details/117951908

R.X. NLOS

我还没有学会写个人说明!

暂无评论

发表评论

相关推荐

将gpu运行的torch程序改为cpu运行

拿到了师兄的程序,但是自己手头的电脑无法使用cpu训练,虽然也可以使用colab在线运行,但是还是本地的会舒服一些,anyway,就得将代码稍作一下修改 方法&#xff1a