出现这个问题。找到了原因是,在global函数中调用了__device__函数,但是这两个文件不在同一个src文件里面。
http://stackoverflow.com/questions/31006581/cuda-device-unresolved-extern-function
The issue is that you defined a __device__ function in separate compilation unit from __global__that calls it. You need to either explicitely enable relocatable device code mode by adding -dc flag or move your definition to the same unit.
From nvcc documentation:
--device-c|-dcCompile each .c/.cc/.cpp/.cxx/.cu input file into an object file that contains relocatable device code. It is equivalent to--relocatable-device-code=true--compile.
See Separate Compilation and Linking of CUDA C++ Device Code for more information.
http://stackoverflow.com/questions/17188527/cuda-external-class-linkage-and-unresolved-extern-function-in-ptxas-file
因此解决的方式有2个。
第一是两个函数放到同一个cu文件中。
第二是在cu文件属性页面选项卡中 cuda c/c++->common->Generate Relocatable Device Code 选择-rdc=true。允许重定位device代码编译。或者在整个工程的cuda c/c++项中配置这个-rdc=true.
解决问题。
其他参考
https://devtalk.nvidia.com/default/topic/524436/how-to-deal-with-ptxas-fatal-error-unresolved-extern-function-39-cudagetparameterbuffer-39-/
1) View -> Property Pages
2) Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
3) Configuration Properties -> CUDA C/C++ -> Code Generation -> compute_35,sm_35
4) Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib
本文解决了一个CUDA编程中的常见问题:当__global__函数调用__device__函数且二者不在同一源文件中时,如何避免链接错误。提供了两种解决方案:一是将函数放在同一个cu文件中;二是通过编译选项允许重定位device代码。

1500

被折叠的 条评论
为什么被折叠?



