Cuda Clion

Cuda Clion

Documentation autocomplete cmake boilerplate-template vscode cuda starter-kit clion code-completion cuda-toolkit cuda-support cmake-template vscode-language header-files cuda-programming Updated Feb 23, 2020.
Nov 19, 2020 CLion - CUDA Syntax - 'Use of undeclared identifier cudaConfigureCall' Follow. M Laws Created November 19, 2020 15:25. I have a simple HelloWorld CUDA.
Apr 30, 2021 CLion 2021.1.1 Bug-fix Update: Fixes for CUDA, Remote Projects, Code Completion, and Navigation A bug-fix update for CLion 2021.1 is now available. The first bug-fix update (build 211.7142.21) for the recently released CLion v2021.1 is now available on our website, via the patch-update, in the Toolbox App, or as a snap (for Ubuntu).
Common CUDA utilities. More...
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
Classes
struct mxnet::common::cuda::CublasType< DType >
Converts between C++ datatypes and enums/constants needed by cuBLAS. More...
struct mxnet::common::cuda::CublasType< float >
struct mxnet::common::cuda::CublasType< double >
struct mxnet::common::cuda::CublasType< mshadow::half::half_t >
struct mxnet::common::cuda::CublasType< uint8_t >
struct mxnet::common::cuda::CublasType< int32_t >
class mxnet::common::cuda::DeviceStore
Namespaces
mxnet
namespace of mxnet 
mxnet::common
mxnet::common::cuda
common utils for cuda 
Macros
#define QUOTE(x) #x
Macros/inlines to assist CLion to parse Cuda files (*.cu, *.cuh) More...
#define QUOTEVALUE(x) QUOTE(x)
#define STATIC_ASSERT_CUDA_VERSION_GE(min_version)
#define CHECK_CUDA_ERROR(msg)
When compiling a device function, check that the architecture is >= Kepler (3.0) Note that CUDA_ARCH is not defined outside of a device function. More...
#define CUDA_CALL(func)
Protected CUDA call. More...
#define CUBLAS_CALL(func)
Protected cuBLAS call. More...
#define CUSOLVER_CALL(func)
Protected cuSolver call. More...
#define CURAND_CALL(func)
Protected cuRAND call. More...
#define NVRTC_CALL(x)
Protected NVRTC call. More...
#define CUDA_DRIVER_CALL(func)
Protected CUDA driver call. More...
#define CUDA_UNROLL _Pragma('unroll')
#define CUDA_NOUNROLL _Pragma('nounroll')
#define MXNET_CUDA_ALLOW_TENSOR_CORE_DEFAULT true
#define MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION_DEFAULT false
Functions
const char * mxnet::common::cuda::CublasGetErrorString (cublasStatus_t error)
Get string representation of cuBLAS errors. More...
const char * mxnet::common::cuda::CusolverGetErrorString (cusolverStatus_t error)
Get string representation of cuSOLVER errors. More...
const char * mxnet::common::cuda::CurandGetErrorString (curandStatus_t status)
Get string representation of cuRAND errors. More...
template 
DType __device__ mxnet::common::cuda::CudaMax (DType a, DType b)
template 
DType __device__ mxnet::common::cuda::CudaMin (DType a, DType b)
int mxnet::common::cuda::get_load_type (size_t N)
Get the largest datatype suitable to read requested number of bytes. More...
int mxnet::common::cuda::get_rows_per_block (size_t row_size, int num_threads_per_block)
Determine how many rows in a 2D matrix should a block of threads handle based on the row size and the number of threads in a block. More...
int cudaAttributeLookup (int device_id, std::vector< int32_t > *cached_values, cudaDeviceAttr attr, const char *attr_name)
Return an attribute GPU device_id. More...
int ComputeCapabilityMajor (int device_id)
Determine major version number of the gpu's cuda compute architecture. More...
int ComputeCapabilityMinor (int device_id)
Determine minor version number of the gpu's cuda compute architecture. More...
int SMArch (int device_id)
Return the integer SM architecture (e.g. Volta = 70). More...
int MultiprocessorCount (int device_id)
Return the number of streaming multiprocessors of GPU device_id. More...
int MaxSharedMemoryPerMultiprocessor (int device_id)
Return the shared memory size in bytes of each of the GPU's streaming multiprocessors. More...
bool SupportsCooperativeLaunch (int device_id)
Return whether the GPU device_id supports cooperative-group kernel launching. More...
bool SupportsFloat16Compute (int device_id)
Determine whether a cuda-capable gpu's architecture supports float16 math. Assume not if device_id is negative. More...
bool SupportsTensorCore (int device_id)
Determine whether a cuda-capable gpu's architecture supports Tensor Core math. Assume not if device_id is negative. More...
bool GetEnvAllowTensorCore ()
Returns global policy for TensorCore algo use. More...
bool GetEnvAllowTensorCoreConversion ()
Returns global policy for TensorCore implicit type casting. More...
Variables
constexpr size_t kMaxNumGpus = 64
Maximum number of GPUs. More...
Detailed DescriptionCommon CUDA utilities. 
Copyright (c) 2015 by Contributors 
Macro Definition Documentation◆ CHECK_CUDA_ERRORValue: cudaError_t e = cudaGetLastError(); 
 CHECK_EQ(e, cudaSuccess) << (msg) << ' CUDA: ' << cudaGetErrorString(e); 
When compiling a device function, check that the architecture is >= Kepler (3.0) Note that CUDA_ARCH is not defined outside of a device function. 
Check CUDA error. 
ParametersmsgMessage to print if an error occured. 
◆ CUBLAS_CALLValue: cublasStatus_t e = (func); 
 << 'cuBLAS: ' << mxnet::common::cuda::CublasGetErrorString(e); 
const char * CublasGetErrorString(cublasStatus_t error)
Definition: cuda_utils.h:258
Protected cuBLAS call. 
ParametersfuncExpression to call.
It checks for cuBLAS errors after invocation of the expression. 
◆ CUDA_CALLValue: cudaError_t e = (func); 
 CHECK(e cudaSuccess || e cudaErrorCudartUnloading) 
 }
Protected CUDA call. 
ParametersfuncExpression to call.
It checks for CUDA errors after invocation of the expression. 
◆ CUDA_DRIVER_CALLValue: CUresult e = (func); 
 char const * err_msg = nullptr; 
 if (cuGetErrorString(e, &err_msg) CUDA_ERROR_INVALID_VALUE) { 
 LOG(FATAL) << 'CUDA Driver: Unknown error ' << e; 
 LOG(FATAL) << 'CUDA Driver: ' << err_msg; 
Cuda Clion Stock } 
Protected CUDA driver call. 
ParametersfuncExpression to call.
It checks for CUDA driver errors after invocation of the expression. 
◆ CUDA_NOUNROLL◆ CUDA_UNROLL◆ CURAND_CALLValue: curandStatus_t e = (func); 
 << 'cuRAND: ' << mxnet::common::cuda::CurandGetErrorString(e); 
const char * CurandGetErrorString(curandStatus_t status)
Definition: cuda_utils.h:329
Protected cuRAND call. 
ParametersfuncExpression to call.
It checks for cuRAND errors after invocation of the expression. 
◆ CUSOLVER_CALLValue: cusolverStatus_t e = (func); 
 << 'cuSolver: ' << mxnet::common::cuda::CusolverGetErrorString(e); 
const char * CusolverGetErrorString(cusolverStatus_t error)
Definition: cuda_utils.h:300
Protected cuSolver call. 
ParametersfuncExpression to call.
It checks for cuSolver errors after invocation of the expression. 
◆ MXNET_CUDA_ALLOW_TENSOR_CORE_DEFAULT#define MXNET_CUDA_ALLOW_TENSOR_CORE_DEFAULT true
◆ MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION_DEFAULT#define MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION_DEFAULT false
◆ NVRTC_CALLValue: nvrtcResult result = x; 
 << #x ' failed with error ' 
 }
Protected NVRTC call. 
ParametersfuncExpression to call.
It checks for NVRTC errors after invocation of the expression. 
◆ QUOTEMacros/inlines to assist CLion to parse Cuda files (*.cu, *.cuh) 
◆ QUOTEVALUE◆ STATIC_ASSERT_CUDA_VERSION_GE#define STATIC_ASSERT_CUDA_VERSION_GE(min_version)
Value:static_assert(CUDA_VERSION >= min_version, 'Compiled-against CUDA version ' 
QUOTEVALUE(CUDA_VERSION) ' is too old, please upgrade system to version ' 

#define QUOTEVALUE(x)
Function Documentation◆ ComputeCapabilityMajor()Determine major version number of the gpu's cuda compute architecture. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe major version number of the gpu's cuda compute architecture. ◆ ComputeCapabilityMinor()Determine minor version number of the gpu's cuda compute architecture. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe minor version number of the gpu's cuda compute architecture. ◆ cudaAttributeLookup()int cudaAttributeLookup (int device_id, 
std::vector< int32_t > * cached_values, 
cudaDeviceAttr attr, 
const char * attr_name
)
inline
Return an attribute GPU device_id. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
cached_valuesAn array of attributes for already-looked-up GPUs. 
attrThe attribute, by number. 
attr_nameA string representation of the attribute, for error messages. 
Returnsthe gpu's attribute value. Clion Crack◆ GetEnvAllowTensorCore()Returns global policy for TensorCore algo use. 
Returnswhether to allow TensorCore algo (if not specified by the Operator locally). ◆ GetEnvAllowTensorCoreConversion()

Returns global policy for TensorCore implicit type casting. 
◆ MaxSharedMemoryPerMultiprocessor()int MaxSharedMemoryPerMultiprocessor (int device_id)
inline
Return the shared memory size in bytes of each of the GPU's streaming multiprocessors. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe shared memory size per streaming multiprocessor. ◆ MultiprocessorCount()Return the number of streaming multiprocessors of GPU device_id. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe gpu's count of streaming multiprocessors. ◆ SMArch()Return the integer SM architecture (e.g. Volta = 70). 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe gpu's cuda compute architecture as an int. ◆ SupportsCooperativeLaunch()bool SupportsCooperativeLaunch (int device_id)
inline
Return whether the GPU device_id supports cooperative-group kernel launching. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe gpu's ability to run cooperative-group kernels. ◆ SupportsFloat16Compute()Determine whether a cuda-capable gpu's architecture supports float16 math. Assume not if device_id is negative. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnswhether the gpu's architecture supports float16 math. ◆ SupportsTensorCore()Determine whether a cuda-capable gpu's architecture supports Tensor Core math. Assume not if device_id is negative. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnswhether the gpu's architecture supports Tensor Core math. Variable Documentation◆ kMaxNumGpusMaximum number of GPUs. 
It is no longer necessary to use this module or call find_package(CUDA)for compiling CUDA code. Instead, list CUDA among the languages namedin the top-level call to the project() command, or call theenable_language() command with CUDA.Then one can add CUDA (.cu) sources to programs directlyin calls to add_library() and add_executable().
New in version 3.17: To find and use the CUDA toolkit libraries the FindCUDAToolkitmodule has superseded this module. It works whether or not the CUDAlanguage is enabled.
Documentation of Deprecated Usage¶Tools for building CUDA C files: libraries and build dependencies.
This script locates the NVIDIA CUDA C tools. It should work on Linux,Windows, and macOS and should be reasonably up to date with CUDA Creleases.
This script makes use of the standard find_package() arguments of, REQUIRED and QUIET. CUDA_FOUND will report if anacceptable version of CUDA was found.
The script will prompt the user to specify CUDA_TOOLKIT_ROOT_DIR ifthe prefix cannot be determined by the location of nvcc in the systempath and REQUIRED is specified to find_package(). To usea different installed version of the toolkit set the environment variableCUDA_BIN_PATH before running cmake (e.g.CUDA_BIN_PATH=/usr/local/cuda1.0 instead of the default/usr/local/cuda) or set CUDA_TOOLKIT_ROOT_DIR after configuring. Ifyou change the value of CUDA_TOOLKIT_ROOT_DIR, various components thatdepend on the path will be relocated.
It might be necessary to set CUDA_TOOLKIT_ROOT_DIR manually on certainplatforms, or to use a CUDA runtime not installed in the defaultlocation. In newer versions of the toolkit the CUDA library isincluded with the graphics driver -- be sure that the driver versionmatches what is needed by the CUDA runtime version.
Input Variables¶The following variables affect the behavior of the macros in thescript (in alphabetical order). Note that any of these flags can bechanged multiple times in the same directory before callingcuda_add_executable(), cuda_add_library(), cuda_compile(),cuda_compile_ptx(), cuda_compile_fatbin(), cuda_compile_cubin()or cuda_wrap_srcs():
CUDA_64_BIT_DEVICE_CODE (Default: host bit size)Set to ON to compile for 64 bit device code, OFF for 32 bit device code.Note that making this different from the host code when generating objector C files from CUDA code just won't work, because size_t gets defined bynvcc in the generated source. If you compile to PTX and then load thefile yourself, you can mix bit sizes between device and host.
CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE (Default: ON)Set to ON if you want the custom build rule to be attached to the sourcefile in Visual Studio. Turn OFF if you add the same cuda file to multipletargets.
This allows the user to build the target from the CUDA file; however, badthings can happen if the CUDA source file is added to multiple targets.When performing parallel builds it is possible for the custom buildcommand to be run more than once and in parallel causing cryptic builderrors. VS runs the rules for every source file in the target, and asource can have only one rule no matter how many projects it is added to.When the rule is run from multiple targets race conditions can occur onthe generated file. Eventually everything will get built, but if the useris unaware of this behavior, there may be confusion. It would be nice ifthis script could detect the reuse of source files across multiple targetsand turn the option off for the user, but no good solution could be found.
CUDA_BUILD_CUBIN (Default: OFF)Set to ON to enable and extra compilation pass with the -cubin option inDevice mode. The output is parsed and register, shared memory usage isprinted during build.
CUDA_BUILD_EMULATION (Default: OFF for device mode)Set to ON for Emulation mode. -D_DEVICEEMU is defined for CUDA C fileswhen CUDA_BUILD_EMULATION is TRUE.
CUDA_LINK_LIBRARIES_KEYWORD (Default: ')The  keyword to use for internaltarget_link_libraries() calls. The default is to use no keyword whichuses the old 'plain' form of target_link_libraries(). Note that is mattersbecause whatever is used inside the FindCUDA module must also be usedoutside - the two forms of target_link_libraries() cannot be mixed.
CUDA_GENERATED_OUTPUT_DIR (Default: CMAKE_CURRENT_BINARY_DIR)Set to the path you wish to have the generated files placed. If it isblank output files will be placed in CMAKE_CURRENT_BINARY_DIR.Intermediate files will always be placed inCMAKE_CURRENT_BINARY_DIR/CMakeFiles.
CUDA_HOST_COMPILATION_CPP (Default: ON)Set to OFF for C compilation of host code.
CUDA_HOST_COMPILER (Default: CMAKE_C_COMPILER)Set the host compiler to be used by nvcc. Ignored if -ccbin or--compiler-bindir is already present in the CUDA_NVCC_FLAGS orCUDA_NVCC_FLAGS_ variables. For Visual Studio targets,the host compiler is constructed with one or more visual studio macrossuch as $(VCInstallDir), that expands out to the path whenthe command is run from within VS.
New in version 3.13: If the CUDAHOSTCXX environment variable is set it willbe used as the default.
CUDA_NVCC_FLAGS, CUDA_NVCC_FLAGS_Additional NVCC command line arguments. NOTE: multiple arguments must besemi-colon delimited (e.g. --compiler-options;-Wall)
New in version 3.6: Contents of these variables may usegeneratorexpressions.
CUDA_PROPAGATE_HOST_FLAGS (Default: ON)Set to ON to propagate _FLAGS'>CMAKE_{C,CXX}_FLAGS and their configurationdependent counterparts (e.g. CMAKE_C_FLAGS_DEBUG) automatically to thehost compiler through nvcc's -Xcompiler flag. This helps make thegenerated host code match the rest of the system better. Sometimescertain flags give nvcc problems, and this will help you turn the flagpropagation off. This does not affect the flags supplied directly to nvccvia CUDA_NVCC_FLAGS or through the OPTION flags specified throughcuda_add_library(), cuda_add_executable(), or cuda_wrap_srcs(). Flags used forshared library compilation are not affected by this flag.
CUDA_SEPARABLE_COMPILATION (Default: OFF)If set this will enable separable compilation for all CUDA runtime objectfiles. If used outside of cuda_add_executable() and cuda_add_library()(e.g. calling cuda_wrap_srcs() directly),cuda_compute_separable_compilation_object_file_name() andcuda_link_separable_compilation_objects() should be called.
CUDA_SOURCE_PROPERTY_FORMATIf this source file property is set, it can override the format specifiedto cuda_wrap_srcs() (OBJ, PTX, CUBIN, or FATBIN). If an input source fileis not a .cu file, setting this file will cause it to be treated as a .cufile. See documentation for set_source_files_properties on how to setthis property.
CUDA_USE_STATIC_CUDA_RUNTIME (Default: ON)New in version 3.3.
When enabled the static version of the CUDA runtime library will be usedin CUDA_LIBRARIES. If the version of CUDA configured doesn't supportthis option, then it will be silently disabled.
CUDA_VERBOSE_BUILD (Default: OFF)Set to ON to see all the commands used when building the CUDA file. Whenusing a Makefile generator the value defaults to VERBOSE (runmakeVERBOSE=1 to see output), although setting CUDA_VERBOSE_BUILD to ON willalways print the output.
Commands¶The script creates the following functions and macros (in alphabetical order):
Adds the cufft library to the target (can be any target). Handles whetheryou are in emulation mode or not.
Adds the cublas library to the target (can be any target). Handleswhether you are in emulation mode or not.
Cmake Cuda CompilerCreates an executable  which is made up of the filesspecified. All of the non CUDA C files are compiled using the standardbuild rules specified by CMake and the CUDA files are compiled to objectfiles using nvcc and the host compiler. In addition CUDA_INCLUDE_DIRS isadded automatically to include_directories(). Some standard CMake targetcalls can be used on the target after calling this macro(e.g. set_target_properties() and target_link_libraries()), but settingproperties that adjust compilation flags will not affect code compiled bynvcc. Such flags should be modified before calling cuda_add_executable(),cuda_add_library() or cuda_wrap_srcs().
Same as cuda_add_executable() except that a library is created.
Creates a convenience target that deletes all the dependency filesgenerated. You should make clean after running this target to ensure thedependency files get regenerated.
Returns a list of generated files from the input source files to be usedwith add_library() or add_executable().
Returns a list of PTX files generated from the input source files.
New in version 3.1.
Returns a list of FATBIN files generated from the input source files.
New in version 3.1.
Returns a list of CUBIN files generated from the input source files.
Compute the name of the intermediate link file used for separablecompilation. This file name is typically passed intoCUDA_LINK_SEPARABLE_COMPILATION_OBJECTS. output_file_var is producedbased on cuda_target the list of objects files that need separablecompilation as specified by . If the  list isempty, then  will be empty. This function is calledautomatically for cuda_add_library() and cuda_add_executable(). Note thatthis is a function and not a macro.
Sets the directories that should be passed to nvcc(e.g. nvcc-Ipath0-Ipath1...). These paths usually contain other .cufiles.
Generates the link object required by separable compilation from the givenobject files. This is called automatically for cuda_add_executable() andcuda_add_library(), but can be called manually when using cuda_wrap_srcs()directly. When called from cuda_add_library() or cuda_add_executable() the passed in are the same as the flags passed in via the OPTIONSargument. The only nvcc flag added automatically is the bitness flag asspecified by CUDA_64_BIT_DEVICE_CODE. Note that this is a functioninstead of a macro.
Selects GPU arch flags for nvcc based on target_CUDA_architecture.
Values for target_CUDA_architecture:
Auto: detects local machine GPU compute arch at runtime.
Common and All: cover common and entire subsets of architectures.
: one of Fermi, Kepler, Maxwell, Kepler+Tegra, Kepler+Tesla, Maxwell+Tegra, Pascal.
, (), +PTX, where  is one of2.0, 2.1, 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.2.
Returns list of flags to be added to CUDA_NVCC_FLAGS in .Additionally, sets _readable to the resulting numeric list.
Example:
More info on CUDA architectures: https://en.wikipedia.org/wiki/CUDA.Note that this is a function instead of a macro.
This is where all the magic happens. cuda_add_executable(),cuda_add_library(), cuda_compile(), and cuda_compile_ptx() all call thisfunction under the hood.
Given the list of files ... this macro generatescustom commands that generate either PTX or linkable objects (use PTX orOBJ for the  argument to switch). Files that don't end with .cuor have the HEADER_FILE_ONLY property are ignored.
The arguments passed in after OPTIONS are extra command line options togive to nvcc. You can also specify per configuration options byspecifying the name of the configuration followed by the options. Generaloptions must precede configuration specific options. Not allconfigurations need to be specified, only the ones provided will be used.For example:
For certain configurations (namely VS generating object files withCUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE set to ON), no generated file willbe produced for the given cuda file. This is because when you add thecuda file to Visual Studio it knows that this file produces an object fileand will link in the resulting object file automatically.
This script will also generate a separate cmake script that is used atbuild time to invoke nvcc. This is for several reasons:
nvcc can return negative numbers as return values which confusesVisual Studio into thinking that the command succeeded. The script nowchecks the error codes and produces errors when there was a problem.
nvcc has been known to not delete incomplete results when itencounters problems. This confuses build systems into thinking thetarget was generated when in fact an unusable file exists. The scriptnow deletes the output files if there was an error.
By putting all the options that affect the build into a file and thenmake the build rule dependent on the file, the output files will beregenerated when the options change.
This script also looks at optional arguments STATIC, SHARED, or MODULE todetermine when to target the object compilation for a shared library.BUILD_SHARED_LIBS is ignored in cuda_wrap_srcs(), but it is respected incuda_add_library(). On some systems special flags are added for buildingobjects intended for shared libraries. A preprocessor macro,_EXPORTS is defined when a shared library compilation isdetected.
Flags passed into add_definitions with -D or /D are passed along to nvcc.
Result Variables¶The script defines the following variables:
CUDA_VERSION_MAJORThe major version of cuda as reported by nvcc.
CUDA_VERSION_MINORThe minor version.
CUDA_VERSION, CUDA_VERSION_STRINGFull version in the X.Y format.
CUDA_HAS_FP16New in version 3.6: Whether a short float (float16, fp16) is supported.
CUDA_TOOLKIT_ROOT_DIRPath to the CUDA Toolkit (defined if not set).
CUDA_SDK_ROOT_DIRPath to the CUDA SDK. Use this to find files in the SDK. This script willnot directly support finding specific libraries or headers, as that isn'tsupported by NVIDIA. If you want to change libraries when the path changessee the FindCUDA.cmake script for an example of how to clear thesevariables. There are also examples of how to use the CUDA_SDK_ROOT_DIRto locate headers or libraries, if you so choose (at your own risk).
CUDA_INCLUDE_DIRSInclude directory for cuda headers. Added automaticallyfor cuda_add_executable() and cuda_add_library().
CUDA_LIBRARIESCuda RT library.
CUDA_CUFFT_LIBRARIESDevice or emulation library for the Cuda FFT implementation (alternative tocuda_add_cufft_to_target() macro)
CUDA_CUBLAS_LIBRARIESDevice or emulation library for the Cuda BLAS implementation (alternative tocuda_add_cublas_to_target() macro).
CUDA_cudart_static_LIBRARYStatically linkable cuda runtime library.Only available for CUDA version 5.5+.
CUDA_cudadevrt_LIBRARYNew in version 3.7: Device runtime library. Required for separable compilation.
Cuda ProjectCUDA_cupti_LIBRARYCUDA Profiling Tools Interface library.Only available for CUDA version 4.0+.
CUDA_curand_LIBRARYCUDA Random Number Generation library.Only available for CUDA version 3.2+.
CUDA_cusolver_LIBRARYNew in version 3.2: CUDA Direct Solver library.Only available for CUDA version 7.0+.
CUDA_cusparse_LIBRARYCUDA Sparse Matrix library.Only available for CUDA version 3.2+.
CUDA_npp_LIBRARYNVIDIA Performance Primitives lib.Only available for CUDA version 4.0+.
CUDA_nppc_LIBRARYNVIDIA Performance Primitives lib (core).Only available for CUDA version 5.5+.
CUDA_nppi_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 5.5 - 8.0.
CUDA_nppial_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppicc_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppicom_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0 - 10.2.Replaced by nvjpeg.
CUDA_nppidei_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppif_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppig_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppim_LIBRARY

#define QUOTEVALUE(x)
Function Documentation◆ ComputeCapabilityMajor()Determine major version number of the gpu's cuda compute architecture. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe major version number of the gpu's cuda compute architecture. ◆ ComputeCapabilityMinor()Determine minor version number of the gpu's cuda compute architecture. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe minor version number of the gpu's cuda compute architecture. ◆ cudaAttributeLookup()int cudaAttributeLookup (int device_id, 
std::vector< int32_t > * cached_values, 
cudaDeviceAttr attr, 
const char * attr_name
)
inline
Return an attribute GPU device_id. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
cached_valuesAn array of attributes for already-looked-up GPUs. 
attrThe attribute, by number. 
attr_nameA string representation of the attribute, for error messages. 
Returnsthe gpu's attribute value. Clion Crack◆ GetEnvAllowTensorCore()Returns global policy for TensorCore algo use. 
Returnswhether to allow TensorCore algo (if not specified by the Operator locally). ◆ GetEnvAllowTensorCoreConversion()Returns global policy for TensorCore implicit type casting. 
◆ MaxSharedMemoryPerMultiprocessor()int MaxSharedMemoryPerMultiprocessor (int device_id)
inline
Return the shared memory size in bytes of each of the GPU's streaming multiprocessors. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe shared memory size per streaming multiprocessor. ◆ MultiprocessorCount()Return the number of streaming multiprocessors of GPU device_id. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe gpu's count of streaming multiprocessors. ◆ SMArch()Return the integer SM architecture (e.g. Volta = 70). 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe gpu's cuda compute architecture as an int. ◆ SupportsCooperativeLaunch()bool SupportsCooperativeLaunch (int device_id)
inline
Return whether the GPU device_id supports cooperative-group kernel launching. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnsthe gpu's ability to run cooperative-group kernels. ◆ SupportsFloat16Compute()Determine whether a cuda-capable gpu's architecture supports float16 math. Assume not if device_id is negative. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnswhether the gpu's architecture supports float16 math. ◆ SupportsTensorCore()Determine whether a cuda-capable gpu's architecture supports Tensor Core math. Assume not if device_id is negative. 
Parametersdevice_idThe device index of the cuda-capable gpu of interest. 
Returnswhether the gpu's architecture supports Tensor Core math. Variable Documentation◆ kMaxNumGpusMaximum number of GPUs. 
It is no longer necessary to use this module or call find_package(CUDA)for compiling CUDA code. Instead, list CUDA among the languages namedin the top-level call to the project() command, or call theenable_language() command with CUDA.Then one can add CUDA (.cu) sources to programs directlyin calls to add_library() and add_executable().
New in version 3.17: To find and use the CUDA toolkit libraries the FindCUDAToolkitmodule has superseded this module. It works whether or not the CUDAlanguage is enabled.
Documentation of Deprecated Usage¶Tools for building CUDA C files: libraries and build dependencies.
This script locates the NVIDIA CUDA C tools. It should work on Linux,Windows, and macOS and should be reasonably up to date with CUDA Creleases.
This script makes use of the standard find_package() arguments of, REQUIRED and QUIET. CUDA_FOUND will report if anacceptable version of CUDA was found.
The script will prompt the user to specify CUDA_TOOLKIT_ROOT_DIR ifthe prefix cannot be determined by the location of nvcc in the systempath and REQUIRED is specified to find_package(). To usea different installed version of the toolkit set the environment variableCUDA_BIN_PATH before running cmake (e.g.CUDA_BIN_PATH=/usr/local/cuda1.0 instead of the default/usr/local/cuda) or set CUDA_TOOLKIT_ROOT_DIR after configuring. Ifyou change the value of CUDA_TOOLKIT_ROOT_DIR, various components thatdepend on the path will be relocated.
It might be necessary to set CUDA_TOOLKIT_ROOT_DIR manually on certainplatforms, or to use a CUDA runtime not installed in the defaultlocation. In newer versions of the toolkit the CUDA library isincluded with the graphics driver -- be sure that the driver versionmatches what is needed by the CUDA runtime version.
Input Variables¶The following variables affect the behavior of the macros in thescript (in alphabetical order). Note that any of these flags can bechanged multiple times in the same directory before callingcuda_add_executable(), cuda_add_library(), cuda_compile(),cuda_compile_ptx(), cuda_compile_fatbin(), cuda_compile_cubin()or cuda_wrap_srcs():
CUDA_64_BIT_DEVICE_CODE (Default: host bit size)Set to ON to compile for 64 bit device code, OFF for 32 bit device code.Note that making this different from the host code when generating objector C files from CUDA code just won't work, because size_t gets defined bynvcc in the generated source. If you compile to PTX and then load thefile yourself, you can mix bit sizes between device and host.
CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE (Default: ON)Set to ON if you want the custom build rule to be attached to the sourcefile in Visual Studio. Turn OFF if you add the same cuda file to multipletargets.
This allows the user to build the target from the CUDA file; however, badthings can happen if the CUDA source file is added to multiple targets.When performing parallel builds it is possible for the custom buildcommand to be run more than once and in parallel causing cryptic builderrors. VS runs the rules for every source file in the target, and asource can have only one rule no matter how many projects it is added to.When the rule is run from multiple targets race conditions can occur onthe generated file. Eventually everything will get built, but if the useris unaware of this behavior, there may be confusion. It would be nice ifthis script could detect the reuse of source files across multiple targetsand turn the option off for the user, but no good solution could be found.
CUDA_BUILD_CUBIN (Default: OFF)Set to ON to enable and extra compilation pass with the -cubin option inDevice mode. The output is parsed and register, shared memory usage isprinted during build.
CUDA_BUILD_EMULATION (Default: OFF for device mode)Set to ON for Emulation mode. -D_DEVICEEMU is defined for CUDA C fileswhen CUDA_BUILD_EMULATION is TRUE.
CUDA_LINK_LIBRARIES_KEYWORD (Default: ')The  keyword to use for internaltarget_link_libraries() calls. The default is to use no keyword whichuses the old 'plain' form of target_link_libraries(). Note that is mattersbecause whatever is used inside the FindCUDA module must also be usedoutside - the two forms of target_link_libraries() cannot be mixed.
CUDA_GENERATED_OUTPUT_DIR (Default: CMAKE_CURRENT_BINARY_DIR)Set to the path you wish to have the generated files placed. If it isblank output files will be placed in CMAKE_CURRENT_BINARY_DIR.Intermediate files will always be placed inCMAKE_CURRENT_BINARY_DIR/CMakeFiles.
CUDA_HOST_COMPILATION_CPP (Default: ON)Set to OFF for C compilation of host code.
CUDA_HOST_COMPILER (Default: CMAKE_C_COMPILER)Set the host compiler to be used by nvcc. Ignored if -ccbin or--compiler-bindir is already present in the CUDA_NVCC_FLAGS orCUDA_NVCC_FLAGS_ variables. For Visual Studio targets,the host compiler is constructed with one or more visual studio macrossuch as $(VCInstallDir), that expands out to the path whenthe command is run from within VS.
New in version 3.13: If the CUDAHOSTCXX environment variable is set it willbe used as the default.
CUDA_NVCC_FLAGS, CUDA_NVCC_FLAGS_Additional NVCC command line arguments. NOTE: multiple arguments must besemi-colon delimited (e.g. --compiler-options;-Wall)
New in version 3.6: Contents of these variables may usegeneratorexpressions.
CUDA_PROPAGATE_HOST_FLAGS (Default: ON)Set to ON to propagate _FLAGS'>CMAKE_{C,CXX}_FLAGS and their configurationdependent counterparts (e.g. CMAKE_C_FLAGS_DEBUG) automatically to thehost compiler through nvcc's -Xcompiler flag. This helps make thegenerated host code match the rest of the system better. Sometimescertain flags give nvcc problems, and this will help you turn the flagpropagation off. This does not affect the flags supplied directly to nvccvia CUDA_NVCC_FLAGS or through the OPTION flags specified throughcuda_add_library(), cuda_add_executable(), or cuda_wrap_srcs(). Flags used forshared library compilation are not affected by this flag.
CUDA_SEPARABLE_COMPILATION (Default: OFF)If set this will enable separable compilation for all CUDA runtime objectfiles. If used outside of cuda_add_executable() and cuda_add_library()(e.g. calling cuda_wrap_srcs() directly),cuda_compute_separable_compilation_object_file_name() andcuda_link_separable_compilation_objects() should be called.
CUDA_SOURCE_PROPERTY_FORMATIf this source file property is set, it can override the format specifiedto cuda_wrap_srcs() (OBJ, PTX, CUBIN, or FATBIN). If an input source fileis not a .cu file, setting this file will cause it to be treated as a .cufile. See documentation for set_source_files_properties on how to setthis property.
CUDA_USE_STATIC_CUDA_RUNTIME (Default: ON)New in version 3.3.
When enabled the static version of the CUDA runtime library will be usedin CUDA_LIBRARIES. If the version of CUDA configured doesn't supportthis option, then it will be silently disabled.
CUDA_VERBOSE_BUILD (Default: OFF)Set to ON to see all the commands used when building the CUDA file. Whenusing a Makefile generator the value defaults to VERBOSE (runmakeVERBOSE=1 to see output), although setting CUDA_VERBOSE_BUILD to ON willalways print the output.
Commands¶The script creates the following functions and macros (in alphabetical order):
Adds the cufft library to the target (can be any target). Handles whetheryou are in emulation mode or not.
Adds the cublas library to the target (can be any target). Handleswhether you are in emulation mode or not.
Cmake Cuda CompilerCreates an executable  which is made up of the filesspecified. All of the non CUDA C files are compiled using the standardbuild rules specified by CMake and the CUDA files are compiled to objectfiles using nvcc and the host compiler. In addition CUDA_INCLUDE_DIRS isadded automatically to include_directories(). Some standard CMake targetcalls can be used on the target after calling this macro(e.g. set_target_properties() and target_link_libraries()), but settingproperties that adjust compilation flags will not affect code compiled bynvcc. Such flags should be modified before calling cuda_add_executable(),cuda_add_library() or cuda_wrap_srcs().
Same as cuda_add_executable() except that a library is created.
Creates a convenience target that deletes all the dependency filesgenerated. You should make clean after running this target to ensure thedependency files get regenerated.
Returns a list of generated files from the input source files to be usedwith add_library() or add_executable().
Returns a list of PTX files generated from the input source files.
New in version 3.1.
Returns a list of FATBIN files generated from the input source files.
New in version 3.1.
Returns a list of CUBIN files generated from the input source files.
Compute the name of the intermediate link file used for separablecompilation. This file name is typically passed intoCUDA_LINK_SEPARABLE_COMPILATION_OBJECTS. output_file_var is producedbased on cuda_target the list of objects files that need separablecompilation as specified by . If the  list isempty, then  will be empty. This function is calledautomatically for cuda_add_library() and cuda_add_executable(). Note thatthis is a function and not a macro.
Sets the directories that should be passed to nvcc(e.g. nvcc-Ipath0-Ipath1...). These paths usually contain other .cufiles.
Generates the link object required by separable compilation from the givenobject files. This is called automatically for cuda_add_executable() andcuda_add_library(), but can be called manually when using cuda_wrap_srcs()directly. When called from cuda_add_library() or cuda_add_executable() the passed in are the same as the flags passed in via the OPTIONSargument. The only nvcc flag added automatically is the bitness flag asspecified by CUDA_64_BIT_DEVICE_CODE. Note that this is a functioninstead of a macro.
Selects GPU arch flags for nvcc based on target_CUDA_architecture.
Values for target_CUDA_architecture:
Auto: detects local machine GPU compute arch at runtime.
Common and All: cover common and entire subsets of architectures.
: one of Fermi, Kepler, Maxwell, Kepler+Tegra, Kepler+Tesla, Maxwell+Tegra, Pascal.
, (), +PTX, where  is one of2.0, 2.1, 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.2.
Returns list of flags to be added to CUDA_NVCC_FLAGS in .Additionally, sets _readable to the resulting numeric list.
Example:
More info on CUDA architectures: https://en.wikipedia.org/wiki/CUDA.Note that this is a function instead of a macro.
This is where all the magic happens. cuda_add_executable(),cuda_add_library(), cuda_compile(), and cuda_compile_ptx() all call thisfunction under the hood.
Given the list of files ... this macro generatescustom commands that generate either PTX or linkable objects (use PTX orOBJ for the  argument to switch). Files that don't end with .cuor have the HEADER_FILE_ONLY property are ignored.
The arguments passed in after OPTIONS are extra command line options togive to nvcc. You can also specify per configuration options byspecifying the name of the configuration followed by the options. Generaloptions must precede configuration specific options. Not allconfigurations need to be specified, only the ones provided will be used.For example:
For certain configurations (namely VS generating object files withCUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE set to ON), no generated file willbe produced for the given cuda file. This is because when you add thecuda file to Visual Studio it knows that this file produces an object fileand will link in the resulting object file automatically.
This script will also generate a separate cmake script that is used atbuild time to invoke nvcc. This is for several reasons:
nvcc can return negative numbers as return values which confusesVisual Studio into thinking that the command succeeded. The script nowchecks the error codes and produces errors when there was a problem.
nvcc has been known to not delete incomplete results when itencounters problems. This confuses build systems into thinking thetarget was generated when in fact an unusable file exists. The scriptnow deletes the output files if there was an error.
By putting all the options that affect the build into a file and thenmake the build rule dependent on the file, the output files will beregenerated when the options change.
This script also looks at optional arguments STATIC, SHARED, or MODULE todetermine when to target the object compilation for a shared library.BUILD_SHARED_LIBS is ignored in cuda_wrap_srcs(), but it is respected incuda_add_library(). On some systems special flags are added for buildingobjects intended for shared libraries. A preprocessor macro,_EXPORTS is defined when a shared library compilation isdetected.
Flags passed into add_definitions with -D or /D are passed along to nvcc.
Result Variables¶The script defines the following variables:
CUDA_VERSION_MAJORThe major version of cuda as reported by nvcc.
CUDA_VERSION_MINORThe minor version.
CUDA_VERSION, CUDA_VERSION_STRINGFull version in the X.Y format.
CUDA_HAS_FP16New in version 3.6: Whether a short float (float16, fp16) is supported.
CUDA_TOOLKIT_ROOT_DIRPath to the CUDA Toolkit (defined if not set).
CUDA_SDK_ROOT_DIRPath to the CUDA SDK. Use this to find files in the SDK. This script willnot directly support finding specific libraries or headers, as that isn'tsupported by NVIDIA. If you want to change libraries when the path changessee the FindCUDA.cmake script for an example of how to clear thesevariables. There are also examples of how to use the CUDA_SDK_ROOT_DIRto locate headers or libraries, if you so choose (at your own risk).
CUDA_INCLUDE_DIRSInclude directory for cuda headers. Added automaticallyfor cuda_add_executable() and cuda_add_library().
CUDA_LIBRARIESCuda RT library.
CUDA_CUFFT_LIBRARIESDevice or emulation library for the Cuda FFT implementation (alternative tocuda_add_cufft_to_target() macro)
CUDA_CUBLAS_LIBRARIESDevice or emulation library for the Cuda BLAS implementation (alternative tocuda_add_cublas_to_target() macro).
CUDA_cudart_static_LIBRARYStatically linkable cuda runtime library.Only available for CUDA version 5.5+.
CUDA_cudadevrt_LIBRARYNew in version 3.7: Device runtime library. Required for separable compilation.
Cuda ProjectCUDA_cupti_LIBRARYCUDA Profiling Tools Interface library.Only available for CUDA version 4.0+.
CUDA_curand_LIBRARYCUDA Random Number Generation library.Only available for CUDA version 3.2+.
CUDA_cusolver_LIBRARYNew in version 3.2: CUDA Direct Solver library.Only available for CUDA version 7.0+.
CUDA_cusparse_LIBRARYCUDA Sparse Matrix library.Only available for CUDA version 3.2+.
CUDA_npp_LIBRARYNVIDIA Performance Primitives lib.Only available for CUDA version 4.0+.
CUDA_nppc_LIBRARYNVIDIA Performance Primitives lib (core).Only available for CUDA version 5.5+.
CUDA_nppi_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 5.5 - 8.0.
CUDA_nppial_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppicc_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppicom_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0 - 10.2.Replaced by nvjpeg.
CUDA_nppidei_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppif_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppig_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppim_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppist_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppisu_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_nppitc_LIBRARYNVIDIA Performance Primitives lib (image processing).Only available for CUDA version 9.0.
CUDA_npps_LIBRARYClion Cuda PluginNVIDIA Performance Primitives lib (signal processing).Only available for CUDA version 5.5+.
CUDA_nvcuvenc_LIBRARYCUDA Video Encoder library.Only available for CUDA version 3.2+.Windows only.
CUDA_nvcuvid_LIBRARYCUDA Video Decoder library.Only available for CUDA version 3.2+.Windows only.
CUDA_nvToolsExt_LIBRARYNew in version 3.16: NVIDA CUDA Tools Extension library.Available for CUDA version 5+.
Cuda Clion ProCUDA_OpenCL_LIBRARYCuda Clion BlackNew in version 3.16: NVIDA CUDA OpenCL library.Available for CUDA version 5+.

Classes
struct	mxnet::common::cuda::CublasType< DType >
Converts between C++ datatypes and enums/constants needed by cuBLAS. More...
struct	mxnet::common::cuda::CublasType< float >
struct	mxnet::common::cuda::CublasType< double >
struct	mxnet::common::cuda::CublasType< mshadow::half::half_t >
struct	mxnet::common::cuda::CublasType< uint8_t >
struct	mxnet::common::cuda::CublasType< int32_t >
class	mxnet::common::cuda::DeviceStore

Macros
#define	QUOTE(x) #x
Macros/inlines to assist CLion to parse Cuda files (.cu, .cuh) More...
#define	QUOTEVALUE(x) QUOTE(x)
#define	STATIC_ASSERT_CUDA_VERSION_GE(min_version)
#define	CHECK_CUDA_ERROR(msg)
When compiling a device function, check that the architecture is >= Kepler (3.0) Note that CUDA_ARCH is not defined outside of a device function. More...
#define	CUDA_CALL(func)
Protected CUDA call. More...
#define	CUBLAS_CALL(func)
Protected cuBLAS call. More...
#define	CUSOLVER_CALL(func)
Protected cuSolver call. More...
#define	CURAND_CALL(func)
Protected cuRAND call. More...
#define	NVRTC_CALL(x)
Protected NVRTC call. More...
#define	CUDA_DRIVER_CALL(func)
Protected CUDA driver call. More...
#define	CUDA_UNROLL _Pragma('unroll')
#define	CUDA_NOUNROLL _Pragma('nounroll')
#define	MXNET_CUDA_ALLOW_TENSOR_CORE_DEFAULT true
#define	MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION_DEFAULT false

Functions
const char *	mxnet::common::cuda::CublasGetErrorString (cublasStatus_t error)
Get string representation of cuBLAS errors. More...
const char *	mxnet::common::cuda::CusolverGetErrorString (cusolverStatus_t error)
Get string representation of cuSOLVER errors. More...
const char *	mxnet::common::cuda::CurandGetErrorString (curandStatus_t status)
Get string representation of cuRAND errors. More...
template
DType __device__	mxnet::common::cuda::CudaMax (DType a, DType b)
template
DType __device__	mxnet::common::cuda::CudaMin (DType a, DType b)
int	mxnet::common::cuda::get_load_type (size_t N)
Get the largest datatype suitable to read requested number of bytes. More...
int	mxnet::common::cuda::get_rows_per_block (size_t row_size, int num_threads_per_block)
Determine how many rows in a 2D matrix should a block of threads handle based on the row size and the number of threads in a block. More...
int	cudaAttributeLookup (int device_id, std::vector< int32_t > cached_values, cudaDeviceAttr attr, const char attr_name)
Return an attribute GPU `device_id`. More...
int	ComputeCapabilityMajor (int device_id)
Determine major version number of the gpu's cuda compute architecture. More...
int	ComputeCapabilityMinor (int device_id)
Determine minor version number of the gpu's cuda compute architecture. More...
int	SMArch (int device_id)
Return the integer SM architecture (e.g. Volta = 70). More...
int	MultiprocessorCount (int device_id)
Return the number of streaming multiprocessors of GPU `device_id`. More...
int	MaxSharedMemoryPerMultiprocessor (int device_id)
Return the shared memory size in bytes of each of the GPU's streaming multiprocessors. More...
bool	SupportsCooperativeLaunch (int device_id)
Return whether the GPU `device_id` supports cooperative-group kernel launching. More...
bool	SupportsFloat16Compute (int device_id)
Determine whether a cuda-capable gpu's architecture supports float16 math. Assume not if device_id is negative. More...
bool	SupportsTensorCore (int device_id)
Determine whether a cuda-capable gpu's architecture supports Tensor Core math. Assume not if device_id is negative. More...
bool	GetEnvAllowTensorCore ()
Returns global policy for TensorCore algo use. More...
bool	GetEnvAllowTensorCoreConversion ()
Returns global policy for TensorCore implicit type casting. More...

Variables
constexpr size_t	kMaxNumGpus = 64
Maximum number of GPUs. More...

device_id	The device index of the cuda-capable gpu of interest.
cached_values	An array of attributes for already-looked-up GPUs.
attr	The attribute, by number.
attr_name	A string representation of the attribute, for error messages.