HIP: Heterogenous-computing Interface for Portability
|
Defines the HIP API. See the individual sections for more information. More...
Modules | |
Initialization and Version | |
This section describes the initializtion and version functions of HIP runtime API. | |
Device Management | |
This section describes the device management functions of HIP runtime API. | |
Execution Control | |
This section describes the execution control functions of HIP runtime API. | |
Error Handling | |
This section describes the error handling functions of HIP runtime API. | |
Stream Management | |
This section describes the stream management functions of HIP runtime API. The following Stream APIs are not (yet) supported in HIP: | |
Event Management | |
This section describes the event management functions of HIP runtime API. | |
Memory Management | |
This section describes the memory management functions of HIP runtime API. The following CUDA APIs are not currently supported: | |
PeerToPeer Device Memory Access | |
Context Management | |
This section describes the context management functions of HIP runtime API. | |
Module Management | |
This section describes the module management functions of HIP runtime API. | |
Occupancy | |
This section describes the occupancy functions of HIP runtime API. | |
Profiler Control[Deprecated] | |
This section describes the profiler control functions of HIP runtime API. | |
Launch API to support the triple-chevron syntax | |
This section describes the API to support the triple-chevron syntax. | |
Texture Management | |
This section describes the texture management functions of HIP runtime API. | |
Variables | |
unsigned | hipDeviceArch_t::hasGlobalInt32Atomics: 1 |
32-bit integer atomics for global memory. | |
unsigned | hipDeviceArch_t::hasGlobalFloatAtomicExch: 1 |
32-bit float atomic exch for global memory. | |
unsigned | hipDeviceArch_t::hasSharedInt32Atomics: 1 |
32-bit integer atomics for shared memory. | |
unsigned | hipDeviceArch_t::hasSharedFloatAtomicExch: 1 |
32-bit float atomic exch for shared memory. | |
unsigned | hipDeviceArch_t::hasFloatAtomicAdd: 1 |
32-bit float atomic add in global and shared memory. | |
unsigned | hipDeviceArch_t::hasGlobalInt64Atomics: 1 |
64-bit integer atomics for global memory. | |
unsigned | hipDeviceArch_t::hasSharedInt64Atomics: 1 |
64-bit integer atomics for shared memory. | |
unsigned | hipDeviceArch_t::hasDoubles: 1 |
Double-precision floating point. | |
unsigned | hipDeviceArch_t::hasWarpVote: 1 |
Warp vote instructions (__any, __all). | |
unsigned | hipDeviceArch_t::hasWarpBallot: 1 |
Warp ballot instructions (__ballot). | |
unsigned | hipDeviceArch_t::hasWarpShuffle: 1 |
Warp shuffle operations. (__shfl_*). | |
unsigned | hipDeviceArch_t::hasFunnelShift: 1 |
Funnel two words into one with shift&mask caps. | |
unsigned | hipDeviceArch_t::hasThreadFenceSystem: 1 |
__threadfence_system. | |
unsigned | hipDeviceArch_t::hasSyncThreadsExt: 1 |
__syncthreads_count, syncthreads_and, syncthreads_or. | |
unsigned | hipDeviceArch_t::hasSurfaceFuncs: 1 |
Surface functions. | |
unsigned | hipDeviceArch_t::has3dGrid: 1 |
Grid and group dims are 3D (rather than 2D). | |
unsigned | hipDeviceArch_t::hasDynamicParallelism: 1 |
Dynamic parallelism. | |
char | hipDeviceProp_t::name [256] |
Device name. | |
size_t | hipDeviceProp_t::totalGlobalMem |
Size of global memory region (in bytes). | |
size_t | hipDeviceProp_t::sharedMemPerBlock |
Size of shared memory region (in bytes). | |
int | hipDeviceProp_t::regsPerBlock |
Registers per block. | |
int | hipDeviceProp_t::warpSize |
Warp size. | |
int | hipDeviceProp_t::maxThreadsPerBlock |
Max work items per work group or workgroup max size. | |
int | hipDeviceProp_t::maxThreadsDim [3] |
Max number of threads in each dimension (XYZ) of a block. | |
int | hipDeviceProp_t::maxGridSize [3] |
Max grid dimensions (XYZ). | |
int | hipDeviceProp_t::clockRate |
Max clock frequency of the multiProcessors in khz. | |
int | hipDeviceProp_t::memoryClockRate |
Max global memory clock frequency in khz. | |
int | hipDeviceProp_t::memoryBusWidth |
Global memory bus width in bits. | |
size_t | hipDeviceProp_t::totalConstMem |
Size of shared memory region (in bytes). | |
int | hipDeviceProp_t::major |
int | hipDeviceProp_t::minor |
int | hipDeviceProp_t::multiProcessorCount |
Number of multi-processors (compute units). | |
int | hipDeviceProp_t::l2CacheSize |
L2 cache size. | |
int | hipDeviceProp_t::maxThreadsPerMultiProcessor |
Maximum resident threads per multi-processor. | |
int | hipDeviceProp_t::computeMode |
Compute mode. | |
int | hipDeviceProp_t::clockInstructionRate |
hipDeviceArch_t | hipDeviceProp_t::arch |
Architectural feature flags. New for HIP. | |
int | hipDeviceProp_t::concurrentKernels |
Device can possibly execute multiple kernels concurrently. | |
int | hipDeviceProp_t::pciDomainID |
PCI Domain ID. | |
int | hipDeviceProp_t::pciBusID |
PCI Bus ID. | |
int | hipDeviceProp_t::pciDeviceID |
PCI Device ID. | |
size_t | hipDeviceProp_t::maxSharedMemoryPerMultiProcessor |
Maximum Shared Memory Per Multiprocessor. | |
int | hipDeviceProp_t::isMultiGpuBoard |
1 if device is on a multi-GPU board, 0 if not. | |
int | hipDeviceProp_t::canMapHostMemory |
Check whether HIP can map host memory. | |
int | hipDeviceProp_t::gcnArch |
DEPRECATED: use gcnArchName instead. | |
char | hipDeviceProp_t::gcnArchName [256] |
AMD GCN Arch Name. | |
int | hipDeviceProp_t::integrated |
APU vs dGPU. | |
int | hipDeviceProp_t::cooperativeLaunch |
HIP device supports cooperative launch. | |
int | hipDeviceProp_t::cooperativeMultiDeviceLaunch |
HIP device supports cooperative launch on multiple devices. | |
int | hipDeviceProp_t::maxTexture1DLinear |
Maximum size for 1D textures bound to linear memory. | |
int | hipDeviceProp_t::maxTexture1D |
Maximum number of elements in 1D images. | |
int | hipDeviceProp_t::maxTexture2D [2] |
Maximum dimensions (width, height) of 2D images, in image elements. | |
int | hipDeviceProp_t::maxTexture3D [3] |
Maximum dimensions (width, height, depth) of 3D images, in image elements. | |
unsigned int * | hipDeviceProp_t::hdpMemFlushCntl |
Addres of HDP_MEM_COHERENCY_FLUSH_CNTL register. | |
unsigned int * | hipDeviceProp_t::hdpRegFlushCntl |
Addres of HDP_REG_COHERENCY_FLUSH_CNTL register. | |
size_t | hipDeviceProp_t::memPitch |
Maximum pitch in bytes allowed by memory copies. | |
size_t | hipDeviceProp_t::textureAlignment |
Alignment requirement for textures. | |
size_t | hipDeviceProp_t::texturePitchAlignment |
Pitch alignment requirement for texture references bound to pitched memory. | |
int | hipDeviceProp_t::kernelExecTimeoutEnabled |
Run time limit for kernels executed on the device. | |
int | hipDeviceProp_t::ECCEnabled |
Device has ECC support enabled. | |
int | hipDeviceProp_t::tccDriver |
1:If device is Tesla device using TCC driver, else 0 | |
int | hipDeviceProp_t::cooperativeMultiDeviceUnmatchedFunc |
int | hipDeviceProp_t::cooperativeMultiDeviceUnmatchedGridDim |
int | hipDeviceProp_t::cooperativeMultiDeviceUnmatchedBlockDim |
int | hipDeviceProp_t::cooperativeMultiDeviceUnmatchedSharedMem |
int | hipDeviceProp_t::isLargeBar |
1: if it is a large PCI bar device, else 0 | |
int | hipDeviceProp_t::asicRevision |
Revision of the GPU in this device. | |
int | hipDeviceProp_t::managedMemory |
Device supports allocating managed memory on this system. | |
int | hipDeviceProp_t::directManagedMemAccessFromHost |
Host can directly access managed memory on the device without migration. | |
int | hipDeviceProp_t::concurrentManagedAccess |
Device can coherently access managed memory concurrently with the CPU. | |
int | hipDeviceProp_t::pageableMemoryAccess |
int | hipDeviceProp_t::pageableMemoryAccessUsesHostPageTables |
Device accesses pageable memory via the host's page tables. | |
enum hipMemoryType | hipPointerAttribute_t::memoryType |
int | hipPointerAttribute_t::device |
void * | hipPointerAttribute_t::devicePointer |
void * | hipPointerAttribute_t::hostPointer |
int | hipPointerAttribute_t::isManaged |
unsigned | hipPointerAttribute_t::allocationFlags |
hipSuccess = 0 | |
Successful completion. | |
hipErrorInvalidValue = 1 | |
hipErrorOutOfMemory = 2 | |
hipErrorMemoryAllocation = 2 | |
Memory allocation error. | |
hipErrorNotInitialized = 3 | |
hipErrorInitializationError = 3 | |
hipErrorDeinitialized = 4 | |
hipErrorProfilerDisabled = 5 | |
hipErrorProfilerNotInitialized = 6 | |
hipErrorProfilerAlreadyStarted = 7 | |
hipErrorProfilerAlreadyStopped = 8 | |
hipErrorInvalidConfiguration = 9 | |
hipErrorInvalidPitchValue = 12 | |
hipErrorInvalidSymbol = 13 | |
hipErrorInvalidDevicePointer = 17 | |
Invalid Device Pointer. | |
hipErrorInvalidMemcpyDirection = 21 | |
Invalid memory copy direction. | |
hipErrorInsufficientDriver = 35 | |
hipErrorMissingConfiguration = 52 | |
hipErrorPriorLaunchFailure = 53 | |
hipErrorInvalidDeviceFunction = 98 | |
hipErrorNoDevice = 100 | |
Call to hipGetDeviceCount returned 0 devices. | |
hipErrorInvalidDevice = 101 | |
DeviceID must be in range 0...#compute-devices. | |
hipErrorInvalidImage = 200 | |
hipErrorInvalidContext = 201 | |
Produced when input context is invalid. | |
hipErrorContextAlreadyCurrent = 202 | |
hipErrorMapFailed = 205 | |
hipErrorMapBufferObjectFailed = 205 | |
Produced when the IPC memory attach failed from ROCr. | |
hipErrorUnmapFailed = 206 | |
hipErrorArrayIsMapped = 207 | |
hipErrorAlreadyMapped = 208 | |
hipErrorNoBinaryForGpu = 209 | |
hipErrorAlreadyAcquired = 210 | |
hipErrorNotMapped = 211 | |
hipErrorNotMappedAsArray = 212 | |
hipErrorNotMappedAsPointer = 213 | |
hipErrorECCNotCorrectable = 214 | |
hipErrorUnsupportedLimit = 215 | |
hipErrorContextAlreadyInUse = 216 | |
hipErrorPeerAccessUnsupported = 217 | |
hipErrorInvalidKernelFile = 218 | |
In CUDA DRV, it is CUDA_ERROR_INVALID_PTX. | |
hipErrorInvalidGraphicsContext = 219 | |
hipErrorInvalidSource = 300 | |
hipErrorFileNotFound = 301 | |
hipErrorSharedObjectSymbolNotFound = 302 | |
hipErrorSharedObjectInitFailed = 303 | |
hipErrorOperatingSystem = 304 | |
hipErrorInvalidHandle = 400 | |
hipErrorInvalidResourceHandle = 400 | |
Resource handle (hipEvent_t or hipStream_t) invalid. | |
hipErrorIllegalState = 401 | |
Resource required is not in a valid state to perform operation. | |
hipErrorNotFound = 500 | |
hipErrorNotReady = 600 | |
hipErrorIllegalAddress = 700 | |
hipErrorLaunchOutOfResources = 701 | |
Out of resources error. | |
hipErrorLaunchTimeOut = 702 | |
hipErrorPeerAccessAlreadyEnabled | |
Peer access was already enabled from the current device. More... | |
hipErrorPeerAccessNotEnabled | |
Peer access was never enabled from the current device. More... | |
hipErrorSetOnActiveProcess = 708 | |
hipErrorContextIsDestroyed = 709 | |
hipErrorAssert = 710 | |
Produced when the kernel calls assert. | |
hipErrorHostMemoryAlreadyRegistered | |
Produced when trying to lock a page-locked memory. More... | |
hipErrorHostMemoryNotRegistered | |
Produced when trying to unlock a non-page-locked memory. More... | |
hipErrorLaunchFailure | |
An exception occurred on the device while executing a kernel. More... | |
hipErrorCooperativeLaunchTooLarge | |
hipErrorNotSupported = 801 | |
Produced when the hip API is not supported/implemented. | |
hipErrorStreamCaptureUnsupported = 900 | |
hipErrorStreamCaptureInvalidated = 901 | |
hipErrorStreamCaptureMerge = 902 | |
hipErrorStreamCaptureUnmatched = 903 | |
The capture was not initiated in this stream. | |
hipErrorStreamCaptureUnjoined = 904 | |
hipErrorStreamCaptureIsolation = 905 | |
hipErrorStreamCaptureImplicit = 906 | |
hipErrorCapturedEvent = 907 | |
hipErrorStreamCaptureWrongThread = 908 | |
hipErrorGraphExecUpdateFailure = 910 | |
hipErrorUnknown = 999 | |
hipErrorRuntimeMemory = 1052 | |
hipErrorRuntimeOther = 1053 | |
char | hipIpcMemHandle_st::reserved [HIP_IPC_HANDLE_SIZE] |
char | hipIpcEventHandle_st::reserved [HIP_IPC_HANDLE_SIZE] |
int | hipFuncAttributes::binaryVersion |
int | hipFuncAttributes::cacheModeCA |
size_t | hipFuncAttributes::constSizeBytes |
size_t | hipFuncAttributes::localSizeBytes |
int | hipFuncAttributes::maxDynamicSharedSizeBytes |
int | hipFuncAttributes::maxThreadsPerBlock |
int | hipFuncAttributes::numRegs |
int | hipFuncAttributes::preferredShmemCarveout |
int | hipFuncAttributes::ptxVersion |
size_t | hipFuncAttributes::sharedSizeBytes |
uint32_t | dim3::x |
x | |
uint32_t | dim3::y |
y | |
uint32_t | dim3::z |
z | |
void * | hipLaunchParams_t::func |
Device function symbol. | |
dim3 | hipLaunchParams_t::gridDim |
Grid dimentions. | |
dim3 | hipLaunchParams_t::blockDim |
Block dimentions. | |
void ** | hipLaunchParams_t::args |
Arguments. | |
size_t | hipLaunchParams_t::sharedMem |
Shared memory. | |
hipStream_t | hipLaunchParams_t::stream |
Stream identifier. | |
hipExternalMemoryHandleType | hipExternalMemoryHandleDesc_st::type |
int hipExternalMemoryHandleDesc_st::fd | |
void * hipExternalMemoryHandleDesc_st::::handle | |
const void * hipExternalMemoryHandleDesc_st::::name | |
struct { | |
void * handle | |
const void * name | |
} hipExternalMemoryHandleDesc_st::win32 | |
union { | |
int fd | |
struct { | |
void * handle | |
const void * name | |
} win32 | |
} | hipExternalMemoryHandleDesc_st::handle |
unsigned long long | hipExternalMemoryHandleDesc_st::size |
unsigned int | hipExternalMemoryHandleDesc_st::flags |
unsigned long long | hipExternalMemoryBufferDesc_st::offset |
unsigned long long | hipExternalMemoryBufferDesc_st::size |
unsigned int | hipExternalMemoryBufferDesc_st::flags |
hipExternalSemaphoreHandleType | hipExternalSemaphoreHandleDesc_st::type |
int hipExternalSemaphoreHandleDesc_st::fd | |
void * hipExternalSemaphoreHandleDesc_st::::handle | |
const void * hipExternalSemaphoreHandleDesc_st::::name | |
struct { | |
void * handle | |
const void * name | |
} hipExternalSemaphoreHandleDesc_st::win32 | |
union { | |
int fd | |
struct { | |
void * handle | |
const void * name | |
} win32 | |
} | hipExternalSemaphoreHandleDesc_st::handle |
unsigned int | hipExternalSemaphoreHandleDesc_st::flags |
unsigned long long hipExternalSemaphoreSignalParams_st::::value | |
struct { | |
unsigned long long value | |
} hipExternalSemaphoreSignalParams_st::fence | |
unsigned long long hipExternalSemaphoreSignalParams_st::::key | |
struct { | |
unsigned long long key | |
} hipExternalSemaphoreSignalParams_st::keyedMutex | |
unsigned int hipExternalSemaphoreSignalParams_st::reserved [12] | |
struct { | |
struct { | |
unsigned long long value | |
} fence | |
struct { | |
unsigned long long key | |
} keyedMutex | |
unsigned int reserved [12] | |
} | hipExternalSemaphoreSignalParams_st::params |
unsigned int | hipExternalSemaphoreSignalParams_st::flags |
unsigned long long hipExternalSemaphoreWaitParams_st::::value | |
struct { | |
unsigned long long value | |
} hipExternalSemaphoreWaitParams_st::fence | |
unsigned long long hipExternalSemaphoreWaitParams_st::::key | |
unsigned int hipExternalSemaphoreWaitParams_st::::timeoutMs | |
struct { | |
unsigned long long key | |
unsigned int timeoutMs | |
} hipExternalSemaphoreWaitParams_st::keyedMutex | |
unsigned int hipExternalSemaphoreWaitParams_st::reserved [10] | |
struct { | |
struct { | |
unsigned long long value | |
} fence | |
struct { | |
unsigned long long key | |
unsigned int timeoutMs | |
} keyedMutex | |
unsigned int reserved [10] | |
} | hipExternalSemaphoreWaitParams_st::params |
unsigned int | hipExternalSemaphoreWaitParams_st::flags |
hipHostFn_t | hipHostNodeParams::fn |
void * | hipHostNodeParams::userData |
dim3 | hipKernelNodeParams::blockDim |
void ** | hipKernelNodeParams::extra |
void * | hipKernelNodeParams::func |
dim3 | hipKernelNodeParams::gridDim |
void ** | hipKernelNodeParams::kernelParams |
unsigned int | hipKernelNodeParams::sharedMemBytes |
void * | hipMemsetParams::dst |
unsigned int | hipMemsetParams::elementSize |
size_t | hipMemsetParams::height |
size_t | hipMemsetParams::pitch |
unsigned int | hipMemsetParams::value |
size_t | hipMemsetParams::width |
Defines the HIP API. See the individual sections for more information.
int hipDeviceProp_t::clockInstructionRate |
Frequency in khz of the timer used by the device-side "clock*" instructions. New for HIP.
int hipDeviceProp_t::cooperativeMultiDeviceUnmatchedBlockDim |
HIP device supports cooperative launch on multiple devices with unmatched block dimensions
int hipDeviceProp_t::cooperativeMultiDeviceUnmatchedFunc |
HIP device supports cooperative launch on multiple devices with unmatched functions
int hipDeviceProp_t::cooperativeMultiDeviceUnmatchedGridDim |
HIP device supports cooperative launch on multiple devices with unmatched grid dimensions
int hipDeviceProp_t::cooperativeMultiDeviceUnmatchedSharedMem |
HIP device supports cooperative launch on multiple devices with unmatched shared memories
hipErrorCapturedEvent = 907 |
The operation is not permitted on an event which was last recorded in a capturing stream.
hipErrorCooperativeLaunchTooLarge |
This error indicates that the number of blocks launched per grid for a kernel that was launched via cooperative launch APIs exceeds the maximum number of allowed blocks for the current device
hipErrorGraphExecUpdateFailure = 910 |
This error indicates that the graph update not performed because it included changes which violated constraintsspecific to instantiated graph update.
hipErrorHostMemoryAlreadyRegistered |
Produced when trying to lock a page-locked memory.
hipErrorHostMemoryNotRegistered |
Produced when trying to unlock a non-page-locked memory.
hipErrorInvalidValue = 1 |
One or more of the parameters passed to the API call is NULL or not in an acceptable range.
hipErrorLaunchFailure |
An exception occurred on the device while executing a kernel.
hipErrorNotReady = 600 |
Indicates that asynchronous operations enqueued earlier are not ready. This is not actually an error, but is used to distinguish from hipSuccess (which indicates completion). APIs that return this error include hipEventQuery and hipStreamQuery.
hipErrorPeerAccessAlreadyEnabled |
Peer access was already enabled from the current device.
hipErrorPeerAccessNotEnabled |
Peer access was never enabled from the current device.
hipErrorRuntimeMemory = 1052 |
HSA runtime memory call returned error. Typically not seen in production systems.
hipErrorRuntimeOther = 1053 |
HSA runtime call other than memory returned error. Typically not seen in production systems.
hipErrorStreamCaptureImplicit = 906 |
The operation would have resulted in a disallowed implicit dependency on a current capture sequence from hipStreamLegacy.
hipErrorStreamCaptureInvalidated = 901 |
The current capture sequence on the stream has been invalidated due to a previous error.
hipErrorStreamCaptureIsolation = 905 |
A dependency would have been created which crosses the capture sequence boundary. Only implicit in-stream ordering dependencies are allowed to cross the boundary
hipErrorStreamCaptureMerge = 902 |
The operation would have resulted in a merge of two independent capture sequences.
hipErrorStreamCaptureUnjoined = 904 |
The capture sequence contains a fork that was not joined to the primary stream.
hipErrorStreamCaptureUnsupported = 900 |
The operation is not permitted when the stream is capturing.
hipErrorStreamCaptureWrongThread = 908 |
A stream capture sequence not initiated with the hipStreamCaptureModeRelaxed argument to hipStreamBeginCapture was passed to hipStreamEndCapture in a different thread.
int hipDeviceProp_t::major |
Major compute capability. On HCC, this is an approximation and features may differ from CUDA CC. See the arch feature flags for portable ways to query feature caps.
int hipDeviceProp_t::minor |
Minor compute capability. On HCC, this is an approximation and features may differ from CUDA CC. See the arch feature flags for portable ways to query feature caps.
int hipDeviceProp_t::pageableMemoryAccess |
Device supports coherently accessing pageable memory without calling hipHostRegister on it