2.3. CUPTI Activity API

Functions, types, and enums that implement the CUPTI Activity API.

Classes

struct 
The base activity record.
struct 
The activity record for a driver or runtime API invocation.
struct 
Device auto boost state structure.
struct 
The activity record for source level result branch. (deprecated).
struct 
The activity record for source level result branch.
struct 
The activity record for CDP (CUDA Dynamic Parallelism) kernel.
struct 
The activity record for a context.
struct 
The activity record for a device. (deprecated).
struct 
The activity record for a device. (CUDA 7.0 onwards).
struct 
The activity record for a device attribute.
struct 
The activity record for CUPTI environmental data.
struct 
The activity record for a CUPTI event.
struct 
The activity record for a CUPTI event with instance information.
struct 
The activity record for global/device functions.
struct 
The activity record for source-level global access. (deprecated).
struct 
The activity record for source-level global access.
struct 
The activity record for source-level sass/source line-by-line correlation.
struct 
The activity record for source-level instruction execution.
struct 
The activity record for kernel. (deprecated).
struct 
The activity record for kernel. (deprecated).
struct 
The activity record for a kernel (CUDA 6.5(with sm_52 support) onwards).
struct 
The activity record providing a marker which is an instantaneous point in time.
struct 
The activity record providing detailed information for a marker.
struct 
The activity record for memory copies.
struct 
The activity record for peer-to-peer memory copies.
struct 
The activity record for memset.
struct 
The activity record for a CUPTI metric.
struct 
The activity record for a CUPTI metric with instance information. This activity record represents a CUPTI metric value for a specific metric domain instance (CUPTI_ACTIVITY_KIND_METRIC_INSTANCE). This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profile frameworks built on top of CUPTI that collect metric data may choose to use this type to store the collected metric data. This activity record should be used when metric domain instance information needs to be associated with the metric.
struct 
The activity record for a CUDA module.
struct 
The activity record providing a name.
union 
Identifiers for object kinds as specified by CUpti_ActivityObjectKind.
struct 
The activity record for CUPTI and driver overheads.
struct 
The activity record for PC sampling.
struct 
PC sampling configuration structure.
struct 
The activity record for record status for PC sampling.
struct 
The activity record for a preemption of a CDP kernel.
struct 
The activity record for source-level shared access.
struct 
The activity record for source locator.
struct 
The activity record for Unified Memory counters (deprecated in CUDA 7.0).
struct 
The activity record for Unified Memory counters (CUDA 7.0 and beyond).
struct 
Unified Memory counters configuration structure.

Defines

#define CUPTI_AUTO_BOOST_INVALID_CLIENT_PID 0
#define CUPTI_CORRELATION_ID_UNKNOWN 0
#define CUPTI_GRID_ID_UNKNOWN 0LL
#define CUPTI_SOURCE_LOCATOR_ID_UNKNOWN 0
#define CUPTI_TIMESTAMP_UNKNOWN 0LL

Typedefs

typedef void  ( *CUpti_BuffersCallbackCompleteFunc )( CUcontext context,  uint32_t streamId, uint8_t*  buffer,  size_t size,  size_t validSize )
Function type for callback used by CUPTI to return a buffer of activity records.
typedef void  ( *CUpti_BuffersCallbackRequestFunc )( uint8_t*  *buffer, size_t*  size, size_t*  maxNumRecords )
Function type for callback used by CUPTI to request an empty buffer for storing activity records.

Enumerations

enum CUpti_ActivityAttribute
Activity attributes.
enum CUpti_ActivityComputeApiKind
The kind of a compute API.
enum CUpti_ActivityEnvironmentKind
The kind of environment data. Used to indicate what type of data is being reported by an environment activity record.
enum CUpti_ActivityFlag
Flags associated with activity records.
enum CUpti_ActivityInstructionClass
SASS instruction classification.
enum CUpti_ActivityKind
The kinds of activity records.
enum CUpti_ActivityMemcpyKind
The kind of a memory copy, indicating the source and destination targets of the copy.
enum CUpti_ActivityMemoryKind
The kinds of memory accessed by a memory copy.
enum CUpti_ActivityObjectKind
The kinds of activity objects.
enum CUpti_ActivityOverheadKind
The kinds of activity overhead.
enum CUpti_ActivityPCSamplingPeriod
Sampling period for PC sampling method Sampling period can be set using /ref cuptiActivityConfigurePCSampling.
enum CUpti_ActivityPCSamplingStallReason
The stall reason for PC sampling activity.
enum CUpti_ActivityPartitionedGlobalCacheConfig
Partitioned global caching option.
enum CUpti_ActivityPreemptionKind
The kind of a preemption activity.
enum CUpti_ActivityUnifiedMemoryCounterKind
Kind of the Unified Memory counter.
enum CUpti_ActivityUnifiedMemoryCounterScope
Scope of the unified memory counter (deprecated in CUDA 7.0).
enum CUpti_EnvironmentClocksThrottleReason
Reasons for clock throttling.

Functions

CUptiResult cuptiActivityConfigurePCSampling ( CUcontext ctx, CUpti_ActivityPCSamplingConfig* config )
Set PC sampling configuration.
CUptiResult cuptiActivityConfigureUnifiedMemoryCounter ( CUpti_ActivityUnifiedMemoryCounterConfig* config, uint32_t count )
Set Unified Memory Counter configuration.
CUptiResult cuptiActivityDisable ( CUpti_ActivityKind kind )
Disable collection of a specific kind of activity record.
CUptiResult cuptiActivityDisableContext ( CUcontext context, CUpti_ActivityKind kind )
Disable collection of a specific kind of activity record for a context.
CUptiResult cuptiActivityEnable ( CUpti_ActivityKind kind )
Enable collection of a specific kind of activity record.
CUptiResult cuptiActivityEnableContext ( CUcontext context, CUpti_ActivityKind kind )
Enable collection of a specific kind of activity record for a context.
CUptiResult cuptiActivityFlush ( CUcontext context, uint32_t streamId, uint32_t flag )
Wait for all activity records are delivered via the completion callback.
CUptiResult cuptiActivityFlushAll ( uint32_t flag )
Wait for all activity records are delivered via the completion callback.
CUptiResult cuptiActivityGetAttribute ( CUpti_ActivityAttribute attr, size_t* valueSize, void* value )
Read an activity API attribute.
CUptiResult cuptiActivityGetNextRecord ( uint8_t* buffer, size_t validBufferSizeBytes, CUpti_Activity** record )
Iterate over the activity records in a buffer.
CUptiResult cuptiActivityGetNumDroppedRecords ( CUcontext context, uint32_t streamId, size_t* dropped )
Get the number of activity records that were dropped of insufficient buffer space.
CUptiResult cuptiActivityRegisterCallbacks ( CUpti_BuffersCallbackRequestFunc funcBufferRequested, CUpti_BuffersCallbackCompleteFunc funcBufferCompleted )
Registers callback functions with CUPTI for activity buffer handling.
CUptiResult cuptiActivitySetAttribute ( CUpti_ActivityAttribute attr, size_t* valueSize, void* value )
Write an activity API attribute.
CUptiResult cuptiGetAutoBoostState ( CUcontext context, CUpti_ActivityAutoBoostState* state )
Get auto boost state.
CUptiResult cuptiGetContextId ( CUcontext context, uint32_t* contextId )
Get the ID of a context.
CUptiResult cuptiGetDeviceId ( CUcontext context, uint32_t* deviceId )
Get the ID of a device.
CUptiResult cuptiGetStreamId ( CUcontext context, CUstream stream, uint32_t* streamId )
Get the ID of a stream.
CUptiResult cuptiGetTimestamp ( uint64_t* timestamp )
Get the CUPTI timestamp.

Defines

#define CUPTI_AUTO_BOOST_INVALID_CLIENT_PID 0

An invalid/unknown process id.

#define CUPTI_CORRELATION_ID_UNKNOWN 0

An invalid/unknown correlation ID. A correlation ID of this value indicates that there is no correlation for the activity record.

#define CUPTI_GRID_ID_UNKNOWN 0LL

An invalid/unknown grid ID.

#define CUPTI_SOURCE_LOCATOR_ID_UNKNOWN 0

The source-locator ID that indicates an unknown source location. There is not an actual CUpti_ActivitySourceLocator object corresponding to this value.

#define CUPTI_TIMESTAMP_UNKNOWN 0LL

An invalid/unknown timestamp for a start, end, queued, submitted, or completed time.

Typedefs

void ( *CUpti_BuffersCallbackCompleteFunc )( CUcontext context,  uint32_t streamId, uint8_t*  buffer,  size_t size,  size_t validSize )

Function type for callback used by CUPTI to return a buffer of activity records. This callback function returns to the CUPTI client a buffer containing activity records. The buffer contains validSize bytes of activity records which should be read using cuptiActivityGetNextRecord. The number of dropped records can be read using cuptiActivityGetNumDroppedRecords. After this call CUPTI relinquished ownership of the buffer and will not use it anymore. The client may return the buffer to CUPTI using the CUpti_BuffersCallbackRequestFunc callback. Note: CUDA 6.0 onwards, all buffers returned by this callback are global buffers i.e. there is no context/stream specific buffer. User needs to parse the global buffer to extract the context/stream specific activity records.

Parameters
context
The context this buffer is associated with. If NULL, the buffer is associated with the global activities. This field is deprecated as of CUDA 6.0 and will always be NULL.
uint32_t streamId
buffer
The activity record buffer.
size_t size
size_t validSize
void ( *CUpti_BuffersCallbackRequestFunc )( uint8_t*  *buffer, size_t*  size, size_t*  maxNumRecords )

Function type for callback used by CUPTI to request an empty buffer for storing activity records. This callback function signals the CUPTI client that an activity buffer is needed by CUPTI. The activity buffer is used by CUPTI to store activity records. The callback function can decline the request by setting *buffer to NULL. In this case CUPTI may drop activity records.

Parameters
*buffer
size
Returns the size of the returned buffer.
maxNumRecords
Returns the maximum number of records that should be placed in the buffer. If 0 then the buffer is filled with as many records as possible. If > 0 the buffer is filled with at most that many records before it is returned.

Enumerations

enum CUpti_ActivityAttribute

These attributes are used to control the behavior of the activity API.

Values
CUPTI_ACTIVITY_ATTR_DEVICE_BUFFER_SIZE = 0
The device memory size (in bytes) reserved for storing profiling data for non-CDP operations for each buffer on a context. The value is a size_t.Having larger buffer size means less flush operations but consumes more device memory. Having smaller buffer size increases the risk of dropping timestamps for kernel records if too many kernels are launched/replayed at one time. This value only applies to new buffer allocations.Set this value before initializing CUDA or before creating a context to ensure it is considered for the following allocations.The default value is 4194304 (4MB).Note: The actual amount of device memory per buffer reserved by CUPTI might be larger.
CUPTI_ACTIVITY_ATTR_DEVICE_BUFFER_SIZE_CDP = 1
The device memory size (in bytes) reserved for storing profiling data for CDP operations for each buffer on a context. The value is a size_t.Having larger buffer size means less flush operations but consumes more device memory. This value only applies to new allocations.Set this value before initializing CUDA or before creating a context to ensure it is considered for the following allocations.The default value is 8388608 (8MB).Note: The actual amount of device memory per context reserved by CUPTI might be larger.
CUPTI_ACTIVITY_ATTR_DEVICE_BUFFER_POOL_LIMIT = 2
The maximum number of memory buffers per context. The value is a size_t.Buffers can be reused by the context. Increasing this value reduces the times CUPTI needs to flush the buffers. Setting this value will not modify the number of memory buffers currently stored.Set this value before initializing CUDA to ensure the limit is not exceeded.The default value is 4.
CUPTI_ACTIVITY_ATTR_DEVICE_BUFFER_FORCE_INT = 0x7fffffff
enum CUpti_ActivityComputeApiKind

Values
CUPTI_ACTIVITY_COMPUTE_API_UNKNOWN = 0
The compute API is not known.
CUPTI_ACTIVITY_COMPUTE_API_CUDA = 1
The compute APIs are for CUDA.
CUPTI_ACTIVITY_COMPUTE_API_CUDA_MPS = 2
The compute APIs are for CUDA running in MPS (Multi-Process Service) environment.
CUPTI_ACTIVITY_COMPUTE_API_FORCE_INT = 0x7fffffff
enum CUpti_ActivityEnvironmentKind

Values
CUPTI_ACTIVITY_ENVIRONMENT_UNKNOWN = 0
Unknown data.
CUPTI_ACTIVITY_ENVIRONMENT_SPEED = 1
The environment data is related to speed.
CUPTI_ACTIVITY_ENVIRONMENT_TEMPERATURE = 2
The environment data is related to temperature.
CUPTI_ACTIVITY_ENVIRONMENT_POWER = 3
The environment data is related to power.
CUPTI_ACTIVITY_ENVIRONMENT_COOLING = 4
The environment data is related to cooling.
CUPTI_ACTIVITY_ENVIRONMENT_COUNT
CUPTI_ACTIVITY_ENVIRONMENT_KIND_FORCE_INT = 0x7fffffff
enum CUpti_ActivityFlag

Activity record flags. Flags can be combined by bitwise OR to associated multiple flags with an activity record. Each flag is specific to a certain activity kind, as noted below.

Values
CUPTI_ACTIVITY_FLAG_NONE = 0
Indicates the activity record has no flags.
CUPTI_ACTIVITY_FLAG_DEVICE_CONCURRENT_KERNELS = 1<<0
Indicates the activity represents a device that supports concurrent kernel execution. Valid for CUPTI_ACTIVITY_KIND_DEVICE.
CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE = 1<<0
Indicates if the activity represents a CUdevice_attribute value or a CUpti_DeviceAttribute value. Valid for CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE.
CUPTI_ACTIVITY_FLAG_MEMCPY_ASYNC = 1<<0
Indicates the activity represents an asynchronous memcpy operation. Valid for CUPTI_ACTIVITY_KIND_MEMCPY.
CUPTI_ACTIVITY_FLAG_MARKER_INSTANTANEOUS = 1<<0
Indicates the activity represents an instantaneous marker. Valid for CUPTI_ACTIVITY_KIND_MARKER.
CUPTI_ACTIVITY_FLAG_MARKER_START = 1<<1
Indicates the activity represents a region start marker. Valid for CUPTI_ACTIVITY_KIND_MARKER.
CUPTI_ACTIVITY_FLAG_MARKER_END = 1<<2
Indicates the activity represents a region end marker. Valid for CUPTI_ACTIVITY_KIND_MARKER.
CUPTI_ACTIVITY_FLAG_MARKER_COLOR_NONE = 1<<0
Indicates the activity represents a marker that does not specify a color. Valid for CUPTI_ACTIVITY_KIND_MARKER_DATA.
CUPTI_ACTIVITY_FLAG_MARKER_COLOR_ARGB = 1<<1
Indicates the activity represents a marker that specifies a color in alpha-red-green-blue format. Valid for CUPTI_ACTIVITY_KIND_MARKER_DATA.
CUPTI_ACTIVITY_FLAG_GLOBAL_ACCESS_KIND_SIZE_MASK = 0xFF<<0
The number of bytes requested by each thread Valid for CUpti_ActivityGlobalAccess2.
CUPTI_ACTIVITY_FLAG_GLOBAL_ACCESS_KIND_LOAD = 1<<8
If bit in this flag is set, the access was load, else it is a store access. Valid for CUpti_ActivityGlobalAccess2.
CUPTI_ACTIVITY_FLAG_GLOBAL_ACCESS_KIND_CACHED = 1<<9
If this bit in flag is set, the load access was cached else it is uncached. Valid for CUpti_ActivityGlobalAccess2.
CUPTI_ACTIVITY_FLAG_METRIC_OVERFLOWED = 1<<0
If this bit in flag is set, the metric value overflowed. Valid for CUpti_ActivityMetric and CUpti_ActivityMetricInstance.
CUPTI_ACTIVITY_FLAG_METRIC_VALUE_INVALID = 1<<1
If this bit in flag is set, the metric value couldn't be calculated. This occurs when a value(s) required to calculate the metric is missing. Valid for CUpti_ActivityMetric and CUpti_ActivityMetricInstance.
CUPTI_ACTIVITY_FLAG_INSTRUCTION_VALUE_INVALID = 1<<0
If this bit in flag is set, the source level metric value couldn't be calculated. This occurs when a value(s) required to calculate the source level metric cannot be evaluated. Valid for CUpti_ActivityInstructionExecution.
CUPTI_ACTIVITY_FLAG_INSTRUCTION_CLASS_MASK = 0xFF<<1
The mask for the instruction class, CUpti_ActivityInstructionClass Valid for CUpti_ActivityInstructionExecution and CUpti_ActivityInstructionCorrelation
CUPTI_ACTIVITY_FLAG_FLUSH_FORCED = 1<<0
When calling cuptiActivityFlushAll, this flag can be set to force CUPTI to flush all records in the buffer, whether finished or not
CUPTI_ACTIVITY_FLAG_SHARED_ACCESS_KIND_SIZE_MASK = 0xFF<<0
The number of bytes requested by each thread Valid for CUpti_ActivitySharedAccess.
CUPTI_ACTIVITY_FLAG_SHARED_ACCESS_KIND_LOAD = 1<<8
If bit in this flag is set, the access was load, else it is a store access. Valid for CUpti_ActivitySharedAccess.
CUPTI_ACTIVITY_FLAG_FORCE_INT = 0x7fffffff
enum CUpti_ActivityInstructionClass

The sass instruction are broadly divided into different class. Each enum represents a classification.

Values
CUPTI_ACTIVITY_INSTRUCTION_CLASS_UNKNOWN = 0
The instruction class is not known.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_FP_32 = 1
Represents a 32 bit floating point operation.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_FP_64 = 2
Represents a 64 bit floating point operation.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_INTEGER = 3
Represents an integer operation.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_BIT_CONVERSION = 4
Represents a bit conversion operation.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_CONTROL_FLOW = 5
Represents a control flow instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_GLOBAL = 6
Represents a global load-store instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_SHARED = 7
Represents a shared load-store instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_LOCAL = 8
Represents a local load-store instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_GENERIC = 9
Represents a generic load-store instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_SURFACE = 10
Represents a surface load-store instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_CONSTANT = 11
Represents a constant load instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_TEXTURE = 12
Represents a texture load-store instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_GLOBAL_ATOMIC = 13
Represents a global atomic instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_SHARED_ATOMIC = 14
Represents a shared atomic instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_SURFACE_ATOMIC = 15
Represents a surface atomic instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_INTER_THREAD_COMMUNICATION = 16
Represents a inter-thread communication instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_BARRIER = 17
Represents a barrier instruction.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_MISCELLANEOUS = 18
Represents some miscellaneous instructions which do not fit in the above classification.
CUPTI_ACTIVITY_INSTRUCTION_CLASS_KIND_FORCE_INT = 0x7fffffff
enum CUpti_ActivityKind
Values
CUPTI_ACTIVITY_KIND_INVALID = 0
The activity record is invalid.
CUPTI_ACTIVITY_KIND_MEMCPY = 1
A host<->host, host<->device, or device<->device memory copy. The corresponding activity record structure is CUpti_ActivityMemcpy.
CUPTI_ACTIVITY_KIND_MEMSET = 2
A memory set executing on the GPU. The corresponding activity record structure is CUpti_ActivityMemset.
CUPTI_ACTIVITY_KIND_KERNEL = 3
A kernel executing on the GPU. The corresponding activity record structure is CUpti_ActivityKernel3.
CUPTI_ACTIVITY_KIND_DRIVER = 4
A CUDA driver API function execution. The corresponding activity record structure is CUpti_ActivityAPI.
CUPTI_ACTIVITY_KIND_RUNTIME = 5
A CUDA runtime API function execution. The corresponding activity record structure is CUpti_ActivityAPI.
CUPTI_ACTIVITY_KIND_EVENT = 6
An event value. The corresponding activity record structure is CUpti_ActivityEvent.
CUPTI_ACTIVITY_KIND_METRIC = 7
A metric value. The corresponding activity record structure is CUpti_ActivityMetric.
CUPTI_ACTIVITY_KIND_DEVICE = 8
Information about a device. The corresponding activity record structure is CUpti_ActivityDevice2.
CUPTI_ACTIVITY_KIND_CONTEXT = 9
Information about a context. The corresponding activity record structure is CUpti_ActivityContext.
CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL = 10
A (potentially concurrent) kernel executing on the GPU. The corresponding activity record structure is CUpti_ActivityKernel3.
CUPTI_ACTIVITY_KIND_NAME = 11
Thread, device, context, etc. name. The corresponding activity record structure is CUpti_ActivityName.
CUPTI_ACTIVITY_KIND_MARKER = 12
Instantaneous, start, or end marker. The corresponding activity record structure is CUpti_ActivityMarker.
CUPTI_ACTIVITY_KIND_MARKER_DATA = 13
Extended, optional, data about a marker. The corresponding activity record structure is CUpti_ActivityMarkerData.
CUPTI_ACTIVITY_KIND_SOURCE_LOCATOR = 14
Source information about source level result. The corresponding activity record structure is CUpti_ActivitySourceLocator.
CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS = 15
Results for source-level global acccess. The corresponding activity record structure is CUpti_ActivityGlobalAccess2.
CUPTI_ACTIVITY_KIND_BRANCH = 16
Results for source-level branch. The corresponding activity record structure is CUpti_ActivityBranch2.
CUPTI_ACTIVITY_KIND_OVERHEAD = 17
Overhead activity records. The corresponding activity record structure is CUpti_ActivityOverhead.
CUPTI_ACTIVITY_KIND_CDP_KERNEL = 18
A CDP (CUDA Dynamic Parallel) kernel executing on the GPU. The corresponding activity record structure is CUpti_ActivityCdpKernel. This activity can not be directly enabled or disabled. It is enabled and disabled through concurrent kernel activity i.e. _CONCURRENT_KERNEL
CUPTI_ACTIVITY_KIND_PREEMPTION = 19
Preemption activity record indicating a preemption of a CDP (CUDA Dynamic Parallel) kernel executing on the GPU. The corresponding activity record structure is CUpti_ActivityPreemption.
CUPTI_ACTIVITY_KIND_ENVIRONMENT = 20
Environment activity records indicating power, clock, thermal, etc. levels of the GPU. The corresponding activity record structure is CUpti_ActivityEnvironment.
CUPTI_ACTIVITY_KIND_EVENT_INSTANCE = 21
An event value associated with a specific event domain instance. The corresponding activity record structure is CUpti_ActivityEventInstance.
CUPTI_ACTIVITY_KIND_MEMCPY2 = 22
A peer to peer memory copy. The corresponding activity record structure is CUpti_ActivityMemcpy2.
CUPTI_ACTIVITY_KIND_METRIC_INSTANCE = 23
A metric value associated with a specific metric domain instance. The corresponding activity record structure is CUpti_ActivityMetricInstance.
CUPTI_ACTIVITY_KIND_INSTRUCTION_EXECUTION = 24
Results for source-level instruction execution. The corresponding activity record structure is CUpti_ActivityInstructionExecution.
CUPTI_ACTIVITY_KIND_UNIFIED_MEMORY_COUNTER = 25
Unified Memory counter record. The corresponding activity record structure is CUpti_ActivityUnifiedMemoryCounter2.
CUPTI_ACTIVITY_KIND_FUNCTION = 26
Device global/function record. The corresponding activity record structure is CUpti_ActivityFunction.
CUPTI_ACTIVITY_KIND_MODULE = 27
CUDA Module record. The corresponding activity record structure is CUpti_ActivityModule.
CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE = 28
A device attribute value. The corresponding activity record structure is CUpti_ActivityDeviceAttribute.
CUPTI_ACTIVITY_KIND_SHARED_ACCESS = 29
Results for source-level shared acccess. The corresponding activity record structure is CUpti_ActivitySharedAccess.
CUPTI_ACTIVITY_KIND_PC_SAMPLING = 30
Enable PC sampling for kernels. This will serialize kernels. The corresponding activity record structure is CUpti_ActivityPCSampling.
CUPTI_ACTIVITY_KIND_PC_SAMPLING_RECORD_INFO = 31
Summary information about PC sampling records. The corresponding activity record structure is CUpti_ActivityPCSamplingRecordInfo.
CUPTI_ACTIVITY_KIND_INSTRUCTION_CORRELATION = 32
SASS/Source line-by-line correlation record. This will generate sass/source correlation for functions that have source level analysis or pc sampling results. The records will be generated only when either of source level analysis or pc sampling activity is enabled. The corresponding activity record structure is CUpti_ActivityInstructionCorrelation.
CUPTI_ACTIVITY_KIND_FORCE_INT = 0x7fffffff
enum CUpti_ActivityMemcpyKind

Each kind represents the source and destination targets of a memory copy. Targets are host, device, and array.

Values
CUPTI_ACTIVITY_MEMCPY_KIND_UNKNOWN = 0
The memory copy kind is not known.
CUPTI_ACTIVITY_MEMCPY_KIND_HTOD = 1
A host to device memory copy.
CUPTI_ACTIVITY_MEMCPY_KIND_DTOH = 2
A device to host memory copy.
CUPTI_ACTIVITY_MEMCPY_KIND_HTOA = 3
A host to device array memory copy.
CUPTI_ACTIVITY_MEMCPY_KIND_ATOH = 4
A device array to host memory copy.
CUPTI_ACTIVITY_MEMCPY_KIND_ATOA = 5
A device array to device array memory copy.
CUPTI_ACTIVITY_MEMCPY_KIND_ATOD = 6
A device array to device memory copy.
CUPTI_ACTIVITY_MEMCPY_KIND_DTOA = 7
A device to device array memory copy.
CUPTI_ACTIVITY_MEMCPY_KIND_DTOD = 8
A device to device memory copy on the same device.
CUPTI_ACTIVITY_MEMCPY_KIND_HTOH = 9
A host to host memory copy.
CUPTI_ACTIVITY_MEMCPY_KIND_PTOP = 10
A peer to peer memory copy across different devices.
CUPTI_ACTIVITY_MEMCPY_KIND_FORCE_INT = 0x7fffffff
enum CUpti_ActivityMemoryKind

Each kind represents the type of the memory accessed by a memory copy.

Values
CUPTI_ACTIVITY_MEMORY_KIND_UNKNOWN = 0
The memory kind is unknown.
CUPTI_ACTIVITY_MEMORY_KIND_PAGEABLE = 1
The memory is pageable.
CUPTI_ACTIVITY_MEMORY_KIND_PINNED = 2
The memory is pinned.
CUPTI_ACTIVITY_MEMORY_KIND_DEVICE = 3
The memory is on the device.
CUPTI_ACTIVITY_MEMORY_KIND_ARRAY = 4
The memory is an array.
CUPTI_ACTIVITY_MEMORY_KIND_FORCE_INT = 0x7fffffff
enum CUpti_ActivityObjectKind
Values
CUPTI_ACTIVITY_OBJECT_UNKNOWN = 0
The object kind is not known.
CUPTI_ACTIVITY_OBJECT_PROCESS = 1
A process.
CUPTI_ACTIVITY_OBJECT_THREAD = 2
A thread.
CUPTI_ACTIVITY_OBJECT_DEVICE = 3
A device.
CUPTI_ACTIVITY_OBJECT_CONTEXT = 4
A context.
CUPTI_ACTIVITY_OBJECT_STREAM = 5
A stream.
CUPTI_ACTIVITY_OBJECT_FORCE_INT = 0x7fffffff
enum CUpti_ActivityOverheadKind

Values
CUPTI_ACTIVITY_OVERHEAD_UNKNOWN = 0
The overhead kind is not known.
CUPTI_ACTIVITY_OVERHEAD_DRIVER_COMPILER = 1
Compiler(JIT) overhead.
CUPTI_ACTIVITY_OVERHEAD_CUPTI_BUFFER_FLUSH = 1<<16
Activity buffer flush overhead.
CUPTI_ACTIVITY_OVERHEAD_CUPTI_INSTRUMENTATION = 2<<16
CUPTI instrumentation overhead.
CUPTI_ACTIVITY_OVERHEAD_CUPTI_RESOURCE = 3<<16
CUPTI resource creation and destruction overhead.
CUPTI_ACTIVITY_OVERHEAD_FORCE_INT = 0x7fffffff
enum CUpti_ActivityPCSamplingPeriod

Values
CUPTI_ACTIVITY_PC_SAMPLING_PERIOD_INVALID = 0
The PC sampling period is not set.
CUPTI_ACTIVITY_PC_SAMPLING_PERIOD_MIN = 1
Minimum sampling period available on the device.
CUPTI_ACTIVITY_PC_SAMPLING_PERIOD_LOW = 2
Sampling period in lower range.
CUPTI_ACTIVITY_PC_SAMPLING_PERIOD_MID = 3
Medium sampling period.
CUPTI_ACTIVITY_PC_SAMPLING_PERIOD_HIGH = 4
Sampling period in higher range.
CUPTI_ACTIVITY_PC_SAMPLING_PERIOD_MAX = 5
Maximum sampling period available on the device.
CUPTI_ACTIVITY_PC_SAMPLING_PERIOD_FORCE_INT = 0x7fffffff
enum CUpti_ActivityPCSamplingStallReason

Values
CUPTI_ACTIVITY_PC_SAMPLING_STALL_INVALID = 0
Invalid reason
CUPTI_ACTIVITY_PC_SAMPLING_STALL_NONE = 1
No stall, instruciton is selected for issue
CUPTI_ACTIVITY_PC_SAMPLING_STALL_INST_FETCH = 2
Warp is blocked because next instruction is not yet available, because of instruction cache miss, or because of branching effects
CUPTI_ACTIVITY_PC_SAMPLING_STALL_EXEC_DEPENDENCY = 3
Instruction is waiting on an arithmatic dependency
CUPTI_ACTIVITY_PC_SAMPLING_STALL_MEMORY_DEPENDENCY = 4
Warp is blocked because it is waiting for a memory access to complete.
CUPTI_ACTIVITY_PC_SAMPLING_STALL_TEXTURE = 5
Texture sub-system is fully utilized or has too many outstanding requests.
CUPTI_ACTIVITY_PC_SAMPLING_STALL_SYNC = 6
Warp is blocked as it is waiting at __syncthreads() or at memory barrier.
CUPTI_ACTIVITY_PC_SAMPLING_STALL_CONSTANT_MEMORY_DEPENDENCY = 7
Warp is blocked waiting for __constant__ memory and immediate memory access to complete.
CUPTI_ACTIVITY_PC_SAMPLING_STALL_PIPE_BUSY = 8
Compute operation cannot be performed due to the required resources not being available.
CUPTI_ACTIVITY_PC_SAMPLING_STALL_MEMORY_THROTTLE = 9
Warp is blocked because there are too many pending memory operations. In Kepler architecture it often indicates high number of memory replays.
CUPTI_ACTIVITY_PC_SAMPLING_STALL_NOT_SELECTED = 10
Warp was ready to issue, but some other warp issued instead.
CUPTI_ACTIVITY_PC_SAMPLING_STALL_OTHER = 11
Miscellaneous reasons
CUPTI_ACTIVITY_PC_SAMPLING_STALL_FORCE_INT = 0x7fffffff
enum CUpti_ActivityPartitionedGlobalCacheConfig

Values
CUPTI_ACTIVITY_PARTITIONED_GLOBAL_CACHE_CONFIG_UNKNOWN = 0
Partitioned global cache config unknown.
CUPTI_ACTIVITY_PARTITIONED_GLOBAL_CACHE_CONFIG_NOT_SUPPORTED = 1
Partitioned global cache not supported.
CUPTI_ACTIVITY_PARTITIONED_GLOBAL_CACHE_CONFIG_OFF = 2
Partitioned global cache config off.
CUPTI_ACTIVITY_PARTITIONED_GLOBAL_CACHE_CONFIG_ON = 3
Partitioned global cache config on.
CUPTI_ACTIVITY_PARTITIONED_GLOBAL_CACHE_CONFIG_FORCE_INT = 0x7fffffff
enum CUpti_ActivityPreemptionKind

Values
CUPTI_ACTIVITY_PREEMPTION_KIND_UNKNOWN = 0
The preemption kind is not known.
CUPTI_ACTIVITY_PREEMPTION_KIND_SAVE = 1
Preemption to save CDP block.
CUPTI_ACTIVITY_PREEMPTION_KIND_RESTORE = 2
Preemption to restore CDP block.
CUPTI_ACTIVITY_PREEMPTION_KIND_FORCE_INT = 0x7fffffff
enum CUpti_ActivityUnifiedMemoryCounterKind

Many activities are associated with Unified Memory mechanism; among them are tranfer from host to device, device to host, page fault at host side.

Values
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_UNKNOWN = 0
The unified memory counter kind is not known.
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD = 1
Number of bytes transfered from host to device
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH = 2
Number of bytes transfered from device to host
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT = 3
Number of CPU page faults
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_COUNT
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_FORCE_INT = 0x7fffffff
enum CUpti_ActivityUnifiedMemoryCounterScope

Values
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_SCOPE_UNKNOWN = 0
The unified memory counter scope is not known.
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_SCOPE_PROCESS_SINGLE_DEVICE = 1
Collect unified memory counter for single process on one device
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_SCOPE_PROCESS_ALL_DEVICES = 2
Collect unified memory counter for single process across all devices
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_SCOPE_COUNT
CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_SCOPE_FORCE_INT = 0x7fffffff
enum CUpti_EnvironmentClocksThrottleReason

The possible reasons that a clock can be throttled. There can be more than one reason that a clock is being throttled so these types can be combined by bitwise OR. These are used in the clocksThrottleReason field in the Environment Activity Record.

Values
CUPTI_CLOCKS_THROTTLE_REASON_GPU_IDLE = 0x00000001
Nothing is running on the GPU and the clocks are dropping to idle state.
CUPTI_CLOCKS_THROTTLE_REASON_USER_DEFINED_CLOCKS = 0x00000002
The GPU clocks are limited by a user specified limit.
CUPTI_CLOCKS_THROTTLE_REASON_SW_POWER_CAP = 0x00000004
A software power scaling algorithm is reducing the clocks below requested clocks.
CUPTI_CLOCKS_THROTTLE_REASON_HW_SLOWDOWN = 0x00000008
Hardware slowdown to reduce the clock by a factor of two or more is engaged. This is an indicator of one of the following: 1) Temperature is too high, 2) External power brake assertion is being triggered (e.g. by the system power supply), 3) Change in power state.
CUPTI_CLOCKS_THROTTLE_REASON_UNKNOWN = 0x80000000
Some unspecified factor is reducing the clocks.
CUPTI_CLOCKS_THROTTLE_REASON_UNSUPPORTED = 0x40000000
Throttle reason is not supported for this GPU.
CUPTI_CLOCKS_THROTTLE_REASON_NONE = 0x00000000
No clock throttling.
CUPTI_CLOCKS_THROTTLE_REASON_FORCE_INT = 0x7fffffff

Functions

CUptiResult cuptiActivityConfigurePCSampling ( CUcontext ctx, CUpti_ActivityPCSamplingConfig* config )
Set PC sampling configuration.
Parameters
ctx
The context
config
A pointer to CUpti_ActivityPCSamplingConfig structure containing PC sampling configuration.
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_INVALID_OPERATION

    if this api is called while some valid event collection method is set.

  • CUPTI_ERROR_INVALID_PARAMETER

    if config is NULL or any parameter in the config structures is not a valid value

  • CUPTI_ERROR_NOT_SUPPORTED

    Indicates that the system/device does not support the unified memory counters

Description

CUptiResult cuptiActivityConfigureUnifiedMemoryCounter ( CUpti_ActivityUnifiedMemoryCounterConfig* config, uint32_t count )
Set Unified Memory Counter configuration.
Parameters
config
A pointer to CUpti_ActivityUnifiedMemoryCounterConfig structures containing Unified Memory counter configuration.
count
Number of Unified Memory counter configuration structures
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_PARAMETER

    if config is NULL or any parameter in the config structures is not a valid value

  • CUPTI_ERROR_UM_PROFILING_NOT_SUPPORTED

    One potential reason is that platform (OS/arch) does not support the unified memory counters

  • CUPTI_ERROR_UM_PROFILING_NOT_SUPPORTED_ON_DEVICE

    Indicates that the device does not support the unified memory counters

  • CUPTI_ERROR_UM_PROFILING_NOT_SUPPORTED_ON_NON_P2P_DEVICES

    Indicates that multi-GPU configuration without P2P support between any pair of devices does not support the unified memory counters

  • CUPTI_ERROR_UM_PROFILING_NOT_SUPPORTED_WITH_MPS

    Indicates that the Multi-Process Service (MPS) environment does not support the unified memory counters

Description

CUptiResult cuptiActivityDisable ( CUpti_ActivityKind kind )
Disable collection of a specific kind of activity record.
Parameters
kind
The kind of activity record to stop collecting
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_KIND

    if the activity kind is not supported

Description

Disable collection of a specific kind of activity record. Multiple kinds can be disabled by calling this function multiple times. By default all activity kinds are disabled for collection.

CUptiResult cuptiActivityDisableContext ( CUcontext context, CUpti_ActivityKind kind )
Disable collection of a specific kind of activity record for a context.
Parameters
context
The context for which activity is to be disabled
kind
The kind of activity record to stop collecting
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_KIND

    if the activity kind is not supported

Description

Disable collection of a specific kind of activity record for a context. This setting done by this API will supersede the global settings for activity records. Multiple kinds can be enabled by calling this function multiple times.

CUptiResult cuptiActivityEnable ( CUpti_ActivityKind kind )
Enable collection of a specific kind of activity record.
Parameters
kind
The kind of activity record to collect
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_NOT_COMPATIBLE

    if the activity kind cannot be enabled

  • CUPTI_ERROR_INVALID_KIND

    if the activity kind is not supported

Description

Enable collection of a specific kind of activity record. Multiple kinds can be enabled by calling this function multiple times. By default all activity kinds are disabled for collection.

CUptiResult cuptiActivityEnableContext ( CUcontext context, CUpti_ActivityKind kind )
Enable collection of a specific kind of activity record for a context.
Parameters
context
The context for which activity is to be enabled
kind
The kind of activity record to collect
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_NOT_COMPATIBLE

    if the activity kind cannot be enabled

  • CUPTI_ERROR_INVALID_KIND

    if the activity kind is not supported

Description

Enable collection of a specific kind of activity record for a context. This setting done by this API will supersede the global settings for activity records enabled by cuptiActivityEnable. Multiple kinds can be enabled by calling this function multiple times.

CUptiResult cuptiActivityFlush ( CUcontext context, uint32_t streamId, uint32_t flag )
Wait for all activity records are delivered via the completion callback.
Parameters
context
A valid CUcontext or NULL.
streamId
The stream ID.
flag
The flag can be set to indicate a forced flush. See CUpti_ActivityFlag
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_CUPTI_ERROR_INVALID_OPERATION

    if not preceeded by a successful call to cuptiActivityRegisterCallbacks

  • CUPTI_ERROR_UNKNOWN

    an internal error occurred

Description

This function does not return until all activity records associated with the specified context/stream are returned to the CUPTI client using the callback registered in cuptiActivityRegisterCallbacks. To ensure that all activity records are complete, the requested stream(s), if any, are synchronized.

If context is NULL, the global activity records (i.e. those not associated with a particular stream) are flushed (in this case no streams are synchonized). If context is a valid CUcontext and streamId is 0, the buffers of all streams of this context are flushed. Otherwise, the buffers of the specified stream in this context is flushed.

Before calling this function, the buffer handling callback api must be activated by calling cuptiActivityRegisterCallbacks.

**DEPRECATED** This method is deprecated CONTEXT and STREAMID will be ignored. Use cuptiActivityFlushAll to flush all data.

CUptiResult cuptiActivityFlushAll ( uint32_t flag )
Wait for all activity records are delivered via the completion callback.
Parameters
flag
The flag can be set to indicate a forced flush. See CUpti_ActivityFlag
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_OPERATION

    if not preceeded by a successful call to cuptiActivityRegisterCallbacks

  • CUPTI_ERROR_UNKNOWN

    an internal error occurred

Description

This function does not return until all activity records associated with all contexts/streams (and the global buffers not associated with any stream) are returned to the CUPTI client using the callback registered in cuptiActivityRegisterCallbacks. To ensure that all activity records are complete, the requested stream(s), if any, are synchronized.

Before calling this function, the buffer handling callback api must be activated by calling cuptiActivityRegisterCallbacks.

CUptiResult cuptiActivityGetAttribute ( CUpti_ActivityAttribute attr, size_t* valueSize, void* value )
Read an activity API attribute.
Parameters
attr
The attribute to read
valueSize
Size of buffer pointed by the value, and returns the number of bytes written to value
value
Returns the value of the attribute
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_PARAMETER

    if valueSize or value is NULL, or if attr is not an activity attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT

    Indicates that the value buffer is too small to hold the attribute value.

Description

Read an activity API attribute and return it in *value.

CUptiResult cuptiActivityGetNextRecord ( uint8_t* buffer, size_t validBufferSizeBytes, CUpti_Activity** record )
Iterate over the activity records in a buffer.
Parameters
buffer
The buffer containing activity records
validBufferSizeBytes
The number of valid bytes in the buffer.
record
Inputs the previous record returned by cuptiActivityGetNextRecord and returns the next activity record from the buffer. If input value is NULL, returns the first activity record in the buffer. Records of kind CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL may contain invalid (0) timestamps, indicating that no timing information could be collected for lack of device memory.
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_MAX_LIMIT_REACHED

    if no more records in the buffer

  • CUPTI_ERROR_INVALID_PARAMETER

    if buffer is NULL.

Description

This is a helper function to iterate over the activity records in a buffer. A buffer of activity records is typically obtained by using the cuptiActivityDequeueBuffer() function or by receiving a CUpti_BuffersCallbackCompleteFunc callback.

An example of typical usage:

CUpti_Activity *record = NULL;
       CUptiResult status = CUPTI_SUCCESS;
         do {
            status = cuptiActivityGetNextRecord(buffer, validSize, &record);
            if(status == CUPTI_SUCCESS) {
                 // Use record here...
            }
            else if (status == CUPTI_ERROR_MAX_LIMIT_REACHED)
                break;
            else {
                goto Error;
            }
          } while (1);

CUptiResult cuptiActivityGetNumDroppedRecords ( CUcontext context, uint32_t streamId, size_t* dropped )
Get the number of activity records that were dropped of insufficient buffer space.
Parameters
context
The context, or NULL to get dropped count from global queue
streamId
The stream ID
dropped
The number of records that were dropped since the last call to this function.
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_PARAMETER

    if dropped is NULL

Description

Get the number of records that were dropped because of insufficient buffer space. The dropped count includes records that could not be recorded because CUPTI did not have activity buffer space available for the record (because the CUpti_BuffersCallbackRequestFunc callback did not return an empty buffer of sufficient size) and also CDP records that could not be record because the device-size buffer was full (size is controlled by the CUPTI_ACTIVITY_ATTR_DEVICE_BUFFER_SIZE_CDP attribute). The dropped count maintained for the queue is reset to zero when this function is called.

CUptiResult cuptiActivityRegisterCallbacks ( CUpti_BuffersCallbackRequestFunc funcBufferRequested, CUpti_BuffersCallbackCompleteFunc funcBufferCompleted )
Registers callback functions with CUPTI for activity buffer handling.
Parameters
funcBufferRequested
callback which is invoked when an empty buffer is requested by CUPTI
funcBufferCompleted
callback which is invoked when a buffer containing activity records is available from CUPTI
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_INVALID_PARAMETER

    if either funcBufferRequested or funcBufferCompleted is NULL

Description

This function registers two callback functions to be used in asynchronous buffer handling. If registered, activity record buffers are handled using asynchronous requested/completed callbacks from CUPTI.

Registering these callbacks prevents the client from using CUPTI's blocking enqueue/dequeue functions.

CUptiResult cuptiActivitySetAttribute ( CUpti_ActivityAttribute attr, size_t* valueSize, void* value )
Write an activity API attribute.
Parameters
attr
The attribute to write
valueSize
The size, in bytes, of the value
value
The attribute value to write
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_PARAMETER

    if valueSize or value is NULL, or if attr is not an activity attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT

    Indicates that the value buffer is too small to hold the attribute value.

Description

Write an activity API attribute.

CUptiResult cuptiGetAutoBoostState ( CUcontext context, CUpti_ActivityAutoBoostState* state )
Get auto boost state.
Parameters
context
A valid CUcontext.
state
A pointer to CUpti_ActivityAutoBoostState structure which contains the current state and the id of the process that has requested the current state
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_INVALID_PARAMETER

    if CUcontext or state is NULL

  • CUPTI_ERROR_NOT_SUPPORTED

    Indicates that the device does not support auto boost

  • CUPTI_ERROR_UNKNOWN

    an internal error occurred

Description

The profiling results can be inconsistent in case auto boost is enabled. CUPTI tries to disable auto boost while profiling. It can fail to disable in cases where user does not have the permissions or CUDA_AUTO_BOOST env variable is set. The function can be used to query whether auto boost is enabled.

CUptiResult cuptiGetContextId ( CUcontext context, uint32_t* contextId )
Get the ID of a context.
Parameters
context
The context
contextId
Returns a process-unique ID for the context
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_CONTEXT

    The context is NULL or not valid.

  • CUPTI_ERROR_INVALID_PARAMETER

    if contextId is NULL

Description

Get the ID of a context.

CUptiResult cuptiGetDeviceId ( CUcontext context, uint32_t* deviceId )
Get the ID of a device.
Parameters
context
The context, or NULL to indicate the current context.
deviceId
Returns the ID of the device that is current for the calling thread.
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

    if unable to get device ID

  • CUPTI_ERROR_INVALID_PARAMETER

    if deviceId is NULL

Description

If context is NULL, returns the ID of the device that contains the currently active context. If context is non-NULL, returns the ID of the device which contains that context. Operates in a similar manner to cudaGetDevice() or cuCtxGetDevice() but may be called from within callback functions.

CUptiResult cuptiGetStreamId ( CUcontext context, CUstream stream, uint32_t* streamId )
Get the ID of a stream.
Parameters
context
If non-NULL then the stream is checked to ensure that it belongs to this context. Typically this parameter should be null.
stream
The stream
streamId
Returns a context-unique ID for the stream
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_STREAM

    if unable to get stream ID, or if context is non-NULL and stream does not belong to the context

  • CUPTI_ERROR_INVALID_PARAMETER

    if streamId is NULL

Description

Get the ID of a stream. The stream ID is unique within a context (i.e. all streams within a context will have unique stream IDs).

See also:

cuptiActivityEnqueueBuffer

cuptiActivityDequeueBuffer

CUptiResult cuptiGetTimestamp ( uint64_t* timestamp )
Get the CUPTI timestamp.
Parameters
timestamp
Returns the CUPTI timestamp
Returns

  • CUPTI_SUCCESS

  • CUPTI_ERROR_INVALID_PARAMETER

    if timestamp is NULL

Description

Returns a timestamp normalized to correspond with the start and end timestamps reported in the CUPTI activity records. The timestamp is reported in nanoseconds.