Description
Profiles ROCm GPU workloads using performance counters and derived metrics. It is useful for developers and researchers who need to understand kernel timing, bottlenecks, occupancy, and hardware behavior in AMD GPU applications.
Profiling can expose workload structure, model behavior, data sizes, and performance-sensitive details. Use it only on authorized workloads and review reports before sharing them.