Unlocking Performance Insights with NVIDIA Nsight Systems 2021.5

Summary: NVIDIA Nsight Systems 2021.5 is a powerful performance analysis tool designed to help developers optimize their applications across CPUs and GPUs. This update introduces several key enhancements, including improved statistics, multi-report views, and expert system analysis for GPU utilization. In this article, we’ll delve into the new features and capabilities of Nsight Systems 2021.5, exploring how it can help developers unlock performance insights and scale their applications efficiently.

Enhanced Profiling Experience

Nsight Systems 2021.5 is part of the NVIDIA Nsight Tools Suite, a comprehensive set of debugging and profiling tools. This latest update is designed to enhance the profiling experience, providing developers with a deeper understanding of their application’s performance.

Statistics and Multi-Report Views

One of the key features of Nsight Systems 2021.5 is the addition of statistics in the user interface. This allows developers to quickly identify performance bottlenecks and optimize their applications. The multi-report view feature enables users to investigate performance issues across server nodes, VMs, containers, ranks, and processes, making it easier to identify and resolve performance limiters.

Expert System Analysis

Nsight Systems 2021.5 includes an expert system that provides GPU utilization analysis for OpenGL and DX12 applications. This feature helps developers identify performance bottlenecks and optimize their applications for better GPU utilization.

NVIDIA NIC Infiniband Metrics Sampling

Nsight Systems 2021.5 introduces NVIDIA NIC Infiniband metrics sampling, an experimental feature that enables developers to understand server communications, including throughput, packet counts, and congestion notifications.

DirectX12 Memory Operations and Warnings

The latest update includes a new memory operations row in the DirectX12 trace, highlighting memory usage warnings and situations where expensive functions are called when resources are non-persistently mapped.

WDDM Trace Correlation

Nsight Systems 2021.5 correlates graphics API calls to WDDM queue packets, providing developers with a better understanding of workload creation and its progress through the Windows display driver model.

Scaling Performance Across Platforms

Nsight Systems 2021.5 is designed to help developers scale their applications across a wide range of NVIDIA platforms, from NVIDIA DGX to NVIDIA RTX workstations, including NVIDIA DRIVE for automotive and NVIDIA Jetson for edge AI and robotics.

Visualizing CPU-GPU Interactions

Nsight Systems provides a unified timeline view of CPU and GPU activity, allowing developers to investigate correlations, dependencies, and performance bottlenecks.

Tracking GPU Activity

The tool offers GPU metrics sampling, which plots low-level input/output activity, such as PCIe throughput, NVIDIA NVLink, and dynamic random-access memory (DRAM) activity.

Tracing GPU Workloads

Nsight Systems supports investigating CUDA API and tracing CUDA libraries, including cuBLAS, cuDNN, and NVIDIA TensorRT.

Optimizing Performance

Nsight Systems 2021.5 provides developers with the tools they need to optimize their applications for better performance.

Detecting Frame Stutter and Bottlenecks

The tool automatically detects slow frames and local stutter frames, highlighting frame times higher than a target and frames with higher times than neighboring frames.

Python Support

Nsight Systems 2021.5 includes Python support, allowing developers to write Python applications that maximize GPU utilization.

Multi-Node Profiling

The tool supports multi-node profiling, enabling developers to resolve performance limiters on the scale of data centers and clusters.

Key Features:

  • Statistics and Multi-Report Views: Enhanced profiling experience with statistics and multi-report views.
  • Expert System Analysis: GPU utilization analysis for OpenGL and DX12 applications.
  • NVIDIA NIC Infiniband Metrics Sampling: Experimental feature for understanding server communications.
  • DirectX12 Memory Operations and Warnings: New memory operations row in DirectX12 trace.
  • WDDM Trace Correlation: Correlation of graphics API calls to WDDM queue packets.
  • Visualizing CPU-GPU Interactions: Unified timeline view of CPU and GPU activity.
  • Tracking GPU Activity: GPU metrics sampling for low-level input/output activity.
  • Tracing GPU Workloads: Support for investigating CUDA API and tracing CUDA libraries.
  • Detecting Frame Stutter and Bottlenecks: Automatic detection of slow frames and local stutter frames.
  • Python Support: Support for writing Python applications that maximize GPU utilization.
  • Multi-Node Profiling: Support for multi-node profiling to resolve performance limiters.

System Requirements:

  • Operating System: Windows 11, Linux
  • NVIDIA Platforms: NVIDIA DGX, NVIDIA RTX workstations, NVIDIA DRIVE, NVIDIA Jetson
  • CUDA Version: CUDA 11.4 or later
  • Nsight Systems Version: 2021.5 or later

Getting Started:

  1. Download Nsight Systems 2021.5: Visit the NVIDIA Developer website to download the latest version of Nsight Systems.
  2. Install Nsight Systems: Follow the installation instructions to install Nsight Systems on your system.
  3. Launch Nsight Systems: Launch Nsight Systems and start profiling your application.
  4. Analyze Performance: Use the statistics and multi-report views to analyze your application’s performance.
  5. Optimize Performance: Use the expert system analysis and other features to optimize your application’s performance.

By following these steps and using the features of Nsight Systems 2021.5, developers can unlock performance insights and scale their applications efficiently across CPUs and GPUs.

Conclusion

NVIDIA Nsight Systems 2021.5 is a powerful performance analysis tool that provides developers with the insights they need to optimize their applications across CPUs and GPUs. With its enhanced profiling experience, expert system analysis, and support for multi-node profiling, Nsight Systems 2021.5 is an essential tool for developers looking to unlock performance insights and scale their applications efficiently.