Accelerating Leaderboard-Topping ASR Models 10x with NVIDIA NeMo

Summary NVIDIA NeMo has made significant strides in accelerating automatic speech recognition (ASR) models, achieving up to 10x speed improvements. This article delves into the key enhancements that enabled these advancements, including autocasting tensors to bfloat16, the innovative label-looping algorithm, and the introduction of CUDA Graphs available with NeMo 2.0.0. Overcoming Speed Performance Bottlenecks NVIDIA NeMo ASR models faced several speed performance bottlenecks, including casting overheads, low compute intensity, and divergence performance issues....

September 24, 2024 · Pablo Escobar

Petrobras Accelerates Linear Solvers with NVIDIA Grace CPU

Unlocking the Power of Reservoir Simulation: How Petrobras and NVIDIA Are Revolutionizing the Energy Sector Summary: Petrobras, a leading Brazilian energy company, has achieved significant advancements in reservoir simulation by leveraging the NVIDIA Grace CPU. This collaboration has resulted in faster time-to-solution, greater energy efficiency, and higher scalability compared to traditional x86-based CPUs. This article explores the key aspects of this partnership and how it is transforming the energy sector....

September 24, 2024 · Pablo Escobar

Using Generative AI to Enable Robots to Reason and Act with ReMEmbR

Summary This article explores how robots can be enabled to reason and act using generative AI, specifically through a project called ReMEmbR. ReMEmbR combines large language models (LLMs), vision-language models (VLMs), and retrieval-augmented generation (RAG) to allow robots to reason over long-horizon spatial and temporal memory, enabling them to answer complex queries and perform meaningful actions. Enabling Robots to Reason and Act with ReMEmbR ReMEmbR is a groundbreaking project that leverages the power of generative AI to enable robots to reason and act in complex environments....

September 23, 2024 · Tony Redgrave

Advancing the Accuracy-Efficiency Frontier with Llama-3.1-Nemotron-51B

Breaking Down Barriers: The Future of AI Models with Llama-3.1-Nemotron-51B Summary: NVIDIA’s latest breakthrough, Llama-3.1-Nemotron-51B, revolutionizes the field of large language models (LLMs) by achieving an unprecedented balance between accuracy and efficiency. This model, derived from Meta’s Llama-3.1-70B, employs a novel neural architecture search (NAS) approach to significantly reduce memory footprint and computational requirements while maintaining exceptional accuracy. This article delves into the details of this groundbreaking model and its implications for the future of AI....

September 23, 2024 · Tony Redgrave

AI-Powered 3D Printing Enhances Surgical Preparation

Revolutionizing Surgical Preparation: How AI-Powered 3D Printing Is Changing the Game Summary: A groundbreaking AI-driven 3D printing technique developed by researchers at Washington State University (WSU) is revolutionizing surgical preparation. This technology allows for the rapid creation of precise replicas of human organs, enabling surgeons to practice complex procedures before performing the actual surgery. This article explores how this innovation is enhancing surgical outcomes and what it means for the future of medical practice....

September 20, 2024 · Pablo Escobar

SLB and NVIDIA Collaborate on Gen AI Solutions for Energy

Unlocking the Power of Generative AI in the Energy Industry Summary The energy industry is on the cusp of a significant transformation, driven by the adoption of generative AI solutions. Global energy technology company SLB is collaborating with NVIDIA to develop industry-specific generative AI foundation models. This partnership aims to accelerate the development and deployment of AI-powered solutions across SLB’s global platforms, including its Delfi digital platform and Lumi data and AI platform....

September 20, 2024 · Pablo Escobar

Quickly Voice Your Apps with NVIDIA NIM Microservices for Speech and Translation

Summary NVIDIA’s NIM microservices are designed to enhance speech and translation capabilities in applications. These microservices leverage NVIDIA Riva to provide automatic speech recognition (ASR), neural machine translation (NMT), and text-to-speech (TTS) functionalities. This article explores how developers can use these microservices to build customer service bots, interactive voice assistants, and multilingual content platforms with minimal development effort. Voice Your Apps with NVIDIA NIM Microservices NVIDIA NIM microservices are part of the NVIDIA AI Enterprise suite and offer advanced speech and translation features....

September 18, 2024 · Tony Redgrave

Accelerating Oracle Database Gen AI Workloads with NVIDIA NIM and NVIDIA cuVS

Accelerating Oracle Database Generative AI Workloads with NVIDIA NIM and NVIDIA cuVS Summary: NVIDIA and Oracle have collaborated to demonstrate how NVIDIA’s accelerated computing platform can enhance the performance of generative AI workloads in Oracle Database. This partnership focuses on accelerating bulk vector embeddings, vector index creation, and large language model (LLM) inference using NVIDIA GPUs and software. The integration of NVIDIA NIM and NVIDIA cuVS with Oracle Database 23ai and Oracle Cloud Infrastructure (OCI) enables enterprises to leverage their structured and unstructured data more effectively, improving the quality and reliability of generative AI outputs....

September 17, 2024 · Tony Redgrave

Orchestrating Innovation at Scale with NVIDIA Maxine and Texel

Summary NVIDIA Maxine and Texel are revolutionizing virtual interactions with advanced AI capabilities. Maxine’s AI developer platform offers real-time video and audio enhancements, while Texel provides scalable integration solutions. This partnership enables developers to create engaging video applications with features like Eye Contact, which simulates eye contact to enhance human connections. With flexible integration options and seamless scalability, developers can focus on building unique user experiences while leaving the complexities of AI deployment to the experts....

September 16, 2024 · Tony Redgrave

Improved Data Loading with Threads

Summary This article explores the benefits of using threads instead of processes for data loading in deep learning applications, specifically focusing on the PyTorch torch.DataLoader tool. The removal of the Global Interpreter Lock (GIL) in upcoming Python versions opens new possibilities for parallelism, leading to experiments with thread-based parallelism in data loading. The article discusses the advantages and limitations of this approach, highlighting its potential for better performance in certain scenarios....

September 13, 2024 · Carl Corey