Generative AI – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-03T23:45:44Z https://developer.nvidia.com/blog/feed/ Michelle Horton <![CDATA[Top Generative AI Sessions at NVIDIA GTC 2025]]> https://developer.nvidia.com/blog/?p=96689 2025-03-03T23:45:44Z 2025-03-03T23:45:42Z Discover cutting-edge AI and data science innovations from top generative AI teams at NVIDIA GTC 2025.]]>

Discover cutting-edge AI and data science innovations from top generative AI teams at NVIDIA GTC 2025.

Source

]]>
Aditi Bodhankar <![CDATA[Measuring the Effectiveness and Performance of AI Guardrails in Generative AI Applications]]> https://developer.nvidia.com/blog/?p=96562 2025-03-03T17:22:13Z 2025-03-03T17:22:09Z Safeguarding AI agents and other conversational AI applications to ensure safe, on-brand and reliable behavior is essential for enterprises. NVIDIA NeMo...]]>

Safeguarding AI agents and other conversational AI applications to ensure safe, on-brand and reliable behavior is essential for enterprises. NVIDIA NeMo Guardrails offers robust protection with AI guardrails for content safety, topic control, jailbreak detection, and more to evaluate and optimize guardrail performance. In this post, we explore techniques for measuring and optimizing your AI…

Source

]]>
Mehran Maghoumi <![CDATA[Build an AI Agent with Expert Reasoning Capabilities Using the DeepSeek-R1 NIM]]> https://developer.nvidia.com/blog/?p=96030 2025-02-28T20:23:54Z 2025-02-28T20:23:51Z AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on...]]>

AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on expert reasoning, enabling smarter planning and efficient execution. Agentic AI applications could benefit from the capabilities of models such as DeepSeek-R1. Built for solving problems that require advanced AI reasoning…

Source

]]>
Sangjune Park <![CDATA[Spotlight: NAVER Place Optimizes SLM-Based Vertical Services with NVIDIA TensorRT-LLM]]> https://developer.nvidia.com/blog/?p=96279 2025-02-28T19:40:53Z 2025-02-28T17:57:49Z NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of...]]>

NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of businesses and points of interest across Korea. Users can search about different places, leave reviews, and place bookings or orders in real time. NAVER Place vertical services are based on small language models (SLMs) to improve usability…

Source

]]>
Anu Srivastava <![CDATA[Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs]]> https://developer.nvidia.com/blog/?p=96519 2025-02-28T17:13:38Z 2025-02-26T22:05:00Z Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...]]>

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical for the current resource constraints that many companies have. The rise of small language models (SLMs) bridge quality and cost by creating models with a smaller resource footprint. SLMs are a subset of language models that tend to…

Source

]]>
Francesco Ciannella <![CDATA[Building a Simple VLM-Based Multimodal Information Retrieval System with NVIDIA NIM]]> https://developer.nvidia.com/blog/?p=96151 2025-02-25T22:21:17Z 2025-02-26T17:00:00Z In today’s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined,...]]>

In today’s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined, effective solutions for quick deployments, prototyping, or experimentation. One of the key challenges in information retrieval is managing the diverse modalities in unstructured datasets, including text, PDFs, images, tables, audio, video…

Source

]]>
1
Yifan Wu <![CDATA[Accelerating Scientific Literature Reviews with NVIDIA NIM Microservices for LLMs]]> https://developer.nvidia.com/blog/?p=96324 2025-02-25T22:13:42Z 2025-02-26T17:00:00Z A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a...]]>

A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a structured overview of the domain. For experts, it refines their understanding and sparks new ideas. In 2024 alone, 218,650 review articles were indexed in the Web of Science database, highlighting the importance of these resources in research.

Source

]]>
Shubham Agrawal <![CDATA[Vision Language Model Prompt Engineering Guide for Image and Video Understanding]]> https://developer.nvidia.com/blog/?p=96229 2025-02-26T16:25:37Z 2025-02-26T16:25:34Z Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual...]]>

Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual understanding to large language models (LLMs) through the use of a vision encoder. These initial VLMs were limited in their abilities, only able to understand text and single image inputs. Fast-forward a few years and VLMs are now capable of…

Source

]]>
Mark Ren <![CDATA[Configurable Graph-Based Task Solving with the Marco Multi-AI Agent Framework for Chip Design]]> https://developer.nvidia.com/blog/?p=96209 2025-02-25T22:17:31Z 2025-02-25T22:17:28Z Chip and hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around...]]>

Chip and hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around time (TAT) for optimizing performance, power, area, and cost (PPAC) during synthesis, verification, physical design, and reliability loops. Large language models (LLMs) have shown a remarkable capacity to comprehend and generate natural…

Source

]]>
Leon Derczynski <![CDATA[Defining LLM Red Teaming]]> https://developer.nvidia.com/blog/?p=96239 2025-02-28T17:15:40Z 2025-02-25T18:49:26Z There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to...]]>

There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to deviate from acceptable standards. This use of LLMs began in 2023 and has rapidly evolved to become a common industry practice and a cornerstone of trustworthy AI. How can we standardize and define LLM red teaming?

Source

]]>
Rich Harang <![CDATA[Agentic Autonomy Levels and Security]]> https://developer.nvidia.com/blog/?p=96341 2025-02-26T19:20:12Z 2025-02-25T18:45:05Z Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable...]]>

Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable AI models to use tools to access additional data or automate user actions, and enable AI models to operate autonomously, analyzing and performing complex tasks with a minimum of human involvement or interaction. Because of their power…

Source

]]>
Joe Bungo <![CDATA[NVIDIA Deep Learning Institute Releases New Generative AI Teaching Kit]]> https://developer.nvidia.com/blog/?p=88388 2025-02-25T17:48:08Z 2025-02-25T17:47:49Z Generative AI, powered by advanced machine learning models and deep neural networks, is revolutionizing industries by generating novel content and driving...]]>

Generative AI, powered by advanced machine learning models and deep neural networks, is revolutionizing industries by generating novel content and driving innovation in fields like healthcare, finance, and entertainment. NVIDIA is leading this transformation with its cutting-edge GPU architectures and software ecosystems, such as the H100 Tensor Core GPU and CUDA platform…

Source

]]>
6
Charu Chaubal <![CDATA[NVIDIA AI Enterprise Adds Support for NVIDIA H200 NVL]]> https://developer.nvidia.com/blog/?p=96424 2025-02-24T22:37:49Z 2025-02-24T22:37:47Z NVIDIA AI Enterprise is the cloud-native software platform for the development and deployment of production-grade AI solutions. The latest release of the NVIDIA...]]>

NVIDIA AI Enterprise is the cloud-native software platform for the development and deployment of production-grade AI solutions. The latest release of the NVIDIA AI Enterprise infrastructure software collection adds support for the latest NVIDIA data center GPU, NVIDIA H200 NVL, giving your enterprise new options for powering cutting-edge use cases such as agentic and generative AI with some of the…

Source

]]>
Sama Bali <![CDATA[Transforming Product Design Workflows in Manufacturing with Generative AI]]> https://developer.nvidia.com/blog/?p=96242 2025-02-21T17:42:04Z 2025-02-20T19:32:11Z Traditional design and engineering workflows in the manufacturing industry have long been characterized by a sequential, iterative approach that is often...]]>

Traditional design and engineering workflows in the manufacturing industry have long been characterized by a sequential, iterative approach that is often time-consuming and resource intensive. These conventional methods typically involve stages such as requirement gathering, conceptual design, detailed design, analysis, prototyping, and testing, with each phase dependent on the results of previous…

Source

]]>
Sven Chilton <![CDATA[Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT]]> https://developer.nvidia.com/blog/?p=95339 2025-02-20T18:54:51Z 2025-02-20T18:54:48Z NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a...]]>

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a collection of GPU-accelerated speech and translation AI microservices for ASR, TTS, and NMT, support English-Spanish and English-Japanese code-switching ASR models based on the Conformer architecture, along with a model supporting multiple…

Source

]]>
Tanya Lenz <![CDATA[Upcoming Livestream: Using the NVIDIA AI Blueprint for PDF to Podcast ]]> https://developer.nvidia.com/blog/?p=96307 2025-02-20T18:11:39Z 2025-02-20T18:11:37Z Join us on February 27 to learn how to transform PDFs into AI podcasts using the NVIDIA AI Blueprint.]]>

Join us on February 27 to learn how to transform PDFs into AI podcasts using the NVIDIA AI Blueprint.

Source

]]>
Allyson Vasquez <![CDATA[Bring NVIDIA ACE AI Characters to Games with the New In-Game Inferencing SDK]]> https://developer.nvidia.com/blog/?p=96051 2025-02-20T21:43:57Z 2025-02-20T17:00:00Z NVIDIA ACE is a suite of digital human technologies that bring game characters and digital assistants to life with generative AI. ACE on-device models enable...]]>

Source

]]>
Nitzan Simchi <![CDATA[Spotlight: Drug Discovery Startup Protai Advances Complex Structure Prediction with AlphaFold, Proteomics, and NVIDIA NIM]]> https://developer.nvidia.com/blog/?p=96107 2025-02-20T15:51:52Z 2025-02-19T17:30:00Z Generative AI, especially with breakthroughs like AlphaFold and RosettaFold, is transforming drug discovery and how biotech companies and research laboratories...]]>

Generative AI, especially with breakthroughs like AlphaFold and RosettaFold, is transforming drug discovery and how biotech companies and research laboratories study protein structures, unlocking groundbreaking insights into protein interactions. Proteins are dynamic entities. It has been postulated that a protein’s native state is known by its sequence of amino acids alone…

Source

]]>
Kyle Tretina <![CDATA[Understanding the Language of Life’s Biomolecules Across Evolution at a New Scale with Evo 2]]> https://developer.nvidia.com/blog/?p=95589 2025-02-20T15:52:05Z 2025-02-19T17:14:51Z AI has evolved from an experimental curiosity to a driving force within biological research. The convergence of deep learning algorithms, massive omics...]]>

AI has evolved from an experimental curiosity to a driving force within biological research. The convergence of deep learning algorithms, massive omics datasets, and automated laboratory workflows has allowed scientists to tackle problems once thought intractable—from rapid protein structure prediction to generative drug design, increasing the need for AI literacy among scientists.

Source

]]>
Brad Nemire <![CDATA[Featured Sessions for Students at NVIDIA GTC 2025]]> https://developer.nvidia.com/blog/?p=96181 2025-02-20T15:52:32Z 2025-02-15T02:00:58Z Learn from researchers, scientists, and industry leaders across a variety of topics including AI, robotics, and Data Science.]]>

Learn from researchers, scientists, and industry leaders across a variety of topics including AI, robotics, and Data Science.

Source

]]>
Anjali Shah <![CDATA[Optimizing Qwen2.5-Coder Throughput with NVIDIA TensorRT-LLM Lookahead Decoding]]> https://developer.nvidia.com/blog/?p=96010 2025-02-20T15:52:43Z 2025-02-14T18:19:37Z Large language models (LLMs) that specialize in coding have been steadily adopted into developer workflows. From pair programming to self-improving AI agents,...]]>

Large language models (LLMs) that specialize in coding have been steadily adopted into developer workflows. From pair programming to self-improving AI agents, these models assist developers with various tasks, including enhancing code, fixing bugs, generating tests, and writing documentation. To promote the development of open-source LLMs, the Qwen team recently released Qwen2.5-Coder…

Source

]]>
Joanne Chang <![CDATA[Upcoming Webinar: Unlocking Video Analytics With AI Agents]]> https://developer.nvidia.com/blog/?p=96135 2025-02-20T15:52:55Z 2025-02-13T22:05:57Z Master prompt engineering, fine-tuning, and customization to build video analytics AI agents.]]>

Master prompt engineering, fine-tuning, and customization to build video analytics AI agents.

Source

]]>
Terry Chen <![CDATA[Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling]]> https://developer.nvidia.com/blog/?p=95998 2025-02-20T15:56:57Z 2025-02-12T18:00:00Z As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is...]]>

As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is emerging. Also known as AI reasoning or long-thinking, this technique improves model performance by allocating additional computational resources during inference to evaluate multiple possible outcomes and then selecting the best one…

Source

]]>
2
Gomathy Venkata Krishnan <![CDATA[LLM Model Pruning and Knowledge Distillation with NVIDIA NeMo Framework]]> https://developer.nvidia.com/blog/?p=93451 2025-02-20T15:54:00Z 2025-02-12T17:54:52Z Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. ...]]>

Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. The How to Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model post discussed the best practices of using large language models (LLMs) that combine depth, width, attention, and MLP pruning with knowledge distillation…

Source

]]>
Emily Potyraj <![CDATA[NVIDIA DGX Cloud Introduces Ready-To-Use Templates to Benchmark AI Platform Performance]]> https://developer.nvidia.com/blog/?p=95558 2025-02-20T15:54:23Z 2025-02-11T17:00:00Z In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a...]]>

In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a comprehensive evaluation of the entire stack, from compute to networking to model framework. Navigating the complexities of AI system performance can be difficult. There are many application changes that you can make…

Source

]]>
Brad Nemire <![CDATA[Featured Researcher and Educator Sessions at NVIDIA GTC 2025]]> https://developer.nvidia.com/blog/?p=95817 2025-02-06T19:33:45Z 2025-02-05T23:03:06Z Explore the latest advancements in academia, including advanced research, innovative teaching methods, and the future of learning and technology.]]>

Explore the latest advancements in academia, including advanced research, innovative teaching methods, and the future of learning and technology.

Source

]]>
Cheng-Han (Hank) Du <![CDATA[Improving Translation Quality with Domain-Specific Fine-Tuning and NVIDIA NIM]]> https://developer.nvidia.com/blog/?p=95756 2025-02-06T19:33:46Z 2025-02-05T21:30:00Z Translation plays an essential role in enabling companies to expand across borders, with requirements varying significantly in terms of tone, accuracy, and...]]>

Translation plays an essential role in enabling companies to expand across borders, with requirements varying significantly in terms of tone, accuracy, and technical terminology handling. The emergence of sovereign AI has highlighted critical challenges in large language models (LLMs), particularly their struggle to capture nuanced cultural and linguistic contexts beyond English-dominant…

Source

]]>
1
Pradeep Ramani <![CDATA[OpenAI Triton on NVIDIA Blackwell Boosts AI Performance and Programmability]]> https://developer.nvidia.com/blog/?p=95388 2025-02-06T19:33:47Z 2025-02-05T18:00:00Z Matrix multiplication and attention mechanisms are the computational backbone of modern AI workloads. While libraries like NVIDIA cuDNN provide highly optimized...]]>

Matrix multiplication and attention mechanisms are the computational backbone of modern AI workloads. While libraries like NVIDIA cuDNN provide highly optimized implementations, and frameworks such as CUTLASS offer deep customization, many developers and researchers need a middle ground that combines performance with programmability. The open-source Triton compiler on the NVIDIA Blackwell…

Source

]]>
Shruthii Sathyanarayanan <![CDATA[Streamline Collaboration Across Local and Cloud Systems with NVIDIA AI Workbench]]> https://developer.nvidia.com/blog/?p=95720 2025-02-06T19:35:45Z 2025-02-05T18:00:00Z NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a...]]>

NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a frictionless experience across PCs, workstations, servers, and cloud for AI, data science, and machine learning (ML) projects. The user experience includes: This post provides details about the January 2025 release of NVIDIA AI Workbench…

Source

]]>
Isabel Hulseman <![CDATA[New NVIDIA AI Blueprint: Build a Customizable RAG Pipeline]]> https://developer.nvidia.com/blog/?p=95614 2025-02-13T20:44:16Z 2025-01-30T22:26:12Z Connect AI applications to enterprise data using embedding and reranking models for information retrieval.]]>

Connect AI applications to enterprise data using embedding and reranking models for information retrieval.

Source

]]>
Eric Phan <![CDATA[How to Integrate NVIDIA DLSS 4 into Your Game with NVIDIA Streamline]]> https://developer.nvidia.com/blog/?p=95492 2025-02-06T19:33:58Z 2025-01-30T14:00:00Z NVIDIA DLSS 4 is the latest iteration of DLSS introduced with the NVIDIA GeForce RTX 50 Series GPUs. It includes several new features: DLSS Multi Frame...]]>

NVIDIA DLSS 4 is the latest iteration of DLSS introduced with the NVIDIA GeForce RTX 50 Series GPUs. It includes several new features: Here’s how you can get started with DLSS 4 in your integrations. This post focuses on the Streamline SDK, which provides a plug-and-play framework for simplified plugin integration. The NVIDIA Streamline SDK is an open-source framework that…

Source

]]>
Annamalai Chockalingam <![CDATA[New AI SDKs and Tools Released for NVIDIA Blackwell GeForce RTX 50 Series GPUs]]> https://developer.nvidia.com/blog/?p=95526 2025-02-06T19:33:57Z 2025-01-30T14:00:00Z NVIDIA recently announced a new generation of PC GPUs—the GeForce RTX 50 Series—alongside new AI-powered SDKs and tools for developers. Powered by the...]]>

NVIDIA recently announced a new generation of PC GPUs—the GeForce RTX 50 Series—alongside new AI-powered SDKs and tools for developers. Powered by the NVIDIA Blackwell architecture, fifth-generation Tensor Cores and fourth-generation RT Cores, the GeForce RTX 50 Series delivers breakthroughs in AI-driven rendering, including neural shaders, digital human technologies, geometry and lighting.

Source

]]>
Amit Bleiweiss <![CDATA[Mastering LLM Techniques: Evaluation]]> https://developer.nvidia.com/blog/?p=95447 2025-02-17T05:21:53Z 2025-01-29T20:44:06Z Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and...]]>

Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and multifaceted nature of these systems. Unlike traditional machine learning (ML) models, LLMs generate a wide range of diverse and often unpredictable outputs, making standard evaluation metrics insufficient. Key challenges include the…

Source

]]>
Edoardo Maria Ponti <![CDATA[Dynamic Memory Compression]]> https://developer.nvidia.com/blog/?p=93500 2025-02-06T19:34:01Z 2025-01-24T17:43:42Z Despite the success of large language models (LLMs) as general-purpose AI tools, their high demand for computational resources make their deployment challenging...]]>

Despite the success of large language models (LLMs) as general-purpose AI tools, their high demand for computational resources make their deployment challenging in many real-world scenarios. The sizes of the model and conversation state are limited by the available high-bandwidth memory, limiting the number of users that can be served and the maximum conversation length. At present…

Source

]]>
Nick Comly <![CDATA[Optimize AI Inference Performance with NVIDIA Full-Stack Solutions]]> https://developer.nvidia.com/blog/?p=95310 2025-02-06T19:34:02Z 2025-01-24T16:00:00Z The explosion of AI-driven applications has placed unprecedented demands on both developers, who must balance delivering cutting-edge performance with managing...]]>

The explosion of AI-driven applications has placed unprecedented demands on both developers, who must balance delivering cutting-edge performance with managing operational complexity and cost, and AI infrastructure. NVIDIA is empowering developers with full-stack innovations—spanning chips, systems, and software—that redefine what’s possible in AI inference, making it faster, more efficient…

Source

]]>
Juana Nakfour <![CDATA[Horizontal Autoscaling of NVIDIA NIM Microservices on Kubernetes]]> https://developer.nvidia.com/blog/?p=94972 2025-02-06T19:34:03Z 2025-01-22T17:34:51Z NVIDIA NIM microservices are model inference containers that can be deployed on Kubernetes. In a production environment, it’s important to understand the...]]>

NVIDIA NIM microservices are model inference containers that can be deployed on Kubernetes. In a production environment, it’s important to understand the compute and memory profile of these microservices to set up a successful autoscaling plan. In this post, we describe how to set up and use Kubernetes Horizontal Pod Autoscaling (HPA) with an NVIDIA NIM for LLMs model to automatically scale…

Source

]]>
2
Chris Krapu <![CDATA[Lessons Learned from Building an AI Sales Assistant]]> https://developer.nvidia.com/blog/?p=95231 2025-02-06T19:34:04Z 2025-01-21T20:34:41Z At NVIDIA, the Sales Operations team equips the Sales team with the tools and resources needed to bring cutting-edge hardware and software to market. Managing...]]>

At NVIDIA, the Sales Operations team equips the Sales team with the tools and resources needed to bring cutting-edge hardware and software to market. Managing this across NVIDIA’s diverse technology is a complex challenge shared by many enterprises. Through collaboration with our Sales team, we found that they rely on internal and external documentation…

Source

]]>
1
John Thomson <![CDATA[Introducing New KV Cache Reuse Optimizations in NVIDIA TensorRT-LLM]]> https://developer.nvidia.com/blog/?p=95040 2025-02-06T19:34:05Z 2025-01-16T22:57:30Z Language models generate text by predicting the next token, given all the previous tokens including the input text tokens. Key and value elements of the...]]>

Language models generate text by predicting the next token, given all the previous tokens including the input text tokens. Key and value elements of the previous tokens are used as historical context in LLM serving for generation of the next set of tokens. Caching these key and value elements from previous tokens avoids expensive recomputation and effectively leads to higher throughput. However…

Source

]]>
Shashank Maheshwari <![CDATA[NVIDIA JetPack 6.2 Brings Super Mode to NVIDIA Jetson Orin Nano and Jetson Orin NX Modules]]> https://developer.nvidia.com/blog/?p=95089 2025-02-06T19:38:26Z 2025-01-16T22:10:29Z The introduction of the NVIDIA Jetson Orin Nano Super Developer Kit sparked a new age of generative AI for small edge devices. The new Super Mode delivered an...]]>

The introduction of the NVIDIA Jetson Orin Nano Super Developer Kit sparked a new age of generative AI for small edge devices. The new Super Mode delivered an unprecedented generative AI performance boost of up to 1.7x on the developer kit, making it the most affordable generative AI supercomputer. JetPack 6.2 is now available to support Super Mode for Jetson Orin Nano and Jetson Orin NX…

Source

]]>
Aditi Bodhankar <![CDATA[How to Safeguard AI Agents for Customer Service with NVIDIA NeMo Guardrails]]> https://developer.nvidia.com/blog/?p=94928 2025-02-04T19:53:15Z 2025-01-16T14:00:00Z AI agents present a significant opportunity for businesses to scale and elevate customer service and support interactions. By automating routine inquiries and...]]>

AI agents present a significant opportunity for businesses to scale and elevate customer service and support interactions. By automating routine inquiries and enhancing response times, these agents improve efficiency and customer satisfaction, helping organizations stay competitive. However, alongside these benefits, AI agents come with risks. Large language models (LLMs) are vulnerable to…

Source

]]>
Martin Cimmino <![CDATA[Continued Pretraining of State-of-the-Art LLMs for Sovereign AI and Regulated Industries with iGenius and NVIDIA DGX Cloud]]> https://developer.nvidia.com/blog/?p=95012 2025-01-23T19:54:22Z 2025-01-16T12:00:00Z In recent years, large language models (LLMs) have achieved extraordinary progress in areas such as reasoning, code generation, machine translation, and...]]>

In recent years, large language models (LLMs) have achieved extraordinary progress in areas such as reasoning, code generation, machine translation, and summarization. However, despite their advanced capabilities, foundation models have limitations when it comes to domain-specific expertise such as finance or healthcare or capturing cultural and language nuances beyond English.

Source

]]>
Sama Bali <![CDATA[GPU Memory Essentials for AI Performance]]> https://developer.nvidia.com/blog/?p=94979 2025-01-23T19:54:24Z 2025-01-15T16:00:00Z Generative AI has revolutionized how people bring ideas to life, and agentic AI represents the next leap forward in this technological evolution. By leveraging...]]>

Generative AI has revolutionized how people bring ideas to life, and agentic AI represents the next leap forward in this technological evolution. By leveraging sophisticated, autonomous reasoning and iterative planning, AI agents can tackle complex, multistep problems with remarkable efficiency. As AI continues to revolutionize industries, the demand for running AI models locally has surged.

Source

]]>
Harry Petty <![CDATA[Transforming Data Centers into AI Factories for the 5th Industrial Revolution]]> https://developer.nvidia.com/blog/?p=94879 2025-01-23T19:54:25Z 2025-01-14T19:58:01Z In a recent DC Anti-Conference Live presentation, Wade Vinson, chief data center distinguished engineer at NVIDIA, shared insights based upon work by NVIDIA...]]>

In a recent DC Anti-Conference Live presentation, Wade Vinson, chief data center distinguished engineer at NVIDIA, shared insights based upon work by NVIDIA designing, building, and operating NVIDIA DGX SuperPOD multi-megawatt data centers since 2016. NVIDIA is helping make data centers more accessible, resource-efficient, energy-efficient, and business-efficient, as well as scalable to any…

Source

]]>
Nirmal Kumar Juluru <![CDATA[Enhancing Generative AI Model Accuracy with NVIDIA NeMo Curator]]> https://developer.nvidia.com/blog/?p=94263 2025-01-23T19:54:27Z 2025-01-13T17:00:00Z In the rapidly evolving landscape of artificial intelligence, the quality of the data used for training models is paramount. High-quality data ensures that...]]>

In the rapidly evolving landscape of artificial intelligence, the quality of the data used for training models is paramount. High-quality data ensures that models are accurate, reliable, and capable of generalizing well across various applications. The recent NVIDIA webinar, Enhance Generative AI Model Accuracy with High-Quality Multimodal Data Processing, dove into the intricacies of data…

Source

]]>
Kyle Tretina <![CDATA[Evaluating GenMol as a Generalist Foundation Model for Molecular Generation]]> https://developer.nvidia.com/blog/?p=94836 2025-01-23T19:54:29Z 2025-01-13T14:00:00Z Traditional computational drug discovery relies almost exclusively on highly task-specific computational models for hit identification and lead optimization....]]>

Traditional computational drug discovery relies almost exclusively on highly task-specific computational models for hit identification and lead optimization. Adapting these specialized models to new tasks requires substantial time, computational power, and expertise—challenges that grow when researchers simultaneously work across multiple targets or properties.

Source

]]>
Kyle Tretina <![CDATA[Accelerate Protein Engineering with the NVIDIA BioNeMo Blueprint for Generative Protein Binder Design]]> https://developer.nvidia.com/blog/?p=94851 2025-01-23T19:54:28Z 2025-01-13T14:00:00Z Designing a therapeutic protein that specifically binds its target in drug discovery is a staggering challenge. Traditional workflows are often a painstaking...]]>

Designing a therapeutic protein that specifically binds its target in drug discovery is a staggering challenge. Traditional workflows are often a painstaking trial-and-error process—iterating through thousands of candidates, each synthesis and validation round taking months if not years. Considering the average human protein is 430 amino acids long, the number of possible designs translates to…

Source

]]>
Dan Su <![CDATA[Announcing Nemotron-CC: A Trillion-Token English Language Dataset for LLM Pretraining]]> https://developer.nvidia.com/blog/?p=94818 2025-01-23T19:54:30Z 2025-01-09T19:20:16Z NVIDIA is excited to announce the release of Nemotron-CC, a 6.3-trillion-token English language Common Crawl dataset for pretraining highly accurate large...]]>

NVIDIA is excited to announce the release of Nemotron-CC, a 6.3-trillion-token English language Common Crawl dataset for pretraining highly accurate large language models (LLMs), including 1.9 trillion tokens of synthetically generated data. One of the keys to training state-of-the-art LLMs is a high-quality pretraining dataset, and recent top LLMs, such as the Meta Llama series…

Source

]]>
Brad Nemire <![CDATA[NVIDIA Project DIGITS, A Grace Blackwell AI Supercomputer On Your Desk]]> https://developer.nvidia.com/blog/?p=94765 2025-01-23T19:54:30Z 2025-01-09T18:19:00Z Powered by the new GB10 Grace Blackwell Superchip, Project DIGITS can tackle large generative AI models of up to 200B parameters.]]>

Powered by the new GB10 Grace Blackwell Superchip, Project DIGITS can tackle large generative AI models of up to 200B parameters.

Source

]]>
5
Pranjali Joshi <![CDATA[Advancing Physical AI with NVIDIA Cosmos World Foundation Model Platform]]> https://developer.nvidia.com/blog/?p=94577 2025-01-23T19:54:31Z 2025-01-09T17:42:06Z As robotics and autonomous vehicles advance, accelerating development of physical AI—which enables autonomous machines to perceive, understand, and perform...]]>

As robotics and autonomous vehicles advance, accelerating development of physical AI—which enables autonomous machines to perceive, understand, and perform complex actions in the physical world—has become essential. At the center of these systems are world foundation models (WFMs)—AI models that simulate physical states through physics-aware videos, enabling machines to make accurate decisions and…

Source

]]>
1
Brad Nemire <![CDATA[Upcoming Livestream: NVIDIA Developer Highlights from CES 2025]]> https://developer.nvidia.com/blog/?p=94843 2025-01-23T19:54:32Z 2025-01-09T10:00:00Z Tune in January 16th at 9:00 AM PT for a live recap, followed by a Q&A of the latest developer announcements at CES 2025.]]>

Tune in January 16th at 9:00 AM PT for a live recap, followed by a Q&A of the latest developer announcements at CES 2025.

Source

]]>
Zeeshan Patel <![CDATA[Accelerate Custom Video Foundation Model Pipelines with New NVIDIA NeMo Framework Capabilities]]> https://developer.nvidia.com/blog/?p=94541 2025-02-04T19:34:45Z 2025-01-07T16:00:00Z Generative AI has evolved from text-based models to multimodal models, with a recent expansion into video, opening up new potential uses across various...]]>

Generative AI has evolved from text-based models to multimodal models, with a recent expansion into video, opening up new potential uses across various industries. Video models can create new experiences for users or simulate scenarios for training autonomous agents at scale. They are helping revolutionize various industries including robotics, autonomous vehicles, and entertainment.

Source

]]>
Anish Maddipoti <![CDATA[One-Click Deployments for the Best of NVIDIA AI with NVIDIA Launchables]]> https://developer.nvidia.com/blog/?p=94569 2025-01-23T19:54:34Z 2025-01-07T04:30:00Z AI development has become a core part of modern software engineering, and NVIDIA is committed to finding ways to bring optimized accelerated computing to every...]]>

AI development has become a core part of modern software engineering, and NVIDIA is committed to finding ways to bring optimized accelerated computing to every developer that wants to start experimenting with AI. To address this, we’ve been working on making the accelerated computing stack more accessible with NVIDIA Launchables: preconfigured GPU computing environments that enable you to…

Source

]]>
Samuel Ochoa <![CDATA[Build a Video Search and Summarization Agent with NVIDIA AI Blueprint]]> https://developer.nvidia.com/blog/?p=86011 2025-02-13T20:44:57Z 2025-01-07T04:20:00Z This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications...]]>

This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications and their development workflow are typically built on fixed-function, limited models that are designed to detect and identify only a select set of predefined objects. With generative AI, NVIDIA NIM microservices…

Source

]]>
2
Akhil Docca <![CDATA[How to Build a Generative AI-Enabled Synthetic Data Pipeline for Perception-Based Physical AI]]> https://developer.nvidia.com/blog/?p=86105 2025-01-09T19:23:08Z 2025-01-07T03:57:00Z Training physical AI models used to power autonomous machines, such as robots and autonomous vehicles, requires huge amounts of data. Acquiring large sets of...]]>

Training physical AI models used to power autonomous machines, such as robots and autonomous vehicles, requires huge amounts of data. Acquiring large sets of diverse training data can be difficult, time-consuming, and expensive. Data is often limited due to privacy restrictions or concerns, or simply may not exist for novel use cases. In addition, the available data may not apply to the full range…

Source

]]>
Chintan Patel <![CDATA[Llama Nemotron Models Accelerate Agentic AI Workflows with Accuracy and Efficiency]]> https://developer.nvidia.com/blog/?p=94595 2025-01-09T19:23:09Z 2025-01-07T03:40:00Z Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and...]]>

Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and achieve complex goals. Agentic AI combines the power of large language models (LLMs) with advanced reasoning and planning capabilities, opening a world of possibilities across industries, from healthcare and finance to manufacturing and…

Source

]]>
Ike Nnoli <![CDATA[NVIDIA RTX Neural Rendering Introduces Next Era of AI-Powered Graphics Innovation]]> https://developer.nvidia.com/blog/?p=94662 2025-02-03T21:14:21Z 2025-01-07T03:22:00Z NVIDIA today unveiled next-generation hardware for gamers, creators, and developers—the GeForce RTX 50 Series desktop and laptop GPUs. Alongside these GPUs,...]]>

NVIDIA today unveiled next-generation hardware for gamers, creators, and developers—the GeForce RTX 50 Series desktop and laptop GPUs. Alongside these GPUs, NVIDIA introduced NVIDIA RTX Kit, a suite of neural rendering technologies to ray trace games with AI, render scenes with immense geometry, and create game characters with lifelike visuals. RTX Kit enhances geometry, textures, materials…

Source

]]>
2
Katie Link <![CDATA[Build a Generative AI Medical Device Training Assistant with NVIDIA NIM Microservices]]> https://developer.nvidia.com/blog/?p=94379 2024-12-20T19:55:30Z 2024-12-20T18:00:00Z Innovation in medical devices continues to accelerate, with a record number authorized by the FDA every year. When these new or updated devices are introduced...]]>

Innovation in medical devices continues to accelerate, with a record number authorized by the FDA every year. When these new or updated devices are introduced to clinicians and patients, they require training to use them properly and safely. Once in use, clinicians or patients may need help troubleshooting issues. Medical devices are often accompanied by lengthy and technically complex…

Source

]]>
Tom Balough <![CDATA[Enhance Your Training Data with New NVIDIA NeMo Curator Classifier Models]]> https://developer.nvidia.com/blog/?p=94447 2024-12-19T23:08:12Z 2024-12-19T23:08:08Z Classifier models are specialized in categorizing data into predefined groups or classes, playing a crucial role in optimizing data processing pipelines for...]]>

Classifier models are specialized in categorizing data into predefined groups or classes, playing a crucial role in optimizing data processing pipelines for fine-tuning and pretraining generative AI models. Their value lies in enhancing data quality by filtering out low-quality or toxic data, ensuring only clean and relevant information feeds downstream processes. Beyond filtering…

Source

]]>
Sama Bali <![CDATA[Accelerating Film Production with Dell AI Factory and NVIDIA]]> https://developer.nvidia.com/blog/?p=94350 2025-01-11T17:49:14Z 2024-12-19T18:26:03Z Filmmaking is an intricate and complex process that involves a diverse team of artists, writers, visual effects professionals, technicians, and countless other...]]>

Filmmaking is an intricate and complex process that involves a diverse team of artists, writers, visual effects professionals, technicians, and countless other specialists. Each member brings their unique expertise to the table, collaborating to transform a simple idea into a captivating cinematic experience. From the initial spark of a story to the final cut, every step requires creativity…

Source

]]>
Sama Bali <![CDATA[A Guide to Retrieval-Augmented Generation for AEC]]> https://developer.nvidia.com/blog/?p=94305 2024-12-18T17:58:35Z 2024-12-18T21:00:00Z Large language models (LLMs) are rapidly changing the business landscape, offering new capabilities in natural language processing (NLP), content generation,...]]>

Large language models (LLMs) are rapidly changing the business landscape, offering new capabilities in natural language processing (NLP), content generation, and data analysis. These AI-powered tools have improved how companies operate, from streamlining customer service to enhancing decision-making processes. However, despite their impressive general knowledge, LLMs often struggle with…

Source

]]>
1
Rakib Hasan <![CDATA[NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference]]> https://developer.nvidia.com/blog/?p=92963 2024-12-20T18:40:15Z 2024-12-18T17:31:01Z Recurrent drafting (referred as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...]]>

Recurrent drafting (referred as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM) inference now available with NVIDIA TensorRT-LLM. ReDrafter helps developers significantly boost LLM workload performance on NVIDIA GPUs. NVIDIA TensorRT-LLM is a library for optimizing LLM inference. It provides an easy-to-use Python API to define…

Source

]]>
Anna Shors <![CDATA[Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner]]> https://developer.nvidia.com/blog/?p=94082 2024-12-18T01:43:12Z 2024-12-18T01:43:09Z Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact,...]]>

Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact, easily deployable student with comparable accuracy to the teacher. Knowledge distillation has gained popularity in pretraining settings, but there are fewer resources available for performing knowledge distillation during supervised fine-tuning…

Source

]]>
Ike Nnoli <![CDATA[Deploy Agents, Assistants, and Avatars on NVIDIA RTX AI PCs with New Small Language Models]]> https://developer.nvidia.com/blog/?p=92896 2024-12-17T03:32:15Z 2024-12-17T18:00:00Z NVIDIA just announced a series of small language models (SLMs) that increase the amount and type of information digital humans can use to augment their...]]>

NVIDIA just announced a series of small language models (SLMs) that increase the amount and type of information digital humans can use to augment their responses. This includes new large-context models that provide more relevant answers and new multi-modal models that allow images as inputs. These models are available now as part of NVIDIA ACE, a suite of digital human technologies that brings…

Source

]]>
Japinder Singh <![CDATA[Fine-Tuning Small Language Models to Optimize Code Review Accuracy]]> https://developer.nvidia.com/blog/?p=94078 2025-02-17T05:13:45Z 2024-12-17T17:58:31Z Generative AI is transforming enterprises by driving innovation and boosting efficiency across numerous applications. However, adopting large foundational...]]>

Source

]]>
Anjali Shah <![CDATA[Boost Llama 3.3 70B Inference Throughput 3x with NVIDIA TensorRT-LLM Speculative Decoding]]> https://developer.nvidia.com/blog/?p=94146 2024-12-19T23:03:40Z 2024-12-17T17:00:00Z Meta's Llama collection of open large language models (LLMs) continues to grow with the recent addition of Llama 3.3 70B, a text-only...]]>

Meta’s Llama collection of open large language models (LLMs) continues to grow with the recent addition of Llama 3.3 70B, a text-only instruction-tuned model. Llama 3.3 provides enhanced performance respective to the older Llama 3.1 70B model and can even match the capabilities of the larger, more computationally expensive Llama 3.1 405B model on several tasks including math, reasoning, coding…

Source

]]>
2
Ronay AK <![CDATA[Develop Multilingual and Cross-Lingual Information Retrieval Systems with Efficient Data Storage]]> https://developer.nvidia.com/blog/?p=93638 2024-12-17T20:42:28Z 2024-12-17T16:00:00Z Efficient text retrieval is critical for a broad range of information retrieval applications such as search, question answering, semantic textual similarity,...]]>

Efficient text retrieval is critical for a broad range of information retrieval applications such as search, question answering, semantic textual similarity, summarization, and item recommendation. It also plays a pivotal role in retrieval-augmented generation (RAG), a technique that enables large language models (LLMs) to access external context without modifying underlying parameters.

Source

]]>
Suhas Hariharapura Sheshadri https://www.linkedin.com/in/suhassheshadri/ <![CDATA[NVIDIA Jetson Orin Nano Developer Kit Gets a “Super” Boost]]> https://developer.nvidia.com/blog/?p=93942 2024-12-20T02:17:32Z 2024-12-17T14:00:00Z The generative AI landscape is rapidly evolving, with new large language models (LLMs), visual language models (VLMs), and vision language action (VLA) models...]]>

The generative AI landscape is rapidly evolving, with new large language models (LLMs), visual language models (VLMs), and vision language action (VLA) models emerging daily. To stay at the forefront of this transformative era, developers need a platform powerful enough to seamlessly deploy the latest models from the cloud to the edge with optimized inferencing and open ML frameworks using CUDA.

Source

]]>
1
Joseph Lucas <![CDATA[Sandboxing Agentic AI Workflows with WebAssembly]]> https://developer.nvidia.com/blog/?p=93975 2024-12-16T21:06:56Z 2024-12-16T20:33:46Z Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this...]]>

Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this code should be sanitized and executed in a safe environment to mitigate risks from prompt injection and errors in the returned code. Sanitizing Python with regular expressions and restricted runtimes is insufficient…

Source

]]>
Michelle Horton <![CDATA[Top Posts of 2024 Highlight NVIDIA NIM, LLM Breakthroughs, and Data Science Optimization]]> https://developer.nvidia.com/blog/?p=93566 2024-12-16T18:34:16Z 2024-12-16T18:34:14Z 2024 was another landmark year for developers, researchers, and innovators working with NVIDIA technologies. From groundbreaking developments in AI inference to...]]>

2024 was another landmark year for developers, researchers, and innovators working with NVIDIA technologies. From groundbreaking developments in AI inference to empowering open-source contributions, these blog posts highlight the breakthroughs that resonated most with our readers. NVIDIA NIM Offers Optimized Inference Microservices for Deploying AI Models at Scale Introduced in…

Source

]]>
0
Rohan Rao <![CDATA[Insights, Techniques, and Evaluation for LLM-Driven Knowledge Graphs]]> https://developer.nvidia.com/blog/?p=93677 2024-12-18T00:22:12Z 2024-12-16T17:00:00Z Data is the lifeblood of modern enterprises, fueling everything from innovation to strategic decision making. However, as organizations amass ever-growing...]]>

Data is the lifeblood of modern enterprises, fueling everything from innovation to strategic decision making. However, as organizations amass ever-growing volumes of information—from technical documentation to internal communications—they face a daunting challenge: how to extract meaningful insights and actionable structure from an overwhelming sea of unstructured data.

Source

]]>
1
Tanay Varshney <![CDATA[An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio]]> https://developer.nvidia.com/blog/?p=93893 2024-12-16T21:53:48Z 2024-12-16T17:00:00Z Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across...]]>

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across multiple modalities, including text, images, tables, audio, video, and more. In our previous post, An Easy Introduction to Multimodal Retrieval-Augmented Generation, we discussed how to tackle text and images. This post extends this conversation…

Source

]]>
Tianna Nguy <![CDATA[Upcoming Webinar: Gain Insights, and Tips from NVIDIA Certification Experts]]> https://developer.nvidia.com/blog/?p=93920 2024-12-13T17:14:22Z 2024-12-13T17:14:19Z Join the live webinar to learn practical exam preparation tips, and get your questions answered by NVIDIA recruiters on taking advantage of certifications for...]]>

Join the live webinar to learn practical exam preparation tips, and get your questions answered by NVIDIA recruiters on taking advantage of certifications for your career.

Source

]]>
Zekun Hao <![CDATA[High-Fidelity 3D Mesh Generation at Scale with Meshtron]]> https://developer.nvidia.com/blog/?p=93665 2024-12-16T19:00:43Z 2024-12-13T16:57:33Z Meshes are one of the most important and widely used representations of 3D assets. They are the default standard in the film, design, and gaming industries and...]]>

Meshes are one of the most important and widely used representations of 3D assets. They are the default standard in the film, design, and gaming industries and they are natively supported by virtually all the 3D softwares and graphics hardwares. A 3D mesh can be considered as a collection of polygon faces, most commonly consisting of triangles or quadrilaterals.

Source

]]>
2
Isabel Hulseman <![CDATA[Three Building Blocks for Creating AI Virtual Assistants for Customer Service with an NVIDIA AI Blueprint]]> https://developer.nvidia.com/blog/?p=90672 2024-12-12T19:35:14Z 2024-12-11T23:49:16Z In today's fast-paced business environment, providing exceptional customer service is no longer just a nice-to-have—it's a necessity. Whether addressing...]]>

In today’s fast-paced business environment, providing exceptional customer service is no longer just a nice-to-have—it’s a necessity. Whether addressing technical issues, resolving billing questions, or providing service updates, customers expect quick, accurate, and personalized responses at their convenience. However, achieving this level of service comes with significant challenges.

Source

]]>
Anjali Shah <![CDATA[NVIDIA TensorRT-LLM Now Accelerates Encoder-Decoder Models with In-Flight Batching]]> https://developer.nvidia.com/blog/?p=93516 2024-12-12T19:35:15Z 2024-12-11T22:10:51Z NVIDIA recently announced that NVIDIA TensorRT-LLM now accelerates encoder-decoder model architectures. TensorRT-LLM is an open-source library that optimizes...]]>

NVIDIA recently announced that NVIDIA TensorRT-LLM now accelerates encoder-decoder model architectures. TensorRT-LLM is an open-source library that optimizes inference for diverse model architectures, including the following: The addition of encoder-decoder model support further expands TensorRT-LLM capabilities, providing highly optimized inference for an even broader range of…

Source

]]>
Elias Wolfberg <![CDATA[New AI Research Foreshadows Autonomous Robotic Surgery]]> https://developer.nvidia.com/blog/?p=93547 2025-01-07T20:19:40Z 2024-12-10T17:26:44Z A robot commonly used and manually manipulated by surgeons for routine operations can now autonomously perform key surgical tasks as precisely as humans....]]>

A robot commonly used and manually manipulated by surgeons for routine operations can now autonomously perform key surgical tasks as precisely as humans. Researchers at Johns Hopkins and Stanford Universities, revealed they have integrated a vision-language model (VLM)—trained on hours of surgical videos—with the widely-used da Vinci robotic surgical system. Once connected with the VLM…

Source

]]>
Joanne Chang <![CDATA[Just Released: NVIDIA VILA VLM]]> https://developer.nvidia.com/blog/?p=93512 2024-12-12T19:35:17Z 2024-12-09T17:09:10Z Now available in preview, NVIDIA VILA is an advanced multimodal VLM that provides visual understanding of multi-images and video.]]>

Now available in preview, NVIDIA VILA is an advanced multimodal VLM that provides visual understanding of multi-images and video.

Source

]]>
Aditi Bodhankar <![CDATA[Content Moderation and Safety Checks with NVIDIA NeMo Guardrails]]> https://developer.nvidia.com/blog/?p=92908 2024-12-13T01:12:41Z 2024-12-06T17:23:51Z Content moderation has become essential in retrieval-augmented generation (RAG) applications powered by generative AI, given the extensive volume of...]]>

Source

]]>
1
Michael Zephyr <![CDATA[Celebrating Open Science and Enterprise AI Innovation on MONAI’s 5th Anniversary]]> https://developer.nvidia.com/blog/?p=92886 2024-12-20T18:35:40Z 2024-12-05T22:13:17Z As MONAI celebrates its fifth anniversary, we're witnessing the convergence of our vision for open medical AI with production-ready enterprise solutions. ...]]>

As MONAI celebrates its fifth anniversary, we’re witnessing the convergence of our vision for open medical AI with production-ready enterprise solutions. This announcement brings two exciting developments: the release of MONAI Core v1.4, expanding open-source capabilities, and the general availability of VISTA-3D and MAISI as NVIDIA NIM microservices. This dual release reflects our…

Source

]]>
Shubham Agrawal <![CDATA[Build an Agentic Video Workflow with Video Search and Summarization]]> https://developer.nvidia.com/blog/?p=92834 2025-01-07T05:45:50Z 2024-12-03T18:30:00Z Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system...]]>

Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system that can answer questions about video and image content? This presents a far more complex task. Traditional video analytics tools struggle due to their limited functionality and a narrow focus on predefined objects.

Source

]]>
Carl (Izzy) Putterman <![CDATA[TensorRT-LLM Speculative Decoding Boosts Inference Throughput by up to 3.6x]]> https://developer.nvidia.com/blog/?p=92847 2025-01-11T17:32:51Z 2024-12-02T23:09:43Z NVIDIA TensorRT-LLM support for speculative decoding now provides over 3x the speedup in total token throughput. TensorRT-LLM is an open-source library that...]]>

NVIDIA TensorRT-LLM support for speculative decoding now provides over 3x the speedup in total token throughput. TensorRT-LLM is an open-source library that provides blazing-fast inference support for numerous popular large language models (LLMs) on NVIDIA GPUs. By adding support for speculative decoding on single GPU and single-node multi-GPU, the library further expands its supported…

Source

]]>
3
Chen Tessler <![CDATA[Unified Whole-Body Control for Physically Simulated Humanoids]]> https://developer.nvidia.com/blog/?p=92803 2024-12-12T19:38:33Z 2024-12-02T22:18:29Z Creating interactive simulated humanoids that move naturally and respond intelligently to diverse control inputs remains one of the most challenging problems in...]]>

Creating interactive simulated humanoids that move naturally and respond intelligently to diverse control inputs remains one of the most challenging problems in computer animation and robotics. High-performance GPU-accelerated simulators such as NVIDIA Isaac Sim and robot policy training using NVIDIA Isaac Lab enable significant progress in training interactive humanoids.

Source

]]>
1
Manoj C R <![CDATA[Spotlight: TCS Increases Automotive Software Testing Speeds by 2x Using NVIDIA Generative AI]]> https://developer.nvidia.com/blog/?p=92444 2024-12-12T19:38:36Z 2024-11-22T20:07:53Z Generative AI is transforming every aspect of the automotive industry, including software development, testing, user experience, personalization, and safety....]]>

Generative AI is transforming every aspect of the automotive industry, including software development, testing, user experience, personalization, and safety. With the automotive industry shifting from a mechanically driven approach to a software-driven one, generative AI is unlocking a world of possibilities. Tata Consultancy Services (TCS) focuses on two major segments for leveraging…

Source

]]>
Xin Dong <![CDATA[Hymba Hybrid-Head Architecture Boosts Small Language Model Performance]]> https://developer.nvidia.com/blog/?p=92595 2024-12-12T19:38:36Z 2024-11-22T17:31:14Z Transformers, with their attention-based architecture, have become the dominant choice for language models (LMs) due to their strong performance,...]]>

Transformers, with their attention-based architecture, have become the dominant choice for language models (LMs) due to their strong performance, parallelization capabilities, and long-term recall through key-value (KV) caches. However, their quadratic computational cost and high memory demands pose efficiency challenges. In contrast, state space models (SSMs) like Mamba and Mamba-2 offer constant…

Source

]]>
Amr Elmeleegy <![CDATA[NVIDIA TensorRT-LLM Multiblock Attention Boosts Throughput by More Than 3x for Long Sequence Lengths on NVIDIA HGX H200]]> https://developer.nvidia.com/blog/?p=92591 2024-12-12T19:47:20Z 2024-11-22T00:53:18Z Generative AI models are advancing rapidly. Every generation of models comes with a larger number of parameters and longer context windows. The Llama 2 series...]]>

Generative AI models are advancing rapidly. Every generation of models comes with a larger number of parameters and longer context windows. The Llama 2 series of models introduced in July 2023 had a context length of 4K tokens, and the Llama 3.1 models, introduced only a year later, dramatically expanded that to 128K tokens. While long context lengths allow models to perform cognitive tasks…

Source

]]>
1
Zenodia Charpy <![CDATA[Build Your First Human-in-the-Loop AI Agent with NVIDIA NIM]]> https://developer.nvidia.com/blog/?p=91339 2024-12-12T19:38:38Z 2024-11-21T22:45:13Z AI agents powered by large language models (LLMs) help organizations streamline and reduce manual workloads. These agents use multilevel, iterative reasoning to...]]>

AI agents powered by large language models (LLMs) help organizations streamline and reduce manual workloads. These agents use multilevel, iterative reasoning to analyze problems, devise solutions, and execute tasks with various tools. Unlike traditional chatbots, LLM-powered agents automate complex tasks by effectively understanding and processing information. To avoid potential risks in specific…

Source

]]>
20
Bethann Noble <![CDATA[Deploying Fine-Tuned AI Models with NVIDIA NIM]]> https://developer.nvidia.com/blog/?p=91696 2024-12-17T00:07:21Z 2024-11-21T22:04:57Z For organizations adapting AI foundation models with domain-specific data, the ability to rapidly create and deploy fine-tuned models is key to efficiently...]]>

For organizations adapting AI foundation models with domain-specific data, the ability to rapidly create and deploy fine-tuned models is key to efficiently delivering value with enterprise generative AI applications. NVIDIA NIM offers prebuilt, performance-optimized inference microservices for the latest AI foundation models, including seamless deployment of models customized using parameter…

Source

]]>
Shashank Maheshwari <![CDATA[NVIDIA JetPack 6.1 Boosts Performance and Security through Camera Stack Optimizations and Introduction of Firmware TPM]]> https://developer.nvidia.com/blog/?p=91283 2024-12-12T19:47:55Z 2024-11-21T22:01:16Z NVIDIA JetPack has continuously evolved to offer cutting-edge software tailored to the growing needs of edge AI and robotic developers. With each release,...]]>

NVIDIA JetPack has continuously evolved to offer cutting-edge software tailored to the growing needs of edge AI and robotic developers. With each release, JetPack has enhanced its performance, introduced new features, and optimized existing tools to deliver increased value to its users. This means that your existing Jetson Orin-based products experience performance optimizations by upgrading to…

Source

]]>
Phoebe Lee <![CDATA[Powering AI-Augmented Workloads with NVIDIA and Windows 365]]> https://developer.nvidia.com/blog/?p=91709 2024-12-12T19:38:43Z 2024-11-21T17:26:44Z We are entering a new era of AI-powered digital workflow, where Windows 365 Cloud PCs are dynamic platforms that host AI technologies and reshape traditional...]]>

We are entering a new era of AI-powered digital workflow, where Windows 365 Cloud PCs are dynamic platforms that host AI technologies and reshape traditional processes. GPU acceleration unlocks the potential for AI-augmented workloads running on Windows 365 Cloud PCs, enabling advanced computing capabilities for everyone. The integration of NVIDIA GPUs with NVIDIA RTX Virtual Workstation…

Source

]]>
Pralaypati Ta <![CDATA[Advancing Neuroscience Research with Visual Question Answering and Multimodal Retrieval]]> https://developer.nvidia.com/blog/?p=90772 2024-11-20T00:26:26Z 2024-11-20T21:30:00Z Leading healthcare organizations are turning to generative AI to help build applications that can deliver life-saving impacts. These organizations include the...]]>

Leading healthcare organizations are turning to generative AI to help build applications that can deliver life-saving impacts. These organizations include the Indian Institute of Technology Madras – IIT Madras Brain Centre. Advancing neuroscience research, the IIT Madras Brain Centre is using AI to generate analyses of whole human brains at a cellular level across various demographics.

Source

]]>
Hoang Nguyen <![CDATA[Processing High-Quality Vietnamese Language Data with NVIDIA NeMo Curator]]> https://developer.nvidia.com/blog/?p=92268 2024-12-20T18:38:19Z 2024-11-19T21:04:13Z Open-source large language models (LLMs) excel in English but struggle with other languages, especially the languages of Southeast Asia. This is primarily due...]]>

Open-source large language models (LLMs) excel in English but struggle with other languages, especially the languages of Southeast Asia. This is primarily due to a lack of training data in these languages, limited understanding of local cultures, and insufficient tokens to capture unique linguistic structures and expressions. To fully meet customer needs, enterprises in non-English-speaking…

Source

]]>
Xhoni Shollaj <![CDATA[Create a Custom Slackbot LLM Agent with NVIDIA NIM and LangChain]]> https://developer.nvidia.com/blog/?p=89825 2025-02-17T05:12:38Z 2024-11-19T17:00:00Z In the dynamic world of modern business, where communication and efficient workflows are crucial for success, AI-powered solutions have become a competitive...]]>

In the dynamic world of modern business, where communication and efficient workflows are crucial for success, AI-powered solutions have become a competitive advantage. AI agents, built on cutting-edge large language models (LLMs) and powered by NVIDIA NIM provide a seamless way to enhance productivity and information flow. NIM, part of NVIDIA AI Enterprise, is a suite of easy-to-use…

Source

]]>
1
Ashraf Eassa <![CDATA[Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs]]> https://developer.nvidia.com/blog/?p=90142 2024-11-22T23:11:53Z 2024-11-19T16:00:00Z Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...]]>

Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are multimodal, supporting both text and image inputs. In addition, Meta has launched text-only small language model (SLM) variants of Llama 3.2 with 1B and 3B parameters. NVIDIA has optimized the Llama 3.2 collection of models for great performance and…

Source

]]>
James Mills <![CDATA[Building a Generative AI OpenUSD App for Brand-Accurate Marketing Visuals]]> https://developer.nvidia.com/blog/?p=92177 2024-12-09T22:38:59Z 2024-11-19T13:30:00Z Today, brands and their creative agencies are under huge strain to create and deliver high-quality, accurate product images at scale, from campaign key visuals...]]>

Today, brands and their creative agencies are under huge strain to create and deliver high-quality, accurate product images at scale, from campaign key visuals to packshots for e-commerce. Audience-targeted content, such as personalized and localized visual variations, adds additional layers of complexity to production. Production costs, short timelines, resources…

Source

]]>
Mario Geiger <![CDATA[Accelerate Drug and Material Discovery with New Math Library NVIDIA cuEquivariance]]> https://developer.nvidia.com/blog/?p=91896 2024-11-18T22:58:58Z 2024-11-18T18:30:00Z AI models for science are often trained to make predictions about the workings of nature, such as predicting the structure of a biomolecule or the properties of...]]>

AI models for science are often trained to make predictions about the workings of nature, such as predicting the structure of a biomolecule or the properties of a new solid that can become the next battery material. These tasks require high precision and accuracy. What makes AI for science even more challenging is that highly accurate and precise scientific data is often scarce…

Source

]]>
1
Szymon Karpiński <![CDATA[Fusing Epilog Operations with Matrix Multiplication Using nvmath-python]]> https://developer.nvidia.com/blog/?p=92098 2024-11-21T21:07:24Z 2024-11-18T18:30:00Z nvmath-python (Beta) is an open-source Python library, providing Python programmers with access to high-performance mathematical operations from NVIDIA CUDA-X...]]>

nvmath-python (Beta) is an open-source Python library, providing Python programmers with access to high-performance mathematical operations from NVIDIA CUDA-X math libraries. nvmath-python provides both low-level bindings to the underlying libraries and higher-level Pythonic abstractions. It is interoperable with existing Python packages, such as PyTorch and CuPy. In this post, I show how to…

Source

]]>
1
Rob Nertney <![CDATA[Exploring the Case of Super Protocol with Self-Sovereign AI and NVIDIA Confidential Computing]]> https://developer.nvidia.com/blog/?p=91216 2025-02-04T19:53:37Z 2024-11-14T22:01:38Z Confidential and self-sovereign AI is a new approach to AI development, training, and inference where the user’s data is decentralized, private, and...]]>

Confidential and self-sovereign AI is a new approach to AI development, training, and inference where the user’s data is decentralized, private, and controlled by the users themselves. This post explores how the capabilities of Confidential Computing (CC) are expanded through decentralization using blockchain technology. The problem being solved is most clearly shown through the use of…

Source

]]>
24
Trisha Tripathi <![CDATA[Expanding AI Agent Interface Options with 2D and 3D Digital Human Avatars]]> https://developer.nvidia.com/blog/?p=91882 2024-11-14T17:10:33Z 2024-11-14T00:53:23Z When interfacing with generative AI applications, users have multiple communication options—text, voice, or through digital avatars.  Traditional chatbot...]]>

When interfacing with generative AI applications, users have multiple communication options—text, voice, or through digital avatars. Traditional chatbot or copilot applications have text interfaces where users type in queries and receive text-based responses. For hands-free communication, speech AI technologies like automatic speech recognition (ASR) and text-to-speech (TTS) facilitate…

Source

]]>
1
Amit Bleiweiss <![CDATA[Mastering LLM Techniques: Data Preprocessing]]> https://developer.nvidia.com/blog/?p=91738 2025-02-04T19:54:19Z 2024-11-13T18:05:06Z The advent of large language models (LLMs) marks a significant shift in how industries leverage AI to enhance operations and services. By automating routine...]]>

The advent of large language models (LLMs) marks a significant shift in how industries leverage AI to enhance operations and services. By automating routine tasks and streamlining processes, LLMs free up human resources for more strategic endeavors, thus improving overall efficiency and productivity. Training and customizing LLMs for high accuracy is fraught with challenges…

Source

]]>
Kyle Tretina <![CDATA[Boost Alphafold2 Protein Structure Prediction with GPU-Accelerated MMseqs2]]> https://developer.nvidia.com/blog/?p=91623 2024-11-14T17:10:35Z 2024-11-13T17:00:00Z The ability to compare the sequences of multiple related proteins is a foundational task for many life science researchers. This is often done in the form of a...]]>

The ability to compare the sequences of multiple related proteins is a foundational task for many life science researchers. This is often done in the form of a multiple sequence alignment (MSA), and the evolutionary information retrieved from these alignments can yield insights into protein structure, function, and evolutionary history. Now, with MMseqs2-GPU, an updated GPU-accelerated…

Source

]]>