Mehran Maghoumi – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-02-28T20:23:54Z https://developer.nvidia.com/blog/feed/ Mehran Maghoumi <![CDATA[Build an AI Agent with Expert Reasoning Capabilities Using the DeepSeek-R1 NIM]]> https://developer.nvidia.com/blog/?p=96030 2025-02-28T20:23:54Z 2025-02-28T20:23:51Z AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on...]]>

AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on expert reasoning, enabling smarter planning and efficient execution. Agentic AI applications could benefit from the capabilities of models such as DeepSeek-R1. Built for solving problems that require advanced AI reasoning…

Source

]]>
Mehran Maghoumi <![CDATA[Streamlining Data Processing for Domain Adaptive Pretraining with NVIDIA NeMo Curator]]> https://developer.nvidia.com/blog/?p=87876 2024-10-18T20:11:21Z 2024-09-10T16:30:00Z Domain-adaptive pretraining (DAPT) of large language models (LLMs) is an important step towards building domain-specific models. These models demonstrate...]]>

Domain-adaptive pretraining (DAPT) of large language models (LLMs) is an important step towards building domain-specific models. These models demonstrate greater capabilities in domain-specific tasks compared to their off-the-shelf open or commercial counterparts. Recently, NVIDIA published a paper about ChipNeMo, a family of foundation models that are geared toward industrial chip design…

Source

]]>
Mehran Maghoumi <![CDATA[Curating Custom Datasets for LLM Parameter-Efficient Fine-Tuning with NVIDIA NeMo Curator]]> https://developer.nvidia.com/blog/?p=85771 2024-10-18T20:13:02Z 2024-07-31T16:00:00Z In a recent post, we discussed how to use NVIDIA NeMo Curator to curate custom datasets for pretraining or continuous training use cases of large language...]]>

Source

]]>
Mehran Maghoumi <![CDATA[Curating Custom Datasets for LLM Training with NVIDIA NeMo Curator]]> https://developer.nvidia.com/blog/?p=82737 2024-10-18T20:14:38Z 2024-05-21T17:00:00Z Data curation is the first, and arguably the most important, step in the pretraining and continuous training of large language models (LLMs) and small language...]]>

Data curation is the first, and arguably the most important, step in the pretraining and continuous training of large language models (LLMs) and small language models (SLMs). NVIDIA recently announced the open-source release of NVIDIA NeMo Curator, a data curation framework that prepares large-scale, high-quality datasets for pretraining generative AI models. NeMo Curator, which is part of…

Source

]]>
Mehran Maghoumi <![CDATA[Scale and Curate High-Quality Datasets for LLM Training with NVIDIA NeMo Curator]]> https://developer.nvidia.com/blog/?p=80168 2025-02-17T05:28:15Z 2024-03-27T18:00:00Z Enterprises are using large language models (LLMs) as powerful tools to improve operational efficiency and drive innovation. NVIDIA NeMo microservices aim to...]]>

Enterprises are using large language models (LLMs) as powerful tools to improve operational efficiency and drive innovation. NVIDIA NeMo microservices aim to make building and deploying models more accessible to enterprises. An important step for building any LLM system is to curate the dataset of tokens to be used for training or customizing the model. However, curating a suitable dataset…

Source

]]>