With Nemotron model, Nvidia looks to full open source

by IT Consulting Group October 29, 2025 0 Comments Blog

Nvidia unveiled new open models, data and tools at its GTC conference in Washington, D.C. on Tuesday.

The new open models in the Nemotron family of models are: Nemotron Nano 3, Nemotron Nano 2 VL, Nemotron Parse and Nemotron Safety Guard.

Nano 3 uses a hybrid mixture-of-experts architecture for enhanced reasoning throughput in areas such as software development, customer service, and IT. Nano 2VL helps with image reasoning and video analysis. Parse extracts text and tables from documents, while Safety Guard finds harmful content in nine languages. And Nemotron retrieval-augmented generation (RAG) features advanced document extraction and unified retrieval across text, images, audio and video data sources.

Nvidia also released new NeMo datasets, including multimodal training data and NeMo Data Designer, designed for generating synthetic data.

Also, with the Nemotron data set, Nvidia released its open source data set for physical AI, which includes 1,700 hours of multimodal driving sensor data from across the U.S. and Europe.

In this Q&A, Kari Briski, vice president, generative AI software for enterprise at Nvidia, discusses the vendor’s strategy for Nemotron and why Nvidia is aiming to be a fully open source vendor.

Nemotron is Nvidia’s open source model, but what makes this model stand out compared to all the other foundation models in the market?

Kari Briski: A lot of people try to put us in a box of just a foundation model. But when we personally and internally talk about Nemotron, it’s so much more than that.

The whole community has adopted the phrase ‘open weights’ because they say, ‘well, it’s not truly open source.’

We are striving for true open source. We have released our data sets, so we now have a number of Nemotron data sets. We put out our pre-training data, our post-training data, additional data sets that we’ve even started to work on after we release a model, things like something called personas, data sets that are statistically representative of the census, but completely private, so it’s a model that can learn about the world without actually being able to regurgitate.  

What strategy do you take with Nemotron that is different from other open source models?

Briski: When we talk about Nemotron, we talk about the whole platform.

Open models, up until now, have not had a reliable roadmap. There are a few companies that have been putting models out, but you don’t necessarily know when, why, or how often. Will they fix the bugs in them? Will they release that data set?  Will they release the algorithm they used to train it?

Our mission is to infuse that ecosystem’s flywheel with complete openness in innovation through transparency, trust, and reliability.

What I also notice about some of the model builders is that they might even release some software, but it’s not the software that they used internally to build that model. And so, what we put out there is literally what we use; we know that the model converges. We know that it scales for training and reinforcement learning. We know that it scales for inference.

There are a couple of reasons why we’ve done it. We’ve seen some challenges in engaging with customers. One is the system of models. It’s like a new type of object-oriented programming, where the objects have some autonomy and generation.

It’s customization. Enterprises that need to go the last mile really need to customize.

My theory for why these enterprise applications, which we all engage with every day, haven’t truly integrated generative AI in the way they will in the future is that the necessary systems aren’t yet in place for them to scale, and, most importantly, specialize

Open models fuel that customization, and having that data allows us to say, ‘OK, what was it trained on? Which data should I remove?’ I can trust that now that they have put that data in, and I’ve interrogated it.

How does Nvidia train Nemotron?

Briski: We focus Nemotron to be super-efficient. We have identified three scaling factors that we’ve observed and for which we create solutions. The first one’s pre-training. These scaling factors are for accuracy. Think of accuracy as the y-axis. The bigger the model, the more data and computation it requires, the more accurate it will be. So that’s the pre-training scaling factor. The second one is post-training. The more data, the more reinforcement learning. The third scaling law is what we call test time compute, or inference time scaling, and this has really been pushed to its limits this year with the introduction of reasoning models. The more tokens that you can think of, the more solution space that you can explore, the more self-reflection, the more accurate the answer. The demand for tokens has increased, while the cost of the token has come down. The importance of being efficient in research, finding better ways of reasoning, and achieving accuracy is part of our Nemotron strategy.

When we think of it as libraries, or if you think of it as a commitment we’ve made with CUDA [compute unified device architecture] and all of the CUDA libraries, along with the great developer ecosystem that we have for that, this is what we want to build with Nemotron, because generative AI is going to be that new development platform. And so, you can rely on a roadmap; you can also rely on this data set or this technique.

How do you keep moving ahead when you’ve already revealed your roadmap?

Briski: We build models for ourselves rather than just for the ecosystem. We build models for ourselves, not just to have a good model internally, but to inform us of the way we design our chip architecture and data center for both training and inference. Not only does it need to be a really accurate model, because anyone can run some dummy data through an architecture at scale and get some speeds and feeds, but you have to understand how you are scaling across thousands of GPU nodes to train the model, update it and use an algorithm for it to converge and converge in time and converge efficiently. And then once you’ve trained it, the same goes for inference.  

Why is it important for vendors to constantly innovate on models, especially open models?

Briski: You have to keep it fresh. You have to update it. There’s both the new model, but there’s also helping customers. We’ve been taking them on this evolution, not to feel like they need to be one and done, because that’s absolutely not how software is built.

Look at software companies. They’re constantly updating their code bases. They’re recompiling, they’re releasing fixes every night. They’re doing hot patches. This is a new approach to software development. In order to do that, enterprises need to set up what we call the data flywheel, and we’ve released all of that with Nemotron. The way you curate your data, the way you may even do synthetic data generation to augment your data and improve your data, the way you train, the way you evaluate. Is it better than the one I already have released? If not, then how do I guardrail it? And then release that back out? Enterprises must adopt this, and they are starting to do so now.

How does Nemotron fit into the overall Nvidia strategy and ecosystem?

Briski: This is a new way of developing software, running software, and interacting with systems in the future. We do believe it’s foundational to the way we build systems, the way we deliver systems, and to the way people interact with them and have developers on our platform.

And if you actually look at what we’re doing and how I talk about this end-to-end development, you know, we take it from the data sets to the scaling of the data center to the model and architecture itself, and we do that end-to-end, even the inference, and then we release it all out. You can take the whole thing or just the pieces you want. Just take a data set, take the model checkpoint, just take a curator. That is the same as we’ve done with any other stack here at Nvidia. It’s extreme co-design.

The way we build data centers is based on a data center reference architecture, which encompasses everything from compute to storage, networking, and cooling. The way we do reference architecture is that people building data centers either use that reference architecture, tweak it, or buy the pieces they need. That’s the way we’ve built systems and software in the past as well. 

Nvidia works with so many vendors, and with the release of Nemotron’s data set, why do you think you’re not afraid of competition?

Briski: We were afraid a few years ago, but the more we’ve learned from the engagement of the ecosystem, the more we realize that we do so much better when there’s a vibrant, thriving and diverse ecosystem, when there’s not one model to rule them all.

Diversity aids in innovation, but it also provides a fantastic development platform for us and for our partners. Putting out the data and models is better for our business to use as a development platform. The secret sauce for us is not in these base models that we put out. The importance comes from the specialization across the ecosystem.

Editor’s note: This interview has been edited for clarity and conciseness.

Source link

IT Consulting Group

With Nemotron model, Nvidia looks to full open source

Leave A Comment Cancel reply

Edtech’s Offline Shift reshapes AI Usage

Google’s Nano Banana Pro – Image