Update shared on21 Oct 2024
The Latest on Nvidia's NVLM 1.0
Nvidia has recently unveiled NVLM 1.0, a family of open-source multimodal large language models (MLLMs) designed to compete with leading proprietary systems like OpenAI's GPT-4 and open-source models such as Meta's Llama 3.1. The flagship model, NVLM-D-72B, boasts 72 billion parameters and demonstrates similar performance to both proprietary and open-source competitors across both vision and language tasks.
For those of you who are a bit more technologically inclined, there are a few key features that make NVLM quite interesting:
- The model offers three architectural options - Decoder-only, Cross-attention, and Hybrid models - to optimize for different tasks, improving both training efficiency and multimodal reasoning.
- Nvidia introduced a novel 1-D "tile-tagging" system for efficient handling of high-resolution images. The result here should be far greater performance on tasks requiring detailed image analysis.
- Interestingly, unlike many multimodal models that see a decline in text performance after multi-modal training, Nvidia’s NVLM-D-72B task accuracy in text-only applications is improved by an average of 4.3 points across key benchmarks.
- Nvidia is committed to an open source model. Nvidia is releasing the model weights and plans to provide the training code in Megatron-Core, which should encourage community collaboration and improvements. Although it’s not fully open source, as there isn’t full transparency on training data, it’s a great step to generate engagement within the community.
My Latest Thoughts
The release of NVLM 1.0 is a significant development that was previously unaccounted for in my narrative, but it comes as a welcome surprise. Although it’s come a bit from left field, I think it still aligns well with my original narrative and could be an important new catalyst for Nvidia's future growth.
Although I’m not very savvy when it comes to LLMs, the fact one of the NVLM 1.0 models shows improved text-only performance over its LLM backbone after multimodal training is an impressive achievement. While I don’t think this is going to be anything disruptive, it could position Nvidia's models as quite favourable in multimodal applications and perhaps appealing to a broader range of use cases compared to LLMs from competitors..
The open-source nature of NVLM 1.0, while not fully disclosing the training data, is a strategic move. By releasing the model weights and planning to make the training code available, Nvidia is encouraging widespread adoption and collaboration. This openness allows researchers and developers to build upon the model, leading to rapid improvements and optimizations. The more the community engages with NVLM 1.0, the more its efficacy is likely to increase.
From a business perspective, this strategy could create a positive feedback loop for Nvidia. The company doesn't need NVLM 1.0 to be the absolute best model on the market; it needs sufficient adoption to drive more GPU sales. I think of it like Nvidia attempting to create a software and hardware ecosystem in the way Apple has. As more people use the model, the demand for Nvidia GPUs could increase, especially if users aim to run these models locally or develop their own variations. Essentially, Nvidia is positioning itself to benefit from both the hardware and software sides of AI development—much like "owning the mine and the shop that sells the pickaxes."
This development could be Nvidia's first step toward a business plan that involves creating consumer AI-optimized GPUs, empowering individuals and smaller organizations to create and run their own models. While Nvidia’s current consumer GPU lineup has a variety of applications in both gaming and workstation loads, the bulk of the consumer sales are for gaming purposes. But Nvidia has the opportunity to augment their current GPU lineup or launch a new branch of GPUs that are more focussed on machine learning and running AI models locally. If successful, this market could potentially rival the size of the gaming GPU market. While this is a possibility and not yet a probability, it aligns well with some of my existing catalysts and almost blends two of my catalysts together.
Ultimately, I think these latest developments on NVLM 1.0 not only support my existing narrative, but enhance it by introducing a new avenue for growth for Nvidia. In some respects, it mitigates some risks related to increased competition by building up a larger ecosystem centered around Nvidia's technology. While I believe this move further solidifies Nvidia's position on top of the game, I won’t yet include it in my valuation. I strongly believe this has the potential to change the game for how and why consumers use GPUs, but I want to see what paths Nvidia look to head down before assigning value to this catalyst.
Disclaimer
Simply Wall St analyst Bailey holds no position in NasdaqGS:NVDA. Simply Wall St has no position in the company(s) mentioned. Simply Wall St may provide the securities issuer or related entities with website advertising services for a fee, on an arm's length basis. These relationships have no impact on the way we conduct our business, the content we host, or how our content is served to users. This narrative is general in nature and explores scenarios and estimates created by the author. The narrative does not reflect the opinions of Simply Wall St, and the views expressed are the opinion of the author alone, acting on their own behalf. These scenarios are not indicative of the company's future performance and are exploratory in the ideas they cover. The fair value estimate's are estimations only, and does not constitute a recommendation to buy or sell any stock, and they do not take account of your objectives, or your financial situation. Note that the author's analysis may not factor in the latest price-sensitive company announcements or qualitative material.