Scale Enterprise AI with Canonical and NVIDIA

Canonical

on 21 March 2023

Tags: AI , AI/ML , DGX , Kubeflow , MLOps , nvidia

This article is more than 1 year old.

Charmed Kubeflow is now certified in the NVIDIA DGX-Ready Software Program for MLOps!

Canonical is proud to announce that Charmed Kubeflow is now certified as part of the NVIDIA DGX-Ready Software program. This collaboration accelerates at-scale deployments of AI and data science projects on the highest-performing AI infrastructure, providing a unified and optimised institution-wide solution for data science workflows from experiment to training and serving.

Canonical, in collaboration with NVIDIA, enables and optimises Ubuntu on NVIDIA DGX AI supercomputers or mainstream servers built on the NVIDIA EGX Enterprise platform. Today we are taking the next step, launching an end-to-end certified MLOps solution that brings together the best of open-source MLOps with the best hardware for training and inference to give organisations a robust and unified stack from kernel and drivers through to the data science application layer.

Canonical and NVIDIA teams test both single-node and multi-node DGX systems to validate the functionality of Charmed Kubeflow on top of both MicroK8s and Charmed Kubernetes. With the entire machine learning workflow on this stack, enterprises unlock quicker experimentation and faster delivery for AI initiatives.

Take AI models from concept to production

Recent breakthroughs in generative AI have motivated organisations to spend 5% of their digital budget on AI (McKinsey – The State of AI in 2022) and to challenge multiple teams to integrate AI into their innovation plan and process. Key skills as well as infrastructure capabilities are in high demand, such as compute power, workflow automation, and continuous monitoring. These needs are directly met by Charmed Kubeflow on NVIDIA hardware.

Charmed Kubeflow is an open-source, end-to-end MLOps platform on Kubernetes. It automates machine learning workflows to create a reliable application layer for model development, iteration and production deployment with KServe, KNative, or NVIDIA Triton Inference Server regardless of the ML framework that is used.

NVIDIA DGX systems are purpose-built for enterprise AI. These platforms feature NVIDIA Tensor Core GPUs which vastly outperform traditional CPUs for machine learning workloads, alongside advanced networking and storage capabilities. With record-breaking performance, NVIDIA DGX systems run optimised Ubuntu Pro with a 10 year security commitment. Additionally, DGX systems include NVIDIA AI Enterprise, the software layer of the NVIDIA AI platform, which include over 50 frameworks and pretrained models to accelerate development.

“Canonical works closely with NVIDIA to enable companies to run AI at scale easily. Together, we facilitate the development of optimised machine learning models, using AI-specialised infrastructure, with MLOps open source.” said Andreea Munteanu, MLOps Product Manager at Canonical. “Extending this collaboration to AI tools and frameworks such as NVIDIA Triton Inference Server, offers developers a fully integrated development pipeline.”

“NVIDIA DGX systems make it possible for every enterprise to access turnkey AI supercomputing to address their most complex AI challenges,” said Tony Paikeday, Senior Director of AI Systems at NVIDIA. “DGX-Ready MLOps software like Charmed Kubeflow maximises both the utilisation and efficiency of DGX systems to accelerate the development of AI models .”

Maximise efficiency of AI infrastructure with MLOps solutions

In 2018, OpenAI reported that the compute capacity used in large-scale AI training runs had doubled every 3.4 months since 2012. Around the same time, the volume of data generated also increased dramatically.

Traditional, general-purpose enterprise infrastructure cannot deliver the required computing power, nor can it support the petabytes of data required to train accurate AI models at this scale. Instead, enterprises need dedicated hardware designed for AI workloads.

AI infrastructure solutions such as NVIDIA DGX systems accelerate the ROI of AI initiatives with a proven platform optimised for the unique demands of enterprise. Businesses can pair their DGX environment with MLOps solutions to operationalize AI development at scale . Leading MLOps platforms, such as Canonical’s Charmed Kubeflow, are tested and optimised to work on DGX systems, ensuring that users can get the most out of their AI infrastructure without worrying about manually integrating and configuring their MLOps software.

For more information about the latest features on Charmed Kubeflow, please check out the solution brief.

Learn more about the NVIDIA DGX- Ready Software program.

Join the NVIDIA and Canonical joint webinar on 28 March 2023 to learn more AI at scale.

Run Kubeflow anywhere, easily

With Charmed Kubeflow, deployment and operations of Kubeflow are easy for any scenario.

Charmed Kubeflow is a collection of Python operators that define integration of the apps inside Kubeflow, like katib or pipelines-ui.

Use Kubeflow on-prem, desktop, edge, public cloud and multi-cloud.

Learn more about Charmed Kubeflow ›

What is Kubeflow?

Kubeflow makes deployments of Machine Learning workflows on Kubernetes simple, portable and scalable.

Kubeflow is the machine learning toolkit for Kubernetes. It extends Kubernetes ability to run independent and configurable steps, with machine learning specific frameworks and libraries.

Learn more about Kubeflow ›

Install Kubeflow

The Kubeflow project is dedicated to making deployments of machine learning workflows on Kubernetes simple, portable and scalable.

You can install Kubeflow on your workstation, local server or public cloud VM. It is easy to install with MicroK8s on any of these environments and can be scaled to high-availability.

Install Kubeflow ›

Scale Enterprise AI with Canonical and NVIDIA

Canonical

Charmed Kubeflow is now certified in the NVIDIA DGX-Ready Software Program for MLOps!

Take AI models from concept to production

Maximise efficiency of AI infrastructure with MLOps solutions

Run Kubeflow anywhere, easily

What is Kubeflow?

Install Kubeflow

Newsletter signup

Related posts

Deploy GenAI applications with Canonical’s Charmed Kubeflow and NVIDIA NIM

Let’s talk about open source, AI and cloud infrastructure at GITEX 2024

Canonical joins OPEA to enable Enterprise AI