OpenShift simplifies GPU Computing for NVIDIA

05 Mar

BY : Sigin Jacob George 0 comment

Known for popularizing the graphical processing unit (GPU), NVIDIA is now helping enterprise customers adopt GPU-accelerated computing for artificial intelligence (AI) and high-performance computing (HPC) applications. Through its partnership with Red Hat, NVIDIA helps customers run GPU-accelerated computing on Red Hat OpenShift. A GPU operator, developed initially by Red Hat and now owned by NVIDIA, simplifies GPU-accelerated computing on an enterprise-level container platform.

Benefits

Offered crucial insight for ongoing development of the operator
Ensured optimal compute efficiency for customers’ AI and HPC workloads
Saved customers time and avoided manual errors because of automation
Provided customers with support and expertise from the right partner

Facilitating processing-intensive computing in the enterprise

NVIDIA popularized the GPU, a specialized processor that can process many pieces of data simultaneously, and helped enterprise customers adopt the processor for running processing-intensive HPC, AI, and cloud operations. GPU-accelerated computing – where the compute-intensive portion of a workload runs on GPUs – is reshaping transportation, healthcare, manufacturing, and many other industries.

NVIDIA developed Compute Unified Device Architecture (CUDA) – a parallel computing platform and programming model for general computing on GPUs – to simplify the development of GPU-accelerated applications. The framework includes libraries, a toolkit, runtime, and plugins that communicate with the GPU.

Customers who initially wanted to take advantage of running Kubernetes on top of GPUs had to manually write containers for CUDA and all the software needed to run GPU-accelerated applications on Kubernetes. Developers also had to write additional code to tell Kubernetes which nodes contained GPUs. The process was time-consuming and prone to errors, but is now greatly simplified by using Red Hat OpenShift.

Partnering for an optimal solution

While NVIDIA caters to all Kubernetes distributions, Red Hat OpenShift is seen as a priority. “Red Hat OpenShift is very important to NVIDIA as it allows our customers to develop, deploy, and deliver new apps faster and easier,” said Akins. “When we adapted CUDA for Kubernetes, Red Hat OpenShift was top of mind.”

NVIDIA produced a series of Red Hat OpenShift techniques for CUDA and the software needed by GPU-accelerated applications with guidance from Red Hat.

Making AI and HPC accessible with Red Hat OpenShift

When a customer deploys Red Hat OpenShift on top of a server with GPUs, the GPU operator automatically containerizes CUDA and all the software needed before deploying to Red Hat OpenShift.

More than 100 customers are currently using the GPU operator to help them implement and run GPU-accelerated workloads across a wide range of application types, including AI, machine learning, model training, and inferencing.

Benefit of this partnership

Offered crucial insight for ongoing development of the operator

NVIDIA uses Red Hat’s expertise and influence regarding Kubernetes, helping NVIDIA understand the container platform’s future direction so they can build critical evolutionary advancements into the GPU operator.

Ensured optimal compute efficiency for customers’ AI and HPC workloads

The GPU operator allows NVIDIA to optimize compute efficiency for its customers. A process orchestrated by Red Hat OpenShift uses node-labeling techniques, so workloads can automatically find the specific type of GPU they need.

Saved customers time and avoided manual errors because of automation

The GPU operator makes it easier for customers to use CUDA to take advantage of GPU technology for running HPC and AI workloads on Red Hat OpenShift. Automation saves customers time and helps them avoid errors.

Provided customers with support and expertise from the right partner

With the NVIDIA and Red Hat teams aligned, any customer facing an issue with the GPU operator can submit a ticket to Red Hat. The partners then triage the ticket together and have an escalation path before escalating it to either NVIDIA or Red Hat experts, ensuring customers have access to the best support.

Building the next generation of computing

NVIDIA has a GPU network operator development team that regularly collaborates with Red Hat. “We work with Red Hat on the development of the GPU networking operator,” said Akins. “It’s very much a Red Hat-influenced roadmap.”

That team is currently creating an operator for NVIDIA DOCA software framework. An analog to the NVIDIA GPU operator, the network operator simplifies scale-out network design for Kubernetes by automating aspects of network deployment and configuration that would otherwise require manual work. It loads the required drivers, libraries, device plugins, and CNIs on any cluster node with an NVIDIA network interface. DPUs are a new class of programmable processors for cloud-native computing platforms for modern cloud-scale computing. The DPU operator provides security enhancements, future SDKs and frameworks, and other features critical to DOCA.

About NVIDIA

NVIDIA’s popularization of the GPU sparked the PC gaming market. The company’s pioneering work in accelerated computing—a supercharged form of computing at the intersection of computer graphics, high performance computing and AI—is reshaping trillion-dollar industries, such as transportation, healthcare, manufacturing, and fueling the growth of many

List of Authors

Sigin Jacob George

Global Relations Manager at ipsr solutions limited, Technological Specialist with 20+ years experience in IT Training, Sales and Management. Technical Areas of Interest - Open Source, Cloud, DevOps

Tags

#aintegrateddigitlmarketing##ansibleautomates #AWS #blog #cicd #Container #DO180 #DO280 #ipsronlinetraining #kubernetes #OpenShift #OpenShiftTraining #pythonindemand##redhatautomation #redhatcertification #redhatcertification #redhatlinux #redhatsystemadministration #ansibleautomates #containers #kubernetes #RHCSA #RHCE #DO180 #DO280 #ipsr #ipsronlinetraining #openshift #RedHatLearningSubscription #redhatlinux #RedHatOpenShift #redhatsystemadministration #RedHatTraining #RHCE #RHCSA #RHLS #RHLSPremium #tiktok analytics android ansible article Artificial Intelligence ASP.NET aws online training backlink boot camp career career advancement career opportunity certification cloud cloug containers cybersecurity CyberSecurityCertification Data Analysts in 2024 data analytics data analytics certification data analytics training demand Dev-Ops Devop DevOps Digital marketing digital marketing course Digital Marketing Salary in India Exam results files types Forrester Research india Influencer Marketing Instagram interview questions interview quetion interview technique IOT ipsr IT it career IT Finishing schools IT jobs IT proffessionals ITFS java job interview job opportunities job opportunitiess jobs keyword KMEA College learn python online link building Linux linux online training Linux System Administration machine learning Mastering DevOps networking online python training open source phyton placements Private Cloud python python certification python certification course Python in 2024 python training Rankings Red Hat Red Hat Academy red hat linux Red Hat Linux Training redhat results RHCA RHCE certification roles social media marketing online course software student post students post success stories tablue Threads by Instagram training

Our Blog