Generative AI is fueling transformative change, unlocking a new frontier of opportunities for enterprises across every industry. To transform with AI, enterprises need more compute resources, greater scale, and a broad set of capabilities to meet the demands of an ever-increasing set of diverse and complex workloads.

The NVIDIA L40S GPU is the most powerful universal GPU for the data center, delivering end-to-end acceleration for the next generation of AI-enabled applications—from generative AI and model training and inference to 3D graphics, rendering, and video applications.

Enterprises are looking to use mainstream infrastructure to satisfy their compute needs, but training state-of-the-art models requires massive compute capability. For LLM models, eight L40S’s in mainstream servers bring up to 1.7X the training performance of an NVIDIA HGX™ A100 8-GPU system, giving enterprises fast time to solution with traditional infrastructure. When compared to the A100 80GB SXM for inference, the L40S delivers up to 1.2X more generative AI inference performance using StableDiffusion and up to 1.5X inference performance on popular networks, such as those included within the MLPerf benchmark.

Key use cases of the NVIDIA L40S GPU:

Generative AIThe AI, graphics, and media acceleration capabilities of the L40S GPU make it the premier platform for multi-modal generative AI pipelines. With powerful inferencing capabilities, combined with NVIDIA RTX™-accelerated ray tracing and dedicated encode and decode engines, the L40S accelerates AI-enabled audio, speech, 2D, video, and 3D generative AI applications.
For image generative AI inference, the L40S GPU delivers more than 5X higher performance than the previous-generation NVIDIA A40 GPU and 1.2X more performance than the HGX A100. This breakthrough performance, combined with 48GB of memory capacity, makes the L40S GPU the ideal generative AI platform for high-quality images and immersive visual content.
LLM Inference and TrainingAccelerate training, fine tuning, and inference workloads with powerful throughput and floating-point performance to build and deploy state-of-the-art AI models. Powerful NVIDIA-Certified Systems™ with eight L40 GPUs can train foundational models with up to 175 billion parameters to convergence and accelerate fine-tuning and retraining of existing large-scale models to adapt them for new tasks.
Combining NVIDIA’s full stack of inference serving software with the compute capabilities of the L40S provides a powerful platform for trained models ready for inference. With support for structural sparsity and a broad range of precisions, including TF32, INT8, and FP8, the L40S delivers over 1 petaFLOPS of inference operation performance, delivering actionable insights with speed and precision.
AI-Ready Development Platform with NVIDIA AI EnterpriseEnterprise adoption of AI is now mainstream and leading to an increased demand for skilled AI developers and data scientists. Organizations require a flexible, high-performance platform consisting of optimized hardware and software to maximize productivity and accelerate AI development.
NVIDIA AI Enterprise is an end-to-end, enterprise-grade AI software platform that offers 100+ frameworks, pretrained models, and libraries to streamline development and deployment of production AI, including generative AI, computer vision, and speech AI. Optimized and certified for reliable performance, NVIDIA AI Enterprise, together with the L40S, provides a unified platform to develop applications once and deploy anywhere, reducing the risks involved with moving from pilot to production.
Rendering and 3D GraphicsRunning professional 3D visualization applications with NVIDIA L40S enables creative professionals to iterate more, render faster, and unlock tremendous performance advantages that increase productivity and speed up project completion. The NVIDIA L40S’s third-generation RT Cores and industry-leading 48GB of GDDR6 memory deliver up to 2X the real-time ray-tracing performance of the previous generation.
With these capabilities, artists and designers can work with complex geometry and high-resolution textures in real time to generate photorealistic designs and power full-fidelity creative workflows, from interactive rendering to virtual production.
NVIDIA OmniverseNVIDIA Omniverse is a multi-GPU-enabled open platform for Universal Scene Description (USD)-based collaboration and real-time photorealistic simulation. The full-stack platform based on USD and NVIDIA RTX is the powerful culmination of NVIDIA’s core graphics, compute, and AI technologies. NVIDIA L40S GPUs bring powerful AI and RTX capabilities to accelerate 3D content creation and industrial digitalization.
For the most complex Omniverse workloads like extended reality (XR), multi-user design collaboration, and digital twins, the NVIDIA L40S enables ray-traced and path-traced rendering of materials, physically accurate simulations, and generation of photorealistic 3D synthetic data.
Streaming and Video ContentThe NVIDIA L40S takes streaming and video content workloads to the next level, delivering breakthrough media acceleration capabilities with three video encode and three video decode engines. With the addition of AV1 encoding, the L40S delivers up to 2X the performance and improved TCO for broadcast streaming, video production, and transcription workloads.
Virtual WorkstationsWhen combined with NVIDIA RTX Virtual Workstation (vWS) software, the NVIDIA L40S can be virtualized to deliver high-performance workstation instances to remote users for high-end design, AI, and compute workloads. With 48GB of GPU memory, the NVIDIA L40S with vWS enables flexible, work-from-anywhere solutions for GPU memory-intensive workloads.

GPU Architecture	NVIDIA Ada Lovelace
NVIDIA CUDA Parallel Processing Cores	18,176
NVIDIA Tensor Cores (4th gen)	568
NVIDIA RT Cores (3rd Gen)	142
Peak FP32 performance (non-Tensor)	91.6 TFLOPS
Peak FP16 Tensor performance	362.05 TFLOPS, 733 TFLOPS*
Peak Tensor Float 32 (TF32) performance	183 TFLOPS, 366 TFLOPS*
Peak Bfloat16 (BF16) Tensor performance	362.05 TFLOPS, 733 TFLOPS*
Peak FP8 Tensor performance	733 TFLOPS, 1466 TFLOPS*
Peak INT8 Integer Performance	733 TOPS, 1466 TOPS*
Peak INT4 Integer Performance	733 TOPS, 1466 TOPS*
RT Core performance	209 TFLOPS
GPU Memory	48 GB GDDR6
Memory Bandwidth	864 GB/s
ECC	Yes
NVIDIA NVLink	No support
System Interface	PCIe Gen 4, x16 lanes
Form Factor	PCIe full height/length, double width (10.5" x 4.4")
Multi-Instance GPU (MIG)	No support
Max Power Consumption	350 W
Thermal Solution	Passive
vGPU Software Support	NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS)
Display connectors	4x DisplayPort 1.4a (disabled by default**)
Max Simultaneous Displays	Up to four 5K Monitors at 60Hz per card or dual 8K displays @ 60Hz (requires DisplayPort 1.4 DSC); Each display port can support 4K at 120 Hz with 30-bit color
Graphics APIs	DirectX 12 Ultimate, Shader Model 6.6, OpenGL 4.6, Vulkan 1.3
Compute APIs	CUDA 12.0, Direct Compute, OpenCL 3.0

New

Click to enlarge

Back to products

NVIDIA L40S 48GB

NVIDIA L40s Graphics Card, 48GB GDDR6 ECC, 18176 CUDA Cores, 4x DisplayPort 1.4a

Add to wishlist

Add to quote

Category: GPU CARDS

Description

Reviews (0)

Rated 0 out of 5

0 reviews

Rated 5 out of 5

Rated 4 out of 5

Rated 3 out of 5

Rated 2 out of 5

Rated 1 out of 5

Reviews

Clear filters

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Specification

GPU Architecture	NVIDIA Ada Lovelace
NVIDIA CUDA Parallel Processing Cores	18,176
NVIDIA Tensor Cores (4th gen)	568
NVIDIA RT Cores (3rd Gen)	142
Peak FP32 performance (non-Tensor)	91.6 TFLOPS
Peak FP16 Tensor performance	362.05 TFLOPS, 733 TFLOPS*
Peak Tensor Float 32 (TF32) performance	183 TFLOPS, 366 TFLOPS*
Peak Bfloat16 (BF16) Tensor performance	362.05 TFLOPS, 733 TFLOPS*
Peak FP8 Tensor performance	733 TFLOPS, 1466 TFLOPS*
Peak INT8 Integer Performance	733 TOPS, 1466 TOPS*
Peak INT4 Integer Performance	733 TOPS, 1466 TOPS*
RT Core performance	209 TFLOPS
GPU Memory	48 GB GDDR6
Memory Bandwidth	864 GB/s
ECC	Yes
NVIDIA NVLink	No support
System Interface	PCIe Gen 4, x16 lanes
Form Factor	PCIe full height/length, double width (10.5" x 4.4")
Multi-Instance GPU (MIG)	No support
Max Power Consumption	350 W
Thermal Solution	Passive
vGPU Software Support	NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS)
Display connectors	4x DisplayPort 1.4a (disabled by default**)
Max Simultaneous Displays	Up to four 5K Monitors at 60Hz per card or dual 8K displays @ 60Hz (requires DisplayPort 1.4 DSC); Each display port can support 4K at 120 Hz with 30-bit color
Graphics APIs	DirectX 12 Ultimate, Shader Model 6.6, OpenGL 4.6, Vulkan 1.3
Compute APIs	CUDA 12.0, Direct Compute, OpenCL 3.0

NVIDIA L40S 48GB

Reviews

Nvidia Tesla H100 80GB

Nvidia RTX A6000 ADA 48GB

NVIDIA RTX 4000 SFF ADA

Nvidia RTX A6000 48GB

Nvidia RTX A4000 16GB

GIGABYTE AORUS GeForce RTX™ 4070 Ti ELITE 12G

NVIDIA L40 48GB

Nvidia Tesla A100 80GB

Follow AUK

Useful Links

Explore Products

Contact Us

NVIDIA L40S 48GB

Reviews

Related products