Installing NVIDIA Drivers on Ubuntu: A Practical Engineering Workflow

NVIDIA GPUs demand correct, up-to-date drivers for full hardware utilization on Ubuntu. Deploying the wrong driver—or leaving traces of the default Nouveau module—inevitably leads to suboptimal performance or system instability. This workflow outlines the standard, reliable process for a modern Ubuntu workstation (tested on 22.04 LTS, kernel 5.15+) with PCIe or laptop-integrated NVIDIA cards.

Recurring Problem: Erratic CUDA Kernel Failures

A common real-world scenario: deploying TensorRT workloads on a fresh Ubuntu install, only to encounter cryptic CUDA_ERROR_UNKNOWN failures. In 7 of 10 cases, this traces back to mismatched or conflicting NVIDIA driver installations, typically due to automatic Nouveau loading or partial legacy drivers left by an upgrade.

Step 1: Confirm GPU Model & Existing Driver State

lspci | grep -i nvidia

Typical output:

01:00.0 VGA compatible controller: NVIDIA Corporation GP106GL [Quadro P2000] (rev a1)

Check driver status (returns error if not present):

nvidia-smi

Expected for working driver:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03    Driver Version: 535.54.03    CUDA Version: 12.2     |
|-------------------------------...

If you see:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver...

proceed with installation.

Step 2: Update System, Avoiding Old Kernel Issues

sudo apt update
sudo apt full-upgrade -y

full-upgrade ensures dependency shifts (common after PPA addition) are handled cleanly.

Step 3: Explicitly Blacklist Nouveau (Hard Failures Without This)

Edit/create the modprobe config:

sudo nano /etc/modprobe.d/blacklist-nouveau.conf

Insert:

blacklist nouveau
options nouveau modeset=0

Then:

sudo update-initramfs -u
sudo reboot

Known issue: Failing to reboot here often leaves Nouveau in memory, breaking DKMS module insertion later.

Post-reboot, validate:

lsmod | grep nouveau

No output confirms success.

Step 4: Enable Latest Driver Repository

Sometimes, apt sources lag behind NVIDIA’s own releases (critical for recent cards, e.g., RTX 40xx):

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update

For hardened environments, skip this and use the stock repos. Otherwise, PPA offers bleeding edge.

Step 5: Identify and Install the Recommended Driver

Use:

ubuntu-drivers devices

Look for recommended in the output, such as:

driver   : nvidia-driver-535 - third-party free recommended

Then install:

sudo apt install nvidia-driver-535 -y

(Replace with whatever is current/recommended for your GPU. Quadro and legacy users may need other versions—check compatibility on NVIDIA’s official matrix.)

Step 6: Reboot, Then Hard Verification

sudo reboot

Post-reboot checks:

nvidia-smi
lsmod | grep nvidia
glxinfo | grep "OpenGL renderer"

If nvidia-smi still fails, review /var/log/syslog for kernel module insertion errors:

grep NVRM /var/log/syslog

Practical example:
If Secure Boot is enabled in firmware, driver module signing will fail silently, causing DKMS misinstallation. Disable Secure Boot in BIOS, reinstall driver, and retest.

Step 7: Systems with Hybrid GPUs (Optimus/PRIME)

For laptops (e.g., ThinkPad P1, Dell XPS with both NVIDIA and Intel GPUs):

sudo apt install nvidia-prime
sudo prime-select nvidia
sudo reboot

Switch back to Intel as needed:

sudo prime-select intel

Performance and power trade-off: selecting NVIDIA forces discrete GPU, increasing power draw, but is mandatory for CUDA.

Step 8: (Optional) Install CUDA Toolkit—Version Matching Required

Non-obvious tip: Always install CUDA after verifying the kernel module loads cleanly.

Visit NVIDIA CUDA Toolkit Archive for the desired version (e.g., CUDA 12.2, matching driver >= 535). Example:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.0-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt update
sudo apt install cuda

Add to ~/.bashrc:

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

Gotcha: Installing mismatched major CUDA and driver versions results in this error:
CUDA driver version is insufficient for CUDA runtime version

Troubleshooting

Symptom	Likely Cause	Solution
Black screen after driver install	Secure Boot, Wayland, old xorg config	Boot with `nomodeset`, disable Secure Boot, restore `/etc/X11/xorg.conf.backup`
`modprobe nvidia` fails	Nouveau loaded	Repeat blacklisting, check for typos
`nvidia-smi` hangs, dmesg `RM: version...`	PCIe ASPM bug, or old BIOS	Upgrade motherboard firmware, check PCIe settings

Side note:
For cloud images (AWS, GCP), use distro-specific CUDA/NVIDIA guides—kernel flavors may break PPA modules.

Summary

Flawless NVIDIA GPU operation on Ubuntu hinges on three factors: eradicating Nouveau, syncing driver and CUDA versions, and rebooting at appropriate points. For advanced installations (multiple cards, GRID/virtualization), vendor scripts or custom DKMS builds may be justified, but for most workstations and development rigs, the outlined method is robust. Alternative approaches—like .run file installs—exist, but are harder to maintain and best avoided unless integration with nvidia-docker or experimental hardware is required.

If unique failures arise—kernel panic, missing /dev/nvidia*—always check kernel logs and DKMS status before reinstalling drivers.

No approach is truly “one and done” as upstream changes can break dependencies. Document your working config for future reference.

How To Install Nvidia Drivers Ubuntu