Efficient Deployment of DeepSeek on Intel Platforms

DeepSeek on Intel Platforms

As artificial intelligence technology advances, large language models (LLMs) are increasingly used in natural language processing, content generation, and conversational systems. DeepSeek, a China-developed AI large model, has received attention for its generative capabilities and application range.

We validated large-model inference on a system configured with Intel Core processors and an Intel Arc GPU.

This article describes how to deploy the DeepSeek model on Intel platforms, covering hardware, BIOS, OS, drivers, and tooling required for inference.

1 Hardware configuration

GPU: Intel Arc B580 12GB

2 BIOS configuration

After installing an Intel Arc GPU, enable PCIe Resizable BAR (Base Address Register) in the BIOS.

3 Operating system and driver installation

3.1 Ubuntu 24.10 installation

The following shows steps to download and prepare Ubuntu 24.10:

wget https://releases.ubuntu.com/24.10/ubuntu-24.10-desktop-amd64.iso

Disable Ubuntu unattended upgrades to avoid unintended kernel updates.

sudo systemctl disable --now unattended-upgrades

Then edit /etc/apt/apt.conf.d/20auto-upgrades and set Unattended-Upgrade to 0.

3.2 Intel client GPU

Driver installation (Arc B series)

Reference: Intel DGPU Documentation

4 OpenVINO and benchmark tooling

OpenVINO is an open-source toolkit for optimizing and deploying deep learning models from cloud to edge. It accelerates inference across use cases such as generative AI, video, audio, and language, and supports models from popular frameworks like PyTorch, TensorFlow, and ONNX. It enables model conversion and optimization for deployment on heterogeneous Intel hardware and environments, whether locally, on devices, in browsers, or in the cloud.

5 Running benchmarks with the DeepSeek distilled model

Obtain the DeepSeek distilled model from Hugging Face or ModelScope and save the downloaded model in the ~/models folder.

Following these steps and configuration guidelines, we validated inference performance of the DeepSeek distilled model on an Intel platform. The hardware configuration, BIOS settings, OS installation, drivers, and OpenVINO deployment were configured to support model inference.

Future work will include additional validation in more complex scenarios, particularly for generative AI and large language model applications, and testing with Intel Arc GPUs for edge deployments.