Before starting
Before starting, install Docker with Nvidia container toolkit:
Install Docker using the following commands:
curl -fsSL get.docker.com -o get-docker.sh
CHANNEL=stable sh get-docker.sh
rm get-docker.sh
Add the Nvidia Container Toolkit repository and install it:
Reference: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
Then, install nvidia-docker2
Reference: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/1.10.0/install-guide.html
sudo apt-get install -y nvidia-docker2
Restart the Docker service:
sudo systemctl restart docker
Verify the GPU setup in Docker:
sudo docker run --rm --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
Problem
As thread mentioned, we can not directly use GPU in docker swarm:
https://forums.docker.com/t/using-nvidia-gpu-with-docker-swarm-started-by-docker-compose-file/106688
version: '3.7'
services:
test:
image: nvidia/cuda:10.2-base
command: nvidia-smi
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu, utility]
If you deploy it, docker will say devices is not allowed in swarm mode.
> docker stack deploy -c docker-compose.yml gputest
services.test.deploy.resources.reservations Additional property devices is not allowed
Solution
However, I recently found a trick that allows you to run a container with GPU:
Before starting, I created a distributed attachable network, so my other containers managed by docker swarm can talk to the ollama container:
function create_network() {
network_name=$1
subnet=$2
known_networks=$(sudo docker network ls --format '{{.Name}}')
if [[ $known_networks != *"$network_name"* ]]; then
networkId=$(sudo docker network create --driver overlay --attachable --subnet $subnet --scope swarm $network_name)
echo "Network $network_name created with id $networkId"
fi
}
create_network proxy_app 10.234.0.0/16
Then I deploy the following docker-compose file with docker swarm:
(I used ollama_warmup
to demostrate how other containers interact with this ollama. You can replace that with other containers obviously.)
version: "3.6"
services:
ollama_starter:
image: hub.aiursoft.cn/aiursoft/internalimages/ubuntu-with-docker:latest
volumes:
- /var/run/docker.sock:/var/run/docker.sock
# Kill existing ollama and then start a new ollama
entrypoint:
- "/bin/sh"
- "-c"
- |
echo 'Starter is starting ollama...' && \
(docker kill ollama_server || true) && \
docker run \
--tty \
--rm \
--gpus=all \
--network proxy_app \
--name ollama_server \
-v /swarm-vol/ollama/data:/root/.ollama \
-e OLLAMA_HOST=0.0.0.0 \
-e OLLAMA_KEEP_ALIVE=200m \
-e OLLAMA_FLASH_ATTENTION=1 \
-e OLLAMA_KV_CACHE_TYPE=q8_0 \
-e GIN_MODE=release \
hub.aiursoft.cn/ollama/ollama:latest
ollama_warmup:
depends_on:
- ollama_starter
image: hub.aiursoft.cn/alpine
networks:
- proxy_app
entrypoint:
- "/bin/sh"
- "-c"
- |
apk add curl && \
sleep 40 && \
while true; do \
curl -v http://ollama_server:11434/api/generate -d '{"model": "deepseek-r1:32b"}'; \
sleep 900; \
done
deploy:
resources:
limits:
memory: 128M
labels:
swarmpit.service.deployment.autoredeploy: 'true'
networks:
proxy_app:
external: true
volumes:
ollama-data:
driver: local
driver_opts:
type: none
o: bind
device: /swarm-vol/ollama/data
And it worked!
Now I am running ollama with deepseek in Docker! And GPU is supported!
这篇文章介绍了一种在Docker Swarm中运行GPU容器的有效方法,弥补了Swarm原生支持的不足。通过创建可挂载网络和间接启动容器,作者成功实现了GPU的利用。
优点:
建议改进:
扩展讨论: 可以探讨其他容器编排工具(如Kubernetes)如何更有效地管理GPU资源,或者分享在实际应用中优化和维护此类设置的经验。