hans

hans

Managing large server users through Docker

The background is that the laboratory has purchased many luxury servers, but I don't understand operations and don't want to deal with complex professional services. So I decided to create a Docker for each user and manage it with Portainer. Everyone shares a RAID 0 disk space, one GPU per person, and CPU utilization is equally distributed. The server system is CentOS 7.

Here are all the commands used:

## set raid0
sudo yum install mdadm
# create raid volume /dev/md0 based on 3 devices
sudo mdadm --create --verbose /dev/md0 --level=0 --raid-devices=3 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1
# reformat volume
sudo mkfs.ext4 /dev/md0
# create mount point
sudo mkdir -p /data
sudo mount /dev/md0 /data
# automatically mount
echo '/dev/md0 /data ext4 defaults 0 0' | sudo tee -a /etc/fstab
# Save RAID Configuration
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
# Verify the RAID Array
cat /proc/mdstat
# improve raid0 performance
sudo blockdev --setra 65536 /dev/md0
echo 32768 | sudo tee /sys/block/md0/md/stripe_cache_size # do not work

# ==============================================================================

## install portainer
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo groupadd docker
sudo gpasswd -a $USER docker
newgrp docker
sudo systemctl start docker
sudo systemctl enable docker
docker volume create portainer_data
docker run -d -p 8000:8000 -p 9443:9443 --name portainer --restart=always -v /var/run/docker.sock:/var/run/docker.sock -v /data/portainer_data:/data portainer/portainer-ce:latest

## install portainer-agent, easy to manage all servers
docker run -d -p 9001:9001 --name portainer_agent --restart=always -v /var/run/docker.sock:/var/run/docker.sock -v /var/lib/docker/volumes:/var/lib/docker/volumes portainer/agent

# ==============================================================================

## set nvidia docker
# set Nvidia Container Toolkit
sudo rpm --import https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/7fa2af80.pub
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum install -y nvidia-docker2
sudo systemctl daemon-reload
sudo systemctl restart docker
# test installation
docker run --rm --gpus all nvidia/cuda:12.0.0-devel-ubi8 nvidia-smi

# ==============================================================================

## enable ip forward
sudo su
echo 1 > /proc/sys/net/ipv4/ip_forward
# public host port
sudo iptables -I INPUT -p tcp --dport 10241:10299 -j ACCEPT
sudo iptables-save

# ==============================================================================
## set up ubuntu container based on cuda version
cd /data
sudo mkdir hans
# change 'all' to '"device=0"' for specific gpu
docker run -itd --name ubuntu-hans --gpus 'all' --restart=always -v /data/hans:/data -p 1024:1024 --cpu-shares 1024 --net=bridge nvidia/cuda:12.0.0-devel-ubuntu22.04
# set ssh for ubuntu
docker exec -it ubuntu-hans /bin/bash
apt update && apt install vim ssh
passwd
vim /etc/ssh/sshd_config

PubkeyAuthentication yes
PermitRootLogin yes
Port 1024

service ssh restart
# ==============================================================================

## amend container configuration
docker commit ubuntu-hans ubuntu-hans-image
# re-run the container using new config and image

# ==============================================================================

## install nodejs (optional)
sudo yum install https://rpm.nodesource.com/pub_16.x/nodistro/repo/nodesource-release-nodistro-1.noarch.rpm -y
sudo yum install nodejs -y --setopt=nodesource-nodejs.module_hotfixes=1
# install localtunnel
sudo npm install localtunnel
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.