Background

Recently I was building a VMware ESXI cluster, aka Vsphere.

My rack structure is like this. I am using my own server as shared NFS storage.

This post is about how to configure this server.

file

Why am I doing this?

We all know that Vsphere clusters require separate storage so that ESXI hosts can migrate virtual machines between multiple hosts.

There are professional solutions like Dell PowerStore. However, those machines are really really expensive, about 20,000 USD.

So how can we get a similar solution as an ESXI storage server with only around 800 USD?

Step 1 - Buy hardware

1.1 Buy a server (400USD)

My personal suggested hardware is:

SKU CPU RAM SATA SSD NVME SSD HDD
Dell R730XD 12 disks Intel E5-2620v3 * 2, 2.4GHz, 12 Cores, 24 Threads. 16GB x 2 DDR4 256GB x 1 2TB 2TBx12
anduin@NFS:~$ neofetch
       _,met$$$$$gg.          anduin@NFS
    ,g$$$$$$$$$$$$$$$P.       ----------
  ,g$$P"     """Y$$.".        OS: Debian GNU/Linux 11 (bullseye) x86_64
 ,$$P'              `$$$.     Host: PowerEdge R730XD
',$$P       ,ggs.     `$$b:   Kernel: 5.10.0-18-amd64
`d$$'     ,$P"'   .    $$$    Uptime: 1 day, 4 hours, 34 mins
 $$P      d$'     ,    $$P    Packages: 571 (dpkg)
 $$:      $$.   -    ,d$$'    Shell: bash 5.1.4
 $$;      Y$b._   _,d$P'      Resolution: 1024x768
 Y$$.    `.`"Y$$$$P"'         Terminal: /dev/pts/0
 `$$b      "-.__              CPU: Intel Xeon E5-2620 v3 (24) @ 3.200GHz
  `Y$$                        GPU: 0b:00.0 Matrox Electronics Systems Ltd. G200eR2
   `Y$$.                      Memory: 650MiB / 31816MiB
     `$$b.
       `Y$$b.
          `"Y$b._
              `"""

The server costs about 400USD.

1.2 Buy a SATA SSD (30USD)

You need to buy a SATA SSD to install the Operating system. This is because Dell R730XD only has 3.5 inches size HDD slot. I really not like to install my OS to an HDD.

And also, Dell R730XD is a very old server that doesn't recognize NVME SSD before booting. So you also could not install your OS on NVME SSD.

That's why it's painful that you have to buy an another SATA SSD only for OS to install.

file

1.2 Buy NVME (300USD, Optional)

You will need some NVME.

The NVME disk is for some need for machines which requires a high random access speed. For raid doesn't speed up random access speed.

I am using SAMSUNG 980 PRO as NVME storage. It costs about 300USD. (2100CNY).

file

1.3 Buy some disks (400USD)

You will need some HDD. For example, I'm using TOSHIBA P300.

12 disks costs about 400USD.

file

1.4 Buy NVME to PCIE converter (3USD)

Yes. You need NVME to PCIE converter to connect your NVME SSD to your server!

Usually this is super cheap. About 3 USD.

file

1.5 Buy SATA to PCIE converter (3USD)

Yes. You also need to buy a SATA to PCIE converter, so you can connect your SATA SSD to PCIE port.

file

1.6 Buy fiber network adapter (80USD)

You need to buy some fiber network adapter. It is suggested to buy at least 3.

I am using Intel X520 DA2. Make sure it supports 10Gbps.

file

1.7 Assemble your server like this!

file

Step 2 - Configure iDRAC

Step 2.1 Upgrade your firmware

First, you need to upgrade the firmware.

You need to use a cable to connect the iDRAC port to your router. Try to find it's IP address and sign it in.

Connect this port to your LAN. Try to get the IP address.

file

file

Default username: root. Default password: calvin.

To upgrade the firmware, you need to download it from Dell.

Download R730 latest Firmware

Download latest BIOS and iDRAC. Download the Windows x64 installer, and unzip it.

file

file

Extract it with 7zip.

file

And upload the firming.d7 file to the iDRAC portal.

file

file

Step 2.2 Fix fan issue

Enable IPMI over lan

Make sure this is enabled:

file

Download IPMI tools

For Windows, download it here.

After downloading, install it and add the extracted path to system environment variable PATH.

For Linux, try:

sudo apt install ipmitool

You can try to run it after installing:

file

Fix your server

Here,

  • -H follows the host name. 192.168.50.20 as example.
  • -U follows the user name. root as example.
  • -P follows the password. calvin as example.

Change those values to your own. And run:

ipmitool  -I lanplus -H 192.168.50.20 -U root -P calvin raw 0x30 0xce 0x00 0x16 0x05 0x00 0x00 0x00 0x05 0x00 0x01 0x00 0x00

This will disablet the third-party PCIE default cooling response.

After that, your server is fixed. The fans became normal again.

file

Why that command?

# Query `Dell's Third-Party PCIe card based default system fan response` status
ipmitool raw 0x30 0xce 0x01 0x16 0x05 0x00 0x00 0x00

response like below means Disabled

16 05 00 00 00 05 00 01 00 00

response like below means Enabled

16 05 00 00 00 05 00 00 00 00

To disable it:

0x30 0xce 0x00 0x16 0x05 0x00 0x00 0x00 0x05 0x00 0x00 0x00 0x00 

To disable it: (Jets off)

0x30 0xce 0x00 0x16 0x05 0x00 0x00 0x00 0x05 0x00 0x01 0x00 0x00 

Manual fan control?

Yes you can do manual fan control.

To enable that:

192.168.18.7 raw 0x30 0x30 0x01 0x00

The speed is a hex number, from 00 to 64. You may need a hex converter.

https://www.rapidtables.com/convert/number/hex-to-decimal.html

  • 64 is 100%
  • 50 is 80%
  • 00 is 0%.

For example, to set fan speed to 80%,

0x30 0x30 0x02 0xff 0x50

Step 3 - Configure RAID 6

After setting up your fan and iDRAC, you can setup your RAID now.

It is suggested to configure a RAID 6 array because it's safe.

You need to open the terminal of your server. Reboot it, and press F2 to enter RAID settings.

file

file

You need to create a new virtual disk. Select related physical disks and raid mode and then initialize the disk.

file

Step 4 - Install Debian

Step 4.1 Download and burn Debian installer

Now download Debian OS ISO file.

You can download it here.

file

I'm using Rufus on Windows to burn that iso to a USB stick.

Make sure you select DD mode when burning the USB stick!

Step 4.2 Install the Debian OS.

Now you can install the Debian OS.

Just install the USB stick to your server and boot.

file

Follow the instructions to finish installing Debian as OS.

file

And note this: Do NOT install your OS to your NVME!!! It won't boot!

Do install the OS to the SATA SSD!

file

After installing the OS, you need to configure the authentication for best practice.

You can follow the instructions here if you are not familiar with Linux.

What need to do after you get a new Linux?

Step 5 - Mount disks

Run command sudo fdisk -l to locate the disk:

sudo fdisk -l

file

Now we can create a new partition. I'm using sda as an exmaple in it.

sudo fdisk /dev/sda

Tips:

  • n – Create partition
  • p – print partition table
  • g - reset as GPT partition table
  • d – delete a partition
  • q – exit without saving the changes
  • w – write the changes and exit.

After creating, we can format it.

mkfs.ext4 /dev/sda1

After formatting, you need to get it's UUID.

sudo blkid | grep UUID=

Output will be like:

file

Copy the PARTITION's UUID after UUID, not the PARTUUID!

For example: 3238a745-7026-47c2-82a3-e48a4d31556b.

Now we need to mount that. For example, to mount disk 3238a745-7026-47c2-82a3-e48a4d31556b to /mnt/random_store.

First, edit /etc/fstab file:

sudo vim /etc/fstab

Add this line:

UUID=3238a745-7026-47c2-82a3-e48a4d31556b /mnt/random_store     ext4    defaults,noatime,nofail 0       0

So it will be like:

file

Try to mount it:

sudo mkdir /mnt/random_store
sudo mount /mnt/random_store

You can verify the mappings:

sudo lsblk

file

Step 6 - Enable NFS Server

Step 6.1 Install NFS Server

sudo apt update
sudo apt install nfs-kernel-server -y

Step 6.2 Expose an NFS path.

For example, we want to expose /mnt/nfs_share as an NFS path.

sudo mkdir -p /mnt/nfs_share
sudo chown -R nobody:nogroup /mnt/nfs_share/
sudo chmod 777 /mnt/nfs_share/

And edit the configuration:

sudo vim /etc/exports

Add this:

/mnt/nfs_share 172.16.1.0/24(rw,sync,no_subtree_check)

The 172.16.1.0/24 is the subnet which allows to connect to this server. This is important, to set it as the network which your ESXI to mount it!!

file

Finally:

sudo exportfs -a
sudo systemctl restart nfs-kernel-server

If you have ufw enabled, don't forget to expose 2049 port.

Step 7 - Configure network

Step 7.1 - Connect between your NFS server and your ESXI server

file

Make the connection between your NFS server and your ESXI.

As shown the Storage network in the picture.

file

Why do I need to install an additional DHCP server?

Because I don't want my VM's traffic to mix with storage's traffic. While I can do this, it will slow down performance.

I use fiber to connect the ESXI host directly to the storage server. This gives the best performance.

However, I have multiple ESXI hosts. So I need to connect each one.

For ease of management, I can bridge multiple fiber optic sockets into a subnet, and enable the DHCP server in the subnet.

Step 7.1 Bridge all your fiber ports

sudo apt install bridge-utils

To view all connected network adapters:

ifconfig

Now bridge all fiber ports.

ip link add name bridge0 type bridge
ip link set dev bridge0 up
ip link set dev enp130s0f0 master bridge0
ip link set dev enp130s0f1 master bridge0
ip link set dev enp5s0f0 master bridge0
ip link set dev enp5s0f1 master bridge0

Don't forget to assign an IP to your bridge:

sudo ip addr add 172.16.1.1/24 brd + dev bridge0

You can enable Jumbo frames here:

sudo ip link set enp5s0f0 mtu 9000
sudo ip link set enp5s0f1 mtu 9000
sudo ip link set enp130s0f0 mtu 9000
 sudo ip link set enp130s0f1 mtu 9000 # All interfaces...
sudo ip link set bridge0 mtu 9000
sudo ip link show | grep mtu

Step 7.2 Start DHCP server

First, install your DHCP server.

sudo apt install isc-dhcp-server

Now edit the configuration:

sudo vim /etc/dhcp/dhcpd.conf

My configuration is:

option domain-name "storage.network";
option domain-name-servers ns1.example.org, ns2.example.org;

default-lease-time 600;
max-lease-time 7200;

subnet 172.16.1.0 netmask 255.255.255.0 {
 range 172.16.1.100 172.16.1.200;                   # My ESXIs.
 option routers 172.16.1.0;                              # A fake router.
 option domain-name-servers 172.16.1.0;      # A fake DNS
 option domain-name "storage.network";
}

file

Finally you need to make sure your DHCP server doesn't ruin your management network.

Make the DHCP server only listen to the bridge:

sudo vim /etc/default/isc-dhcp-server

Edit the content as to your bridge:

INTERFACESv4="bridge0"
INTERFACESv6=""

file

Don't forget to restart the DHCP service.

sudo systemctl restart isc-dhcp-server.service

Step 8 - Install cockpit for monitoring

Cockpit is a web portal helps you monitoring your server status. Installing it is simple.

file

Step 8.1 - Install cockpit

Run:

sudo apt install cockpit -y

Step 8.2 - Configure Authentication

Ensure you have a user instead of root is in the sudo group.

Please follow the instructions from: Best practice for authentication

Step 8.3 - Install navigator plugin

sudo apt install -y jq
latestUrl=$(curl https://api.github.com/repos/45Drives/cockpit-navigator/releases/latest | jq -r '(.assets[] | select(.browser_download_url | contains(".deb"))).browser_download_url')
echo "Latest download url is $latestUrl"
wget -O /home/anduin/install.deb $latestUrl
sudo dpkg -i /home/anduin/install.deb
rm /home/anduin/install.deb

Step 8.4 - Try cockpit on your browser!

Open browser, open: https://host:9090. Ignore cert issues.

file

Don't forget, we are using cockpit to monitor our network and disk usage.

file

Step 9 - Mount the server to Vsphere

Step 9.1 Prepare your host to mount NFS storage

First, log in your ESXI host.

Add a new virtual switch. Select the fiber interface as uplink.

file

Name it, select link and save.

file

If you want to enable Jumbo frames, set MTU to 9000:

file

Then add a new port group. Select the related storage virtual switch Storage switch.

file

Finally, add a new VMkernel NIC. Select the Storage network.

file

Step 9.2 Mount the NFS storage

Go to your VCenter server. Select adding new store.

file

Select type.

file

Select version.

file

Enter server details:

file

And finally works!

file

Step 10 - Benchmark your performance!

Of course CrystalDiskBench!

file

Don't forget to keep an eye on your network usage and disk usage!

10Gbps network should have max speed around 1200MB/s.

file