This page describes a method for configuring and managing a group of Indigo switches with central policy and updates.
The switch boot process after these instructions has the switch grab an IP address via DHCP, then a kernel, filesystem image, and device tree file over TFTP, and begins booting into Linux. Via config file put into the SFS flash, the switch then mounts a shared NFS directory where it pulls its custom config (such as ports enabled, switch name, DPID, etc.) and write logs.
Most Indigo setup instructions focus on configuring a single switch, with methods that become a burden at scale. Sure, for every firmware update, you could go into the bootloader to copy the kernel, initrd, and device tree for every update, but that's error-prone and inefficient.
After reading through this page, you should know how to manage a cluster of Indigo switches with central configuration and out-of-band control, using a combination of the following:
The goal is to be able to update the firmware on every switch in the cluster within a minute, and with a minimum of disruption - that is, without having to take all switches down at the same time. This is feasible, but requires a small patch against the default Indigo distribution as of 3/31/2011.
Required: These instructions assume that you have set up a DHCP server, TFTP server, and NFS server.
Required: The DHCP server must be able to allocate IP addresses for each switch based on MAC address.
Optional: A machine with a cross-compiler set up to build u-boot images for PowerPC is optional, but recommended.
This step is optional, but saves the need to copy/paste a bunch of environment variables into the bootloader console, which has been unpredictably error-prone. There's no flow control between the terminal into which you're copying and the bootloader console, which can yield dropped characters, and a broken bootloader config.
First ask Dan Talayco for the Indigo u-boot code, which has the needed #defines for each switch, as well as instructions for setting up a cross-compiler.
Then modify the code to add custom environment variables. The general procedure is to add a CONFIG_X line for each env var added, and to modify a few existing ones.
In include/configs/quanta_lb4g.h, modify the presets for your environment and add code to support the new boot process. Don't copy/past this blindly; some changes are modifies and some are adds.
#define CONFIG_BOOTDELAY 1 /* -1 disables auto-boot */ #define CONFIG_IPADDR #define CONFIG_SERVERIP #define CONFIG_GATEWAYIP #define CONFIG_NETMASK #define CONFIG_BOOTCOMMAND "run memboot" #define CONFIG_MEMLOAD \ "tftp 1000000 uImage;" \ "tftp 2000000 uInitrd2m;" \ "tftp 3000000 LB4G.dtb" #define CONFIG_MEMBOOTARGS \ "setenv bootargs root=/dev/ram " \ "console=ttyS0,$baudrate rw " \ "ip=$ipaddr:$serverip:$gatewayip:$netmask:$hostname:$netdev DEV_ADDR=$ipaddr" #define CONFIG_MEMRUN \ "bootm 1000000 2000000 3000000" #define CONFIG_MEMBOOT \ "dhcp;" \ "run memload;" \ "run membootargs;" \ "run memrun" |
In common/env_common.c and common/environment.c, add this code:
#endif #if defined(CONFIG_MEMBOOTARGS) "membootargs=" CONFIG_MEMBOOTARGS "\0" #endif #if defined(CONFIG_MEMRUN) "memrun=" CONFIG_MEMRUN "\0" #endif #if defined(CONFIG_MEMBOOT) "memboot=" CONFIG_MEMBOOT "\0" #endif #if defined(CONFIG_AUTOLOAD) "autoload=" CONFIG_AUTOLOAD "\0" #endif #if defined(CONFIG_AUTOSTART) "autostart=" CONFIG_AUTOSTART "\0" #endif #if defined(CONFIG_NETDEV) "netdev=" CONFIG_NETDEV "\0" #endif |
Add DHCP support in include/config_cmd_default.h:
#define CONFIG_CMD_DHCP |
Build updated firmware:
make |
Transfer the updated bootloader to your TFTP server in a separate directory from the main Indigo image.
scp u-boot.bin [remote-server]:/var/lib/tftpboot/custom |
You'll want to generate an SSH keypair to automate login for running scripts, copying files, etc. By embedding the public key in the firmware image, automation becomes much easier. In addition, for now, you'll need to apply a patch(though this should get folded into the main distribution soon.) This script will take a stock Indigo firmware image, unpack it, apply a patch, add the public key, and re-package it back up. Save it as customize_indigo.sh, customize any env vars at the top, and then run it once per firmware update.
#!/bin/sh IMAGE=indigo-2011.03.31-pronto-3240 UINITRD=uInitrd2m-indigo-2011.03.31-pronto-3240 ELDK_PATH=/home/dtalayco/tools/ELDK_4.2 MKIMAGE=${ELDK_PATH}/usr/bin/mkimage UINITRD_OUT=uInitrd2m-custom DNRC_DIR=~/dnrc # Clean up rm -rf $IMAGE # Unpack tar xzf ${IMAGE}.tgz cd ${IMAGE} # Create mount dir mkdir mnt # Unpack image into fs: dd if=${UINITRD} of=initrd2m.gz bs=64 skip=1 gunzip -v9 initrd2m.gz sudo mount -o loop initrd2m mnt # Modify here... # Apply our patch sudo chown -R `whoami` . cp ${DNRC_DIR}/add_nfs_logs.patch mnt cd mnt git apply add_nfs_logs.patch rm add_nfs_logs.patch # Copy key over # SSH instructions: https://mailman.stanford.edu/pipermail/openflow-indigo/2011-March/000085.html mkdir .ssh chmod 700 .ssh touch .ssh/authorized_keys cat ${DNRC_DIR}/keys/id_sw.pub >> .ssh/authorized_keys cd .. sudo chown -R root mnt # Re-create image from fs: sudo umount mnt gzip -fv9 initrd2m $MKIMAGE -A ppc -O linux -T ramdisk -C gzip -a 00000000 -e 00000000 -n $IMAGE-mod -d initrd2m.gz ${UINITRD_OUT} cd .. echo created ${IMAGE}/${UINITRD_OUT} |
Copy this updated firmware image to the TFTP server:
scp [built-initrd] [remote-server]:/var/lib/tftpboot/custom |
Copy zipped distribution to remote (example):
scp indigo-2011.03.31-pronto-3240.tgz remote-server:/var/lib/tftpboot
Modify symbolic links in the tftpboot directory to point to the latest files. Make sure the files are in subdirs of the tftpboot dir. Using symlinks to dirs outside doesn't seem to work.
Unpack, modify symlinks:
cd /var/lib/tftpboot tar xzf <firmware-name> ln -fs <firmware-name>/u-boot-<firmware-name> u-boot.bin ln -fs <firmware-name>/uImage-<firmware-name> uImage ln -fs <firmware-name>/uInitrd2m-<firmware-name> uInitrd2m ln -fs <firmware-name>/LB4G.dtb-<firmware-name> LB4G.dtb |
Add custom ones too (which were presumably copied in prior steps):
ln -fs custom/u-boot.bin-boot.bin ln -fs custom/uInitrd2m uInitrd2m |
Copy bootloader. On the switch:
copy -b tftp://192.168.1.11/u-boot.bin |
The flash will be rewritten, and the next attempt to boot will likely fail, esp. if there was no previously stored device tree file and the bootloader is now looking for one; that's why you'll want to update the switch image (see next section).
In the case that you set bootdelay to 0, or the TFTP or NFS servers go down, updating the image file stored on flash is more likely to yield a switch that actually boots. If you're careful to set bootdelay to something larger than 0, always, AND you have remote reboot for each switch, then this isn't a big deal.
Copy other stuff, so that if you miss the space bar to escape from the boot sequence, you don't lose the console and need a remote reboot. On the switch:
copy -k tftp://192.168.1.11/uImage copy -r tftp://192.168.1.11/uInitrd2m copy -d tftp://192.168.1.11/LB4G.dtb |
Check the current bootloader version:
version |
To confirm the bootloader update, reset the switch, and hold down space to drop into the console:
reset |
Once it drops you into the console:
version |
It should say something like this:
U-Boot 1.3.0 (Mar 11 2011 - 13:29:12) |
It's likely that you'll want a custom sysenv. Examples of custom settings:
export log_dir="$nfs_dir/logs/$DEV_ADDR" |
export CTRL_PORT=6633 |
export ofp_options="--max-backoff=1 --listen=ptcp:6634 --fail=closed" |
You can even get fancy and generate the sysenv file based on a higher-level description of connectivity.
Place this file in /var/exports/shared.
It is highly recommended to place this file under revision control.
Save this script as setup_nfs.sh:
#!/bin/sh mkdir -p /sfs echo '192.168.1.11:/var/exports/shared' > /sfs/nfs_path cd sfs sfsctl create * |
Replace the IP/dir pair with yours.
Save this script as sw_setup_nfs.sh (just an example):
#!/bin/sh # Setup NFS remotely on a DNRC switch. KEY=keys/id_sw HOST=root@192.168.3.15 scp -i $KEY setup_nfs.sh $HOST:/ ssh -i $KEY $HOST /setup_nfs.sh |
Apply the changes to a switch, remotely:
./sw_setup_nfs.sh |
Note that this will cause the switch to automatically reboot. At that point, any custom config should be applied. To test that the NFS dir was mounted, log into the switch and check that there's stuff in /mnt.
Congrats, you're done!
The additional effort to add a switch is now:
dhcp; copy -b tftp://192.168.1.11/u-boot.bin; reset |
./sw_setup_nfs.sh # sets NFS logs & cluster config, then reboots into new setup. |
If you followed the main instructions, then this should be unnecessary. It might be useful if you're only doing a part of the above, like DHCP.
In the bootloader, set env vars for DHCP:
setenv ipaddr setenv netmask setenv gatewayip setenv autoload no saveenv |
Verify DHCP address:
dhcp |
The switch should get a DHCP offer.
Update env vars, which were cleared when you updated the bootloader:
setenv bootargs setenv bootdelay 1 setenv nfsip 192.168.1.11 |
Set env vars to do a dhcp, then to copy via tftp all 3 pieces into into memory, then start from memory.
setenv netdev eth0 setenv bootdelay 1 setenv memload 'tftp 1000000 uImage; tftp 2000000 uInitrd2m; tftp 3000000 LB4G.dtb' setenv membootargs 'setenv bootargs root=/dev/ram console=ttyS0,$baudrate rw ip=$ipaddr:$serverip:$gatewayip:$netmask:$hostname:$netdev DEV_ADDR=$ipaddr' setenv memrun 'bootm 1000000 2000000 3000000' setenv memboot 'dhcp; run memload; run membootargs; run memrun' setenv bootcmd 'run memboot' saveenv |
Verify:
run memload run membootargs reset |
These steps should suffice to boot into the kernel, but no parameters outside of the IP will be set for this switch.