System deployment refers to a very common task for IT departments: setting up a computer system with all the software it needs — the operating system and base set of applications. Because this is a very routine task, the goal is to fully automate it in a reliable, consistent way. This document discusses how to do this for server hardware; another strategy document, System Imaging, discusses how to accomplish this for desktop and laptop systems.
It encompasses the initial installation of an operating system and base applications on new or repurposed hardware or in a newly created virtual machine, rebuild of a system for major operating system upgrades, and support for tasks such as hardware diagnosis, wiping disks before de-commissioning, and a "rescue" boot when the operating system is corrupted or nonfunctional. The goal for system deployment is a fully automated deployment of a supported operating system on either physical or virtual systems. Physical access to the system must not be required so that staff can support remote lights-out data centers or remote client locations. Build services are a critical component in fast provisioning of new systems and provide a consistent foundation for deployment of applications and configuration.
Automated OS (operating system) installations over freshly formatted disks are preferred above OS upgrades or cloned copies from templates or captured images, even in a virtualized world with better support for system cloning. Installing operating systems this way ensures that a newly deployed system starts fully patched, without any unnecessary confusing configurations from previous patches and upgrades. Doing fresh builds rather than upgrades or cloning of existing systems also tests, on a day-to-day basis, the ability to automatically regenerate the base OS install from first principles, ensuring that the automation stays up-to-date. Separating the modifications from the base OS image, using a script or set of scripts to perform all the actions necessary to take the computer to final build, also allows an easier upgrade to new releases of an OS since the underlying OS image is interchangeable (within bounds).
The long-term goal is to provide OS builds with necessary customizations (such as a preferred authentication service) as an automated service, along with supporting documentation for how to customize supported operating systems for the Stanford environment. IT Services will use, and improve where necessary; the standard automated build facilities provided by supported operating systems, contributing improvements back to the OS provider or broader community when possible.
The standard industry way to automate system builds is to use a PXE (Preboot Execution Environment)-based network boot (configured via DHCP, or Dynamic Host Configuration Protocol) alongside the automated deployment system for the operating system being built: WDS (Windows Deployment Service) for Windows, FAI (Fully Automated Installation) for Debian and Ubuntu, and Kickstart for Red Hat. IT Services has been doing this for some time, as have Stanford's peer institutions and most large organizations. On UNIX, IT Services supplements the standard operating system package repositories with local repositories for Stanford-specific packages, which is also common among peer institutions and large organizations. IT Services is somewhat behind the trend towards digital signatures of all package repositories on Debian and Ubuntu; that is already being done for Red Hat.
Servers are provisioned using an automated server deployment. Server builds require a small amount of input to initiate and then complete without human interaction using the following technologies:
- PXE to start system builds, both physical and virtual; UNIX and Windows.PXE relies on DHCP for configuration. The use of PXE on virtual environments is problematic, especially for Windows deployment, since the more efficient Hypervisor-integrated virtual network interfaces (NICs) do not support PXE.
- Kickstart for building Red Hat systems
- Yum for Red Hat package repositories.
- Red Hat deployment is available to campus.
- FAI for building Debian and Ubuntu systems in Computing Services.
- WDS (Windows Deployment Service) and Microsoft Deployment Toolkit (Lite Touch) for building Windows Server systems in Computing Services.
- Yum server to provide Red Hat build services to campus and ESX3 package updates for legacy BCDR ESX servers at Duke University.
For virtual servers, managed virtual server host environments are available on both VMWare ESXi, offered as a service by IT Services, and Microsoft Hyper-V, which is used for internal Windows infrastructure. These environments are managed by the following:
- VMware ESXi and Virtual Center for management of VMWare virtual machines
- System Center Virtual Machine Manager for management of Hyper-V virtual machines
Client Support has a service called RaDIS (Rapid Deployment and Imaging Service) for deployment of operating systems on new or repurposed hardware at client sites. System deployments can be done from anywhere on campus with a sufficient network connection. A single, hardware-independent universal image for each OS is developed that can be deployed to all business-level desktops and laptops.
- Macintosh OS images are applied manually to client computers through a custom NetBoot installer and NetRestore. NetRestore is no longer in development.
- For Windows systems, the question of licensing comes up often. Stanford acquires licenses for Microsoft from a large number of avenues, including retail purchase, bundled with new machines, Select, MSDNAA, and Campus Agreement.
The current build infrastructure is stable and sufficient for most needs within IT Services. No major investment is required for IT Services' internal needs. However, IT Services has received multiple requests for better support for other campus system administrators, and better documentation and dissemination of expertise around system build and configuration for Stanford's environment. Therefore, the primary effort is to turn build services into a more general service and significantly improve build and configuration documentation.
Some ongoing operational effort is required to adopt build services for new releases of supported operating systems and, for the Linux systems, some work remains to bring the level of support for Ubuntu up to that provided for Debian and Red Hat. Since system builds are a common system administration task, IT Services should also make ongoing efforts to improve the automation of the build and bootstrap process. Time saved per build has a significant impact on reducing operational overhead for systems administration.
Organizations that deploy large server farms have been investing in automation of initial system keying and bootstrapping into a configuration management system, a step that is still currently manual for IT Services' systems. The current rate of system builds probably doesn't warrant a lot of investment here, but it is expected that the rate of system builds will increase with research computing and with the growth of virtualization and desire for automated deployment of new virtual machines.
Virtual desktops and application virtualization are becoming more and more prevalent. With the growth of virtualization, there is a growing trend towards deployment of systems from virtual machine templates or by cloning existing machines. There are advantages and disadvantages to that method compared to an automated build of a fresh version of the operating system. Due to investment in automated build systems, IT Services will be able to choose the best approach between the two for any given situation, but expects to continue to lean towards fresh builds.
- Further develop the existing automated build for Microsoft Windows Server.
- Use multicast for Windows Deployment Services and Lite Touch Builds to reduce network bandwidth requirements for building multiple machines at once.
- Provide WDS+Lite Touch build services to campus. This could be available everywhere now, but has limited access because there are licensing concerns, since Stanford does not subscribe to a campus agreement for server operating systems and only some departments subscribe for client operating systems.
- Provide FAI-based Debian and Ubuntu build services to campus, with appropriate separation of configuration and defaults from internal Computing Services builds.
- Improve management of the supporting package repositories for Debian and Ubuntu systems, containing Stanford-local software and configuration, to add better consistency checks, policy enforcement, and digital signatures of packages.
- Retire the Red Hat Network (RHN) Satellite Service. The yum server has been promoted to the primary source of Red Hat Enterprise Linux (RHEL) updates for campus RHEL systems, but there are many systems that still need to be migrated off the RHN Satellite server. Help Desk Level 2 has been brought in to help track down system owners and have them migrate to the yum server by using the available documentation and tools.
- Create automated builds for Microsoft Windows client OS.
- Determine the need for alternative solutions for Macintosh OS deployment.
- Document Stanford-specific configuration and integration issues for supported operating systems for the benefit of the campus community.
- Improve automation of the initial bootstrap process to bring a newly built system into the systems automation infrastructure, add Kerberos keys or join Windows Infrastructure, set the local root password, and register it with the new CMDB.
- Provide automated facilities for unattended virtual machine deployment. Integrate with ordering process for rapid deployment.
For all of Systems:
- Add to build services the necessary automation and support for integrating with the CMDB.
- Work with Networking to improve workflow around provisioning of DHCP (a prerequisite for automated builds) for new networks and firewall zones.
- Determine how best to incorporate provisioning of VMware ESXi servers into the build infrastructure.
- Integrate automated provisioning of virtual machines, system build of those machines, keying, and bootstrapping into a configuration management system.
For the UNIX Systems group:
- Retire the Red Hat Network Satellite Service.
- Add the necessary interfaces and configuration options to the FAI build system to provide Debian and Ubuntu build services to campus.
- Convert Debian and Ubuntu repositories to reprepro and deploy repository signing and improved package checking.
- Overhaul and significantly expand existing documentation for building and deploying Linux systems at Stanford.
- Develop an initial bootstrap keying mode in wallet that allows a one-time-only download of a system keytab for a newly built system.
- Add automation of setting the root password to wallet.
- Retire the Solaris 8 and 9 build system as support is dropped for Solaris.
For the Windows Systems group:
- Complete deployment of network-based IPMI control for all servers.
- Evaluate integrating build with System Center configuration management. In other words, consider switching from Microsoft's "Lite Touch" build to "Zero Touch" automated build. Leverage SCCM for ongoing configuration management of servers.
- Evaluate switching to virtual DVD-ROM drive to initiate Lite-Touch or Zero-Touch deployments on virtual machines to avoid PXE limitations.
For Client Support:
- Develop ways of improving system deployments for clients by automating processes
- On Macs, perform testing on other products to determine if aging NetRestore needs to be replaced.
Measures of success
- A fully automated deployment of a virtual machine can be supplied upon request.
- Anyone on campus can build a Debian, Ubuntu or RHEL system, taking advantage of the build infrastructure and local Stanford packages, and get a system without configuration (users, authorization, root passwords) specific to Computing Services.
- Systems can be built in Livermore without relying on campus services.
- Success of documentation improvements can be measured by increased campus interest and use of these local package repositories and IT Services' ability to answer more HelpSU tickets with references to existing documentation.
- System deployments are automated and do not represent a large investment of operator time.
- Good client feedback from periodic surveys.