Web Application Infrastructure

Overview

The main communication tool of many research, professorial, student, and administrative groups at Stanford is their web presence. IT Services provides Stanford with a web infrastructure that offers both web content and application hosting environments. These environments meet the varied needs of the community, allowing both novice and experienced users to create and deliver content over the web as well as create web applications.

Security is a top priority for this central, shared infrastructure. IT Services implements safeguards that may limit some flexibility and speed, but protect the user community from the accidental or intentional misuse of the resources. When possible, commonly used tools are integrated with the shared infrastructure, providing simplicity and consistency of use in a manner that meets security objectives.

IT Services recognizes that some users require more flexibility or greater performance than the central, shared infrastructure provides, and would prefer to use dedicated servers for their applications. IT Services provides additional mid-tier and top-tier web infrastructure services and support to help address those users, and collaborates with the user community to determine how to improve the web infrastructure for existing and future needs.

Current State

The core foundation of the web infrastructure are the web servers. Recently updated to improve performance, this server pool provides:

  • Apache 2.x HTTP (Hypertext Transfer Protocol) server for Unix-based web applications and  general web hosting needs.
  • AFS (Andrew File System) as a shared file system for storing content that should be served out of the central campus web infrastructure and for managing ACL (access control list) controls on modifying that content.
  • WebAuth v3 for web authentication and LDAP (Lightweight Directory Access Protocol) directory integration.
  • The standard set of W3C standards for web content, including HTML (Hypertext Markup Language) 4, XHTML, CSS (Cascading Style Sheets), DOM (Document Object Model), as well as Javascript for Dynamic HTML.
  • Server-parsed HTML for some basic web page programmability needs.
  • CGI (Common Gateway Interface) service to run programs that provide dynamic content, collect user input, and offer services otherwise unavailable through the central shared service.
  • PHP (Hypertext Preprocessor) 5.2, Perl 5.10, Ruby 1.8, and Python 2.5 for CGI scripting and web applications
  • MySQL as a database engine for web applications created by groups, departments, and classes; it is available as its own server and service, but also accessible via the web servers.

Many web applications and services are enabled as well:

  • Web collaboration tools: IT Services has developed a tool that automates the installation and maintenance of several collaborative applications: Drupal, WordPress, and MediaWiki. These tools are also discussed in the Web Publishing strategy.
  • Forms: IT Services also developed the Stanford Web Forms Service (aka FormBuilder), which provides a simple user interface for building online  forms and collecting data.
  • Web programming: For programmers, IT Services developed a Web Development Cookbook and Best Practices, a freshly reviewed and up-to-date collection of PHP scripts, modules, sample code, and documentation, and The Stanford Web Application Toolkit (SWAT), a set of tools designed to assist Stanford web developers in creating secure and robust PHP-based web applications.
  • Web site development: With its campus partners, IT Services developed the Stanford Self-Help Web Design Resources, a collection of templates, tools, CSS style guide, and so forth. Clients who need more assistance with building or maintaining web sites in the Stanford environment are able to contract with IT Services for development and integration assistance.
  • Web hosting: Clients wishing their own hostnames can use the virtual hosting service to provide redirects or proxies to sites, allowing aliases of their own hostname to www.stanford.edu content. This service also provides WebAuth to servers that do not allow it, such as Microsoft IIS or externally hosted servers. This can cause some complexity in setting authorization limits between the proxy and the web server content.

Clients who want more control or expanded services can contract with IT Services to purchase and run dedicated hardware or virtual machines (VMs). These systems have:

  • Unix-based web applications (including many of the services listed above).
  • Microsoft IIS for Windows-based web applications. (There is no native Windows WebAuth installation, and the proxy service is used to provide WebAuth for a Windows server when needed.)
  • Tomcat for Java servlets (currently a mix of Tomcat 4 and 5).
  • PAC (Proxy Automatic Configuration)-based proxying through a WebAuth-enabled server to provide authenticated access to purchased content such as academic journals.

Vision

The web is heavily used both for internal and external communication, and the central web infrastructure is the most openly available service for the  general campus to use for this communication. Usage has increased, and while the service has gained new tools over the past year, the infrastructure still needs to be improved through ways that include:

  • A better Collaboration Tools installer that provides web access to other tools, which will cut down on HelpSU tickets.
  • Support for more experienced users who would like more tools for building their own sites.
  • Promotion of new tools that allow campus groups to add surveys and other additional functionality to their web sites, to ensure that users and  administrators are focused on the newest, more robust tools available and transition them from outdated utilities (e.g., formage).
  • Simplification of site setup for such services as the proxy service and sessions, eliminating the wrinkles from the process.

Technology trends that IT Services is tracking in the development of its web application infrastructure strategy:

  • Hosting in the cloud (e.g., Google App Engine, Rackspace Cloud, Amazon's web services).
  • PHP.
  • Java Servlet API.
  • JSON used in conjunction with Javascript for active web application content and serialized data exchange.
  • Ruby on Rails.
  • ASP.NET.
  • OpenSocial API.

Goals

  • Allow individuals (faculty, staff, and students) access to the MySQL database service, which would allow them to run web applications that require a database.
  • More complete integration of Shibboleth into the infrastructure.
  • Expand frameworks in multiple programming languages for those who do their own programming.
  • Develop and automate a collection of standard metrics that demonstrate the health  of the service and its components (e.g., the web servers, CGI and MYSQL services).
  • Provide a full-featured tool for conducting web surveys and collecting associated data.
  • Pilot an OpenSocial gadget gallery, offering the range of Stanford services and external services of interest to the Stanford community; evaluate OpenSocial containers.
  • Provide web-based maintenance for basic web authorization tasks.
  • Move all www.stanford.edu proxies and redirects off of the proxy servers and onto the www.stanford.edu servers, in order to simplify WebAuth and other local setup for clients.
  • Define lifecycle management for groups and classes, archiving class information and disabling web access for unused spaces. There is currently no real lifecycle management for the groups and classes that use web services, so while there is some occasional manual cleanup, abandoned classes or groups usually remain indefinitely.
  • Retire formage, the old web forms utility, moving all clients to the Web Forms Service or other alternative,

Roadmap

  • Apply AFS fixes to help tune problems as the AFS software is updated upstream.
  • Continue to tune the MySQL service as required to improve web application performance.
  • Hardware load balance the web servers and implement session affinity.
  • Move proxy setup on to the web servers for any www.stanford.edu addresses.
  • Implement web collaboration and web publishing environments as described in the Web Collaboration and Web Publishing documents.
  • Improve web-based service controls (i.e., Control Panel).
  • Define additional monitoring and metrics useful for measuring responsiveness of the central web services as well as usage in order to prioritize  resources. Primary examples are: general cgi-bin application responsiveness, Stanford Web Application Toolkit usage, and Web Forms Service usage.
  • Develop skills around other web-based tools to understand and meet client needs (e.g., Shibboleth, ASP.NET backed by MySQL).
  • Contact clients using formage, encouraging them to move to the Web Forms Service and providing help as needed; apply improvements to Web Forms Service as needed to answer any formage migration problems.
  • Expand the Application Toolkit to include Perl, and other languages commonly used at Stanford as requested.
  • Add Stanford skins and WebAuth or Shibboleth integration to Qualtrics, along with any other requirements, to smoothly integrate it into the infrastructure as a survey solution.
  • Create a web interface to enable and disable WebAuth and Shibboleth, along with basic editing of the settings.
  • Create a web interface to control the contents of PTS-admins groups, used by several tools to determine who has administrative rights to group, class, and department cgi-bin tasks.

Measures of success

  • Broader adoption of web application infrastructure.
  • Fewer websites running on departmentally hosted servers.
  • Increased ease of configuration and deployment of web applications into infrastructure.
  • Performance improvements on the shared web environment.