Tuesday, July 14, 2009

Best practices for multi-tenant applications

On Designing and Deploying Internet-Scale Services is an interesting article in which Hamilton, who was at the time of writing the article doing research on the Windows Live Platform, describes best practices for design and deploy of internet-scale services, such as Hotmail. The recommendations are made with the goal of optimizing the cost of operations, but I believe they can improve the quality of your software, including multi-tenant applications, in general.

Hamilton points out three design principles:

  1. Expect failures. Hardware and software will fail, so better be prepared, otherwise your complete system will crash.
  2. Keep it simple. Simple installation and maintenance procedures will result in less mistakes. Also, keep dependencies as simple as possible to make sure that e.g. replacing a server is easy.
  3. Automate everything. Staff is expensive and will make mistakes. An automated process can be tested and is repeatable.

The rest of the paper is mostly a list of best practices with a short description. I will give a short overview of what I believe to be the most important ones in a multi-tenancy environment. I encourage you all to read the full paper, as it contains excellent recommendations for developing web applications in general.

Application design
  • Develop in a complete environment. Although unit testing is essential, make sure that your component works in the complete system.
  • Zero trust underlying components. Always validate input as the component it came from may not have done this. Never trust another component does the validation for you: better safe than sorry.
  • Do not duplicate functionality in different components. Code multiplication will result in more difficult maintenance.
  • Understand access patterns. Improve your application by understanding how it is being used; e.g. by improving paths for shorter latency.
  • Version everything. Without versioning it is impossible to keep track of which features have been added or removed.
  • Avoid single points of failure. When a single point of failure stops working, your whole system may fail. Therefore, always use redundancy and replication.
Automatic management
  • Be restartable. If a service cannot restart when it is in faulted state, the whole system needs to be restarted.
  • Keep deployment simple. The simpler the deployment process, the simpler it is to automate.
Dependency management
  • Use stable, proven components. Make sure you use components which are reliable. Alpha and beta components may contain many errors, which may cause your system to crash.
Release management
  • Allow rollbacks to previous versions. Necessary in case of an error in an update.
  • Monitor and instrument everything, and give enough fault information for diagnosis. Save more information than e.g. just 'A query has failed'. Save query, time, error message and if possible the state of the application.
  • Make everything configurable. Make diagnosis options configurable, rather than adding them when a system is failing. Adding monitoring to a failing system is asking for problems.
On Designing and Deploying Internet-Scale Services contains many more of these recommendations. What are your recommendations for building high quality software?

No comments:

Post a Comment