Salesforce.com is one of the largest multi-tenant platforms out there. Lately, they have been giving an insight on how their database architecture works. Checkout the paper The design of the force.com multitenant internet application development platform (for subscribed ACM members only) and the (excellent!) presentation by Craig Weissman, Chief Software Architect at Salesforce.com
Heap database
In a nutshell, what Salesforce.com does is place all user data in a heap database. Using tenant-specific metadata, virtual database tables can be defined. Imagine we are implementing a stamp collection application for Salesforce.com. The heap could look like:
In this case, the metadata would describe that the columns val0 and val500 for tenant 123 contain the country and price of a stamp.
Indexing
Since all columns may contain different types of data for each tenant, creating indexes on the heap makes no sense. Therefore, Salesforce.com creates tenant-specific indexes by copying the (small) parts of data which require indexing per tenant. Although this sounds like a good way of allowing indexing in a multi-tenant application, I wonder about the performance penalty of having so many small indexes.
It is very nice to see that larger companies like Salesforce.com are beginning to open up and publish more details about their architecture. It is very cool and useful to learn from industrial cases like this one!
Heap database
In a nutshell, what Salesforce.com does is place all user data in a heap database. Using tenant-specific metadata, virtual database tables can be defined. Imagine we are implementing a stamp collection application for Salesforce.com. The heap could look like:
In this case, the metadata would describe that the columns val0 and val500 for tenant 123 contain the country and price of a stamp.
Indexing
Since all columns may contain different types of data for each tenant, creating indexes on the heap makes no sense. Therefore, Salesforce.com creates tenant-specific indexes by copying the (small) parts of data which require indexing per tenant. Although this sounds like a good way of allowing indexing in a multi-tenant application, I wonder about the performance penalty of having so many small indexes.
It is very nice to see that larger companies like Salesforce.com are beginning to open up and publish more details about their architecture. It is very cool and useful to learn from industrial cases like this one!
Paul,
ReplyDeleteI expect that Salesforce will make use of filtered indexes. In that case you can have customer specific indexes which do not impact other customers. See http://www.keepitsimpleandfast.com/2009/05/performance-improvements-with-filtered.html for more information about how SQL 2008 implemented filtered indexes.
Thanks for sharing knowledge, this blog about multi-tenancy is verymuch useful
ReplyDelete