A quick status on Composite C1 and Windows Azure

03 November 2010

A quick status on Composite C1 and Windows Azure

Three weeks ago we held a 3 day workshop titled "Bringing Composite C1 to Azure" and I would like to tell you how far we are at the moment, what we learned and what we are planning to do. I’m not going into a technical deep dive about Windows Azure – I will just give a status report and account for some of the things we learned. Before I begin I'd like to thank @runeibsen @ploeh @danielovich @dotHenrik @henrikwh @reneloehde @martinesmann and @napernik and @IngvarKofoed from core for knowledge and experience sharing on a delightful high level in a truly inquisitive and inspiring atmosphere.

The executive summary

Here is the super short status report for the lazy reader: We have Composite C1 up and running on Azure and are able to install feature packages, work with templates, dynamic data types, content etc. All the ‘fancy stuff’ in Composite C1 runs just fine on Azure, but we need to do some work before we are production ready and yet more work to scale out across multiple instances. Those tasks are pretty well defined by now and we are currently solving them. We expect to have a release grade build in December.

The longer story

At the workshop we broke up into two tracks after the first day – one track focusing on getting Composite C1 "as is" to run on Azure and another track focusing on decoupling Composite C1 from the file system.

The part of getting Composite C1 to run on Azure "as is" turned out to be amazingly simple – we just ended up spending a lot of time chasing ghosts which boiled down to some bad diagnostics code we copied from MSDN and some gotchas about file permissions on Azure.

CRASH!

The bad diagnostics code issue isn’t really the kind of story you get respect by sharing, but you will get it anyway and then I’ll wash my hands and try to blame Microsoft.

We had an issue where the whole web process crashed (the hard way) with an AppDomainUnloaded exception – this happened when Composite C1 has finished initializing and is doing runtime compilations and "caches" the resulting work by writing an assembly to the website ~/bin folder; thus forcing the ASP.NET AppPool to restart. This operation present no problems on your vanilla Windows box so we immediately jumped to the conclusion that messing with assemblies this way was a no-go on Windows Azure. Keep in mind that Azure was "a newcomer" in our world and as humans we quickly accused "the stranger" - the crash completely killed of the host process - and we wasted a lot of time investigating this (non) problem, had IntelliTrace dumps shipped to @brianhprince and all.

It wasn’t until a week later when I asked @XteProfilerNet – the kind of developer who writes a (kick ass) memory and performance profiler for .NET 4 in his spare time – to take a look. To make a long story short, he eventually ended up doing a code review of theRoleEntryPointclass we had added at day one, and he sent me this updated version of the class along   with subtle comments about our developer skills. He had made_diagnosticMonitora static and added the the OnStop() method shown below (highly simplified).

[ Error ]

The diagnostics monitoring process really dislike it, if you shut down an AppDomain without stopping monitoring and this tore down the whole host process.  This is the kind of code one ends up with when blindly reading articles like http://msdn.microsoft.com/en-us/library/ee843890.aspx. Personally I also learned the following two lessons from this whole ordeal: always clean up you mess and stop letting your mistrust of strangers cloud your judgment.

The client and server see different port numbers

We had a few places in Composite C1 where we used the port number (like ‘80’) to do client redirection and to generate unique cookie names; this enables developers to work on multiple sites using different port numbers (like localhost:81 and localhost:82) without having the cookies from different sites mix themselves up.   In Windows Azure there is a proxy sitting between the client and server that change the incoming port number, probably as part of it’s load balancing features. This gave some unexpected results but was easily fixed.

Files are not what they used to be!

Another issue we struggled with in the "deploy Composite C1 as is" track was file permissions. When you run your application in the Windows Azure Simulation Environment (i.e. press F5 in Visual Studio) the file system behaves exactly like you are used to; you are able to update files that are part of the application you deploy, like an XML file. This is slightly different when you then deploy in "real Azure". All files that are part of your deployment to Azure have file permissions that prevent you from changing them. But you are free to create and change whatever new files you like.

This was – for the purpose of getting the default Composite C1 installation up and running on Azure – a minor constraint and once the diagnostics code fail and port number issue was fixed, I did a deployment, ran the setup Wizard, created templates, css files, uploaded images, added a bunch of feature packages (dynamically introducing App_Code files, DLLs, Workflow Foundation tasks etc) and then wrote and published some "Hey! Composite C1 is running in the cloud" content pages without hitting a single issue. I gave up trying to find any issues after a few hours of kicking around – even the server performance was excellent!

The second track – decoupling the file system

The second track we started on the first workshop day focused on decoupling Composite C1 from its file system dependencies and to introduce support for the Windows Azure Blob storage. Here is a fine blog post describing why we would do this: http://geekswithblogs.net/shaunxu/archive/2010/05/05/azure-ndash-part-6-ndash-blob-storage-service.aspx

So what’s the problem?

There are a few things to be said about the file system and Composite C1. We have a provider based model which handle read/write of content, data and state enabling us to write new providers which will store these elements elsewhere, but there are a few areas where decoupling from the file system is either contra intuitive or simply impossible.

Let me quickly iterate some of the traits of our CMS, Composite C1:

it will …

  • … allow you to add and edit any file on the website, including things like web.config and .cs files.
  • … allow you to upload files to any directory, including ~/App_Code and ~/bin
  • … allow you to upgrade Composite C1 by installing an upgrade package which will update any aspect of the application required, including its own assemblies (like Composite.dll), web.config. global.asax etc.
  • … allow you to deploy feature packages which can create or update code files, assemblies or configuration files
  • … give front-end developers a 1:1 relationship between files you manage (like /styles/my.css) and URL’s (like http://mysite/styles/my.css)
  • … do dynamic compilations involving the file system
  • … run on IIS7, but also IIS6 where managed code cannot redirect all incoming URL’s to a virtual store

In most web applications the scope of "mutable data" is very limited and something you can fairly easily store in a SQL Server or blob storage. In Composite C1 everything – including the entire application, all its files, assemblies and config – is considered mutable. No file or piece of data is immutable. It’s a consequence of a feature we are proud of and it makes Composite C1 really easy to upgrade and extend through feature packages. We want to keep this feature on Azure   as we believe this deployment model to be excellent for such an environment.

Some files cannot be virtualized and need to be on the file system, like web.config, App_Code files and assemblies in ~/bin. When you browse   the file structure from the C1 Console and are managing files you should not feel any "virtualization" what so ever – if you are a front-end developer it is   important that paths and URLs work the way you are used to.

In short, the distinction between files and data is a bit murky in our world and it does present us with some challenges.

What is the solution?

During the workshop different ideas was thrown around and we even got some great inspiration from @burningice who pointed us in the direction of a highly relevant articlehttp://msdn.microsoft.com/en-us/library/aa479502.aspxdescribing how you can virtualize asp.net artifacts like aspx pages.

The strategy we ended up with was this:

  • Replace any file IO related code within the application with calls to facades, mimicking the functionality, but hiding the implementation. This include elements like File.ReadAllText(path), XDocument.Load(path),  Directory.GetCreationTime(path), StreamReader, FileSystemWatcher.
  • Ensure that no part of the application is directly dependant on direct IO. We did this by using using IntelliTrace files and writing tools to analyze them and adding FxCop rules to our builds.
  • Build a provider model behind the façade so it can run with the classic file system or (obviously) the Azure Blob storage, without the application caring about it.
  • Identify areas where we want files to be replicated from the Blob storage to the file system – this would be files like web.config, App_Code and assemblies, but could also include files addressable through URLs like aspx, css and images; then write the plumbing to ensure this replication is done.
  • Write the required IHttpHandler and VirtualPathProvider logic to handle requests for files not ‘synced back’ in the step above.There are potential security issues to be addressed here.
  • Make the façade encapsulating file IO available via a nice API for developers using Composite C1.

Currently we are half way through step 3 and judging from the feedback from our test team we haven’t broken anything serious so far. Step 4, 5 and 6 are fairly trivial. Once we are through those steps you should be able to run Composite C1 on Azure, fully featured. In the first generation we will be running on a single instance and if enough of a demand should arise we will be looking into supporting multi instance as well.

The world’s smallest bootstrapper

A major feature of Composite C1 is its ability to "upgrade itself" on demand via the C1 Package system and combined with Windows Azure’s ability to easily and hot swap between staging and production environments we should be able to offer somewhat of a dream when it comes to easy and safe upgrades.

One thing that could prevent us from reaching this Upgrade Nirvana is the fact that our upgrade packages would fail miserably if they tried to update any file that was part of the initial Azure deployment. To address this issue we are looking into a minimal ‘bootstrapper’ which will automatically download the latest Composite C1 release and config it for Azure on brand new installations, or mirror the file structure from the blob storage for new instances.

Other blog posts from the workshop:


comments powered by Disqus