Software Development


.Net Development19 May 2012 09:19 pm

I am using Microsoft’s Entity Framework 4.1’s Code First approach to rapidly develop a basic MVC 3 application for a friend of mine. The app is also leveraging the standard MVC Internet Website template and its build-in membership management functionality. Hence it uses the SQL Membership Provider which adds a bunch of tables to the database by means of the aspnet_regsql.exe utility. Thus not all database access is via the EF.

To get it all working together, my initial, naive approach was to use the utility to generate the SQL script needed to build out the database and then execute that script as part of the Seed() method of my database initializer. (I was subclassing both DropCreateDatabaseIfModelChanges for my website and DropCreateDatabaseAlways for my automated tests.)

The method to actually execute the script looked like this:

public static void ExecuteSqlSript(Database database, string scriptPath)
{
    var conn = database.Connection;
    if (conn.State == ConnectionState.Closed)
        conn.Open();
 
    var fullScript = File.ReadAllText(
        scriptPath);
 
    foreach (var command in Regex.Split(fullScript, @"\bGO\b"))
    {
        var cmd = conn.CreateCommand();
        cmd.CommandText = command;
        try
        {
            cmd.ExecuteNonQuery();
 
        }
        catch (Exception e)
        {
            throw new ApplicationException(e.Message + 
                "\r\n\r\nCommand: " + command, e);
        }
    }
}

This worked great for my first few iterations, until I needed to add a model to the framework to represent some of the data on the aspnet_User table.

Mapping my new User entity to the existing table name was easy. It just required adding the following to my DbContext implementation:

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
    base.OnModelCreating(modelBuilder);
    modelBuilder.Entity<User>()
        .ToTable("aspnet_Users");
    modelBuilder.Entity<User>()
        .Property(u => u.Name).HasColumnName("UserName");
}

This lead to the aspnet_regsql generated script failing because EF now was creating the aspNet_User table before the script ran. So my next step was to hack the script by adding the following just before the code to create the table:

IF EXISTS (SELECT name
                FROM sysobjects
                WHERE (name = N'aspnet_Users')
                  AND (TYPE = 'U'))
   AND NOT EXISTS(SELECT * FROM sys.COLUMNS 
                  WHERE Name = N'ApplicationId'  
                    AND Object_ID = Object_ID(N'aspnet_Users'))
BEGIN
	--This indicates the presense of a the table 
        --as created by the Entity Framework Code First
	--My EF schema does no use/create the full table 
        --so just drop it so that the script below will do its job.
    DROP TABLE [dbo].aspnet_Users
END
 
IF (NOT EXISTS (SELECT name
                FROM sysobjects
                WHERE (name = N'aspnet_Users')
                  AND (TYPE = 'U')))
BEGIN
  PRINT 'Creating the aspnet_Users table...'
/* Actual CREATE TABLE statement omitted for brevity...

This was a hack, but this was rapid development of the “just get it working” sort. And it worked great, until I added a new model that referenced a user, like so:

public class Record
{
    public int RecordId { get; set; }
 
    public Guid UserId { get; set; }
 
    [Required]
    [ForeignKey("UserId")]
    public virtual User Recorder { get; set; }
 
    [MaxLength(4000)]
    public string Notes { get; set; }
}

This busted my hack because I could no longer just drop the table. There was now a foreign key constraint that had to be deleted first. While I could certainly patch my hack to drop the key too, with more foreign keys on the horizon, it was obvious that this approach was going to quickly become unmanageable.

The problem was that, especially for my automated tests, I wanted to drop and recreate the database regularly. The right solution, therefore was to somehow execute the aspnet_regsql script after EF created the database itself, but before EF built out any tables. If I could do that, EF was smart enough to just use the existing aspnet_User table.

In my search for a solution, I found this article which described how to create a database initializer that would just drop all the table and recreate them w/o dropping and recreating the database itself. This seems promising but its approach to dropping all the tables didn’t account for foreign keys or any other objects that might need dropped too. Modifying it to do so might be possible but could also become a real maintenance headache. My concern was how complex the code would become to find and drop all dependencies which my code might create against the aspnet tables w/o dropping those that the aspnet_regsql script itself created. It might be easy; it might not. I still felt that simply dropping the whole database and rebuilding it was the best approach. So I set out to find a way to inject some functionality between EF’s database creation and schema build out actions. I posted on the EF forum and got a suggestion to try migrations. That wasn’t a bad idea, but still seemed more complicated than I wanted.

Going back to the blog on custom initialization strategies, I looked harder at the provided code, and googled around for some additional examples of custom strategies and in the end was able to come up with a custom initializer that did what I wanted. It

  1. drops the database (conditionally or always, base on a constructor arg).
  2. creates the database itself.
  3. runs the aspnet_regsql script.
  4. uses EF to build out the remaining schema.

Without further adue, here is the code:

using System;
using System.Data;
using System.Data.Entity;
using System.Data.Entity.Infrastructure;
using System.Data.SqlClient;
using System.IO;
using System.Text.RegularExpressions;
using System.Transactions;
 
public class CreateDatabaseWithAspNetRegSql<TContext> 
    : IDatabaseInitializer<TContext>
    where TContext : DbContext
{
    public enum CreationStrategy { AlwaysCreate, CreateIfModelChanged }
 
    private readonly CreationStrategy _creationStrategy;
 
    public CreateDatabaseWithAspNetRegSql(
        CreationStrategy creationStrategy)
    {
        _creationStrategy = creationStrategy;
    }
 
    #region IDatabaseInitializer<Context> Members
 
    public void InitializeDatabase(TContext context)
    {
        bool dbExists;
        using (new TransactionScope(TransactionScopeOption.Suppress))
        {
            dbExists = context.Database.Exists();
        }
        if (dbExists)
        {
            if (_creationStrategy == CreationStrategy.CreateIfModelChanged
                && context.Database.CompatibleWithModel(false)) 
                return;
 
            context.Database.Delete();
        }
 
        CreateDatabase(context);
 
        DbInitializer.DoAspNetRegSql(context.Database);
 
        CreateTablesForModels(context);
 
        Seed(context);
        context.SaveChanges();
    }
 
    #endregion
 
    #region Private/Protected Methods
 
    private static void CreateDatabase(TContext context)
    {
        var masterDbConnString = context.Database
            .Connection.ConnectionString
            .Replace(context.Database.Connection.Database, "master");
 
        //TODO: Find way to create db in an agnostic way.
        using (var conn = new SqlConnection(masterDbConnString))
        {
            conn.Open();
 
            using (var cmd = conn.CreateCommand())
            {
                cmd.CommandText = string.Format("CREATE DATABASE [{0}]", 
                    context.Database.Connection.Database);
                cmd.ExecuteNonQuery();
            }
        }
    }
 
    private static void DoAspNetRegSql(Database database)
    {
        //TODO: This file name reference is a hack, 
        //need a better way of handling this!
        ExecuteSqlSript(database, @"C:\Users\Ken\Documents\Visual Studio 2010\Projects\MVCSandbox\_Resources\aspnet_regsql.sql");
    }
 
    protected static void ExecuteSqlSript
        (Database database, string scriptPath)
    {
        var conn = database.Connection;
        if (conn.State == ConnectionState.Closed)
            conn.Open();
 
        var fullScript = File.ReadAllText(
            scriptPath);
 
        foreach (var command in Regex.Split(fullScript, @"\bGO\b"))
        {
            var cmd = conn.CreateCommand();
            cmd.CommandText = command;
            try
            {
                cmd.ExecuteNonQuery();
 
            }
            catch (Exception e)
            {
                throw new ApplicationException(e.Message + 
                    "\r\n\r\nCommand: " + command, e);
            }
        }
    }
 
    private static void CreateTablesForModels(TContext context)
    {
        var modelBuildoutScript = ((IObjectContextAdapter)context)
            .ObjectContext.CreateDatabaseScript();
 
        RemoveTableCreationCommandsForTablesCreatedByAspNetRegSql(ref modelBuildoutScript);
 
        context.Database.ExecuteSqlCommand(modelBuildoutScript);
    }
 
    private static readonly Regex __aspNetCreateTableCommandFinder 
        = new Regex(@"create table \[dbo\]\.\[aspnet_\w+\][^;]*;");
 
    private static void RemoveTableCreationCommandsForTablesCreatedByAspNetRegSql
        (ref string script)
    {
        script = __aspNetCreateTableCommandFinder.Replace(script, string.Empty);
    }
 
    #endregion
 
    #region Public Methods
 
    protected virtual void Seed(TContext context)
    {
 
    }
 
    #endregion
}
Software Development07 Dec 2011 12:43 am

When using version control and working on a branch other than the master, one should periodically pull any changes made to master into the working branch. If you are using Git, the preferred way to do this is with the rebase command rather than merge. (Here is a good explanation of the difference between the two and why rebase is usually preferred.)

However, rebase sometimes does not work as you would like. I hit a good example of this this evening and wanted to document it. The scenario is that of working on my own working branch of a rails project where we had just started using RubyMine. I had created my working branch just prior to switching to RubyMine. To my chagrin, I discovered that RubyMine had created a bunch of files to manage its view of the project in a folder named .idea, and I had committed them in my branch.

In the meantime another developer discovered the same issue and coincidentally added the .idea directory to .gitignore on master. The natural thing to do, I thought, was to rebase my branch on master. I was promptly told that I could not do this. I failed to capture the exact error message, but basically the problem was that rebase was attempting to replay the addition of files to the .idea directory on top of a git repository that now called for the .idea directory to be ignored.

Because I only had a few commits on my branch, I decided the quick solution would be to create a new working branch based on the current master and then cherry-pick my commits from the old working branch onto the new one.

My first cherry-pick command resulted in a “commit not found error.” I thought this was very odd, but in examining the contents of the commit I was attempting to cherry-pick, it turns out that it only contained files in the .idea directory. So it seems a better error message would have been “nothing found in the commit.”

Seeing as I could safely omit that commit, I went on to cherry-pick the remaining ones. Each failed initially due to merge errors, because each attempted to modify files under .idea. To solve this I had to execute a git rm -f <file> command for each such file to resolve the conflict.

After resolving the conflict, I followed the onscreen instructions that were given at the point that the cherry-pick failed. Namely to run the commit -c <hash> command specifying the hash of the commit I had cherry-picked.

The -c option creates a new commit with the same message and author as the one referenced by the hash. I had not seen this option before and as it turns out I really wanted to use the capitalized -C version. Both do the same thing but the lowercase version throws you into vi to edit the commit message where as the uppercase one uses the commit message as-is.

Software Development20 Nov 2011 12:40 pm

In a recent design session, we were discussing the API for an event bus that supports asynchronous RPC. A colleague of mine proposed a fluent interface. A simplified version of it looked roughly like this:

bus.for(event).replyTo(handler).withATimeOutOf(500).Send();

The proposal lead to some debate because some members of the team liked the idea and others preferred the more traditional approach of using overloaded methods as follows.

bus.Send(event, replyHandler);
bus.Send(event, replyHandler, timeout);

In the end it was agreed that the traditional API was essential, and if some of the team wanted a fluent API, one could be written as a wrapper around it.

I think this was the right call and the point of this post is to explain why. One of the reasons that was given against the fluent interface was that they are very hard to extend if you do not have the ability to modify the original source. In short, they violate the Open/Closed Principle, one of the core principles of SOLID object oriented design.

Lets take a look at why.

Here is a pseudo implementation of the fluent API.

class Builder
{
    Builder for(IEvent event) { ... ; return this }
    Builder withReplyTo(IHandler handler) { ... ; return this }
    Builder withATimeOutOf(int milliseconds) { ... ; return this }
    void send() { ... }
}

Omitted is the operative code that would collect the various arguments and ultimately invoke the base API in Send(). What remains is just what is needed to do the fluent call chaining, namely returning an instance of the builder itself. It is this return value that violates OCP.

Suppose a third party wants to add a second event handler to be invoked in case of a timeout. We will presume the base Bus class is otherwise well factored and adheres to OCP. Because of this, adding a new Send overload to the traditional API can be achieved by subclassing the base implementation. Invoking the new method will look like this.

bus.Send(event, replyHanlder, timeout, timeoutHandler);

But what happens if we try to extend our fluent API by subclassing the Builder?

class NewBuilder : Builder 
{
    NewBuilder andHandleTimeoutsWith(ITimeoutHanlder handler)
    { 
        ...
        return this;
    }
}

Now we see the problem with the base class methods returning Builder. From our subclass, even if the return value is actually an instance of MyBuilder, it is returned by the inherited members as Builder. In a statically typed language, the new method is not accessible without a very un-fluent explicit cast.

A Generic Solution?

In a simple API, if the original designers are prescient enough, one might get around this by using generics. Suppose the orignal builder had been written generically.

class Builder<T> where T : Builder
{
    T for(IEvent event) { ... ; return this }
    T andReplyTo(IHandler handler) { ... ; return this }
    T for(int milliseconds) { ... ; return this }
    void send() { ... }
}

As can be seen, we still need the non-generic Builder so that the generic one can be instantiated in terms of it. If we would normally access the fluent API like so: bus.GetFluentBus<Builder>(), we could then access our extended API like this: bus.GetFluentBus<MyBuilder>().

This approach gives us a builder that complies with OCP. The down side is we have muddied up our GetFluentBus method with a generic argument just to allow for what is likely to be an edge case. But supporting edge cases is what extensibility is all about.

A bigger problem is that the generic approach only works well for a simple, single-builder API. Most fluent interfaces use a set of builder classes and/or interfaces in order to limit the methods available at any point in the call chain to those that make sense. In this scenario, genericizing the API gets ugly. Every builder will need to be generic not only with respect to itself but with respect to every builder that it returns directly or indirectly, and our GetFluentBus method is going need generic arguments for all of them. Consequently, generics are not a good general purpose solution.

Conclusion

Fluent APIs do have an appeal and a legitimate place in software design. However, they are not extensible and therefore should never be the only means of doing something. Rather they need to be seen as an augmentation and approached with the understanding that their purpose is to make the common use cases fluent for those who want the option.

« Previous PageNext Page »