ReBuildAll Blog
Thoughts (mostly) on .NET development

Unit testing database code   (Tips & Tricks)   
You probably use some kind of ORM product when creating solutions. The ORM manages the communications with the database. On top of the ORM code you maybe layer other code - repositories, domain services or something else. Or maybe you use the ORM directly from your controller (if you are using MVC patterns).

In this article I will demonstrate how to test this database code by using "unit tests".

Thanks goes to Offbeat Solutions for the idea (NHibernate + SQlite)!

Scenario
My demo consists of the following scenario and technologies. From the bottom up: data layer is SQL Server. I will use Entity Framework Code First to access the database. The Entity Framework code is called from a repository layer which groups together logical operations. The controllers of the application use the repositories to access the data.

I will reference my example code throughout the article. Be sure to take a look on GitHub!

The example uses Entity Framework and Visual Studio Unit Tests (MSunit) but it should not be difficult to apply to other ORMs and tests suits as well.

Why unit tests should not hit the database?
Unit tests should run independently from one another and from other resources, especially external dependencies. If we allow the unit test to go to the real database (even a development database) we hit some big problems. How can we guarantee the data is intact, is always the same? If a test modifies the database, does it affect another test? How can I run the tests on the build server where there are no databases available?

You could of course install a database on the build server and setup some SQL scripts to delete all data prior to tests, etc. But I would argue you are still setting yourself up to trouble, not to mention that now in order for the tests to run those SQL scripts need to be maintained, something that we developers are pretty bad at :)

If you do not want to go to the database ...
If your unit tests cannot go to the database, then the repository layer is the one you will have to cut off. In my example code, you will find the IProductRepository interface, which describes the interface for the repository. The ProductRepository implements this interface, and uses the Entity Framework to retrieve the needed information.

If I wanted to test my code without accessing the database, I could write another repository implementation called ProductMemoryRepository that implements the interface, but operates in memory. I would put this implementation in the unit test dll and then inject that into the code (into my ProductController for example).

The problem with this approach is that the code in my actual ProductRepository is not tested. There might be LINQ expressions and filters there that are now outside the scope of my test.

Enter SQL Server Compact Edition
How can we test our repository code without going to the actual database? SQL Server Compact Edition is an in-process database server that you can include in your application. You include a DLL, and voila, you have SQL Server inside your application (well, more or less). There is even an Entity Framework binding available. Entity Framework and SQL CE is available through NuGet.

The only thing is, SQL CE requires a file to store the data in. But it can be any file (even a temp file) that can be destroyed as soon as the last connection is closed to the database.

Can we use this in unit tests? Absolutely.

Because Entity Framework works with SQL CE, we do not need to modify our entity framework code at all.

So before our test runs, let's create a new database:

        [TestInitialize]
        public void Initialize ()
        {
            currentFile = Path.GetTempFileName();
            string connectionString = string.Format(
                "Data Source={0};Persist Security Info=False", currentFile);

            var conn = DbProviderFactories.GetFactory("System.Data.SqlServerCe.4.0").CreateConnection();
            conn.ConnectionString = connectionString;
            conn.Open();

            entities = new Entities(conn);
            entities.Database.CreateIfNotExists();

            Seed();
        }


What happens here?

First we generate a temporal filename, and construct a connection string using that. We open a new connection using this connection string, and pass the connection to our entities class. We also make sure our database and tables exist. This last step guarantees that our test will always have a fresh copy of the database, and we do not need to maintain our tests to provide a fresh and clean database. This call will be called before every test is run, so every single test will have its own database to work with.

Our code first entities class which I cacled Entities needs a new constructor for this to work:

    public class Entities : DbContext
    {
        public Entities ()
        {
        }
        public Entities(DbConnection connection)
            : base(connection, true)
        {
        }


The default constructor is used normally in the code, and it finds the connection string from the config file. The second constructor receives the connection through parameters. This is how the connection is given to the entities class.

The test initialization method also calls a seed method at the end. This is how some initial data that is required by the test can be added to the database.

        protected virtual void Seed()
        { }


At the end of the test we need to destroy the database.

        [TestCleanup]
        public void Teardown ()
        {
            entities.Dispose();
            entities = null;

            File.Delete(currentFile);
        }


All this code can go into a base class that the actual test classes will inherit, if they require database access. In my sample code this is called SqlCompactBasedTest. All test classes should inherit this that require their own database for testing.

Now we can create a simple test:

        [TestMethod]
        public void Test_AddProduct()
        {
            Product p = new Product() { Name = "Raspberry icecream", Price = 1.99m };
            Entities.Products.Add(p);
            Entities.SaveChanges();

            var cheapThings = Entities.Products.Where(product => product.Price < 10.0m);
            Assert.AreEqual(1, cheapThings.Count());
        }


Because we know exactly in the test what is in the database BEFORE the test, we can make assertions on the contents after we perform some operations. Please check the example code for more sample tests.

Here I used the Entities class directly, but of course I can also write tests that test other aspects of the code base, but in the end, somewhere deep down the call chain they use the database. In the sample code you can see such a test in the RepositoryTests test class in the test method ProductController_ShowProducts().

Example code
I implemented the concepts demonstrated here in an actual solution. The application is very small and not a real application in the sense that the main program does not do anything. The unit tests however do work and use the technique described above. The sample uses NuGet to pull in those dependencies.

The application has a very basic data model, with some persons, products and orders. It also has a database migration for Entity Framework, so you might create the data in an actual SQL Server database, but the sample program does not really access this. The unit tests use the technique described above and contain the helper class to set up the database.

You can find the code on GitHub

Merging WSDL and XSD files   (Tips & Tricks)   
Updated! This tool was updated since this article was written. Read more about the update here

Having played with BizTalk some while ago (version 2006 R2) I had an interesting problem: in certain situations BizTalk would not accept web service description files (WSDL) where the XML schema was stored in separate files (XSD). No matter how hard I tried, there was no success. I even tried the good old trick of putting the files in a webserver and trying to add them over HTTP. It would not work. (This is the way you trick the Visual Studio proxy generator when WSDL and XSD files are spread over the hard drive in different directories, but that is another story :) )

Finally, I ended up creating a tool I called WSDLMerge. This can take a WSDL file, local or remote, and merge it with all the XSD files referenced. The merging is recursive, so any XSD files referenced by other XSD files are also included. It can follow local path locations and remote path locations. The result is a single WSDL file, that contains everything.

If you want to jump right into the code part, you can find the tool in source code format at GoogleCode. You will need Visual Studio 2010 to compile, but you can safely run it with .NET 3.5 SP1 (maybe even earlier).

The rest of this post will talk about how this tools works.

Just XML

WSDL files are just XML files after all. So we can go ahead and load it from disk or from a URL.

XmlDocument wsdl = new XmlDocument ();
wsdl.Load ( filename );

And voila, we have the entire WSDL loaded up. We need to create a XML namespace manager because we are going to work with namespaces. XPath searches in particular required the namespace manager. If you have the source code by now (see link above) you can find the following code in a method named PrepareNamespaceManager().

            XmlNamespaceManager manager = new XmlNamespaceManager ( wsdl.NameTable );
            manager.AddNamespace ( "wsdl", WSDLNamespace );
            manager.AddNamespace ( "xsd", XSDNamespace );
            return manager;

The tools will verify if the file loaded is an actual WSDL file. It does this by checking for the root element, which should be wsdl:definitions. There are probably better ways to do this, but this is good enough for our purposes.

Schemas, where are thee?

Next step, find schemas. These can be found under the following XPath /wsdl:definitions/wsdl:types. We read the import definitions one by one, load the schema location and namespace parts of the imports. We also keep track of all namespaces we have already loaded.

Ok, so first we locate the element where we should find the schema import statements:

XmlNode node = wsdl.SelectSingleNode ( "/wsdl:definitions/wsdl:types", manager );

If such an element exists, we can start finding any schemas:

XmlElement schemaElement = typesElement.SelectSingleNode ( "xsd:schema", manager ) as XmlElement;

Here the process turn recursive. This is done using a method called ProcessSchema() that is designed to process a single schema definition.

Inside this method we need to know if the schema is inline or if the schema is imported. So we look for import elements:
            imports = rootElement.SelectNodes ( "xsd:import", manager );

If we find anything, we check the namespace for this schema as well as the schema location. If the namespace we find is not yet loaded, we load the .XSD file (from either disk or an URL), attach it to the main document, and remove the import statement.

XmlDocument schemaDocument = new XmlDocument ();
schemaDocument.Load ( importLocation );

XmlElement newSchema = wsdl.ImportNode ( schemaDocument.DocumentElement, true ) as XmlElement;

XmlNodeList newImports = newSchema.SelectNodes ( "/xsd:import", manager );
foreach ( XmlNode importNode in newImports )
{
                    if ( level == 0 )
                    {
                        newSchema.RemoveChild ( importNode );
                    }
                    else
                    {
                        if ( importNode.Attributes["schemaLocation"] != null )
                        {
                            importNode.Attributes.RemoveNamedItem ( "schemaLocation" );
                        }
                    }
}
schemas.Add ( importNamespace, newSchema );

The ImportNode() method handles duplicating the element from the schema into our WSDL document. We also remove any import elements from the duplicated element (this could mean removing schemaLocation attributes). We do not want any XSD to import anything.

Of course we also do not want to have any schemas missing. So while we remove schema references from the document we are creating, we will want to follow them in the original documents. After this processing done we will process the original XSD (from which we created the duplicate) for these import statements, and call ourself (ProcessSchema()) recursively to import any further XML namespaces.

When this process is complete, all namespaces (== XSD files) that are referenced in any of the directly referenced schemas or anywhere in there recursively will be included one by one in the body of the WSDL document. This sort of flattens the entire XSD structure (previously using files it was built as a tree like structure). The schema references will still be in place, and because all schemas are now in the body, the WSDL will not have any dependencies.

In the end, the whole process is just navigating and modifying XML documents, looking up references, loading them, and attaching duplicated elements and nodes into the master document. This master document will become our merged WSDL document, which we just write to disk in the end.

SQL Server User Instances and the Database 'abc' already exists mystery   (Tips & Tricks)   
I was trying to find a solution to a problem with LINQ and its CreateDatabase() call (see details in another blog post). Then I bumped into the following problem.

When I tried attaching a database into an SQL Server Express User Instance that existed before (the same database under the same path), I received an exception. I actually manually deleted the .MDF and .LDF files from disc, and then tried to recreate them programatically. The exception informed me the database file already existed. Although I have deleted it myself.

The exception said: Database 'path_to_database' already exists. Choose a different database name.. Which was very strange.

Reading more about SQL Server Express User Instances, I found out that they are more complicated than I thought. What actually happens is that SQL Server Express copies the master and msdb databases under the user's directory (who runs the user instance). It then starts the user instance (sqlservr.exe) under the user account. When you want to use a database in a user instance, it actually attaches the database file to the user instance - just as you would attach a database to a regular SQL Server instance. It just happens behind the scenes.

This creates trouble of course, because the database is registered somewhere, and you shouldn't just delete the files. Which I did. And that is why I got the error message. :)

There is some information about a tool that you can use to connect to a user instance and execute commands (detach maybe?). I found it too complicated. (But if you are interested, visit this MSDN page for more information!)

Rather, I killed my user instance process (Task Manager, kill sqlservr.exe). Then proceeded to delete my user instance directory. For Windows 7, this would be:

c:\Users\username\AppData\Local\Microsoft\Microsoft SQL Server Data\SQLEXPRESS\

I then restarted my application (which called CreateDatabase()), and it worked. It recreated the above directory, and everything was fine.

I have to admit that this seems like a bit of a drastic solution, but it sure is faster than connecting to the instance and issuing SQL commands by hand :)

And it should not have any side effects, because all user instance databases will just get attached automatically the next time you use them.

Top 25 Most Dangerous Programming Errors   (Tips & Tricks)   
Many have probably heard about cross-site scripting or SQL injection and other common pitfalls when developing applications. There is a nice list of the top 25 most dangerous errors one can "create" during coding.

The list was compiled by many organizations and companies, and contains not just the errors themselves but also prevention and mitigations guidelines. Interesting read for anyone concerned about the security of their code.

2010 CWE/SANS Top 25 Most Dangerous Programming Errors

Splitting CSV data with Regex   (Tips & Tricks)   
The other day I needed to process CSV data from a .NET program. The basic processing is quite simple, you can just call string.Split() to have the job done. But I was faced with a CSV file that contained items in quotes. And of course inside the quotes, a comma would be allowed, thus rendering string.Split() quite useless.

I asked around and found that people were using a library that can be obtained from CodeProject. You can find the actual project here. It provides quite robust CSV processing capabilities. Unfortunately, I was in a hurry, and did not want to play around with a library (I know: bad me). I needed a regular expression that could split my string.

As it turned out, there were some articles to be found around the internet, but none would offer a perfect solution. I finally stitched together a regex from various sources and Google cache entries.

Regex rex = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");

This would work for my case. It might not be general enough, but if you have CSVs with quotes and need a fast solution, you can just take the above regex and then use it to split:

string[] result = rex.Split(csvLine);


Shortcut keys in Windows 7   (Tips & Tricks)   
Probably everyone knows a few shortcut keys in Windows and various applications. Ctrl-C and Ctrl-V being one of the most popular, I guesss. But there are really a lot of very useful shortcut keys introduced in Windows 7.

I am myself a keyboard freak: I like doing many things from the keyboard. In fact, I try to avoid the mouse as much as possible. Because I code and write, my hands are usually on the keyboard, so moving onto the mouse takes away precious typing time :)

The following applies to Windows 7 only.

Did you know you can maximize (Win+Up) and restore/minimize (Win+Down, although this only restores if the window is maximized) windows easily, or even move a maximized window from one monitor to the other (Win+Shift+Left or Win+Shift+Right, based on the direction you are moving)? I also like the fill left or fill right side shortcuts (Win+Left, Win+Right). Arranging windows was never quite so easy.

It takes a moment of time to learn a few shortcuts, but I find that it speeds up everyday operations so much.

You will find a list of shortcuts here:

New Keyboard shortcut keys (hotkeys) in Windows 7

Or if you really feel like learning, here is a more complete list:

Windows 7 keyboard shorcuts

Update 27.10.2009 - Another page with useful information on shortcuts for Windows 7: The Master Lift of new Windows 7 shortcuts

Debugger breakpoints from code   (Tips & Tricks)   
Did you ever have the need to stop the debugger from code? That is, without adding a breakpoint in the Visual Studio IDE?

Many times this seems to be a problem when a third party application loads my .DLL assembly and starts executing it. I want to place a breakpoint, but before the .DLL is loaded into this third party host, the breakpoint would not be active, and after the component is activated, it might be too late to place the breakpoint in the IDE.

Solution? Use the following code:

System.Diagnostics.Debugger.Break ();

And like magic, the debugger will stop here just as if there was a brakepont added.

If you are running this as a Windows application, the application error popup will also appear, allowing you to attach a debugger to the process. You can then just start stepping right from this code line forward.