Ruthlessly Helpful

Stephen Ritchie's offerings of ruthlessly helpful .NET practices.

When To Use Database-First

Code-centric development using an object-relational mapping (ORM) tool has a workflow that many developers find comfortable. They feel productive using the ORM in this way, as opposed to starting with the database model. There are a number of good posts out there on the Entity Framework 4.1 code-first capabilities: MSDN, MSDN Magazine, Scott Guthrie, Channel 9, and Scott Hanselman’s Magic Unicorn Feature. It makes sense to the object-oriented developer and writing code-first comes very naturally.

This prompts the question: When would it be better to take a database-first approach?

For many legacy and Brownfield projects the answer is obvious. You have a database that’s already designed — you may even be stuck with it — therefore you choose database-first approach. This is the defining need for database-first because the database is a fixed point. And so, use database-first when the database design comes from an external requirement or is controlled outside the scope or influence of the project. Similarly, modelling the persistence layer using a model-first approach fits the bill because what you learn about the requirements is expressed in data-centric terms.

Let’s say the project is Greenfield and you have 100% control over the database design. Would a database-first approach ever make sense in that situation?

On-line Transaction Processing (OLTP) and On-line Analytical Processing (OLAP) systems are considered two ends of the the data persistence spectrum. With databases that support OLTP systems the objective is to effectively and properly support the CRUD+Q operations in support of business operations. In the databases that support OLAP systems the objective is to effectively and properly support business intelligence, such as data mining, high-speed data analytics, decision support systems, and other data warehousing goals. These are two extremely different database designs. Many systems’ databases live on a continuum between these two extremes.

I once worked on a student loan originations system. It was a start-with-a-clean-slate, object-oriented development project. Initially, the system was all about entering, reviewing and approving loan applications. We talked about borrowers, students and parents, and their multiple addresses. There was a lot about loan limits and interest rates, check disbursements, and a myriad of complicated and subtle rules and regulations related to creating a loan and making a check disbursement. The system was recording the key records and financial transactions and the database was the master repository of this data. In fulfilling these requirements, the system was a success. However, once the system was readied for parallel Beta-testing at the bank things started to go sideways.

Here is some of what we missed by taking a code-first approach:

  1. Every day the bank manager must run a reconciliation report, which joins in a lot of financial data from a lot of the day’s records, no one can go home until the report shows that everything is balanced. The bank manager screamed when the report took over two hours.
  2. At the end of every quarter, there is an even bigger report that looks at more tables and financial transactions and reconciles everything to the penny. More screaming when this report ran for days and never properly reconciled — the query could never quite duplicate the financial engine’s logic to apply transactions.
  3. And lastly, every loan disbursement that goes out requires a form letter, defined by the Dept. of Education, be sent to every student that has a loan originated by the bank. Imagine the tens of thousands of form letters going out on the day they send the money to UCLA. The project almost died when just one form letter to one student took 30 minutes!
  4. The data migration from the legacy system to the new system was taking nearly a week to completely finish. The bank wasn’t going to stop operations for a week.

What we failed to realize was that the really significant, make-or-break requirements of the system were all reporting or data conversion related. None of it had been seriously brought up or laid out during system development, however, not meeting those requirements took the project very close to the edge of extinction.

A major lesson learned, look very closely at the question of data persistence and retrieval. Work hard to uncover and define the reporting, conversion and other data requirements. Make any hidden or implicit database requirements explicit. Find out if the system is really all about the care and feeding of a relational database.

Adding it all up: if the database-specific requirements significantly overshadow the application-specific requirements then a database-first approach is probably your best bet.


Agile Requires Agility

For a long time there has been a widely held belief, early in the collective unconscious and later described in various methodologies: Effective software development requires key elements, like clear deliverable-objectives, a shared understanding of the requirements, realistic estimates of effort, a proper high-level design, and, the most important element of all, effective people.

What happens when the project context doesn’t meet the minimum, essential conditions for a process like Agile development? The project has many missing, ambiguous, or conflicting objectives, and those objectives are nearly always described as activities, not deliverables. There are major differences between what the project leaders, analysts, developers and testers, each think the system is required to do. Every estimate is either prohibitively pessimistic or ultimately turns out to be overly optimistic. The software architecture collapses because it’s not able to carry the system from one sprint to the next, or it’s overly complicated. The project’s people are not sure what to do, how to do it, or aren’t motivated to do it.

In the field of agricultural development, there is the concept of appropriate technology; they say Gandhi fathered this idea. Agricultural development is more successful when straightforward and familiar technologies are used to move from the current status quo to a new, improved status quo. For example, before the tractor would be used effectively farmers should first get comfortable using a team of oxen to plow the fields.

Some ideas to move the team’s status quo from the current state of readiness to the perquisite level:

  • Rephrase project objectives from activities to deliverables. For example, “write the requirements document for feature Xyz” becomes “Requirement Xyz; verified complete, clear and consistent by QA.”
  • Refocus the team away from providing initial estimates, which are often just guesses anyway, toward a timebox and working the prioritized list of deliverables. Use each timebox’s results as the future predictor.
  • Listen carefully and ask probing questions to ensure everyone’s on the same page with respect to what the system’s supposed to do; keep coming back to a topic if there are significant differences.
  • Find ways to continuously validate and verify that the architecture is up to the task; not under-engineered or over-engineered.
  • Look for the tell-tale signs of knowledge-, skill-, or attitude-gaps. Team members tentative about what they’re supposed to be do. wanting more training, time  to experiment or feeling under prepared, or a general concern that project is not on the right track and won’t go better.

A catch-phrase for software development; Agile requires agility. Keep an eye on the appropriateness by monitoring the team’s level of agility and positively influence a transition to the next plateau.

HTML5 Shims, Fallbacks and Polyfills

There is a lot to know about HTML5 shims, fallbacks and polyfills. Let’s start by trying to define the terms and point to some places on the web to get more information.

The whole idea is to provide a way to develop pages in HTML5 and have everything work properly in a browser that doesn’t natively support HTML5 functionality. For example, this approach can enable the new HTML5 features within IE7.

shim /SHim/
Noun: A relatively small library (js, plugin, etc.) that gets in between the HTML5 and the incompatible browser and patches things — transparently — so the page works properly or as close as practicable. Sometimes referred to as a “shiv”. More info: The Story of the HTML5 Shiv

fallback /ˈfôlˌbak/
Noun: A backup plan when your page detects that it’s being displayed in an incompatible browser. More info: Yet another HTML5 fallback strategy for IE

polyfill /ˈpälē fil/
Noun: A patch or shim that is suitable as a fallback for *a whole lot of* missing functionality.
More info: Modernizr and What is a Polyfill? and HTML5 Cross Browser Polyfills

I’m going to roll up my sleeves now and start with this ScottGu post on HTML5 and ASP.NET MVC 3. It looks like it has some good bits on the modernizr.js JavaScript library.

Don’t Comment Out Failing Unit Tests

While working with a rather large Brownfield codebase I came upon a set of commented out unit tests. I uncommented one of these unit tests and ran it.

Sanitized illustrative code sample:

public void ProductionSettings_BaseBusinessServicesUrl_ReturnsExpectedString()
    // Arrange
    var settings = new ProductionSettings();

    // Act
    var actual = settings.BaseBusinessServicesUrl;

    // Assert
    Assert.AreEqual("http://localhost:4321/Business.Services", actual);

As expected, the test failed. Here’s some pseudo-output from that test:

SettingsTests.ProductionSettings_BaseBusinessServicesUrl_ReturnsExpectedString : Failed

NUnit.Sdk.EqualException: Assert.AreEqual() Failure
Position: First difference is at position 17
Expected: http://localhost:4321/Business.Services
Actual:   http://localhost:1234/Business.Services

at Tests.Unit.Acme.Web.Mvc.Settings.SettingsTests
in TestSettings.cs: line 19

So now what should I do? That output prompts the question: what’s the correct port? Is it 1234, 4321 or is it some other port number? To sort this all out I’ll need to take on the responsibility of researching the right answer.

Almost certainly, Mr. or Ms. Comment-out-er gave me this chore because he/she did not have the time to sort it all out themselves. Also, likely, the person who changed the port number didn’t know there was a unit test failing or even that there was a unit test at all. I don’t need to know who did or didn’t deal with this; I’ll leave that to the archeologists.

The larger point is this: if you’re commenting out a failing unit test then you’re missing the point of a unit test. A unit test verifies that the code-under-test is working as intended. A failing test means you need to do something — other than commenting out the test.

If a unit test fails then there are four basic options:

  1. The code under test is working as intended; fix the unit test.
  2. The code under test is NOT working as intended; fix the code.
  3. The code under test has changed in some fundamental way that means the unit test is no longer valid; remove the unit test code.
  4. Set the unit test to be ignored (it shouldn’t fall through the cracks now), report it by writing up a “fix failing unit test” defect in the bug tracking system, and assign it to the proper person.

Commenting out a unit test means that you’re allowing something important to fall through the cracks. The big no-no here is the commenting out. At the very least, pick option 4.

The Virtues of Blogging

I recently attended an INETA Community Leadership Summit meeting where Scott Hanselman made a point of highlighting the relative power and importance of blogging.

The main point he made was that each of us frequently communicates to and within a limited group. Those same email threads or conversations would benefit the Microsoft community, both IT pro and development, more if that same dialog made it into the blog-o-sphere. A blog post is one of the most effective ways any individual software professional can help the community. Especially, if the post provides *constructive* criticism — tact is important — then the post can positively influence change; many people read and take notice of blog postings.

Also, not all communication forms are equal. Many are ephemeral with limited distribution, but a blog is more permanent and far-reaching.

Consider the time it takes to write an email. Let’s say your email raises an issue, points out an inconsistency, explains how to overcome a technical obstacle, or describes an effective way to perform common tasks. Instead of putting that info in an email that reaches dozens of people try writing it up in a blog post; potentially reaching hundreds or thousands of people.

The post may never be read. The blog site may never be visited; however, if your email has a link to your post then the content still reaches the same dozen people with the same number of keystrokes. Nonetheless, it’s not about followers, it’s about adding your voice to the community, without regard to how many people will read it but in a fervent belief that at least one reader, beyond my current circle of influence, will read, understand and appreciate what I have to say. Write it globally; socialize it locally.

Scott delivered a good sermon that I thought I’d share. Let’s see if it has any impact on me. Perchance this is the type of rational argument that motivates me to blog.

I guess you’ll know that I’ve taken the prescription when I create a blog and title my first post: “The Virtues of Blogging”.