Ruthlessly Helpful

Stephen Ritchie's offerings of ruthlessly helpful software engineering practices.

Category Archives: Development

When To Use Database-First

Code-centric development using an object-relational mapping (ORM) tool has a workflow that many developers find comfortable. They feel productive using the ORM in this way, as opposed to starting with the database model. There are a number of good posts out there on the Entity Framework 4.1 code-first capabilities: MSDN, MSDN Magazine, Scott Guthrie, Channel 9, and Scott Hanselman’s Magic Unicorn Feature. It makes sense to the object-oriented developer and writing code-first comes very naturally.

This prompts the question: When would it be better to take a database-first approach?

For many legacy and Brownfield projects the answer is obvious. You have a database that’s already designed — you may even be stuck with it — therefore you choose database-first approach. This is the defining need for database-first because the database is a fixed point. And so, use database-first when the database design comes from an external requirement or is controlled outside the scope or influence of the project. Similarly, modelling the persistence layer using a model-first approach fits the bill because what you learn about the requirements is expressed in data-centric terms.

Let’s say the project is Greenfield and you have 100% control over the database design. Would a database-first approach ever make sense in that situation?

On-line Transaction Processing (OLTP) and On-line Analytical Processing (OLAP) systems are considered two ends of the the data persistence spectrum. With databases that support OLTP systems the objective is to effectively and properly support the CRUD+Q operations in support of business operations. In the databases that support OLAP systems the objective is to effectively and properly support business intelligence, such as data mining, high-speed data analytics, decision support systems, and other data warehousing goals. These are two extremely different database designs. Many systems’ databases live on a continuum between these two extremes.

I once worked on a student loan originations system. It was a start-with-a-clean-slate, object-oriented development project. Initially, the system was all about entering, reviewing and approving loan applications. We talked about borrowers, students and parents, and their multiple addresses. There was a lot about loan limits and interest rates, check disbursements, and a myriad of complicated and subtle rules and regulations related to creating a loan and making a check disbursement. The system was recording the key records and financial transactions and the database was the master repository of this data. In fulfilling these requirements, the system was a success. However, once the system was readied for parallel Beta-testing at the bank things started to go sideways.

Here is some of what we missed by taking a code-first approach:

  1. Every day the bank manager must run a reconciliation report, which joins in a lot of financial data from a lot of the day’s records, no one can go home until the report shows that everything is balanced. The bank manager screamed when the report took over two hours.
  2. At the end of every quarter, there is an even bigger report that looks at more tables and financial transactions and reconciles everything to the penny. More screaming when this report ran for days and never properly reconciled — the query could never quite duplicate the financial engine’s logic to apply transactions.
  3. And lastly, every loan disbursement that goes out requires a form letter, defined by the Dept. of Education, be sent to every student that has a loan originated by the bank. Imagine the tens of thousands of form letters going out on the day they send the money to UCLA. The project almost died when just one form letter to one student took 30 minutes!
  4. The data migration from the legacy system to the new system was taking nearly a week to completely finish. The bank wasn’t going to stop operations for a week.

What we failed to realize was that the really significant, make-or-break requirements of the system were all reporting or data conversion related. None of it had been seriously brought up or laid out during system development, however, not meeting those requirements took the project very close to the edge of extinction.

A major lesson learned, look very closely at the question of data persistence and retrieval. Work hard to uncover and define the reporting, conversion and other data requirements. Make any hidden or implicit database requirements explicit. Find out if the system is really all about the care and feeding of a relational database.

Adding it all up: if the database-specific requirements significantly overshadow the application-specific requirements then a database-first approach is probably your best bet.

Agile Requires Agility

For a long time there has been a widely held belief, early in the collective unconscious and later described in various methodologies: Effective software development requires key elements, like clear deliverable-objectives, a shared understanding of the requirements, realistic estimates of effort, a proper high-level design, and, the most important element of all, effective people.

What happens when the project context doesn’t meet the minimum, essential conditions for a process like Agile development? The project has many missing, ambiguous, or conflicting objectives, and those objectives are nearly always described as activities, not deliverables. There are major differences between what the project leaders, analysts, developers and testers, each think the system is required to do. Every estimate is either prohibitively pessimistic or ultimately turns out to be overly optimistic. The software architecture collapses because it’s not able to carry the system from one sprint to the next, or it’s overly complicated. The project’s people are not sure what to do, how to do it, or aren’t motivated to do it.

In the field of agricultural development, there is the concept of appropriate technology; they say Gandhi fathered this idea. Agricultural development is more successful when straightforward and familiar technologies are used to move from the current status quo to a new, improved status quo. For example, before the tractor would be used effectively farmers should first get comfortable using a team of oxen to plow the fields.

Some ideas to move the team’s status quo from the current state of readiness to the perquisite level:

  • Rephrase project objectives from activities to deliverables. For example, “write the requirements document for feature Xyz” becomes “Requirement Xyz; verified complete, clear and consistent by QA.”
  • Refocus the team away from providing initial estimates, which are often just guesses anyway, toward a timebox and working the prioritized list of deliverables. Use each timebox’s results as the future predictor.
  • Listen carefully and ask probing questions to ensure everyone’s on the same page with respect to what the system’s supposed to do; keep coming back to a topic if there are significant differences.
  • Find ways to continuously validate and verify that the architecture is up to the task; not under-engineered or over-engineered.
  • Look for the tell-tale signs of knowledge-, skill-, or attitude-gaps. Team members tentative about what they’re supposed to be do. wanting more training, time  to experiment or feeling under prepared, or a general concern that project is not on the right track and won’t go better.

A catch-phrase for software development; Agile requires agility. Keep an eye on the appropriateness by monitoring the team’s level of agility and positively influence a transition to the next plateau.