Critique #2 "The database is a detail"
Uncle Bob says: "The database is a detail".
So that’s the title I used for this article.
I learned a lot from Uncle Bob through his Clean Architecture.
But I must admit, the “Database“ part always bugged me.
Clean Architecture solved several problems I saw in practice from dominantly ORM-centric applications. I remember working on an application many years ago; it had ORM “entities“ everywhere. There was a breaking change in the ORM at that time regarding how it handled entity inheritance and perhaps some other breaking changes. It would cause a cascading change across the entire codebase! So the whole upgrade had to be postponed for months.
Wouldn’t it be nice to isolate the ORM code somewhere? That our domain model is separate from the database if we could just keep the database at arm’s length? Wouldn’t it be nice to be able to unit test our whole business logic without caring about the database? Wouldn’t it be cool that later we could swap out the database?
So I came across Uncle Bob’s Clean Architecture, which answered many problems I saw. At that time, it seemed like a really “pure“ approach. The selling points were that you could get Testability, Maintainability, and Portability, an all-in-one win-win.
This article is part of the A Critique of Clean Architecture and TDD series. If you’re a technical leader and software architect who has already tried Clean Architecture and seen the problems that came along with it, or if you’re looking to implement Clean Architecture, I’m writing this for you. You’ve already spent countless hours, days, and months searching for solutions to questions, and you still don’t have the “answers”; you just see polarized answers and people arguing with each other. It’s a jungle out there - scrawling through Stack Overflow and Reddit. Well, that’s what I had done - for several months. Followed by implementing all this in practice over the years and facing stumbling blocks. I wish I had read a book back then to reconcile the “contractions” and provide me with guidance. It didn’t exist and still doesn’t exist. I had learned a lot from Uncle Bob in theory, but then I went beyond in practice. And this is why we’re here, welcome. Let’s go on this journey, one step at a time.
The goal of software architecture?
In the Clean Architecture book, Uncle Bob states that Maintainability is the key goal of a good architecture, that we want to minimize the development effort to build and maintain the system:
The goal of software architecture is to minimize the human resources required to build and maintain the required system.
The Database is a Detail
Uncle Bob has a strong stance when it comes to databases. Indeed, in the Clean Architecture book (which I quote below) there’s a whole chapter called:
The Database is a Detail
So why is the database just a detail?
From an architectural point of view, the database is a non-entity - it is a detail that does not rise to the level of an architectural element.
It’s just a mechanism we use to move the data back and forth between the surface of the risk and RAM. The database is really nothing more than a big bucket of bits where we store our data on a long-term basis.
He then goes on to discuss the different systems for data storage & access:
Disk storage: File systems and Relational DBMS have their own scheme of indexing and arranging data, which we’ll later bring into the RAM
RAM storage: We can organize data in data structures
Separation of Business Rules and the Database
In Clean Architecture, we differentiate regarding where exactly we are implementing the business rules. Clean Architecture teaches us that business rules are only within Use Case Interactors and Entities, but NOT inside the database.
Use Case Interactors encapsulate application-specific business rules
Entities encapsulate enterprise-wide business rules
Database Gateway Implementations use SQL, ORMs, or any other mechanism to access data to implement the methods specified by gateway interfaces
Uncle Bob describes Database Gateways:
Between the use case interactors and the database are the database gateways. These gateways are polymorphic interfaces that contain methods for every create, read, update, or delete operation that can be performed by the application on the database.
But how about database performance? Well, there is one paragraph under the section “But What about Performance“ and it’s still telling us that we can completely separate data storage concerns from the business rules:
Isn’t performance an architectural concern? Of course it is - but when it comes to data storage, it’s a concern that can be entirely encapsulated and separated from the business rules. Yes, we need to get the data in and out of the data store quickly, but that’s a low-level concern.
Perhaps more concretely https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html
Independent of Database. You can swap out Oracle or SQL Server, for Mongo, BigTable, CouchDB, or something else. Your business rules are not bound to the database.
Testable. The business rules can be tested without the UI, Database, Web Server, or any other external element.
Referring to the Use Cases layer, Uncle Bob writes:
We also do not expect this layer to be affected by changes to externalities such as the database
The takeaway message from this is that the database is external, we’re independent of the database, we should be able to swap out the database, the business rules are not bound to the database, and the use case layer should not be affected by changes
Let’s recap the perspective on databases
Clean Architecture produces systems that are independent of databases. More specifically, the business rules are not bound to the database.
Testability: We can unit test business rules because the business rules are implemented in use cases and entities, not in the database
Maintainability: We can reduce our maintenance costs because the business rules are both testable and our application is modular (separation of use cases from I/O concerns)
Portability: We can swap out one database for another, e.g., switching from MySQL to SQL Server to MongoDB, etc.
Note: The diagram above is based on Uncle Bob’s article https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html except here, I’m zooming into just Use Cases, Entities, and Gateways.
Implementing Use Cases and Database Gateways
What does this look like in practice? Let’s now jump into Java code samples:
How do we implement business rules?
Implementing application logic in Use Cases
Implementing domain logic Entities
How do we implement Database Gateways?
Fake Repository is an in-memory implementation of the repository which simulates the real database
File Repository is a file-based implementation of the repository, whereby we’ll be storing our database inside a file
ORM Repository is an implementation of the repository whereby we’ll be using the Hibernate ORM (though similar principles apply to Entity Framework in .NET, and any other ORMS)
Aside from the three options shown above, we also could have implemented the repository using SQL, MongoDB, some other NoSQL database or anything else. If you’d find those additional implementations helpful, please let me know in the comments below.
This article will illustrate the database as “just“ an I/O device. We’ll be show practically how to achieve 100% DB independence:
Keeping business rules completely separate from I/O concerns
All business logic is on the application side, zero on the database side
We’re free to swap between Database Gateway implementations
But is this approach universally optimal? It looks great, but are there any dangers behind it? Initially, I planned to cover that topic in this article, but since I already reached 3,000+ words, I’ll move that analysis and criticism for next time - to showcase the pitfalls of the approach.
Let’s jump into the Java code samples!