Let us revise the C4-model for software architecture diagrams
The C4-model is getting traction as an industry standard for software architecture diagrams. Although the C4-model is a great starting point, the model can and should be improved to gain popularity. In this article I propose the following:
- The four key abstractions should be changed from:
context, container, component and code
system, component, sub-component and deployment.
- We need cleaner and less cluttered diagrams with consistent and fixed notation used to depict the four abstractions to achieve instant comprehension.
It is my hope that refinements of the model such as these presented here can help the industry reach consensus and standardisation in the area of software architecture diagrams.
This article is also published here:
The C4-model for software architecture — published by the author Simon Brown more than 10 years ago— has done a great job in the field of software architecture diagramming. It has helped me on many recent projects to more clearly communicate software architecture and design.
However, I have found room for improvement and so I have evolved it further to fit my needs on various large-scale projects. In this article I will propose an improvement of the C4-model.
The structure of the article is simple: (i) state what really works well for the C4-model, then (ii) analyse the pain points and explain how a revised C4-model can fix the pain points.
What really works well in C4
C4 is simple and logical
The C4-model is extremely simple to grasp for most people. In my experience C4-diagrams are easy to read and suited for all kinds of audiences.
The whole approach is appealing: define 4 key abstractions (context, container, component, code) each being a decomposition of the previous and establish 4 types of diagrams matching the 4 key abstractions. The first time I saw it, I thought: this is exactly how software architecture diagramming should be done. It is simple and elegant, yet powerful and expressive.
C4 focuses on static views — not dynamic
C4 makes a deliberate choice in focusing solely on so-called static views. And this is a good thing.
A static view can be defined as a diagram that doesn’t refer to time. A static view describes the structure of the system. As opposed to this we have dynamic views which is used to describe how elements of the system interact over the course of time to realise use cases — also called behaviour.
Both static views (structure) and dynamic views (behaviour) are crucial design perspectives. Both are crucial to understand the system. However, static views is an order of magnitude more important than dynamic views for the following reasons:
- Static views are a mental frame of reference within which behaviour can be understood.
- The static views used by C4 can also be used to depict important enterprise aspects such as team structure and relationship between development and operations (deployment diagrams).
- Dynamic views are easily built on top of the static views by adding timing aspects via arrows and sequence numbers.
For this reason it a great choice to focus on getting static views standardised. Once this is done, dynamic views are easily added on top.
C4 works well with business stakeholders
In my experience diagrams made using the C4-diagram conventions works much better with business oriented stakeholders than UML-based diagrams. For technically oriented people it is hard to accept that look’n feel matters — but it does. In earlier days when I only had UML in my tool belt as a software architect, I always ended up producing nicer looking powerpoint versions of the architecture diagrams because — after all — the job was to bridge the gap between IT and business.
With C4 I have a consistent approach in making diagrams that can work just as well in IT as in business.
C4 has traction
On the two most recent companies I worked within the last 4 years — both were leading global players in their industry — I experienced that C4 is adopted by a noticeable number of technical people as a shared frame around software architecture work. Although I can’t find concrete market statistics, it definitely seem to have momentum and traction in the market. This makes me push for it to become a new standard to the benefit of the industry.
Pain points in C4
Having seen the positive sides of C4, let us now look with critical eyes on what pain points we have.
Before moving on we need to see the proposed model as a whole. The following legend summarises.
This should be compared with the current default C4 notation found in draw.io:
Now let us walk through the pain points in the current model and see how the new proposed model will make things better.
The 2-level notion of container creates confusion
Using the term “container” for the 2-level abstraction simply creates too much confusion. The risk of misunderstanding is too big and I always end up using a different word in my legends — typically component. Let us consider a system consisting of a backend and a database. Referring to the backend and database as components resonates so much better than referring to them as containers. I would argue that if you ask any software developer straight from the hip about what components the systems consists of, they will say backend and frontend. If you ask what containers the system consists of, they may reply that it depends on the choice of deployment.
Hence I propose to revise the C4-model and use the word component as the 2-level notion instead of container. For level 3 we could simply use the term sub-component. This has the further advantage that it hints at the recursive structure of sub-components: that they may container other sub-components which again may contain sub-components — and so forth.
We lack deployment as a key abstraction
The C4-model does a nice job in defining the 4 core abstractions: context, container, component and code. It also explains how deployment can be depicted using adhoc boxes such as shown here:
Depicting deployment nodes using adhoc boxes
However, we need something more. We need deployment to be part of the core abstractions and have its own special notation in diagrams. Furthermore, the concept of code should be dropped as one of the core abstractions. Hence we end up with 4 proposed key abstractions related to each other like this:
The 4 levels of abstractions should be System, Component, Sub-component and Deployment node.
I am not saying that code-level views are never relevant when dealing with software architecture. I am just saying that it is far less relevant than deployment views. In fact, deployment views are even more important than sub-component views and often we only care about systems, components and deployment views.
The default notation is not good for showing containment
One of the primary purposes of the C4-model is to depict the structure of containment: systems are holding containers which are holding components etc. Although the C4-documentation is very clear on the idea of containment, the default C4-notation is not optimal in depicting it.
Consider these two ways of depicting the composition of a Corebank system:
The main-difference is that on the right diagram the system-shape is the same when zooming in. Keeping the same shape consistently no matter of level of perspective will allow the reader to decode the intention more easily.
Here is another example, this time going from level 2 (container) to level 3 (component):
In the new proposed notation we may even show level 1, level 2 and level 3 in the same diagram if needed. It would look like this — this time including a legend.
Once the reader gets used to the notation and it is used consistently, it will be effortless to read and understand these kinds of diagrams. In particular the reader should notice these advantages:
- The new proposed notation uses a simpler colour scheme. Whereas the current C4-model uses 3 variations of blue, the proposed convention is in a sense monochromatic. This has the advantage that colours can be adopted to show other dimensions in the diagram. For instance, red colour could be used to show components to be phased out and green could be used to show new components.
- In the proposed model, whenever you move from level 1 to level 2 to level 3 in abstractions, the shape of the elements stays the same. There is no difference in how a system is depicted whether you zoom in to show internal components of it or you show it in a context view.
The following shows some examples of diagrams that would not work as well with the current C4 standard notation.
On the left it is demonstrated how colour-codes can be used to give further dimensions in the diagram. On the right you see how sub-components can be depicted in a recursive structure to contain other sub-components.
The notation is a bit too cluttered
The idea of adding the type of a shape in -brackets, like [Software System], although consistent and accurate, tends to make diagrams cluttered. Once we have established a clear interpretation of a shape using a legend, it should not be necessary to repeat this metadata on every instance of that shape.
Compare these two variants:
Comparing diagrams with different amounts of metadata
In the proposed notation, metadata is only used where it adds value. Although it is true that the C4 model is not about notation but about the ideas behind the notation, nevertheless the default notation is built into tools like draw.io stencils. And because the default notation appears in the public C4-material, most people tend to copy that style of diagramming.
What I propose is this:
- Metadata can and should be avoided when the metadata is clear. It should be included only when it adds value to the diagram.
- The level of abstraction (such as “Container”) should never be included as metadata because the elements of the diagram should be specific enough. In the proposed notation we still indicate SpringBoot under Corebank BE because it is an important piece of information. We don’t write “Container” as it is known from the shape of the element.
We can now summarise the proposed revision of the C4-model as follows:
- The 1st level concept should be renamed from context to system.
- The 2nd level concept should be renamed from container to component.
- The 3rd level concept should be renamed from component to sub-component
- The 4th level concept, code, should be dropped as a key concept, and deployment should be added instead.
- To support bullet 1–4, we should re-interpret the C4 acronym. Instead of referring to “context, container, component, code” it should be a reference to the 4 key Concepts: system, component, sub-component and deployment.
- We should agree on a more effective diagram diagram notation allowing us to depict the 4 key concepts in a fixed and consistent manner irrespective of level of abstraction.
- We should agree to minimise the overhead of metadata to reduce clutter and increase the appeal to non-technical readers in particular.
It is my hope that the C4 model in one form or another will become an industry standard one day. It should be taught at universities and accepted in academia in the same way that ER-diagrams are a standard for data modelling. It is also my hope that the improvements proposed here can make the C4-model even more effective and contribute with traction in the industry.