System Architecture: Quality Attributes
When designing the architecture for an application or system of interrelated applications, it is essential to identify which quality attributes of the system are most important to the users, developers, and owners. Often this is done implicitly based on the experience and preferences of the various people participating in the project. When quality attributes are selected with intention and purpose, they help guide the design of the system. At Pluralsight, the quality attributes we focus on have evolved as the company has evolved.
When we were a single team
Initially, our system values were tied up in our team values. These included:
- Favor the customer
- Quality
- Availability
- Do what’s right for the business
- Modifiability
- Maintainability
- Testability
- Capacity for growth
How did these values affect our design?
When making changes to the system at this time, we often asked ourselves if this would be a good experience for the customer and if this is what we would expect of a product we purchased. We valued having a highly available system with rock solid features over creating a less consistent experience with a larger number of features.
As a growing company with a growing and evolving customer-base, we also needed to update and improve our product rapidly. Doing what’s right for the business meant ensuring we could make improvements quickly and safely while continuing to add users at a rapid pace. Automated testing was an essential component of our strategy. Additionally, we set up a continuous integration server, worked in the master branch with feature toggles, and built a continuous delivery pipeline that enabled rapid, automated deployments and rollbacks.
When we first split into multiple teams
As the needs of the business grew, we grew the dev team to the point where it became challenging to coordinate work and releases. During one team offsite, we discussed the growing problem and identified additional quality attributes of the system that we had come to value including:
- Horizontal Scalability
- of the System
- of the Engineering Organization
- Team Autonomy (especially around releasing features)
- Product Innovation & Modifiability
- Integration of acquired products
How did these values affect our design?
Smaller components are easier to reason about and therefore easier to change and deploy. Smaller groups of people tend to have an easier time making decisions and executing on them. These truisms led us to split up our team and our monolithic codebase into a collection of bounded contexts.
We wanted a simple and consistent way to integrate these components and those that came in through acquisitions. We determined that creating strict rules for the interstitial space between teams and components could allow us greater autonomy within the bounded contexts. By focusing on AMQP+JSON messages and HTTP+JSON APIs, we eliminated many of the limits on team decisions including restrictions on development language of choice. This in turn enabled greater innovation within the product.
During our current period of continuous growth
As our number of teams and bounded contexts has grown, we continue to learn and evolve our understanding of the requirements of our system within our business. Currently, we emphasize the following quality attributes:
- Data driven decision making
- Team Autonomy (especially around releasing features)
- Horizontal Scalability
- of the System
- of the Engineering Organization
- Product innovation
- Quality & Testability
- Maintainability
- Availability
- Recovery over Reliability
How do these values affect our design?
Separating our system into bounded contexts continues to pay off as we have nearly three dozen teams working on the platform now. These teams deploy hundreds of times per month with fewer than 3% of releases rolled back. Releases requiring coordination of multiple teams occur infrequently. Our emphasis on automated testing and testability, in addition to including developers in the on-call rotation, helps us maintain high quality throughout the system.
We run our product in the cloud and take advantage of automation tooling to ensure that we run multiple, redundant instances of each service. This same tooling can be used to quickly rebuild instances if we need to recover or scale to maintain our internal and external SLAs. Our regular disaster recovery exercises keep us confident that we can recover in the event of failures.
Team autonomy is one of our most important values. We hire professional software developers and we believe in giving them the freedom and the responsibility to deliver customer value in the best way they know. This means that we have more variation than you might expect in our development practices and our technology choices from team to team.
In order to ensure that any team’s choices do not adversely affect other teams’ ability to deliver customer value, we have maintained our strict rules around communication between bounded contexts. We continue to prefer asynchronous communication through messaging business events. When synchronous communication is the appropriate choice, we still require HTTP+JSON.
As we have grown and learned, it has become obvious that some bounded contexts need a reliable, up-to-date, locally-cached copy of another bounded context’s data but otherwise are not interested in reacting to the published business events from that bounded context. To address this need, we are systematizing data replication between sources of truth using a distributed commit log. This has simplified some of the interactions between bounded contexts and enabled more accurate, effective, data-driven decision making.
Continuous Architecture
Context always changes and architectural choices need to evolve based on new information. Selecting, updating, and communicating the key quality attributes of a system are one of the responsibilities of the architect.
- Which quality attributes are most important to your system today?
- How do you know you are emphasizing the correct quality attributes?
- What would cause you to change them?
Be diligent in evaluating the efficacy and applicability of the quality attributes for your systems.