What is scalability in cloud computing?
Jun 08, 2023 • 6 Minute Read
Maybe you heard it in a punchline on the television show Silicon Valley, or maybe you heard it from a serious VC looking to invest in a company. Either way, you've probably heard the question: does it scale? All jokes aside, scalability is a key concept for not only Solutions Architects, but anyone in the DevOps world and key for cloud enablement initiatives. So let's talk a little bit about the concept of scalability in cloud computing.
What is scalability?
Scalability refers to the idea of a system in which every application or piece of infrastructure can be expanded to handle increased load. For example, suppose your web application gets featured on a popular website like ProductHunt. Suddenly, thousands of visitors are using your app - can your infrastructure handle the traffic? Having a scalable web application ensures that it can scale up to handle the load and not crash. Crashing (or even just slow) pages leave your users unhappy and your app with a bad reputation.Systems have four general areas that scalability can apply to:- Disk I/O
- Memory
- Network I/O
- CPU
Vertical scaling
Vertical is often thought of as the "easier" of the two methods. When scaling a system vertically, you add more power to an existing instance. This can mean more memory (RAM), faster storage such as Solid State Drives (SSDs), or more powerful processors (CPUs).The reason this is thought to be the easier option is that hardware is often trivial to upgrade on cloud platforms like AWS, where servers are already virtualized. There is also very little (if any) additional configuration you are required to do at the software level.Horizontal scaling
Horizontal scaling is slightly more complex. When scaling your systems horizontally, you generally add more servers to spread the load across multiple machines.With this, however, comes added complexity to your system. You now have multiple servers that require the general administration tasks such as updates, security and monitoring but you must also now sync your application, data and backups across many instances.So which is better?
Horizontal scaling is often considered a long term advantage, whereas vertical scaling is usually considered a short term advantage.The reason for this is that you can typically add as many servers as you need to your infrastructure, but at some point, hardware upgrades are just not plausible.Performance
One of the primary reasons for scaling your system is to increase performance. This is only one aspect of performance though - scaling ties in with many other concepts such as elasticity and fault tolerance.Response time
Performance of a system is measured by many different metrics - one of the main ones is response time. Interestingly, scaling your system may increase response times. If you move away from the type of system architecture that has all of the components (database, application code, caching) on one server to a type of system architecture that separates these components onto their own servers then the response time will naturally increase as you now have network latency and other considerations. Let's look at two popular system architecture types below.Monolith
A monolith system architecture is the idea of having many of your components in one place. When talking about an application then it may mean that you have all of your services coupled together such as your data layer, caching layer, file layer and business logic. When talking about hardware and servers it can mean that you run all of your processes in one place such as your database, web server and file system.Microservices
A microservices system architecture is the process of splitting up core services into their own ecosystems. A key part of your application may be an image processing service that can save, delete, cache and manipulate images. This service could be set up as its own infrastructure which means that it would be separated from the other application services. You'll often hear the term separation of concerns when referring to microservices. Although each core service having its own infrastructure can make scalability easier, it can still add a lot of complexity to your application. You'll now have to manage multiple servers but also change your application code to handle these changes.Scalability and databases
Each application is different but the key is to identify key services that may be a bottleneck and the first ones to cripple under increased load pressure. One of the most common bottlenecks can be the database.The database is used to store data in an application. You may use a traditional relational database such as MySQL or a NoSQL database such as MongoDB. In simple terms, the database is used to write data (save it) and read it (view it). The database can often be one of the first components to fall down under high load pressure in an application environment.Sharding
To shard a database for scalability is to split your data up into separate database servers. Instead of having all of your data on one database server you would split the data into "shards". This can help with performance in a few ways:- The data requests are shared across multiple servers instead of a the same database server each time
- Less data on each shard reduces index sizes which can improve data seek time
- Less data on each shard means there are less rows of data, this can allow queries to run quicker since there is less data to traverse or calculate
Partitioning
Database partitioning is similar to database sharding, but not exactly the same. Database partitioning separates the data into distinct parts. Certain partitioning methods include:- Splitting data by range (alphabetically or numerically)
- Row wise (horizontal partitioning)
- Column wise (vertical partitioning)
Application code database optimizations
You can also perform application-level database optimizations, such as:- Using database indexes
- Table partitioning
- Caching database queries
- De-normalization
- Running large queries/batch queries offline