Database Basics
Choosing the right database model up front is essential to the success of your app. Learn how to decide whether your project needs SQL or NoSQL.
Jul 23, 2018 • 7 Minute Read
Database Basics
It would be hard to imagine any public-facing application of even minimal complexity that didn't rely on structured data of one sort or another. Working on an application? Learn to love databases.
A database is software that's good at reading digital information and then reliably storing it in a structured format so that the information can later be efficiently retrieved in useful formats and combinations.
Choosing the Right Database Model
The data world can get really complex, really fast. In practical terms, though, it's fair to say that most projects can be successfully served by one of two database models: relational (SQL) and NoSQL.
Relational Databases
If you need your data organized in ways that carefully define how various categories of information relate to each other, then a relational database may be what you’re after. Think of it in terms of a business that, say, must manage its employees in the context of the jobs they do, the way they’re paid, and their health insurance status. Data related to each employee appears in each of those categories but, at the same time, may not be accessible to other users beyond what’s individually necessary.
Relational databases are often managed by one flavor or another of the SQL standard. SQL stands for Structured Query Language, and the "structured" part of that tells most of the story. A SQL-type database (leading examples of which include Oracle, MySQL, PostresSQL, Microsoft's SQL Server, and, more recently, Amazon's Aurora) is made up of tables, which, in turn, contain records (or, as some call them, rows). Records are made up of individual values known as fields. Thus, the contents of a database of customer information could be represented this way:
ID | Name | Address | City | # of purchases |
---|---|---|---|---|
1 | John Doe | 123 Any St. | Yourtown | 5 |
2 | Jane Smith | 321 Yna Ave. | Hertown | 2 |
Here, the database has records identified by the numbers 1 and 2, and each record contains fields made up of a name, an address, and a number of purchases.
Perhaps the key benefit of this kind of strong structure is that it allows high levels of predictability and reliability, because carefully defined rules can be applied to all transactions affecting your data. You can, for example, apply constraints to the way users of an application can access the database, to ensure that two users aren't trying to write changes to a single record at one time (which could lead to data corruption).
Here's an example of how it can sometimes work. Take a quick look at the US government's Bureau of Labor Statistics (BLS) Occupational Outlook Handbook page for Network and Computer Systems Administrators. Between the content displayed on each of the page's nine tabs, there's quite a lot of text. But I suspect that very little of it was manually added to this page by a human being.
What's more likely is that the database on the BLS server contains terabytes of raw data within which can be found structured information related to each of the many thousands of included occupations. That data is probably organized by information category ("Summary", "Work Environment", etc). When I requested this page from the BLS menu (or through an internet search engine), the BLS web server might have requested the relevant raw data from the database and dynamically organized it on the page the way you see.
Of course, there are many more ways that a website can make use of dynamic access to a database engine installed in the back end, but that was a good illustration.
NoSQL Databases
Times change. Across all segments of the IT world, data is being produced, consumed, and analyzed in volumes and at speeds that weren’t anticipated when the first relational database was designed more than a generation ago. Imagine, for instance, that you’re building a wholesale business that needs to handle constantly changing product descriptions and inventory information for tens of thousands of items; that data, in turn, must be integrated with sales, shipping, and customer service operations.
In such a case, you’ll likely want to use a NoSQL database (such as AWS's DynamoDB). Highly flexible relationships between NoSQL data elements allow much simpler integration of data stored across multiple clients and in different formats. This makes it possible to easily accommodate fast-growing data sources.
Despite what you may think, some people argue that NoSQL stands not for No SQL or Not SQL, but rather for Not Only SQL. That's because these databases can sometimes support SQL-like operations. In other words, you can sometimes trick a NoSQL database into providing functionality similar to what you might expect from a relational database.
If you’d like more complete insights into NoSQL and how it fits into the larger spectrum of database models, the AWS document "What Is NoSQL?" should be helpful: https://aws.amazon.com/nosql.
How to Choose
The database architecture you choose for a project will often depend on the specific needs of your application. For instance, if you're running financial transactions, and, because of the overriding need for absolute accuracy and consistency, it's critical that a single record can never have more than one value, you'll probably opt for a relational platform. Just imagine the chaos that would result if all the money in a particular account was withdrawn by two concurrent client sessions.
On the other hand, suppose you host a popular online multiplayer game. If being able to quickly update data points can make all the difference for player experience, and the occasional write failure won't cause a zombie apocalypse, you'll definitely want to consider NoSQL
Looking to dive in deeper? Check out a couple of outstanding Pluralsight courses: Understanding NoSQL by Andrew Brust and Introduction to SQL by Jon Flanders..
This guide includes content adapted from my two Manning books: Learn Amazon Web Services in a Month of Lunches and Linux in Action.