In late March there were reports of unemployment web sites being overloaded by demand and becoming unresponsive. This sort of scaling challenge will become less common in the age of serverless, but serverless is still new enough that there is value in a post about how serverless changes how we think about delivering applications. Note: this is a somewhat technical post geared for a technical audience that is familiar with CDNs, APIs, databases, etc.
Most people on the technical side of the web space understand that serving HTML pages and other static assets directly is a bad idea. For most use cases a web server or file store should pass static content to a Content Delivery Network (CDN).
A CDN is a global network of web servers that provide web server infrastructure as a service. This way, any given user loading your web page is likely to load it quickly from a server that is relatively “close” to wherever they happen to be. “Close” ideally means that the connection your end user has to the CDN is low latency and high throughput. When this happens, your website will usually feel much faster than if the user has to navigate a high latency or low throughput network before interacting with your web page.
A CDN is serverless: your web site can use a CDN without having to manage servers, software, etc. Web sites and apps that use CDNs pay a monthly fee that is generally proportional to utilization. If we use the CDN more next month our bill will go up. If our usage drops, we should see a lower bill, although of course we many need to pay attention to the myriad ways that we can find ourselves paying for subscriptions that we don’t use in the same way that we used to pay for cell phone minutes that we didn’t necessarily use.
The more serverless you use, the more likely your application will be able to scale relatively seamlessly.
The old model was to have lots of servers. Servers to manage API requests, servers to provide different kinds of middleware and load balancing, and even more servers to host databases. Maybe a few servers to manage all the other servers. Not only does the old model entail maintaining fleets of virtual machines or containers, you have to estimate what your load will be. Overestimate and you incur the costs of idle capacity as surely as if you pay rent on an empty office building. Underestimate and you risk your application becoming unresponsive.
At Ondema, the basic pattern we use on the server side is to create GraphQL APIs that are powered by AWS Lambda functions and DynamoDB. As we add customers, our AWS bill increases in a linear fashion, and our administration cost decreases in a linear fashion (supporting APIs for 10,000 users is much less effort, per user, than supporting APIs for 10 users).
Instead of managing fleets of servers, we deploy APIs, and AWS does the work of managing servers, keeping the underlying software up to date, and scaling to support additional users when they show up. When we get two orders of magnitude more users, we will see a higher AWS bill, and our users will not notice any difference in the performance of our software. Down the road we anticipate some overhead associated with automation to support 3, 4, and 5 orders of magnitude more users, but the general pattern of linear decrease in per-user costs will continue, and the complexity of infrastructure scaling will continue to be outsourced to the provider of underlying compute and database services.
Part of this is bog-standard economies of scale. But the change in how we manage complexity means that we are seeing the emergence of new value chains: cloud services that are disruptive and innovative. We can sometimes only hypothesize about what this means, but few people think that technology value and technology delivery ten years from now will be what we are used to today. But when it comes to scalability of the types of application that drive applications for unemployment insurance, we can confidently predict that this type of app will increasingly be much more scalable and much more resilient in the future. Just as CDNs enhanced the scalability and reliability of static web sites, serverless will drive the scalability and reliability of applications. Wow!
The result is that even as our offerings become more complex and are delivered at greater scale, the odds of the underlying services becoming unresponsive decrease as the cloud services that enable our application are becoming more reliable over time. We’re humans writing software, so quality and usability will continue to require sustained effort, but challenges around responsiveness will be driven by application-level constraints rather than infrastructure-level constraints. We’re starting to be able to forget how to manage virtual machines and containers, and trust others to be experts with regards to load balancing and caching strategies, etc.. Just as we are consumers of CDN services we are consuming API, compute, storage, and database services.
With each passing month we become more confident that yesterday’s scaling challenges really are behind us.
Plug: is your team trying to solve large-scale scheduling, collaboration, and workflow problems? Wondering if a vendor might help? We’d be grateful for the opportunity to discuss it with you!