Somewhere in mid-February 2016, the first AWS Lambda chunk of MindMup went live, making us one of the early adopters of a trend that would later get buzzworded to serverless. At the start of 2017, we finally turned off all the old services, so MindMup now completely runs ‘serverless’, and it was quite a journey. During that period, the number of active users increased roughly by 50%, but our hosting costs dropped slightly less than 50%. Plus, we replaced what was probably our biggest bottleneck with something that scales without any effort on our side. Given how new the whole service is, and how quickly it’s changing, this was a year of intense learning and experimentation. By far, the biggest lesson for me was to really embrace the platform, not just the service.
Buried deep in a flood of tweets about US executive orders and Brexit legal challenges, one of Simon Wardley’s recent comments helped me finally put this into a nice perspective. Wardley argued how ‘serverless’ is effectively platform-as-a-service, or more precisely what platform as a service was supposed to have been before marketers took over the buzzword. Sure enough, anything marketers want to promote today gets labelled as serverless, so the history already started repeating itself, but let’s leave that aside for now. The right way to look at serverless environments is, in effect, as a successful implementation of the platform-as-a-service idea.
And that’s why it’s so critical to embrace the platform, not just the Lambda service. By embracing the platform, I mean using Lambda as kind of a universal glue between the entire AWS platform, letting users directly connect to specific services. Crucially, this means not using Lambda to just host traditional apps cheaper. Sure, it is perfectly possible to run traditional web applications inside Lambda containers with minor modifications, you just need to move state somewhere else. There are even libraries out there that will allow you to run Node.js Express apps or Java Spring Apps inside Lambda easily. On one hand, that is quite a compelling way to start using the new architectures. On the other hand, the benefits of that approach are quite questionable. A big part of Patrick Debois’ keynote at the ServerlessConf 2016 in London was how just pushing things to Lambda won’t make things magically cheaper and faster, quite the opposite. Hosting traditional web apps in Lambda might cause you to pay for the same thing several times over.
Use the platform, Luke
A good example of that would be file conversions, something we have to do in MindMup quite a lot. The usual way to process file conversions with web applications is to let users upload a source file using REST to a server. The server can then spin up some asynchronous process in the background, and expose an endpoint where users can poll for task statuses. When the file is finally converted, the status endpoint tells the user where to look for the result. The web server is the bridge – it translates between the web requests from end-users and some kind of background file storage. The server is the key entry point, the gate keeper, provides authentication, authorisation, enforces capacity limits, and orchestrates the workflow by coordinating background services.
With applications designed to run on the AWS cloud before Lambda, it’s very likely that the file storage would end up in AWS S3, and that the web server and the background conversion would communicate using SQS, a task/queue system. Moving the whole thing to Lambda blindly would normally mean using API Gateway to connect web requests to Lambda functions. It would require paying for these things for each request:
- S3 storage to keep the original and converted files
- Lambda processing to convert files
- Data transfer in and out of AWS
- Lambda processing and API Gateway to receive incoming conversion requests
- Lambda processing and API Gateway to send back results
- Lambda processing and API Gateway requests to poll for results
To deal with for differences in connection speed and latency and various file sizes, it’s safe to assume three to four poll requests in average before a converted file is ready. So a single conversion would require us to pay for 8 Lambda + 8 API Gateway requests. The Lambda functions that copy files in and out of S3 are likely to require a lot of memory, because the API gateway sits between users and the Lambda function, so it is not possible to just stream to a HTTP socket. The entire file contents would have to be stored in memory.
By embracing the platform, we can reduce this to just one or two Lambda requests, which will happily run on the smallest Lambda container. We’ll have to pay for storage, processing and data transfer anyway, but letting the platform take over, we can cut down the overall costs between 50% and 70%.
There are three typical aspects of letting the platform take over the responsibilities of a server:
- Use distributed authorization
- Let clients orchestrate workflows
- Allow clients to directly connect to AWS resources
When a web server is playing the role of a gatekeeper, that process is effectively mediating between a client and the file storage. For traditional web architectures, this is important because it makes it much easier to secure background services and enforce access policies. The server deals with untrustworthy user data, authorises an operation, and everything after that can be trusted. Web servers are in a DMZ, exposed and vulnerable, but the other services can be in a secure private network. That kind of setup is just not unnecessary with AWS Lambda, it’s harmful. First of all, the storage is already available online. S3 and most other AWS resources are directly accessible using HTTPS. Putting a Lambda in front of it would just increase latency for end-users, and create one more potential point of failure. There is no secure private network there. AWS doesn’t really differentiate between requests coming to S3 from a Lambda function or directly from browser clients — none of that can be blindly trusted anyway. Each such request needs to be is authenticated against IAM policies, regardless of where it is coming from. So we might as well let the clients go to the storage directly.
Letting client browsers directly connect to AWS resurces was quite a scary thought for me, after almost twenty years of conditioning to distrust client apps and prevent direct access to back-end systems. Then I realised that we don’t actually have a back-end system any more. There is no benefit of creating a layer of trusted requests that can be managed easier, because each API call needs to be authenticated separately anyway. And sure enough, AWS being AWS, there is more than one way of letting platform services authenticate and authorise client requests. In fact, there are three:
- Use IAM users
- Use Cognito
- Pre-sign URLs
The first approach, with IAM users, is best for controlled situations such as call centres and back office apps. Amazon IAM enables a single service account to open many user accounts, each with its own access keys. Each user can then get specific IAM policies to access resources, and client applications can utilise AWS APIs directly with the client access keys representing the active user. There is a limit of 1000 users per AWS account though, so this approach doesn’t work that well for a public website. Someone also needs to manually create IAM users, or you’ll need to deploying an app that has full access to IAM so users can register themselves. Twenty years of conditioning makes me think that creating an app with the full keys to the kingdom is a big security risk, so this is a no-go for anything exposed to public access. Lastly, this approach does not allow anonymous access. We could simulate anonymous access by including generic access keys with our app, again creating a huge security risk. However, for situations where a small number of well known users needs to utilise AWS services, this approach could work quite well.
The second approach is to use Cognito identity pools instead of IAM, which fixes most of the security problems and limitations of the first approach. Cognito assigns lightweight identities to users directly from the browser, and you can assign IAM policies to those identities. This means we can allow or deny access to AWS resources without the need to include any AWS access keys with the client application. Users can authenticate against your data (for example a username/password database), against federated identify providers such as Google and Facebook, or even against a managed username/password user pool inside Cognito. Crucially, you can also assign IAM policies to unauthenticated clients, effectively controlling what anonymous users can do with AWS resources. This approach apparently scales to millions of users (I’ve not tested that myself), and there is no administrative overhead of creating accounts manually, or the need to expose full IAM access to any application. With Cognito identities, client apps can access AWS platform resources directly using the AWS SDK, including from a browser or a mobile application.
Some AWS services, including S3, will also accept pre-signed access URLs. You confirm to AWS that a client is allowed to make a very specific request, including all the important headers and parameters, limited to a very specific expiry time. It’s like giving your users the power of attorney make a very specific claim. This approach is easiest if there is only one service people need to access, and doesn’t even require you to bundle the AWS SDK with your app. This is also useful if you want to implement an additional level of authorisation that IAM doesn’t give you out of the box. For example, with a pre-signed URL you can restrict the size of a file users can upload to S3, which isn’t possible with IAM or Cognito.
For the Mindmup file conversion, we chose the third approach, mainly because of the option to restrict uploaded file sizes. A browser makes a quick signature request using HTTPS to an authorisation Lambda function. This creates a unique request ID and pre-signs the URLs, which allow the user to upload a file, poll and download the results. One Lambda call per export. Almost no memory needed, and the whole process completes very quickly. The client browser can then talk to S3 directly and send a file there. We’re not paying for API Gateway for file transfer, or paying for a huge Lambda to receive the whole file in memory on the other end. When the file is finally uploaded to S3, another Lambda kicks off directly to convert the file, and save it back to S3. This allows us to use streaming APIs, reduce memory constraints, and deal with errors effectively through dead-letter Lambda functions. While the conversion is going on, the clients can poll S3 directly for the results, and we’re not paying for any API Gateway requests. This means that the clients can poll more often if they need to. The solution costs an order of magnitude less than if we were to just run a Lambda entry point for a traditional server workflow. Plus, the users do not have to suffer the latency introduced by a pointless middleman.
The hybrid cloud of the future
By using distributed authorisation and letting clients connect directly to platform services, we can open up another interesting option to reduce operational costs. Instead of the application running on dozens or hundreds processors in Lambda, it can run on hundreds of thousands of client processors, which we do not have to pay for. In a traditional web architecture, client applications cannot be trusted with orchestration, because the server is typically the gatekeeper. But with distributed authorisation, the need for a single gatekeeper goes away. When client applications can directly connect to back-end resources, there’s very little benefit orchestrating work from a central server. Coordination, workflows and many other aspects of an application can move directly to the client. Only the parts that really need to be locked down, for security reasons or to use specialist resources, need to go to AWS. The hybrid-cloud of the future, that everyone is busy writing about, isn’t going to be a mix of AWS and Google, or AWS and on-premise. It will be a mix of AWS and client machines.