Startup performance on .NET Core AWS Lambda

When working on a microsevice architecture utilising AWS lambda my team and I were finding that it was taking over 30 seconds to complete our web requests after the service had been idle for a while (AWS kills lambda instances idle for greater than 15 mins). This was a problem for 2 reasons; the lambda function sits behind API gateway which has a timeout of 30 seconds, and it's unacceptable to have a request take that long. We needed to improve this situation and we had two obvious options; stop the lambda from being idle, and improve startup performance.

And with a microservice architecture this is exacerbated because each service you talk to has to go through the startup procedure.

Stopping the idle lambda from dying

There's a common concept among those that have used AWS lambda in the past to serve up sites: warming. This is basically when there is a scheduled job to call the lambda at intervals below 15 minutes in order to keep AWS from killing it. This can either be a service that calls the function, or a cron trigger on the function to call itself.

We chose to get the function to call itself. The advantage of this method is that you can force a number of concurrent instances, because each instance recursively calls the next until you concurrency counter reaches your limit. This helps to deal with parallel requests.

This method is good if you're not often making updates, but if you make a fresh deployment then the first hit is still super slow. This is less of a problem if your warmer fires before a user hits it, but this is rarely the case, especially in test environments.

Improving startup performance

This is where some real gains were made.

Take this block of code:

public class Function 
{
	private readonly IConfigurationRoot _config;

	private static RegionEndpoint ParseRegionEndpoint(IConfiguration config) => RegionEndpoint.GetBySystemName(config["AWS_REGION"]);

	private static IAmazonDynamoDB CreateDynamoClient(IConfiguration config) 
	{
		var amazonConfig = new AmazonDynamoDBConfig 
		{ 
			RegionEndpoint = ParseRegionEndpoint(config) 
		};
		return new AmazonDynamoDBClient(amazonConfig); 
	} 

	public Function() => _config = new ConfigurationBuilder().AddEnvironmentVariables().Build(); 

	public async Task<APIGatewayProxyResponse> HandleApiRequest(APIGatewayProxyRequest request, ILambdaContext context) 
	{ 
		var serviceProvider = new ServiceCollection() 
			.AddSingleton(CreateDynamoClient(_config)) 
			.AddSingleton(new TableNameForEnvironment(_config["environment"])) 
			.AddSingleton<IDataStore, DynamoDbDataStore>() 
			.AddSingleton<IDealershipRepository>(new DealershipRepository(_config["dealershipServiceEndpoint"])) 
			.AddSingleton<IEnquiryRepository, EnquiryRepository>() 
			.AddSingleton<IEnquiries, Enquiries>() 
			.AddSingleton<ILogger>(new Logger(context.Logger.LogLine)) 
			.BuildServiceProvider(); 

		var router = new Router(serviceProvider); 
		var path = request.Path.Substring(Router.ServiceName.Length + 2); // Service name plus a slash each side 
		var result = await router.Run(path, request.Body); 
		return new APIGatewayProxyResponse 
		{
			Body = result, 
			StatusCode = 200 
		};
	} 
}

The entry point here is HandleApiRequest. This is what's called each time your function is called. AWS instantiates a new instance of Function and calls that method. Here you an can see we set up all our dependencies and then called router.Run which then calls in to the main business logic.

The problem here is that we're setting up all our dependencies every single time a request is made, which is actually quite slow. This is totally fine if you're running off a queue as you don't need that quick response, but for a web API this is too slow.

Remember how AWS keeps a function alive for 15 minutes? Well this includes the instantiated Function class. We can use this to our advantage by employing a static constructor:

public class Function 
{ 
	private static readonly IConfigurationRoot Config;
	private static readonly Router Router;
	private static readonly ServiceProvider ServiceProvider;

	private static RegionEndpoint ParseRegionEndpoint(IConfiguration config) => RegionEndpoint.GetBySystemName(config["AWS_REGION"]); 

	private static IAmazonDynamoDB CreateDynamoClient(IConfiguration config) 
	{
		var amazonConfig = new AmazonDynamoDBConfig 
		{
			RegionEndpoint = ParseRegionEndpoint(config) 
		}; 
		return new AmazonDynamoDBClient(amazonConfig); 
	} 

	static Function() 
	{ 
		Config = new ConfigurationBuilder().AddEnvironmentVariables().Build(); 
		ServiceProvider = new ServiceCollection() 
			.AddSingleton(CreateDynamoClient(Config)) 
			.AddSingleton(new TableNameForEnvironment(Config["environment"])) 
			.AddSingleton<IDataStore, DynamoDbDataStore>() 
			.AddSingleton<IDealershipRepository>(new DealershipRepository(Config["dealershipServiceEndpoint"])) 
			.AddSingleton<IEnquiryRepository, EnquiryRepository>() 
			.AddSingleton<IEnquiries, Enquiries>() 
			.AddSingleton<ILogger>(new Logger(Console.WriteLine)) 
			.AddSingleton<Warmer>() 
			.BuildServiceProvider(); 
		Router = new Router(ServiceProvider); 
	} 

	public async Task<APIGatewayProxyResponse> HandleApiRequest(APIGatewayProxyRequest request, ILambdaContext context) 
	{
		var path = request.Path.Substring(Router.ServiceName.Length + 2); // Service name plus a slash each side 
		var result = await Router.Run(path, request.Body); 
		return new APIGatewayProxyResponse 
		{ 
			Body = result, 
			StatusCode = 200 
		}; 
	} 
}

This yielded a 2x improvement in initial start up time and a 10x improvement in request time.

Now you may need to be careful with this solution. We don't have any state to manage, nor do we using multi-threading. If you were to use either of those things you should keep an eye out, but for a simple API this has worked really well.