Sharing State Between Azure Functions
The great thing about “serverless” code is that you don’t need to worry about servers at all. If my function gets invoked 10 times, all 10 invocations might run on the same server, or they might run on 10 different servers. I don’t need to know or care.
But suppose every time my function runs I need to look something up in a database. I might decide that it would be nice to temporarily cache the response in memory so that subsequent runs of my function can run a bit faster (assuming they run on the same server as the previous invocation).
Is that possible in Azure Functions? I did a bit of experimenting to see how it could be done.
To keep things simple, I decided to make a C# webhook function that counted how many times it had been called. And I counted in four ways. First, using a static int
variable. Second, using the default MemoryCache
. Third, using a text file in the home directory. Fourth, using a per-machine text file in the home directory. Let’s see what happens with each of these methods.
1. Static Integer
If you declare a static variable in your run.csx
file, then the contents of that variable are available to all invocations of your function running on the same server. So if our function looks like this:
static int invocationCount = 0;
public static async Task<object> Run(HttpRequestMessage req, TraceWriter log)
{
log.Info($"Webhook triggered {++invocationCount}");
return ...
}
And we call it a few times, then we’ll see the invocation count steadily rising. Obviously this code is not thread-safe, but it shows that the memory persists between invocations on the same server.
Unsurprisingly, every time you edit your function, the count will reset. But also you’ll notice it reset at other times too. There’s no guarantee that what you store in a static variable will be present on the next invocation. But it’s absolutely fine for temporarily caching something to speed up function execution.
2. MemoryCache
The next thing I wanted to try was sharing memory between two different functions in the same function app. This would allow you to share a cache between functions. To try this out I decided to use MemoryCache.Default
.
static MemoryCache memoryCache = MemoryCache.Default;
public static async Task<object> Run(HttpRequestMessage req, TraceWriter log)
{
var cacheObject = memoryCache["cachedCount"];
var cachedCount = (cacheObject == null) ? 0 : (int)cacheObject;
memoryCache.Set("cachedCount", ++cachedCount, DateTimeOffset.Now.AddMinutes(5));
log.Info($"Webhook triggered memory count {cachedCount}");
return ...
}
Here we try to find the count in the cache, increment it, and save it with a five minute expiry. If we copy this same code to two functions within the same Azure Function App, then sure enough they each can see the count set by the other one.
Again, this cache will lose its contents every time you edit your code, but its nice to know you can share in-memory data between two functions running on the same server.
3. On Disk Shared Across All Servers
Azure function apps have a %HOME%
directory on disk which is actually a network share. If we write something into that folder, then all instances of my functions, whatever server they are running on can access it. Let’s put a text file in there containing the invocation count. Here’s a simple helper method I made to do that:
private static int IncrementInvocationCountFile(string fileName)
{
var folder = Environment.ExpandEnvironmentVariables(@"%HOME%\data\MyFunctionAppData");
var fullPath = Path.Combine(folder, fileName);
Directory.CreateDirectory(folder); // noop if it already exists
var persistedCount = 0;
if (File.Exists(fullPath))
{
persistedCount = int.Parse(File.ReadAllText(fullPath));
}
File.WriteAllText(fullPath, (++persistedCount).ToString());
return persistedCount;
}
We can call it like this:
public static async Task<object> Run(HttpRequestMessage req, TraceWriter log)
{
var persistedCount = IncrementInvocationCountFile("invocations.txt");
log.Info($"Webhook triggered {persistedCount}");
return ...;
}
Obviously this too isn’t thread-safe as we can’t have multiple instances of our function reading and writing the same file, but the key here is that anything in this folder is visible to all instances of our function, even across different servers (although it was several days before I saw my test function actually run on a different server). And unlike the in memory counter, this won’t be lost if your function restarts for any reason.
4. Per Machine File
What if you want to use disk storage for temporary caching, but only want per machine? Well, each server does have a local disk, and you can write data there by writing to the %TEMP%
folder. This would give you temporary storage that persisted on the same server between invocations of functions in the same function app. But unlike putting things in %HOME%
which the Azure Functions framework won’t delete, things you put in %TEMP%
should be thought of as transient. The temp folder would probably best be used for storing data needed during a single function invocation.
For my experiment I decided to use System.Environment.MachineName
as part of the filename, so each server would maintain its own invocation count file in the %HOME%
folder.
public static async Task<object> Run(HttpRequestMessage req, TraceWriter log)
{
var machineCount = IncrementInvocationCountFile(System.Environment.MachineName + ".txt");
log.Info($"Webhook triggered {machineCount}");
return ...;
}
And so now I can use Kudu to look in my data folder and see how many different machines my function has run on.
Should I do this?
So you can use disk or memory to share state between Azure Functions. But does that mean you should?
Well, first of all you must consider thread safety. Multiple instances of your function could be running at the same time, so if you used the techniques above you’d get file access exceptions, and you’d need to protect the static int variable from multiple access (the MemoryCache
example is already thread-safe).
And secondly, be aware of the limitations. Anything stored in memory can be lost at any time. So only use it for temporarily caching things to improve performance. By contrast anything stored in the %HOME%
folder will persist across invocations, restarts and different servers. But it’s a network share. You’re not really storing it “locally”, so it’s not all that different from just putting the data you want to share in blob storage or a database.
Comments
One small nitpick. The MemoryCache example is NOT thread safe from a semantic standpoint. It won't crash / run into issues, but it does suffer from being incorrect if two calls hit it at the same time (the get/put pattern isn't thread safe without some extra work). Of course, if you're using it for caching static data this doesn't matter.
Kelly Leahyyeah, this post was really just me writing up my findings about how the Azure Functions runtime can reuse the same instance between function calls. I wouldn't recommend using any of this stuff in production functions, but maybe useful for rapid prototyping / proof of concept work
Mark HeathHi,
Jean-Philippe Encausse2 months later after your experiments what would you do to handle a little cache between to function call ?
I have something in mind, I'd love running a Node-RED flow on an Azure Function. It's only a JSON object moving from a piece of code to an other. BUT sometimes one piece of code need to cache or initialize an object with credential (you don't want to do it for each function call)
using something like Redis is still probably the best way to cache data between functions - that way the cached value is shared between all instances, and doesn't rely on the Azure functions runtime reusing the same machine for all function calls. The techniques I discussed in this article are only useful in a few very limited situations
Mark HeathHi,
Nuri YilmazI wondering about "your function running on the same server". How you can be sure same server when we use Functions as a azure service. Could you please explain "same server" meaning?
Thanks.
On consumption plan you have no control over which server runs your function. But once an instance has been created, it will be re-used for a while to avoid cold starts. I was using System.Environment.MachineName in C# to find out what server I was running on.
Mark HeathFor anyone running into issues with AzureFunctions V2 .NetCore you need to use System.Runtime.Caching and add the following to your .csproj file. If you do not, you will get an error mentioning the dll is not valid on the platform
Adrian Campos<target name="PostBuild" aftertargets="PostBuildEvent">
<exec command="copy $(OutDir)$(ProjectName).deps.json $(OutDir)bin\function.deps.json"/>
</target>
<target name="PostPublish" aftertargets="AfterPublish">
<exec command="copy $(PublishDir)$(ProjectName).deps.json $(PublishDir)bin\function.deps.json"/>
</target>
Which option is being used on Enterprise level Azure functions to manage state between multiple instances?
Venkatesh Ramasamy// Not recommended for production functions? //
Venkatesh RamasamyThen what is recommended solution for Production?
Not sure exactly what you mean, but this post was really just about exploring a bit about what's happening behind the scenes. In production if you need to share state between Azure functions, you should use techniques like a Redis memory cache, or a database.
Mark HeathThanks Mark. Hope Redis Cache is Thread-Safe.
Venkatesh RamasamyI've read that the speed of access to the %HOME% share is quite slow, I was considering using this to store and access some large files. Do you have any experience of using large files with it?
Ian Chivers