Awaiting Multiple External Events with Durable Functions

Waiting for external events

Azure Durable Functions makes it really easy to wait for an event from an external system with the DurableOrchestrationContext.WaitForExternalEvent method. A common use case is when you are waiting for manual approval, but it is also very useful for calling any external system that has its own bespoke way of reporting completion (e.g. a webhook). That message can then be passed onto the Durable Functions orchestration with DurableOrchestrationClient.RaiseEventAsync.

It's also possible to time out waiting for external events, which is especially important when waiting for human interaction where you might never get a response, but it's also very useful for integrating with slow or misconfigured third party systems, where a response may not come back quickly enough.

I've blogged before about how you can wait for external events with a timeout, and in fact the technique I show in that article has now been baked into the framework so the WaitForExternalEvent method now offers additional overloads that take a timeout which greatly simplifies your code.

Awaiting multiple external events

In this post I want to consider a slightly more complex scenario. Let's suppose that we want to wait for approval from at least three people before we can proceed with a workflow, but there are five people who are able to provide approval. And we'd also like to time out if we don't get the required number of approvals within a certain timeframe so we can take a mitigating action.

The basic approach we are going to use is to create a single timeout task with DurableOrchestrationContext.CreateTimer, and then use WaitForExternalEvent to receive the approval events. Now, it would be possible to create a bunch of WaitForExternalEvent tasks at the same time, one for each required approval, and so when they all complete, we've got the required number of approvals. However, I decided to take a slightly different approach which would allow for the scenario were an single approver accidentally provided more than one approval response.

So I have a loop in which I use Task.WhenAny to see what finishes first - the timeout task, or the WaitForExternalEvent task. If we receive an event, we update a HashSet of all the people who have approved so far, and if the number of approvers reaches the threshold then we can proceed. But if the timeout task wins, or one of the approvers rejects the message, then we exit the loop. If we receive an approval but haven't yet reached the threshold, then we simply loop back round and start another WaitForExternalEvent task.

Here's the code for my orchestrator function.

public static async Task<string> GetApprovalOrchestrator([OrchestrationTrigger]
            DurableOrchestrationContextBase ctx, ILogger log)
{
    var approvalConfig = ctx.GetInput<ApprovalConfig>();
    string result;
    var expireAt = ctx.CurrentUtcDateTime.AddMinutes(approvalConfig.TimeoutMinutes);
    for(var n = 0; n < approvalConfig.ApproverCount; n++)
    {
        // todo: send a message to each approver
        if (!ctx.IsReplaying) log.LogInformation($"Requesting approval from Approver {n + 1}");
    }

    var cts = new CancellationTokenSource();
    var timerTask = ctx.CreateTimer(expireAt, cts.Token);

    var approvers = new HashSet<string>();
    while(true) // slightly dangerous - we could count iterations and abort if we go round a very high number of times
    {
        var externalEventTask = ctx.WaitForExternalEvent<ApprovalResult>(ApprovalResultEventName);
        var completed = await Task.WhenAny(timerTask,externalEventTask);
        if (completed == timerTask)
        {
            result = $"Timed out with {approvers.Count} approvals so far";
            if (!ctx.IsReplaying) log.LogWarning(result);
            break; // end orchestration - we timed out
        }
        else if (completed == externalEventTask)
        {
            var approver = externalEventTask.Result.Approver;
            if (externalEventTask.Result.Approved)
            {
                approvers.Add(approver);
                if (!ctx.IsReplaying) log.LogInformation($"Approval received from {approver}");
                if (approvers.Count >= approvalConfig.RequiredApprovals)
                {
                    result = $"Approved ({approvers.Count} approvals received)";
                    if (!ctx.IsReplaying) log.LogInformation(result);
                    break;
                }
            }
            else
            {
                result = $"Rejected by {approver}";
                if (!ctx.IsReplaying) log.LogWarning(result);
                break;
            }
        }
        else
        {
            throw new InvalidOperationException("Unexpected result from Task.WhenAny");
        }
    }
    cts.Cancel();
    return result;
}

Is it safe?

There are two potential issues with the orchestrator I showed above.

First, you'll notice that I have a while(true) in my orchestrator, which is potentially dangerous, as it could allow the event sourcing history Durable Functions uses to grow very large. But that's highly unlikely to happen in this particular scenario as it's only possible if the same approver kept submitting endless approvals - which we could easily protect against in other ways. In my demo app, my approvers use a HTTP triggered function to send their approval response to the workflow, so I could block repeat approvals at that level if I wanted to before they reach the orchestrator.

Here's the function I use to pass on the approval to the workflow:

[FunctionName("SubmitApproval")]
public static async Task<IActionResult> SubmitApproval(
    [HttpTrigger(AuthorizationLevel.Function, "post", Route = "SubmitApproval/{id}")] HttpRequest req,
    [OrchestrationClient] DurableOrchestrationClientBase client, string id, ILogger log)
{
    log.LogInformation("Passing on an approval result.");

    
    string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
    var approvalResult = JsonConvert.DeserializeObject<ApprovalResult>(requestBody);
    if (string.IsNullOrEmpty(approvalResult.Approver))
        return new BadRequestObjectResult("Invalid Approval Result");
    if (string.IsNullOrEmpty(id))
        return new BadRequestObjectResult("Invalid Orchestration id");

    await client.RaiseEventAsync(id, ApprovalResultEventName, approvalResult);

    var status = await client.GetStatusAsync(id, false, false);
    return new OkObjectResult(status);
}

The second issue is that Durable Functions used to have some race condition issues where external events could be dropped in some scenarios, making code like this risky. But the recent v1.8.0 release of Durable Functions has resolved these outstanding issues, giving us confidence that the external events sent to our orchestration will all be received safely by our orchestrator function.

Try it out

I've uploaded my sample application to GitHub so feel free to check that out. You can easily configure how many approvers are asked for approval, and how many actual approvals are required before proceeding, as well as being able to configure the timeout. The readme provides PowerShell instructions for testing the workflow.

Summary

Durable Functions not only makes implementing a "wait for external event" pattern with timeout really straightforward to achieve, but is flexible enough to allow us to wait in parallel for multiple events to be received before proceeding. The demo app I created shows one way of achieving this.

Comments

April 1. 2019 07:51

Nice article :). I guess the while loop could be avoided somehow using eternal orchestration.

Alexandre

April 2. 2019 10:44

Yes, possibly - although I'd guess you;d also want to add in a sub-orchestrator in that case. Mixing waitforexternalevent with continueasnew was a scenario you couldn't safely use due to the previous race condition bugs (https://github.com/Azure/az... but that should be safe now

Mark Heath

August 17. 2019 23:16

Thanks for your course and this article.
{EDIT:}
Never mind, I believe I see that I would handle this using the {eventName} parameter.
------------------------------------------------------------
In a slightly more complicated case, I have a multi-step invitation/approval/confirmation process that has to occur in sequence. Each step is slightly different, so a single sendEventPostUri call wouldn't work (or would be pretty complicated).
Am I better off handling this with completely separate orchestrations for each step, or would I use sub-orchestrations? Since each step will be triggered by a different endpoint, perhaps separate orchestrations would be cleaner.
Thanks, again.

Bill Noel

August 19. 2019 10:57

yes, event names can be useful in this scenario