0 Comments

What is a Webhook?

Over recent years more and more services are offering the ability to configure “webhooks” that notify you when something interesting as happened. If you want to use a webhook, you need to provide a “callback” URI, and the service offering the webhook will make a HTTP request to that URI whenever the event of interest occurs. This allows you to write your own code that responds to that event when it happens. All you need is the ability to listen for HTTP requests to the callback URI.

image

Why Do We Need Webhooks?

The main reason we need webhooks is that we don’t want to have to continuously poll a service to discover what has happened recently. Polling is a waste of resources both for the client and the server. Giving the server the ability to push notifications via webhook callbacks solves this problem.

Webhooks also address the problem of an API that initiates a long-running action (say a media transcoding job). We don’t want to hold open the connection for ages waiting for the transcode to complete. Instead it would be better to simply return saying that the job has been accepted, and allow the caller to specify a webhook callback to be notified of completion.

An Example: GitHub

A good example of webhooks in action is GitHub, who let you set up webhooks to subscribe to various events that you might be interested in. So if I want to be notified when a pull request is created, I can tell GitHub where my webhook is hosted, and then whenever a pull request is created on my project, GitHub will make a HTTP request to my callback URI, posting a JSON object that contains information about the pull request.

I can then write my own custom code that does whatever I want with that information whether it’s sending myself a text, or automatically accepting the pull request (don’t do that), or triggering an automated CI build that will validate the quality of the pull request. The point is, I can do whatever I like – because GitHub have provided me with an extensibility point.

Another Example: Online payment providers

Another example is if you are selling things online through a payment provider like Stripe. They typically offer a webhook callback that tells you when someone has made a purchase. You can then use that information to dispatch the product to the buyer. This example highlights the importance of a security mechanism for webhooks – we don’t want anyone to be able to call our webhook and get free stuff shipped to them – we need to be able to verify who the callback is from.

Whose contract is it?

At first glance, webhooks might appear like two systems calling each other’s API. So for example if service A wants to ask service B to perform a long-running operation, service A could call service B’s web API, and then some time in the future, service B calls service A’s web API to tell it that it has finished. However, what we have in this system is two-way coupling of the services. Service A needs to know about Service B’s API and how to call it, and Service B needs to know about Service A’s API and how to call it. Here’s a diagram of a this situation:

image

Now wait on a minute! Isn’t that diagram identical to the webhook one? Well, the difference with webhooks is very subtle but important. With webhooks, it’s the issuer of a webhook who “owns” the contract. They not only define the incoming API, they also define the payload of the webhook. The point is, service B in this diagram should not need to know anything about who service A is.

For example, in the case of GitHub webhooks, it’s GitHub not you, who gets define what the shape of the JSON for a pull request received event is. They also get to define how the webhook is secured. And they get to choose what HTTP verb is used to call the webhook. All you get to specify is which URI should be called.

Now this might seem very obvious when it’s GitHub that’s providing the webhooks. But if you’re in a situation where you are creating both ends yourself then it’s easy to accidentally create a situation of two-way tight coupling between your services. Avoid doing this. Choose which end is the client and which end is the service, and let the service define both the API and the callbacks.

How are webhooks secured?

Hopefully the payment processor example convinced you that it is important that webhooks are secured. But how should we secure them?

First of all, your webhooks should always use HTTPS. If someone is able to intercept a webhook payload and examine it, not only may it contain sensitive information, but that gives them information that could enable them to generate their own spoof callbacks.

One simple way to secure a webhook is to include a pre-shared secretsomewhere in the HTTP request (say in an authorization header). You can then check that the secret passed matches what you were expecting. Obviously if someone can intercept a message and learn this secret, then they can generate spoof requests to your webhook.

A better way is for the webhook HTTP request to include a hashthat is calculated using a pre-shared secret and the JSON payload (basically a HMAC). This way the secret is never transmitted, meaning that even if someone did intercept the contents of a webhook, the worst they could do is replay that one webhook, as they would not be able to generate a correct hash if they changed the payload. The down-side of this approach is that it introduces more complexity for the recipient of the webhook – they need to be able to calculate what the hash value should be (here’s how you do that for GitHub webhooks).

Another important security step is to sanity check the webhook payload. In the case of a payment provider, have we already seen this order number come in? If so, don’t ship it again. And if the order comes in claiming to be for 10,000 licenses of your software, do you really want to blindly generate 10,000 license keys and email them off? Maybe you should hold fire and await manual verification that this is indeed a genuine order.

If you are defining the contents of a webhook payload yourself, there are ways to reduce risk by limiting the scope information shared in the callback. For example, while it is convenient to include lots of relevant information in a webhook payload, you could instead opt to just send a simple notification containing an id and an event type. So for a payment webhook, what if all we got back was { "order id":"123154", "status": "submitted" }. This notification doesn’t contain information of what the order was for or where it should be shipped. It means the recipient needs to respond by making another API call to find out more about that particular order. While this is less convenient for the webhook recipient, it does mean that even if the secret key is compromised an attacker can’t use it to get goods shipped to an arbitrary address.

How should a webhook respond?

If you are the recipient of a webhook, you should validate its contents, and then return as quickly as possible. The caller of a webhook should not need to wait for you to perform any actions you need to do on your end. You just need to acknowledge that you received it with a 200 OK. If you have some potentially long-running tasks of your own to initiate in response to a webhook, then it would be best to post a message onto a queue and handle it asynchronously.

Also, you should not be reliant on the issuer of the webhook to retryif something goes wrong. I would expect most services offering webhooks to make a reasonable effort to resend the webhook if they don’t get a response first time, but they can’t be expected to retry indefinitely.

Note that occasionally, webhooks are used as extensibility points, and actually do expect a response body from the webhook recipient. For example Slack custom slash commands work this way. When a Slack user enters your custom slash command, Slack will call your webhook, but then waits for you to respond with whatever data should be displayed to the user of Slack. Slack give you up to 3 seconds to generate your response.

What if I can’t host a webhook?

Probably the main reason that webhooks have only recently become more ubiquitous is that they require you to have a server to hand that you can use to listen for them. And running a server costs money – even a cheap VM costs around $20 a month, which might feel a lot to pay if your webhook may only be called a few times in that month or if this is just for a quick experiment that might not go anywhere. And trying to share a server that’s being used for another purpose is also not ideal.

But this is where Function as a Service (FaaS)offerings like AWS Lambda, Auth0 WebTasks or Azure Functions shine. They make it super easy for you to expose your webhook endpoint on the internet, and pay only when it’s called. That means if your webhook isn’t called in a month, you’ll pay nothing. You’ll also get HTTPS out of the box, and they may also do some of the hard work of validating the hash for webhooks from well-known services like GitHub. So the effort of setting up a server and the cost of running it are completely taken out of the equation.

These FaaS offerings also solve another reason people avoid webhooks – what if I’m behind a firewall? If your software is running on-premises in a business and can’t receive inbound traffic, then it may seem like accepting webhooks is out of the question. But if you use say an Azure Function running in the cloud to receive the webhook, then it could use another technique to pass on the callback such as posting a message into a queue that the system inside the firewall is able to access.

Should my service offer webhooks?

Does your service let users initiate long-running operations? Are there events in your system that the users of your service might like to be notified in real-time about? If so, offering webhooks will make your system much easier for consumers to work with.

However, my recommendation is to make webhooks optional in your design. They should not provide any information that could not also be retrieved through polling. That way, users who don’t want to (or can’t) host a webhook will still be able to use your service.

Also, if you’re going to offer webhooks, think carefully about security. Make sure you follow the best practices, and consider what the implications would be if a fake webhook was received by the system.

The good news is that the rise of FaaS offerings means it’s going to be much easier going forwards for consumers of your service to set up endpoints to listen for your webhooks.

Want to learn more about how easy it is to get up and running with Azure Functions? Be sure to check out my Pluralsight course Azure Functions Fundamentals.
Vote on HN
comments powered by Disqus