How to use the Microsoft Graph SDK Chaos Handler to Simulate Graph API Errors

So you’ve developed your application using the Microsoft Graph SDK, you are making calls to the Graph API and things are looking great. Being an awesome developer, you want to make sure that your application can handle any errors that the Graph API might throw at you.

The first challenge is trying to figure out what errors could be returned.

According to the Microsoft Graph API documentation, they recommend that your code should at least be able to handle these errors

I’d also recommend paying special attention to 504 (Gateway timeout) errors and consider retrying these calls similar to 503.

The second challenge is to figure out what the shape (or structure) of different error responses looks like so that you can write code in reaction to the different errors happening.

So what exactly do the error responses look like? Well that’s a tough one because you can’t be 100% sure unless you see the Graph API actually respond to you with an error. Some errors you can force, for example a 403 Unauthorized error can be forced by making a call with an invalid or missing Authorization header. 5xx errors are really hard to reproduce because they are internal issues within the Graph service, but believe me they do happen, the most common one I see is a 504 (Gateway Timeout) being returned from the Graph API when it’s having trouble getting a response from one of its backend servers.

There is a very simple way to cause a 429 error (which is one of the more complex errors to handle, as you need to interrogate the error response for retry-after values). The mechanism to create a 429 error is simply to append a parameter to the URL of the call being made to the Graph API as documented in this article. This will result in the Graph API itself returning an error response giving you a valid 429 error that you can develop and test against.

Unfortunately there are still lots of error cases that are difficult to force or simulate (it would be great if we could use that 429 URL parameter approach for all errors) so that brings us to the Chaos Handler that the Microsoft Graph team have provided in the Graph SDKs (the Chaos Handler is in both the JavaScript SDK and .Net Core SDK).

The concept behind the Chaos Handler is to intercept outbound requests being made from the SDK and rather than actually make those calls to the Graph, the SDK creates an error object and returns it to your code instead. The Chaos Handler isn’t designed to block all calls, rather it is configurable to randomly block a percentage of calls. The idea is that if you’ve got the Chaos Handler causing random failures (hence the name Chaos) it allows you to write code that caters for those errors and you are more likely to pick up those error scenarios during development.

So lets go through how you get the Chaos Handler into your project and take a look at the mechanics involved. For this walkthrough I’m going to use C#/.Net Core (note that this is possible using JavaScript/TypeScript as well).

Creating the GraphServiceClient

Your project will need to use Microsoft.Graph NuGet package (which is the .Net Core Graph SDK) and you may also need an additional package for the Authentication Provider (depending on your project you may authenticate in different ways e.g. SPA vs Web API vs Desktop Application). In my example I’m using a ClientSecretCredential from the Azure.Identity package.

Irrespective of the authentication provider you go with, your code should end up creating a GraphServiceClient. This is the central Graph SDK Client which you make all calls to the Graph through.

const string tenantId = "<YOUR_TENANT_ID>";
const string clientId = "<YOUR_CLIENT_ID>";
const string clientSecret = "<YOUR_CLIENT_SECRET>"; // Don't put a secret in plain text, load it in dynamically

var tokenCredential = new ClientSecretCredential(tenantId, clientId, clientSecret);
var graphClient = new GraphServiceClient(tokenCredential, scopes);

Adding the Chaos Handler Middleware

The Graph SDK has the concept of ‘Middleware’. Think of this as an extensible pipeline where every call that is made by the SDK goes through a pipeline of middleware and each piece of middleware has the opportunity to inspect and modify the request before passing it on to the next piece of middleware (on the way out) and the reverse for the response on the way back (each piece of middleware can inspect and modify the response). During the passing of a request through the pipeline the middleware may chose not to pass it on the the next piece of middleware in the pipeline – this is what the Chaos Handler does to prevent the call actually being made.

Almost invisible to you as a developer, when you create a new GraphServiceClient there is a default middleware pipeline that gets provisioned and provides some pretty valuable functionality. This is the default middleware pipeline:

  • Authentication Handler
  • Retry Handler
  • Redirect Handler
  • Telemetry Handler
  • HTTP Message Handler

When adding a middleware (whether it be the Chaos Handler or your own custom middleware handler) you need to be aware that a default middleware pipeline exists and that the order of middleware in the pipeline can be important since any modification made by one becomes the input to the next.

To customize the middleware pipeline we have to change the way we create the GraphServiceClient. We pass a customized middleware pipeline into the GraphServiceClient constructor.

var tokenCredential = new ClientSecretCredential(tenantId, clientId, clientSecret);
            
// Use the static GraphClientFactory to get the default pipeline
var handlers = GraphClientFactory.CreateDefaultHandlers(new TokenCredentialAuthProvider(tokenCredential));

// Add a custom middleware handler (the Chaos Handler) to the pipeline
handlers.Add(new ChaosHandler(new ChaosHandlerOption()
{
   ChaosPercentLevel = 50
}));

// Now we have an extra step of creating a HTTPClient passing in the customized pipeline
var httpClient = GraphClientFactory.Create(handlers);

// Then we construct the Graph Service Client using the HTTPClient
var graphClient = new GraphServiceClient(httpClient);

Testing out the Chaos Handler

In the code snippet above we added the Chaos Handler and set the ChaosPercentLevel to 50. This means that 50% of the calls should now result in a error. Let’s add a little bit more code to call the Graph and bring back some mail messages so we have a call to for the Chaos Handler to fail.

// Get some mail messages
var messages = await graphClient.Users["cdwyer@opsdev.work"].Messages.Request()
   .Top(100)
   .Select(m => m.Subject)
   .GetAsync();

It doesn’t matter how many times I run the above code, the call always seem to succeed. Did we do something wrong in setting up the Chaos Handler? No, there’s a fight going on in the middleware pipeline and unfortunately we don’t have a ticket to see the show. Take another look at the default middleware pipeline and you’ll notice a handler called the Retry Handler. The Retry Handler is pretty handy, it’s on the lookout for failed calls and automatically retries failed calls a number of times. Since the Retry Handler is in the middleware pipeline we don’t see the error causing a problem in the code we wrote, rather the error response is retried and the successful response is passed back to our code.

Adding a Logging Handler

Ok, this article just went from adding the Chaos Handler to creating our our custom Logging Handler because I want to see what is happening in the pipeline between the Retry Handler and the Chaos Handler.

For this we are going to need a new class that inherits from System.Net.Http.DelegatingHandler (that’s what makes it a piece of middleware). In the code for the handler I’m simply going to write to the console the URI of each request and the response code.

class ConsoleLoggingHandler : DelegatingHandler
{
    protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage httpRequest, CancellationToken cancellationToken)
    {
        HttpResponseMessage response = null;
        try
        {
            Console.WriteLine($"Graph Request > {httpRequest.RequestUri.PathAndQuery}");
            response = await base.SendAsync(httpRequest, cancellationToken);
            Console.WriteLine($"Graph Response > {response.StatusCode}");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Something went wrong: {ex.Message}");
            if (response.Content != null)
            {
                await response.Content.ReadAsByteArrayAsync();// Drain response content to free connections.
            }
        }
        return response;
    }
}

Next we need to add our custom Logging Handler to the pipeline (just as we added the Chaos Handler). Order is important here, we need to make sure our Logging Handler sits between the Retry Handler and the Chaos Handler so we simply add the Logging Handler before adding the Chaos Handler.

var handlers = GraphClientFactory.CreateDefaultHandlers(new TokenCredentialAuthProvider(tokenCredential));

// Add logging handler in between the Retry Handler and the Chaos Handler
handlers.Add(new ConsoleLoggingHandler());

// Add the Chaos Handler
handlers.Add(new ChaosHandler(new ChaosHandlerOption()
{
    ChaosPercentLevel = 50
}));

var httpClient = GraphClientFactory.Create(handlers);
var graphClient = new GraphServiceClient(httpClient);

Running the code again we now have visibility to the fight that is going on. Here we see the initial request encounters a GatewayTimeout error (caused by the Chaos Handler), then a second call is made (by the Retry Handler) that the Chaos Handler lets through and thus the call succeeds.

Removing the Retry Handler

This fight in the pipeline is pretty pointless and annoying, the Chaos Handler is just testing the Retry Handler and I’m getting no benefit as my code doesn’t receive any errors that the Chaos Handler is causing. We can fix that by temporarily removing the Retry Handler.

var handlers = GraphClientFactory.CreateDefaultHandlers(new TokenCredentialAuthProvider(tokenCredential));

// Remove the default Retry Handler
var retryHandler = handlers.Where(h => h is RetryHandler).FirstOrDefault();
handlers.Remove(retryHandler);

// Add the Chaos Handler
handlers.Add(new ChaosHandler(new ChaosHandlerOption()
{
    ChaosPercentLevel = 50
}));

var httpClient = GraphClientFactory.Create(handlers);
var graphClient = new GraphServiceClient(httpClient);

Now we have Chaos!

With the Retry Handler out of the way we now see the errors happening back in our code and we can begin to ensure we handle these error scenarios appropriately.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: