Category Archives: Web Development

Azure App Service: My Experience with Auto-Scale In Failing

This is the continuation of my experience with testing the auto-scaling capabilities of the Azure App Service. The first post dealt with scaling out as load increases, and this post deals with scaling back in when load decreases.

On the Scale Out blade of the App Service Plan you can see the current number of instances it has scaled to along with the Run History of scaling events (such as when each new instance was triggered and when it came online). Here we can see that under load my App Service Plan had scaled out to 6 instances, which was the maximum number of instance I’d configured for it to scale out to.

At this stage I removed the heavy load testing to watch it scale back down. In fact I cut all calls to the service so it was experiencing zero traffic. I was starting to get worried after 20 mins as I still had 6 instances and no sign of it attempting to scale back in. Then I realised I had to also setup scale in rules – it won’t do it by itself!

On the same page I configured the rules of when to scale out, I also needed to configure rules to trigger it to scale in again. I configured a simple rule that when the CPU dropped below 70% then it’s time to scale back in an instance. I set the cool down period to 2 minutes so I didn’t have to wait around forever to see it scale in.

After saving those configuration changes, I expected to see the number of instances scale in and decrease by 1 every 2 minutes until it was back to a single instance.

Suddenly an email notification popped up, it had started happening!

I received the notifications and watched the Azure Portal over the next few minutes as it scales back from 6 to 5 to 4 to 3 instances. Then it stopped scaling in. I waiting over half an hour, scratching my head as to why it was failing to scale in. The CPU usage graph showed it was well under the scale in threshold of 70%, in fact it had peaked at 16% during that half hour of waiting. Why were these instances stuck running? I was paying for those instances and I didn’t need them.

On re-reading the Microsoft documentation on scaling best-practices, it became clear what was happening. You have to consider what Azure is trying to do when it’s scaling in. When Azure looks to scale in, it tries to predict the position it’s going to be in after the scale in operation to ensure it’s not placing itself in a position where it would trigger an immediate scale out operation again. So let’s look at how Azure was handling this scenario.

It had 3 instances running, from my first blog post of scaling up we had already established that each instance sat at 55% memory usage and nearly 0% CPU usage when it was idle. The trigger to scale down was when CPU usage was lower than 70%. The average CPU usage was under 15% so Azure had passed the trigger to scale in. But let’s look at what Azure thinks will happen to memory utilisation if it were to scale in. Each of the instances had 1.75GB memory allocated to it (based on the size of an S1 plan). So in Azure’s eyes my 3 instances each running at 55% memory usage required a total of 2.89GB (1.75GB*0.55*3). If we scaled in and were left with 2 instances then those 2 instances would have to be able to handle the total memory usage of 2.89GB (1.44GB each). Let’s do the maths on what the resulting memory usage on each of our instances would be (1.44/1.75*100) 82%. Remember the scale out rules I had set? they were set to scale out when memory usage was over 80%. Azure was refusing to scale in because it thought that would result in an immediate scale out operation again. What Azure was failing to take into account was that baseline memory that every new instance uses 55% memory (or 0.96GB) just doing nothing.

In reality, if Azure did scale in by one instance the remaining 2 instances would both continue to run at 55% memory usage and then scaling down to 1 instance would again result in the last single instance running at 55% memory usage.

Azure auto scale in isn’t the perfect predictor of what’s going to happen and you will need to pay careful attention to the metrics you are using to scale on. My advice would be to test your scaling configuration by load testing as I’ve done here so that you can have confidence that you’ve actually seen what happens under load. As this little test has proven sometimes the behaviour isn’t always obvious and what you’d expect, and a mistake here could lead to some nasty bill shock.

If you are stuck with scenarios when you can’t auto scale in, or you are concerned with scale in not working then here’s a few options to consider:

  • Configure a scheduled scale-in rule to forcible bring the instance count back to 1 at a time of day when you expect least traffic
  • Configure a periodic alert to notify you if the number of instance is over a certain amount, you could then manually reduce the count back to 1.

Azure App Service: My Experience with Auto-Scale Out

I was recently testing the automatic scaling capabilities of Azure App Service plans. I had a static website and a Web API running off the same Azure App Service plan. It was a Production S1 Plan.

The static website was small (less than 10MB) and the Web API exposed a single method which did some file manipulation on files up to 25MB in size. This had the potential to drive memory usage up as the files would be held in memory while this took place. I wanted to be sure that under load my service wouldn’t run out of memory and die on me. Could Azure’s ability to auto-scale handle this scenario and spin up new instances of the App Service dynamically when the load got heavy? The upsides if this worked are obvious; I wouldn’t have to pay the fixed price of a more expensive plan that provided more memory 100% of the time, instead I’d just pay for the additional instance(s) that get dynamically spun up when I need them and therefore the extra cost during those periods would be warranted.

As an aside before I get started, it’s worth pointing out that the memory usage reporting works totally different on the Dev/Test Free plan than on Production plans, and I’d guess this has to do with the fact that the Free plan is a shared plan where it really doesn’t have it’s own dedicated resources.

What I noticed here is that if I ran my static website and Web API on the Dev/Test Free plan then my memory usage sat at 0% when idle. As soon as I change the plan to a production S1 then memory sat at around 55% when idle.

Enabling scale out is really simple, it’s just a matter of setting the trigger(s) for when you want to scale out (create additional instances) and I was impressed with a few of the other options that gave fine grained control over the sample period and cool-down periods to ensure scaling would happen in a sensible and measured way.

Before configuring my scale out rules, I first wanted to check my test rig and measure the load it would put on a single instance so I knew at what level to set my scaling thresholds. This is how the service behaved with just the one instance (no scaling)

You can see that the test was going to consistently get both the CPU and Memory usage above 80%.

Next I went about configuring the scale out rules. Here I’ve set it to scale out if the average CPU Usage > 80% or the Memory Usage > 80%. I also set the maximum instances to 6.

I also liked the option to get notified and receive an email when scaling creates or removes an instance.

So did it work? Let’s see what happened when I started applying some load to both the static website and the Web API.

Before long I started getting emails notifying me that it was scaling up, each new instance resulted in an email like this

These graphs show what happened as more and more load was gradually applied. The red box is before scaling was enabled and the the green box shows how the system behaved as more load was applied and the number of instances grew from 1 to 6 instances. While the CPU and Memory dropped notice how the amount of Data Out and Data In during the green period was significantly higher? Not only was the CPU and Memory usage on average across the instances lower, but it was able to process a much higher volume of requests.

I have to say, I was pretty impressed when I first watched all this happen automatically in-front of my eyes. During testing I was also recording the response codes from every call I was making the the static website and Web API. Not a single request failed during the entire test.

But what happened when I stopped applying such a heavy load on the website and web API? Would it scale down just as gracefully? Read on for part two of this test to find out.

What are Microsoft Teams Messaging Extensions and why you should consider building one

It’s no secret that I think Microsoft Teams is an awesome product. I’ve written in the past about how I believe Teams is a great enabler for making people more productive and brings the right tools together in a place that makes a lot of sense.

What are messaging extensions?

Message extensions are available when you are creating a chat message (either a new conversation, or when replying to an existing message). Message extensions assist you by inserting content into the chat message you are composing.

Microsoft Teams Messaging Extensions bar

Below is the Giphy messaging extension that allows you to search for an animated image and insert it into your message.

Giphy – Search for animated image
Giphy – Insert content into message compose box

Why develop a messaging extension?

One of the resounding benefits of the Microsoft Teams client is being able to extend the out-of-the-box capabilities and integrate with your existing line of business (LOB) applications. Usually the benefit gain in integrating LOB applications into Teams is that it reduces the context switching for users and they spend more time actually getting work done, and less time moving between application, copying/pasting data or links.

Take a CRM application as an example, if you are discussing a customer in a conversation in Teams and needed to explain where that customer was located, or what the contact details were. The steps you would normally go through would be to open your CRM application, search for the customer, find the relevant details, then cut and paste multiple fields of text switching back and forth between the customer details in the CRM app and the conversation in Teams.

By extending Teams with Messaging Extensions, it is possible to provide an integration into your CRM system that would allow a user to search for a customer in Teams while composing a chat message, select the customer and have the details formatted nicely and inserted in the chat message all without having to leave Teams or even open the CRM application.

When you look at the core LOB applications within your organisation there are some key integrations such as this that can bring about real productivity gains.

Where to start with messaging extension development?

Developing messaging extensions for Microsoft Team (overview)

Training Content / Bots, Messaging Extensions and Cards (Github)

Office Add-in Manifest Updates – Deployment timing and potential URL issues

cameron-dwyer-office-developer-365-logoOne of the critical components of an Office Add-in is the add-in manifest. This is the xml file that describes how your add-in should be activated when an end user installs and uses it with Office documents and applications. This manifest contains references to URLs where your web application resides and also to other resources for your add-in such as images to show on ribbon buttons and task panes.

Your manifest.xml file is uploaded to a central location where it is made available to users of your add-in (this may be a public location such as AppSource or the Office Store).

When users acquire your add-in a copy of the manifest.xml file is cached onto the host device and is only periodically resynced if the manifest.xml file is subsequently updated in AppSource/Office Store.  I have seen this take weeks on some combinations of host products/devices.

You need to be careful if you are modifying your application and the changes involve modifications to the URLs in the manfiest.xml file. Imagine your add-in has been deployed and is in use by many users. The manifest.xml file might look similar to the fragment below (notice the SourceLocation URL on line 13).

cameron-dwyer-office-addin-manifest-redirect-01-manifest

That SourceLocation URL (line 13) is the main page of the add-in that gets loaded into the Task Pane when the add-in is launched. Lets imagine you change the structure of this add-in so that the starting page of the app is now https://www.contoso.com/apps/search/index.html

For this we would change the SourceLocation DefaultValue from “https://www.contoso.com/search_app/Default.aspx
to “https://www.contoso.com/apps/search/index.html

How do you go about deploying the updated manifest file that points to this new URL?

If you update the manifest.xml file with the new URL you then have to go through the AppSource/Office Store submission process to get the new manifest approved. You are far from 100% in control of the approval process so you don’t know when the new manifest will get approved and thus when the new URL will come into effect. So do you need to keep your application running at both the old and the new URL at least until the new manifest is approved? It gets a little worse than that though, it seems the deployment of a manifest change can take weeks to reach “saturation” whereby it had propagated down to all client host product caches and the old manifest is no longer being used. I can’t exactly tell you how long this process takes and if it ever fully reaches “saturation” as I’m still seeing some users hitting my service on a URL in a manifest that was superseded almost 3 months ago!

Rather than keeping your old and new application running side by side, the approach I have taken that works quite nicely is to put in place temporary redirects from any URLs that were present in the old manifest and have changed in the new manifest. I suggest creating these as temporary redirects as you will be able to monitor your traffic over time and know when you have reached saturation and you stop getting users coming in on URLs from old manifests. If you use permanent redirects, this may get cached by the client and next time the client will have done the redirect itself and you will no longer be able to track if the new manifest has been deployed or it’s an old manifest being redirected via a client side cached redirect.

How do you go about implementing temporary redirects? Well this depends on what you’ve developed your website using and how you are hosting it. In my case I am hosting the app as a Web App in Azure, redirects can be configured in the web.config file that is located in the root folder of the Web App.

Below is an example web.config file that would achieve that temporary redirection.

cameron-dwyer-office-addin-manifest-redirect-02-redirect-web-config

Allowing different Azure AD app registration permission sets for a single app (user and elevated admin consent) using the v1 auth model

With Azure Active Directory Application Registrations there are two versions of authentication model available.

v1 – all the permission scopes that your app may require must be consented to by the user up front.

v2 – permission scopes can be asked for dynamically as your app is running, if the user hasn’t already consented to the required permission scope then they will be challenged for consent at that time

v2 is a far more flexible model as it allows users try out and/or start using your app without having to consent to everything your app could ever want to get access to. After getting comfortable using the app, as the users explore and use more specific and advanced features in your app, you can ask for further permissions. Even more advantageous, certain API calls require a tenant administrator to consent to the permission on behalf all users. With v2 auth model end users can use features of the app that they have authority to consent access to, and then if a admin consents to some of the admin only permissions then even more features could be lit up in your app.

Here’s a good rundown on the state of app registration and auth in a recent episode of the Microsoft Cloud Show and you can read about the difference between v1 and v2 auth in the official Microsoft docs.

The Problem

Not all backend service APIs support the v2 auth model, and you can’t mix and match v1 and v2 auth model. If one or more of the backend service APIs you require only supports v1, then your entire app (and access to all service APIs) will be done using v1. At the time of writing the Microsoft Graph supports v2 auth, but SharePoint only support v1 auth. There is a technique for taking an refresh token acquired using v2 auth and exchanging it for a SharePoint access token but this technique can only be used from a custom Web API, and not from a Single Page Application (SPA) as it’s not safe to expose a long lived refresh token in client side code (i.e. JavaScript running in the browser).

This means there’s situation where you are stuck with v1 auth for now. Under v1 auth if your app has at least one permission that requires admin consent, then ordinary users are not going to be able to simply start using your app on there own, we are back to the days of having to “go through IT” to have an admin approve the app before it can be used.

The Solution

Well, it’s not a silver bullet solution that is going to fix any scenario, but the technique I’ll discuss here allows you to define two sets of permissions for your app. One set of permissions that contains just the minimal set of permissions to get users started using your app (you wouldn’t want any permission that require admin consent here) – We’ll call this the User Permission Set, then a second set of permissions (that contains those tougher to get approval for permissions that require admin consent) – We’ll call this the Elevated Permission Set.

What we are aiming for is the app to run with just the restricted User Permission Set (so that anyone can quickly start using your app) but maybe not with all the features enabled, and then allow an administrator to optionally provide consent on behalf of all users which then allows the app to use the Elevated Permission set (for all users).

I’ll assume you have already been able to create an app that successfully authenticates and consents a user against a single permission set (here’s a good starting point with Azure authentication concepts if you aren’t to this stage yet)

Step 1 – Create an Application Registration per permission set

Create an application registration for both the User Permission set and the Elevated Permission set (this will be a superset of the User Permission set). These registrations should almost be identical (e.g. same Reply URLs), but they will have different Application IDs, and obviously different permissions to represent the different permission sets. We will call these the User App Reg and the Elevated App Reg.

Step 2 – Change the normal auth flow to try to acquire tokens using the Elevated App Reg first

Your normal auth flow would be to try to acquire an access token for the service endpoint specifying the App Reg Id. Now we have two possible App Reg Ids, so what we do is that we try to acquire the access token first using the Elevated App Reg Id. If you are able to get the token then you are away just like the normal app flow (in this case consent must have been granted by an admin previously). But here’s the trick, if you fail to get the token (and the reason returned is that you need to prompt for consent) then proceed with your standard flow to acquire the token this time using the User App Reg Id and prompting for user consent if required. This way the user is able to start using your app as they will have authority to consent to the User App Reg.

Step 3 – Track which App Reg Id is in use

Once this auth flow is complete, track in the state of your app which App Reg Id you successfully acquired the token for, as that token will only work with the App Reg Id used to acquire it. Example: if the call to acquire the token using the Elevated App Reg Id worked then all future calls should specify the Elevated App Reg Id.

Step 4 – Conditionally protect features that require the Elevated App Reg

Now you are tracking which App Reg is in use you will know when your app only has the restricted User Permission Set. You can use this to hide features or prevent them from being used.

Step 5 – Expose a way for administrators to provide admin consent

Somewhere in your app you can provide the ability (e.g. a button) for a an administrator to provide admin consent. This will just launch the prompt for admin consent login URL and (always use the Elevated App Reg Id for this). Now when a user tries to use the app (see step 1) the attempt to acquire the token using the Elevated App Reg Id should work since an administrator has provided the consent.

If you are feeling really awesome you could (in the same session of your app) go through your auth logic again without restarting the app the discard the tokens you will have acquired against the User App Reg and get new tokens now against the Elevated App Reg and light up those new feature of your app immediately.

Video example of an Outlook Add-in utilizing this technique to provide user and elevated permission sets within a single add-in and allowing an admin to dynamically provide consent enabling additional features.

 

 

 

How to Inspect Dynamic HTML Elements (that keep disappearing!) in Chrome

I still find styling HTML elements difficult at times, trying to figure out where the styling is being inherited from and exactly which elements I need to apply styles to. The Developer Tools in Chrome go a long way to assisting with this. For this tip I’ll assume you are familiar with Chrome Developer Tools for inspecting HTML elements and CSS styles.

What I wanted to focus on was those frustrating elements that only exist on the page (in the Document Object Model) while a certain element has the “focus”. This often happens with navigation menu options or dropdown controls, where you have the menu options or dropdown options visible on the screen but as soon as you click something in Developer Tools (to go exploring), the menu options or dropdown options disappear and don’t exist on the page anymore! This is usually because an event such as the blur event is fired when you click outside the element and this removes the elements from the page that you are trying to inspect.

This tip might not work in all scenarios but it has gotten me out of trouble on a few occasions.

Here’s an example scenario. On the left side of the screenshots you can see the OnePlaceMail (Outlook Add-in) displayed in Chrome, on the right hand side is Developer Tools inspector window. I’m using a 3rd party control for my “Content Type” dropdown (it’s the Kendo UI for Angular library)

When collapsed it’s easy to inspect the kendo-dropdownlist element (that holds the selected value of ‘Document’. At this stage the menu options that will appear when I click on the dropdown don’t even exist in the DOM.

css-dynamic-inspection-chrome-cameron-dwyer-01-kendo-dropdown

When I do click to expand the dropdown, the image below shows that a new kendo-popup element appears in the DOM (and it contains sub-elements to represent each of the options). But the problem is if we now try to use the Developer Tools and expand that kendo-popup element to see those sub-elements then the dropdown collapses (because I’ve click off it) and the kendo-popup element is removed from the DOM and we’re left with nothing to inspect!

css-dynamic-inspection-chrome-cameron-dwyer-03-kendo-dropdown-expanded

So to work around this in the Developer Tools inspector, right click on the element that is driving the elements to appear/disappear (kendo-dropdownlist) and select Break on | subtree modifications.

css-dynamic-inspection-chrome-cameron-dwyer-06-break-on-subtree-modifications

Now go to the web page and click on the dropdown to show the dropdown options. They are shown (elements added to the DOM) but the Developer Tools inspector now goes into a paused state. The web page is effectively frozen.

css-dynamic-inspection-chrome-cameron-dwyer-07-break-on-subtree-modifications-paused

While in this paused state, you can now return to the elements tab and we can expand and explore that pesky kendo-popup element that was dynamically created. This time however the dropdown won’t collapse itself as we click around in the inspector.

css-dynamic-inspection-chrome-cameron-dwyer-08-break-on-subtree-modifications-paused-stays-expanded

I hope you find this tip useful

 

%d bloggers like this: