Skip to main content

HTTP Queue Length and Request counter of Azure App Services (Web Apps)

Auto scaling of App Services and Web Application it is a feature that is available for some time inside Microsoft Azure. Beside standard metrics like CPU, Memory and Data In/Out there is a specific web metric that can be used for scaling – HTTP Queue Length.

Counter definition
It is important to know from beginning what does this metric represents. The name of the metric can create confusion especially if you used in the past IIS or similar services. The counter that can be accessed from Azure Portal represents the total number of active requests inside W3WP. The technical path of the counter would be “W3SVC_W3WP – Active_Requests_ - _Total”.

Naming confusion
This metric created confusion in the past, so it was renamed from HTTP Queue Length to Requests and shows the total number of requests in a specific moment in time. This change was done only in the Metric Monitor part of the Azure Portal.
Inside auto scale section of „Scale out” you will find this metrics called „HttpQueueLength”, but remember that they represents the same counter.


What does this metric represents
This counter represents the total number of active requests to our App Services. For example if we have 5 clients connected in that moment to our App Service, this means that the number of HttpQueueLength will be 5.
This value is relevant especially for two kind of applications. First are the applications that are hit by high number of requests and the requests processing time is short. The second case is when we have requests that are executed for long period of time, which could put pressure to our backend system.

Should I use inside for Auto-Scaling
Yes and No. This is a tricky question. This is that kind of counter that if you analyze it separately if will not provide too much information related to the current load of the system and if you need to scale up/down.
Taking into account the length of the requests, you might realize that there is direct impact on the quality of service if you have 100 requests in parallel that takes 10 minutes and consume 80% of CPU or 1000 requests in parallel that takes 0.2 seconds.
Do not start to use this metric from day 0. Try to gather historical data and see how you can combine this metrics with other counters that are provided by App Services. I recommend to start with simple counters like CPU or Memory. After a while, based on historical data and how the system behave you can decide to use Http Queue Length counter.

Final thoughts 
First, remember that Http Queue Length and Requests are the same metrics. They represents the total number of requests in a specific moment in time. Be careful on how you use this metrics combined with auto scale because false-positive actions might occur. 

Comments

Popular posts from this blog

Windows Docker Containers can make WIN32 API calls, use COM and ASP.NET WebForms

After the last post , I received two interesting questions related to Docker and Windows. People were interested if we do Win32 API calls from a Docker container and if there is support for COM. WIN32 Support To test calls to WIN32 API, let’s try to populate SYSTEM_INFO class. [StructLayout(LayoutKind.Sequential)] public struct SYSTEM_INFO { public uint dwOemId; public uint dwPageSize; public uint lpMinimumApplicationAddress; public uint lpMaximumApplicationAddress; public uint dwActiveProcessorMask; public uint dwNumberOfProcessors; public uint dwProcessorType; public uint dwAllocationGranularity; public uint dwProcessorLevel; public uint dwProcessorRevision; } ... [DllImport("kernel32")] static extern void GetSystemInfo(ref SYSTEM_INFO pSI); ... SYSTEM_INFO pSI = new SYSTEM_INFO(

Azure AD and AWS Cognito side-by-side

In the last few weeks, I was involved in multiple opportunities on Microsoft Azure and Amazon, where we had to analyse AWS Cognito, Azure AD and other solutions that are available on the market. I decided to consolidate in one post all features and differences that I identified for both of them that we should need to take into account. Take into account that Azure AD is an identity and access management services well integrated with Microsoft stack. In comparison, AWS Cognito is just a user sign-up, sign-in and access control and nothing more. The focus is not on the main features, is more on small things that can make a difference when you want to decide where we want to store and manage our users.  This information might be useful in the future when we need to decide where we want to keep and manage our users.  Feature Azure AD (B2C, B2C) AWS Cognito Access token lifetime Default 1h – the value is configurable 1h – cannot be modified

What to do when you hit the throughput limits of Azure Storage (Blobs)

In this post we will talk about how we can detect when we hit a throughput limit of Azure Storage and what we can do in that moment. Context If we take a look on Scalability Targets of Azure Storage ( https://azure.microsoft.com/en-us/documentation/articles/storage-scalability-targets/ ) we will observe that the limits are prety high. But, based on our business logic we can end up at this limits. If you create a system that is hitted by a high number of device, you can hit easily the total number of requests rate that can be done on a Storage Account. This limits on Azure is 20.000 IOPS (entities or messages per second) where (and this is very important) the size of the request is 1KB. Normally, if you make a load tests where 20.000 clients will hit different blobs storages from the same Azure Storage Account, this limits can be reached. How we can detect this problem? From client, we can detect that this limits was reached based on the HTTP error code that is returned by HTTP