Team Services, like many Software-as-a-Service solutions, uses multi-tenancy to reduce costs and to enhance scalability and performance. This leaves users vulnerable to performance issues and even outages when other users of their shared resources have spikes in their consumption. To combat these problems, Team Services limits the resources individuals can consume and the number of requests they can make to certain commands. When these limits are exceeded, subsequent requests may be either delayed or blocked.
When an individual user's requests are delayed by a significant amount, an email is sent to that user and a warning banner appears in the Web UI. If the user does not have an email address – for example, if the “user” is actually a build service account – the notification email is sent to the members of the project collection administrators group. See User Experience for more detail.
When an individual user's requests are blocked, responses with HTTP code 429 (too many requests) will be received, with a message similar to the following:
TF400733: The request has been canceled: Request was blocked due to exceeding usage of resource <resource name> in namespace <namespace ID>.
Current rate limits
Team Services currently has a consumption limit, which delays requests from individual users when they exceed a resource consumption threshold; and a concurrent request limit, which blocks requests from individual users when they exceed a concurrent request threshold.
Individual users will typically only have their requests delayed when their personal usage within an account exceeds 200 times the consumption of a typical user within a (sliding) five-minute window.
The amount of the delay will depend on the user's sustained level of consumption – it may be as little as a few milliseconds per request or as much as thirty seconds. If their consumption goes to zero, the delays will stop after a period of not more than five minutes. If their consumption remains high, the delays may continue indefinitely.
Team Services Throughput Units (TSTUs)
Team Services users consume many shared resources, and consumption depends on many factors. For example:
- Uploading a large number of files to Team Foundation version control or Git typically creates a large amount of load on both an Azure SQL Database and an Azure Storage account.
- Running a complex work item tracking query will create load on an Azure SQL Database, with the amount of load depending on the number of work items in the Team Services account.
- Running a build on a private agent will create load on an Azure SQL Database and on one or more Azure Storage accounts, with the amount of load depending on the amount of version control content downloaded, the amount of data logged by the build, and so forth.
- All operations consume CPU and memory on one or more Team Services application tiers or job agents.
To accommodate all of this, Team Services resource consumption is expressed in abstract units called Team Services Throughput Units, or TSTUs.
TSTUs will eventually incorporate a blend of:
- Azure SQL Database DTUs as a measure of database consumption
- Application tier and job agent CPU, memory, and I/O as a measure of compute consumption
- Azure Storage bandwidth as a measure of storage consumption.
For now, TSTUs are primarily focused on Azure SQL Database DTUs, since Azure SQL Databases are the shared resources most commonly overwhelmed by excessive consumption.
A single TSTU per five minutes is the average load we expect a single normal user of Team Services to generate. Normal users will also generate spikes in load. These will typically be 10 or fewer TSTUs per five minutes, but will less frequently go as high as 100 TSTUs. The resource consumption limit is 200 TSTUs within a sliding five-minute window.
Concurrent request limit
The resource consumption limit can be slow to react to spikes in activity, since the consumption of those resources has to be measured before limits can be applied. To prevent sudden spikes of requests from overwhelming shared resources, Team Services also has a concurrent request limit in place. This limit will block subsequent requests when an individual user exceeds a threshold of concurrent requests. This should not impact normal usage, as the number of concurrent requests will typically only be exceeded by:
- Custom tooling which generates a large volume of asynchronous requests without waiting for previous requests to return.
- Concurrent use of the same user credentials from a large number of machines.
When an individual user's requests are either blocked or delayed by a significant amount, an email will be sent to that user and a warning banner will appear in the Web UI.
If the user does not have an email address - for example, if the "user" is actually a build service account - the notification email will be sent to the members of the project collection administrators group. The warning banner and the notification email both include links to the Usage page, which can be used to investigate the usage that exceeded our thresholds, as well as the requests which were delayed.
Request history on the Usage page is ordered by usage (TSTUs) descending by default, but can be sorted by the other columns as well. Usage is grouped by command into five minute time windows, with the number of commands in the window given by the Count column, and the total TSTU usage and delay time given in the corresponding columns.
For members of the project collection administrators group, this same page can be used to investigate the usage of other users.
By default, visiting the Usage page will display requests for the last hour. Clicking the link from the email will open the Usage page scoped to the request history from 30 minutes before and after the first delayed request. After arriving on the page, review the request history leading up to delayed requests.
Commands consuming a high number of TSTUs will be the ones responsible for the user exceeding the threshold. The User Agent and IP address columns can be helpful to see where these commands are coming from. Common problems to look for are custom tools or build service accounts that might be making a large amount of calls in a short time window. To avoid these types of issues, tools may need to be rewritten or build processes updated to reduce the type and number of calls made. For example, a tool might be pulling a large version control repository from scratch on a regular basis, when it could pull incrementally instead.