It is helpful to narrow whether a build failure is the result of a TFS/VSTS product issue (agent or tasks). Build failures may also result from external commands.
Check the build log for the exact command-line executed by the failing step. Attempting to run the command locally from the command line, may reproduce the issue. It can be helpful to run the command locally from your own machine, and/or log-in to the build machine and run the command as the service account.
For example, is the problem happening during the MSBuild part of your build process (for example, are you using either the MSBuild or Visual Studio Build step)? If so, then try running the same MSBuild command on a local machine using the same arguments. If you can reproduce the problem on a local machine, then your next steps are to investigate the MSBuild problem.
Differences between local command prompt and agent
Keep in mind, some differences are in effect when executing a command from the local command, and when a build is running on an agent. If the agent is configured to run as a service on Windows/Linux, then it is not running within an interactive logged-on session. Without an interactive logged-on session, UI interaction and other limitations exist.
Get logs to diagnose problems
Start by looking at the logs in your completed build. If they don't provide enough detail, you can make them more verbose:
On the Variables tab, add system.debug and set it to true. Select to allow at queue time.
Queue the build.
In the explorer tab, view your completed build and click the build step to view its output.
If you need a copy of all the logs, click Download all logs as zip.
Log on to the agent machine.
Go to the _diag subfolder in the directory where the build agent is installed. For example: c:\agent\_diag
You can get the raw log of the completed build that was generated by the worker process on the build agent. Look for the worker log file that has the date and time stamp of your completed build. For example, worker_20160623-192022-utc_6172.log.
Agent trace logs
Agent trace logs provide a record of how the agent was configured and what happened when it ran. Look for the agent log files. For example, agent_20160624-144630-utc.log. There are two kinds of agent log files:
The log file generated when you ran config.cmd. This log:
Includes this line near the top: Adding Command: configure
Shows the configuration choices made.
The log file generated when you ran run.cmd. This log:
Cannot be opened until the process is terminated.
Attempts to connect to your Team Foundation Server or Team Services account.
Shows when each job was run, and how it completed
Both logs show how the agent capabilities were detected and set.
HTTP trace logs
Important: HTTP traces and trace files can contain passwords and other secrets. Do not post them on a public sites.
Charles: Proxy > Mac OSX Proxy. Recommend disabling to only see agent traffic.
Run the agent interactively. If it's running as a service, you can set in the .env file. See nix service
Restart the agent.
File- and folder-in-use errors
File or folder in use errors are often indicated by error messages such as:
Access to the path [...] is denied.
The process cannot access the file [...] because it is being used by another process.
Access is denied.
Can't move [...] to [...]
Detect files and folders in use
On Windows, tools like Process Monitor can be to capture a trace of file events under a specific directory. Or, for a snapshot in time, tools like Process Explorer or Handle can be used.
Anti-virus software scanning your files can cause file or folder in use errors during a build. Adding an anti-virus exclusion for your agent directory and configured "work folder" may help to identify anti-virus software as the interfering process.
MSBuild and /nodeReuse:false
If you invoke MSBuild during your build, make sure to pass the argument /nodeReuse:false (short form /nr:false). Otherwise MSBuild process(es) will remain running after the build completes. The process(es) remain for some time in anticipation of a potential subsequent build.
This feature of MSBuild can interfere with attempts to delete or move a directory - due to a conflict with the working directory of the MSBuild process(es).
The MSBuild and Visual Studio Build tasks already add /nr:false to the arguments passed to MSBuild. However, if you invoke MSBuild from your own script, then you would need to specify the argument.
MSBuild and /maxcpucount:[n]
By default the build steps such as MSBuild and Visual Studio Build run MSBuild with the /m switch. In some cases this can cause problems such as multiple process file access issues.
Try adding the /m:1 argument to your build steps to force MSBuild to run only one process at a time.
File-in-use issues may result when leveraging the concurrent-process feature of MSBuild. Not specifying the argument /maxcpucount:[n] (short form /m:[n]) instructs MSBuild to use a single process only. If you are using the MSBuild or Visual Studio Build tasks, you may need to specify "/m:1" to override the "/m" argument that is added by default.
Intermittent or inconsistent MSBuild failures
If you are experiencing intermittent or inconsistent MSBuild failures, try instructing MSBuild to use a single-process only. Intermittent or inconsistent errors may indicate that your target configuration is incompatible with the concurrent-process feature of MSBuild. See MSBuild and /maxcpucount:[n]
A process hang may indicate that a process is waiting for input.
Running the agent from the command line of an interactive logged on session may help to identify whether a process is prompting with a dialog for input.
Running the agent as a service may help to eliminate programs from prompting for input. For example in .Net, programs may rely on the System.Environment.UserInteractive Boolean to determine whether to prompt. When running as a Windows service, the value is false.
Builds not starting (on-premises TFS only)
TFS Job Agent not started
This may be characterized by a message in the web console "Waiting for an agent to be requested". Verify the TFSJobAgent (display name: Visual Studio Team Foundation Background Job Agent) Windows service is started.
Misconfigured Notifcation URL (1.x agent version)
This may be characterized by a message in the web console "Waiting for console output from an agent", and the build eventually times out.
A mismatching notification URL may cause the worker to process to fail to connect to the server. See Team Foundation Administration Console, Application Tier. The 1.x agent listens to the message queue using the URL that it was configured with. However, when a job message is pulled from the queue, the worker process uses the notification URL to communicate back to the server.
I need more help. I found a bug. I've got a suggestion. Where do I go?