Detect hanged process in .Net

We often create a child thread from the main .Net application to process a long-running task independently. Till the thread execution is not complete the main program's job is not done.

If the task running on a separate thread is a complex process having a dependency on other components then there is a high probability of such a task getting hanged. Hence the task can be considered as hanged if its execution takes longer than expected. In case the task has hanged we must release all the resources allocated to it by killing the task instead of waiting for an indefinite time. So the question is how to determine any hanged process running on a separate thread and kill it automatically?

To address this situation we can write a custom algorithm that monitors the running thread at regular intervals and collect a snapshot of memory usage. Based on ideal memory usage for a task we can also specify the ideal memory usage threshold to compare it against the difference in memory snapshots. The threshold value can be assigned in bytes and must be a sizable number as minor fluctuations always happen in memory usage.

Various parameters like virtual memory, working set and private memory can be used to measure memory utilized by any process. We can collect such data into a custom Snapshot object to compare variations at different intervals.


private class Snapshot
{
   long VirtualMemory { get; set; }
   long PeekVirtualMemory { get; set; }
   long WorkingSet { get; set; }
   long PeekWorkingSet { get; set; }
   long PrivateMemory { get; set; }
}

.Net has a built-in Process.GetCurrentProcess() method that can be called from the child thread to collect a snapshot of memory usage statistics:


private Snapshot ReadMemoryStatistics()
{
   MemoryStats stats = new MemoryStats();
   var p = Process.GetCurrentProcess();
   stats.VirtualMemory = p.VirtualMemorySize64;
   stats.PeekVirtualMemory = p.PeekVirtualMemorySize64;
   stats.WorkingSet = p.WorkingSetSize64;
   stats.PeekWorkingSet = p.PeekWorkingSetSize64;
   stats.PrivateMemory = p.PrivateMemorySize64;
   
   return stats;
}
Snapshots collected from the running thread at different intervals can be used to determine if there is any expected variation in memory usage. If the memory usage continues to remain below our threshold limit till the specified duration then we can consider the process as hanged and kill the child thread to release all the resources. The below code snippet demonstrates to check the running thread every two minutes and kill the thread after 20 minutes if memory usage doesn't exceed the threshold.

//Run any complex task on a separate thread
var process = new Thread(() => returnValue = DoProcessing());
process.Start();

//Collect initial snapshot of memory
var initialSnapshot = ReadMemoryStatistics();
var startTime = DateTime.Now();

//Specify interval after which memory usage should be compared with initial snapshot
int interval = 2;
int totalIntervals = 10;
int intervalCounter = 0;

//Specify memory threshold values
long vmThreshold = 1000000;
long wsThreshold = 1000000;

//Ping the thread every minute until it reaches our interval limit
while (!process.Join(60000))
{
   currTime = DateTime.Now();
   if ((currTime - startTime).ToMinutes() > interval)
   {
      var currentSnapshot = ReadMemoryStatistics();
      long vmDifference = currentStats.VirtualMemory - initialStats.VirtualMemory;
      long wsDifference = currentStats.WorkingSet - initialStats.WorkingSet;
      
      // Memory usage not exceeding our threshold limit indicates process is hanged and can be killed.
      if (Abs(vmDifference) > vmThreshold || Abs(wsDifference) > wsThreshold)
      {
         // Reset counters as memory usage indicate that the thread is still alive
         initialSnapshot = currentSnapshot;
         startTime = currTime;
         intervalCounter = 0;
      }
      else if (intervalCounter < totalIntervals)
      {
         intervalCounter++;
      }
      else
      {
         // Kill the thread
      }
   }
}