Friday, September 13, 2019

C# Asynchronous programming

In this article, we'll focus on C# async/await keywords and explain what they're intended for.

When ?

Asynchronous programming has been mainly thought to avoid blocking a thread because of an I/O operation (serial port read, http request, database access ...). It can also be used to handle CPU-bound operations like expensive calculations.

How ?

C# has built-in syntactic sugar keywords (async/await) for easily writing asynchronous code without dealing with callbacks and helps making asynchronous calls on existing synchronous interfaces/APIs (although it is really not the recommended approach). It is known as Task-based Asynchronous Pattern (TAP).

This entire mechanism relies on Task<T> object.

Task vs Thread

A Thread is a worker. It is an OS object which executes a job (e.g. some code) in parallel. 
A Task is a job that needs to be scheduled and executed on available workers, eventually in a ThreadPool. They are a promise for execution.
A ThreadPool is a group of Threads that .NET will handle for you. They are system-shared workers that your application can rely on to execute jobs asynchronously.

With an embedded SW engineering background, it is very tempting for me to instantiate my own Thread objects in my application. It gives me confidence on how my application's jobs are scheduled over time. But generally speaking, it's a mistake.

Threads are expensive objects in terms of memory (1MB / thread in .NET) but also in terms of performances. On a resource-limited system, having several threads per application will exhaust the CPU and slow your system down. .NET will manage the ThreadPool for you, keep threads alive for reuse and take your system's limitations in consideration when doing so. In short, with .NET, prefer using Task.Run for multithreading.

Note: The only situation I came accross that required the creation of my own Thread was when developping a Windows service. When operating system exits, and your background service has pending Task objects in the background, it won't be stopped. This is because the ThreadPool cannot be released.

Definitions

Before showing some examples, we need to understand the meaning of the keywords:

async 

Method declarator which allows the usage of await within method's body.

public async static void RequestDataOverHttp()
    {
        await RequestDataAsync();
    }

Keep in mind that declaring a method as async does not make it asynchronous. If await was removed from the code above, the method would execute synchronously despite async declarator.

await

Start execution of a method and yield.

 public async static Task RequestDataOverHttp()
    {
        await RequestDataAsync();

        // The code below won't be executed until RequestDataAsync returns
        Console.WriteLine("Data received"); 
    }

This keyword creates a state engine in the background to handle job completion. This is what's going to happen:

  1. Module A calls RequestDataOverHttp
  2. RequestDataOverHttp schedules execution of RequestDataAsync on the same thread. Here, await captures the SynchronizatioContext before awaiting
  3. await yields and A continue its processing
  4. RequestDataAsync completes and unlocks internal state engine. .NET looks for an available Thread in ThreadPool to resume RequestDataOverHttp. That thread picks up the SynchronizationContext of the original thread.
  5. Console finally shows "Data Received"

The most complicated aspect of this mechanism is in understanding how the processing can continue on same thread when hitting an await statement.That is possible thanks to SynchronizationContext and TaskScheduler objects.

SynchronizationContext & TaskScheduler

SynchronizationContext

It is a representation of the environment in which a job is executed. Concretely, this object contains a worker which is usually a thread but can also be a group of threads (ThreadPool), a network instance, a CPU core...

This is what allows a code to be executed on another Thread. For instance, in WPF & Forms, the edition of controls is only possible from UI Thread. By calling control.BeginInvoke from a regular thread, we're placing a delegate to be executed onto the UI Thread.

Under the hood, delegates are queued with Post() or Send() into the context. That's basically what a context does, it's a sort of queue of work for a Thread.

TaskScheduler

We've seen that calling control.BeginInvoke will queue a delegate for UI Thread, which means that it schedules work. This method is part of ISynchronizeInvoke which is part of Control object.

When creating a Task, the scheduling behavior depends on the situation we're in:

On Task creation, the work will first try to be scheduled into the SynchronizationContext of the current thread.
As all threads do not necessarily have a SynchronizationContext, TaskScheduler will schedule the work using the ThreadPool as default choice.
If the Task has been created into another Task, the context of the primary Task will be reused (this is configurable).

Here is a more detailed summary of situations:

Calling thread Has SynchronizationContext ? Behavior
Console application No Default TaskScheduler used (ThreadPool)
Custom thread No Default TaskScheduler used (ThreadPool)
ThreadPool Yes All Tasks executed on ThreadPool
UI Thread Yes Tasks queued on UI Thread
.NET Core web application No All Tasks executed on ThreadPool
ASP.NET web application Yes Each request has its own thread. Tasks are scheduled on these threads.
Library code Unknown Unexpected behavior, potential deadlock

Task.ConfigureAwait(bool continueOnCapturedContext)

The default behavior of await can be overriden by calling ConfigureAwait(false):

 public async void ReadStringAsync()
    {
    await httpResponse.Content.ReadAsStringAsync().ConfigureAwait(false);
    }
With this call, we indicate that the Task does not have to be executed in caller's context which means that it will be scheduled on the ThreadPool.
When to do that ? If caller is UI Thread and the method does not update the UI elements, doing so is actually better in terms of performances as it will be executed in parallel. Also, it prevents from deadlocks if caller was doing something like ReadStringAsync().Result (see good practices below) which is also why it is a good practice to call ConfigureAwait(false) in library code.

 Usage

Case 1 : I/O bound code

The application awaits an operation which returns a Task<T> inside of an async method.

Synchronous version of an I/O bound method

public string RequestVersion()
    {
        string response = String.Empty;
    
        // Send request
        client.Send(new GetVersionFrame());
        // Wait response
        return client.WaitResponse();
    }
Asynchronous version it

public async Task<string> RequestVersionAsync()
    {
        string response = String.Empty;
    
        // Send request
        await client.SendAsync(new GetVersionFrame());
        // Wait response
        return await client.WaitResponseAsync().ConfigureAwait(false);
    }

Case 2 : CPU bound code

The application awaits an operation which is started on a background thread with the Task.Run method inside an async method.

Synchronous version of a CPU bound method

public List<double> ComputeCoefficients()
    {
        List<double> coefficients = new List<double>();
    
        coefficients.Add(ComputeA());
        coefficients.Add(ComputeB());
        coefficients.Add(ComputeC());
        return coefficients;
    }
Asynchronous version it

public async Task<List<double>> ComputeCoefficientsAsync()
    {
        List<double> coefficients = new List<double>();
    
        coefficients.Add(await Task.Run(() => ComputeA()));
        coefficients.Add(await Task.Run(() => ComputeB()));
        coefficients.Add(await Task.Run(() => ComputeC()));
        return coefficients;
    }

Good practices

Naming

Name asynchronous methods with Async suffix to indicate that the call won't block the caller's thread.

public async void FooAsync()
    {
        await client.DownloadAsync();
    }
Async indicates that the method will offload part of the work to an underlying API (ex: OS networking API).

CPU-bound work

Consider using background threads via Parallel.ForEach or Task.Run for CPU-bound work instead of await unless you're working in a library where you can't do that (see below).

Don't block in async code

1. Bad code
public void Foo()
    {
        client.DownloadAsync().Result;
    }
or 2. Very Bad code
public void Foo()
    {
        Task.Run(() => client.DownloadAsync().Result).Result;
    }
At some point, the async method will be executed/resumed on ThreadPool but if there is no available threads, you'll end with a deadlock. If the example 1 is called from the UI Thread, the task is queued for the UI thread which gets blocked when it reaches Result call --> deadlock.

As asynchronous code relies on execution context, don't block an asynchronous method unless you own the calling thread or if it's the application's main thread. As a general rule : call sync code from sync code and async code from async code, try to not mix them. The application's top layer has control over the context, it can chose whether to use sync or async code.

Note: using Task.Run to delegate some tasks to a ThreadPool while keeping the UI responsive is generally okay

No Task.Run in a library

This rule is related to the previous one. Callers should be the ones to call Task.Run because they have control on the execution context. Functionnally, Task.Run will work but also introduce performance issue because of an additional thread switch.
Additionnally, if a library needs to support both sync and async methods, there should be no relation between them. We can't use async calls in sync code, or we might run into deadlock issues.

Do not use async void
public async void FooAsync()
    {
        await client.DownloadAsync();
    }
As there is no Task object to be returned, exceptions cannot be captured and will be posted in the SynchronizationContext (UI Thread for example).
Also, the caller is unable to know when the execution has finished, it's a "fire and forget" mechanism.

Instead, use

public async Task FooAsync()
    {
        await client.DownloadAsync();
    }

Cem SOYDING

Author & Editor

Senior software engineer with 12 years of experience in both embedded systems and C# .NET

0 comments:

Post a Comment

Note: Only a member of this blog may post a comment.

 
biz.