Manjusaka

Manjusaka

The Road to async/await

The Road to async/await#

After reading Mr. Peng's article This Crappy Python, I have many thoughts, so I want to share my own opinions.

First of all, our company is relatively daring to try new things in China. The most direct manifestation is that we will timely follow the iteration of relevant basic services in the community and dare to try new things. Well, from June last year to now, we have been implementing async/await for a long time in our online services, and we have introduced a new framework called Sanic. But, we have to say goodbye to async/await for now.

Why did we choose async/await?#

It is related to the specific scenarios of our team. A considerable part of the scenarios in our team is to request data from different sub-services based on different URLs, and then perform unified processing after combining them. In this case, the traditional synchronous method becomes helpless as the data sources become more and more diverse.

At that time, we had several options:

  1. Maintain a process/thread pool and use generic processes/threads to handle requests.
  2. Use a third-party coroutine+EventLoop solution like Gevent.
  3. Use the async/await + asyncio combination.

First of all, we ruled out option 1 because it is too heavy. Option 2 was temporarily excluded at the beginning because we were afraid of the seemingly unreliable monkey-patching method.

588d9201f5b14873b0504e3a6ab7b402

So, we happily chose option 3, the async/await + asyncio combination.

In fact, the initial effect was wonderful. However, later on, we found that this operation was actually eating shit QwQ.

Saying Goodbye to async/await#

Why did we give up async/await?#

Actually, it's a few old questions.

1. Contagiousness at the code level#

Python's official coroutine implementation is actually a modification based on the generator system of yield/yield from. This means that you need to start at the entry point and gradually follow the async/await method for use. In the same project, having synchronous/asynchronous code everywhere is a disaster to maintain, believe me.

2. Ecosystem and compatibility#

The current compatibility of async/await is really a headache. Currently, the scope of async/await is limited to Pure Python Code. Here's a problem: many C Extensions we use in the project, such as mysqlclient, are not covered by async/await.

At the same time, the current async/await ecosystem is really a big problem, with bugs everywhere and no one fixing them. For example, the issue of link leakage in aiohttp for HTTPS links. Another example is the messy design structure of Sanic.

When we research a new technology for a production project, we often focus on whether the new technology can cover our services and whether its ecosystem can meet our daily needs. Currently, the async/await ecosystem cannot meet these requirements.

3. Performance issues#

Currently, asyncio, which was proposed by PEP 3156, is the recommended event loop for async/await. However, the official implementation of asyncio is still lacking in many aspects. For example, the issue of link leakage in aiohttp for HTTPS links can be traced back to the implementation of asyncio's SSL. Therefore, when we use it, we often choose to use a third-party loop. Currently, the mainstream implementation of third-party loops is based on libuv/libev with modifications. As a result, their performance is comparable to or even lower than Gevent (after all, Greenlet avoids the overhead of maintaining PyFrameObject).

Therefore, for the sake of our hair, we will gradually retire async/await from our online code. At the latest, we will complete the process of removing async/await by the end of this year.

What is our alternative?#

Currently, we plan to use Gevent as an alternative (yes, it's really good).

The reasons are simple:

  1. It is currently mature and has no significant bugs.
  2. The surrounding ecosystem is mature. For Pure Python Code, we can easily migrate existing code using Monkey-Patch. For C Extensions, we have Greenify, which has been internally validated by Douban, as a solution.
  3. The underlying Greenlet provides corresponding APIs, making it easy to trace the context of coroutine switching when necessary.

Other things I want to say about async/await#

First of all, async/await is a good thing, but it is not practical now. This actually depends on the community's further exploration of its usage.

Speaking of this, many people may ask me, what do you think of ASGI and Django Channels?

First of all, ASGI was not designed for async/await. Its initial design idea was to solve the problem that the PEP333/PEP3333 WSGI protocol is inadequate when facing increasingly complex network protocol models. Django Channels is also a solution to this problem, and it implements ASGI (initially to solve WebSockets?). This solution has indeed solved many problems, such as the ability to easily implement WebSocket Boardcast in Django Channel 2.0. However, they are not closely related to async/await.

At PyCon 2018, the core team of Django introduced that Channel 2.0 added support for async/await. In the future, Django may also add corresponding support. However, the problem is that once async/await is used, the overall ecosystem is still the most worrying and weakest point.

So, hello async/await, goodbye async/await!

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.