You may have encountered explanations that appear overly simplistic, merely emphasizing the importance of async programming without going into the mechanics or exploring high-level APIs such as the asyncio
module in the standard library. Clearly, you already understand the significance and use cases, so let's cut to the chase and see how it actually works.
Pausing
So, what's the deal with async functions in Python? We all know they pause execution until some operation resolves, right? but how does this mechanism truly operate? Let's take a step back and check out a foundational concept:
Iterators
An iterator serves as a programming object that facilitates the systematic traversal of a container, granting access to its elements one at a time. It maintains the state of the traversal, allowing the iterator to pause execution after processing a specific item. To continue, a call to the next()
method, which is typically named by convention, is made. This call prompts the iterator to resume execution and retrieve the next element in the sequence. This iterative process persists until the iterator reaches the end of the container, and this approach is commonly referred to as lazy evaluation.
The key concepts here are the actions of pausing, resuming, and awaiting the next element. These notions become particularly relevant and meaningful when correlated with the concept of asynchronous functions, the idea of pausing and resuming execution aligns with the asynchronous nature of handling tasks, which allows the program to efficiently manage and switch between various operations without blocking the entire execution flow. The term "await" further emphasizes this waiting aspect, where the program can await the completion of a specific asynchronous task before proceeding, we call this task a Future
, more on this later.
How Do I Create An Iterator
To define your own iterator you have to adhere to the iterator protocol meaning you have to define __iter__
and __next__
methods
__iter__
The purpose of the __iter__
method is to return the iterator object itself. When this method is implemented in a class, it allows instances of that class to be used in a for
loop or any other iteration context. In other words, it enables the iteration over the elements of the associated object.
__next__
The __next__
method is here to provide the next element in the sequence when iterating over the associated object. For example, in a loop like for x in my_object
, Python internally calls my_object.__next__()
in each iteration to fetch the subsequent element. If the iteration is complete, the method raises a StopIteration
error, signaling that there are no more elements to be retrieved.
Example
This object enables you to read a file, line by line without loading the entire file into memory.
1class EfficientFile:
2 def __init__(self, file_path: str):
3 self.file_path = file_path
4
5 def __iter__(self) -> 'EfficientFile':
6 self.file = open(self.file_path, 'r')
7 return self
8
9 def __next__(self) -> str:
10 line = self.file.readline()
11 if not line:
12 self.file.close()
13 raise StopIteration
14 return line.strip()
15
16file = EfficientFile(file_path='pyproject.toml')
17
18for line in file:
19 print(line)
20
An alternative way to read all the lines using the next()
method:
try:
while True:
line = next(file) # or file.__next__()
print(line)
except StopIteration:
pass
In this example, the __iter__
method opens the file, and the __next__
method reads one line at a time until the end of the file is reached. This approach ensures that only one line is loaded into memory at a time, by only loading parts of the file upon request, the request in this case is the act of calling the next()
method which makes it memory-efficient, even for large files.
Async Iterators
The extend the concept of iterators to asynchronous operations. Async iterators implement two key methods:
__aiter__
Similar to its synchronous counterpart, this method returns the async iterator object itself, making it compatible with async for
loops instead of traditional for
loops.
__anext__
Is the async version of next
, __anext__
provides the next asynchronous value only upon request. This is contingent on the completion of another Future
object. When the async iterator exhausts its items, it triggers a StopAsyncIteration
exception. The implementation of this method should return a Future
of some sort.
The invocation of __anext__
involves the use of the async
and await
keywords. Speaking of which, the await
keyword can only be used with an awaitable
object. So...
What Exactly Is An awaitable
An awaitable
is an object that can be used with the await
keyword within an async function. It typically represents an operation that may not be immediately completed, such as an asynchronous task or a Future
object.
How Do I Define An awaitable
Define an object that implements the __await__()
dunder method which should return a generator object
What Are Generators ?
Generators are functions that leverage the yield
keyword. Their primary purpose is to provide a concise and efficient means of creating iterators. The inclusion of yield
within generator functions eliminates the need for manual implementation of the iterator protocol.
When encountering the 'yield
' statement, an event occurs: The current state of the generator function is preserved, and control is temporarily relinquished back to the calling function or context. At this point, the generator is in a suspended state, storing all local variables and the execution point.
Upon calling the generator again, whether through a for
loop or the next()
function, the generator resumes execution from where it was paused by the last 'yield
'. This resumption allows the generator to produce the next value in the sequence. This process repeats until the generator function completes or encounters a 'return
' statement.
Now, this behavior is also associated with coroutines, a Coroutine
is essentially an awaitable
object designed to handle asynchronous operations. A Coroutine
accepts three types: the yield type, the send type, and the return type, just like the generator below.
def my_generator() -> Generator[YieldType, None, None]:
yield some_value
yield another_value
gen = my_generator()
for value in gen:
print(value)
- - The first
None
represents the type of the value that can be sent into the generator using thesend()
method (though this is optional and not commonly used). - - The second
None
represents the type of the value that the generator returns. In most cases, generators don't return a value explicitly, so it'sNone
. - - If the generator doesn't yield any values, you can replace
YieldType
withNone
Related Objects
I've referred to Future
, so what's a future ? it's a combination of both iterable
and awaitable
, you might have also encountered an asynciterable
, this is an object that defines and aiter
method and returns an asynciterator
. Lastly there's AsyncGenerator
, just like Generator
but it actually it inherits from an AsyncIterator
instead of Ìterator
Read Again
Seems confusing doesn't it ?