文章目录
1 一些概念
1.1 协程
执行实体每次执行一个执行流/任务,当前执行流/任务遇阻塞,执行实体切换到其他执行流/任务。执行实体在多个执行流/任务中切换,形成一种并发机制,每个执行流/任务就是一个协程。
1.2 事件循环
直接调用协程返回一个协程对象,并不真正执行协程。协程要在事件循环中被调度执行。事件循环管理协程执行流,当协程阻塞,事件循环将调度其他未阻塞的协程继续执行。
async.run(coro)将启动事件循环并运行主协程coro。await将停止一个阻塞的协程恢复另一个不阻塞的协程。

2 coroutine in Python
2.1 协程的执行
await coro1()暂停当前协程,切换到协程coro1执行,直到coro1()执行完毕,才恢复main_coro()执行。而对于coro1(),执行到await asyncio.sleep(3),停止执行coro1(),切换执行asyncio.sleep(3)。asyncio.sleep(3)执行耗时3s后,恢复执行coro1(),coro1()执行完毕后切换main_coro执行。这就是切换协程的顺序性。
2.1.1 例子1
import asyncio
import time
async def coro1(x):
print('coro1 starts!')
await asyncio.sleep(3)
print('coro1 ends!')
return x
async def main_coro():
print('main_coro starts!')
ret = await coro1(1)
print('main_coro1 ends! coro1 return: %d' % ret)
if __name__ == '__main__':
start_time = time.time()
asyncio.run(main_coro())
end_time = time.time()
print('total cost: %fs' % (end_time - start_time))
'''输出
main_coro starts!
coro1 starts!
coro1 ends!
main_coro1 ends! coro1 return: 1
total cost: 3.002261s
'''
2.1.2 例子2
可以看出这种串行调度并没利用好协程并发处理的优势。可以引入task来应对。
import asyncio
import time
async def fetch_data(data_id: int) -> None:
print(f'Fetching data for ID {data_id}')
await asyncio.sleep(3) # Simulates waiting for a response from a server
print(f'Finished fetching data for ID {data_id}')
async def main() -> None:
await fetch_data(1)
await fetch_data(2)
await fetch_data(3)
if __name__ == '__main__':
start_time = time.time()
asyncio.run(main())
end_time = time.time()
print('total cost: %f' % (end_time - start_time))
'''输出
Fetching data for ID 1
Finished fetching data for ID 1
Fetching data for ID 2
Finished fetching data for ID 2
Fetching data for ID 3
Finished fetching data for ID 3
total cost: 9.010200
‘’‘
2.2 协程的并发处理——task
task是对coroutine的封装,以便于并发执行任务。调用asyncio.create_task()将返回一个task对象,同时被封装为task的任务开始进入时间循环执行,而不是要await对应的task对象才执行。例如task1 = asyncio.create_task(fetch_data(1))执行完毕,协程fetch_data(1)就开始在事件循环中被调度运行,而不是执行到await task1的时候才进入事件循环开始执行。
import asyncio
import time
async def fetch_data(data_id: int) -> None:
print(f'Fetching data for ID {data_id}')
await asyncio.sleep(3) # Simulates waiting for a response from a server
print(f'Finished fetching data for ID {data_id}')
async def main() -> None:
task1 = asyncio.create_task(fetch_data(1))
task2 = asyncio.create_task(fetch_data(2))
task3 = asyncio.create_task(fetch_data(3))
await task1
await task2
await task3
if __name__ == '__main__':
start_time = time.time()
asyncio.run(main())
end_time = time.time()
print('total cost: %f' % (end_time - start_time))
2.3 主协程
- 主协程很重要,主协程一定要存活足够长时间,以便其他负载任务的协程能够执行完毕。如果主协程提前退出,则事件循环终止,其他负载工作的协程也终止执行。
- 主协程main()的最后一行代码是await task1,main()将阻塞直到task1执行完毕然后退出,事件循环终止,尚未执行完的task2和task3将终止执行。如果把await task1换成await asyncio.sleep(4)也可以保证3个任务执行完毕,因为主协程main()驻留4s足够确保3个task(执行时间最长的task3才耗时3s)并发执行完毕。
- asyncio.run(main())是阻塞执行的,阻塞时间就是主协程main()的生命周期,也是事件循环的生命周期。
time.sleep(3)在asyncio.run(main())后执行,主线程逗留足够的时间,理论上能够保证剩余2个task都执行完毕,但是剩余两个task都没能完成,因为它们在time.sleep(3)之前就随事件循环终止而被kill了。
import asyncio
import time
async def fetch_data(data_id: int) -> None:
print(f'Fetching data for ID {data_id}')
await asyncio.sleep(3) # Simulates waiting for a response from a server
print(f'Finished fetching data for ID {data_id}')
async def main() -> None:
task1 = asyncio.create_task(fetch_data(1))
task2 = asyncio.create_task(fetch_data(2))
task3 = asyncio.create_task(fetch_data(3))
await task1 # 1s后task1执行完毕,主协程也就执行完毕。主协程退出,事件循环终止,尚未执行完毕的task2和task3将终止执行
# await task2
# await task3
if __name__ == '__main__':
start_time = time.time()
asyncio.run(main())
time.sleep(3)
end_time = time.time()
print('total cost: %f' % (end_time - start_time))
'''输出
Fetching data for ID 1
Fetching data for ID 2
Fetching data for ID 3
Finished fetching data for ID 1
total cost: 4.006973
'''
3 其他话题
3.1 关于事件循环的接口
- asyncio.get_running_loop():在当前操作系统线程中返回正在运行的事件循环,如果没有正在运行的事件循环,则抛出RuntimeError。
- asyncio.get_event_loop():如果在协程或者回调函数中调用,则返回当前事件循环。如果当前没有正在运行事件循环,则返回get_event_loop_policy().get_event_loop()的执行结果。在协程或者回调函数中优先使用asyncio.get_runnning_loop(),因为asyncio.get_event_loop()执行比较复杂,尤其是自定义了事件循环策略时。
- asyncio.set_event_loop(loop):为当前操作系统线程设置当前事件循环
- asyncio.new_event_loop():创建并返回一个新的事件循环对象
3.2 给事件循环指定执行器(线程/进程)
import asyncio
import concurrent.futures
def blocking_io():
# File operations (such as logging) can block the
# event loop: run them in a thread pool.
with open('/dev/urandom', 'rb') as f:
return f.read(100)
def cpu_bound():
# CPU-bound operations will block the event loop:
# in general it is preferable to run them in a
# process pool.
return sum(i * i for i in range(10 ** 7))
async def main():
loop = asyncio.get_running_loop()
## Options:
# 1. Run in the default loop's executor:
result = await loop.run_in_executor(
None, blocking_io)
print('default thread pool', result)
# 2. Run in a custom thread pool:
with concurrent.futures.ThreadPoolExecutor() as pool:
result = await loop.run_in_executor(
pool, blocking_io)
print('custom thread pool', result)
# 3. Run in a custom process pool:
with concurrent.futures.ProcessPoolExecutor() as pool:
result = await loop.run_in_executor(
pool, cpu_bound)
print('custom process pool', result)
if __name__ == '__main__':
asyncio.run(main())
4 参考
https://medium.com/python-features/understanding-coroutines-tasks-in-depth-in-python-af2a4c0e1073

649

被折叠的 条评论
为什么被折叠?



