第33课：Python｜并发编程基础【多线程创建、生命周期与线程安全详解】

原创于 2026-06-24 21:48:48 发布 · 193 阅读

4 ·

本内容遵循CC 4.0 BY-SA版权协议

GEO检测

标签

#python #人工智能

50节课 Python 从入门到精通专栏收录该内容

34 篇文章

订阅专栏

在这里插入图片描述

文章目录

📖 开篇导读
🎯 学习目标
📚 知识点理论精讲
💻 代码案例实操
⚠️ 易错点避坑总结
📝 课后实战练习题
🧠 知识点思维导图总结
🔜 下节课预告
- 第34课：多进程、进程池、线程池原理与企业级实战用法
🔗《50节课 Python 从入门到精通》系列课程导航

📖 开篇导读

在之前的课程中，我们的程序都是单线程顺序执行的——从第一行代码到最后一行，逐条执行。这种方式简单明了，但效率有限。例如，当你需要同时下载多个文件、处理多个用户请求、或者一边播放动画一边接收用户输入时，单线程就无法胜任了。

并发编程就是让程序能够“同时”执行多个任务的技术。在Python中，实现并发的主要方式有：多线程（threading）、多进程（multiprocessing）、异步IO（asyncio）。本课我们将从多线程开始，学习线程的创建、生命周期管理、线程间同步，以及Python特有的**全局解释器锁（GIL）**对多线程的影响。

💡 工作场景：

爬虫程序：同时下载多个网页，大幅提升抓取速度。
GUI程序：主线程负责界面刷新，工作线程执行耗时任务，避免界面卡死。
Web服务器：每个用户请求分配一个线程处理（或协程）。
后台任务：定时清理缓存、发送邮件等，与主业务逻辑并行。

本课将学习：

线程的概念与进程的区别
threading模块创建线程的多种方式
线程的生命周期（启动、阻塞、结束、守护线程）
线程安全问题与同步机制（Lock、RLock、Condition、Semaphore）
Python GIL对多线程的影响及适用场景

学完本课，你将能够编写多线程程序，处理常见的并发问题，并理解Python多线程的应用边界。

🎯 学习目标

目标编号	具体掌握内容	对应面试/工作价值
1️⃣	理解进程与线程的区别，明确多线程适用场景	面试基础
2️⃣	掌握使用`threading.Thread`创建和启动线程	编写简单多线程程序
3️⃣	理解线程的生命周期（`start`、`join`、守护线程`daemon`）	控制线程执行流程
4️⃣	掌握线程同步机制：`Lock`、`RLock`、`Condition`、`Semaphore`	避免资源竞争和数据错乱
5️⃣	理解线程安全的概念，识别常见的不安全情况	写出可靠的并发代码
6️⃣	了解GIL对计算密集型与IO密集型任务的影响	合理选择并发模型

🔥 面试考点：“进程和线程的区别”“Python多线程为什么不能充分利用多核？”“Lock和RLock的区别？”“join的作用？”“守护线程是什么？”

📚 知识点理论精讲

一、进程与线程的概念

1.1 进程（Process）

进程是资源分配的最小单位。每个进程拥有独立的内存空间、文件句柄等资源。进程间相互隔离，通信需要特殊机制（如管道、队列）。

1.2 线程（Thread）

线程是程序执行的最小单位，一个进程可以包含多个线程。线程共享进程的内存空间（全局变量、堆），但每个线程拥有独立的栈和寄存器状态。

1.3 多线程的优势与挑战

优势：

共享内存，通信方便。
创建和切换开销比进程小。
适合IO密集型任务（网络请求、文件读写）。

挑战：

线程安全问题：多个线程同时修改共享数据可能导致数据错乱。
竞争条件、死锁等问题。

二、Python多线程基础：`threading`模块

2.1 创建线程的两种方式

方式一：直接使用`Thread`类，传入目标函数

import threading
import time

def worker(name, delay):
    print(f"线程 {name} 开始")
    time.sleep(delay)
    print(f"线程 {name} 结束")

t = threading.Thread(target=worker, args=("A", 2))
t.start()

方式二：继承`Thread`类，重写`run()`方法

class MyThread(threading.Thread):
    def __init__(self, name, delay):
        super().__init__()
        self.name = name
        self.delay = delay
    
    def run(self):
        print(f"线程 {self.name} 开始")
        time.sleep(self.delay)
        print(f"线程 {self.name} 结束")

2.2 线程的生命周期

创建：t = threading.Thread(target=func)，此时线程处于新建状态。
就绪/运行：调用t.start()后，线程被调度执行。
阻塞：线程因等待锁、IO、time.sleep等进入阻塞状态。
结束：run()方法执行完毕，线程终止。

2.3 `join()`方法

join()让主线程等待子线程执行完毕。

t = threading.Thread(target=worker, args=("A", 2))
t.start()
t.join()  # 主线程在此等待，直到t结束
print("主线程继续")

2.4 守护线程（Daemon Thread）

守护线程会在主线程结束时自动终止，不需要显式等待。常用于后台任务（如心跳、监控）。

t = threading.Thread(target=worker, args=("Daemon", 10))
t.daemon = True
t.start()
# 主线程退出时，守护线程会被强制终止

三、线程安全与同步机制

当多个线程同时访问共享资源（如全局变量、文件）时，可能产生竞争条件（Race Condition），导致数据不一致。

3.1 锁（Lock）

锁是最基本的同步机制，保证同一时刻只有一个线程能执行加锁区域的代码。

lock = threading.Lock()

def increment():
    global counter
    for _ in range(100000):
        lock.acquire()
        counter += 1
        lock.release()

也可以使用上下文管理器：

with lock:
    counter += 1

3.2 可重入锁（RLock）

RLock允许同一个线程多次acquire，必须调用相同次数的release。适合递归函数或需要多次获取锁的场景。

rlock = threading.RLock()

def recursive_func(n):
    with rlock:
        if n > 0:
            recursive_func(n-1)

3.3 条件变量（Condition）

Condition用于线程间更复杂的同步，例如“生产者-消费者”模式，一个线程等待某个条件满足，另一个线程满足后通知。

cv = threading.Condition()

# 消费者
with cv:
    while not items:
        cv.wait()   # 释放锁并等待
    item = items.pop()

# 生产者
with cv:
    items.append(item)
    cv.notify()   # 唤醒等待的线程

3.4 信号量（Semaphore）

Semaphore允许最多n个线程同时访问资源。常用于限制并发数量（如数据库连接池）。

sem = threading.Semaphore(3)

def limited_access():
    with sem:
        # 最多3个线程同时执行此处
        pass

3.5 事件（Event）

Event用于线程间简单的事件通知。一个线程等待事件，另一个线程设置事件。

event = threading.Event()

def waiter():
    print("等待事件")
    event.wait()
    print("事件发生")

def setter():
    time.sleep(2)
    event.set()

四、Python的GIL（全局解释器锁）

4.1 什么是GIL？

CPython解释器中有一个全局锁，任何线程在执行Python字节码之前必须获得GIL。这意味着同一时刻只有一个线程能执行Python代码（即使有多核CPU）。

4.2 GIL的影响

计算密集型任务：多线程无法利用多核，甚至因为线程切换开销，可能比单线程还慢。应使用多进程（multiprocessing）或异步IO。
IO密集型任务：线程在等待IO时释放GIL，多线程可以显著提升性能。适合网络爬虫、文件读写、数据库操作等。

4.3 验证GIL影响

import threading, time

def count(n):
    while n > 0:
        n -= 1

# 单线程耗时
start = time.time()
count(100000000)
print(f"单线程: {time.time()-start}")

# 双线程各计算一半
t1 = threading.Thread(target=count, args=(50000000,))
t2 = threading.Thread(target=count, args=(50000000,))
start = time.time()
t1.start(); t2.start()
t1.join(); t2.join()
print(f"双线程: {time.time()-start}")

通常情况下，双线程可能比单线程还慢（由于GIL争用）。

💻 代码案例实操

案例1：基础多线程——并发下载模拟

"""
thread_basic.py
演示创建线程，模拟并发下载
"""

import threading
import time

def download_file(file_name, duration):
    print(f"开始下载 {file_name}，预计 {duration} 秒")
    time.sleep(duration)
    print(f"{file_name} 下载完成")

files = [("a.pdf", 2), ("b.mp4", 3), ("c.jpg", 1)]

threads = []
for name, dur in files:
    t = threading.Thread(target=download_file, args=(name, dur))
    threads.append(t)
    t.start()

# 等待所有线程完成
for t in threads:
    t.join()

print("所有下载任务完成")

案例2：守护线程——后台监控

"""
daemon_thread.py
演示守护线程：主线程结束时自动终止后台线程
"""

import threading
import time

def background_task():
    while True:
        print("后台监控运行中...")
        time.sleep(1)

t = threading.Thread(target=background_task)
t.daemon = True   # 设置为守护线程
t.start()

print("主线程运行，5秒后退出")
time.sleep(5)
print("主线程结束，守护线程将自动终止")

案例3：竞争条件与锁

"""
race_condition.py
演示多线程不加锁导致的竞争条件，以及使用锁修复
"""

import threading

counter = 0
lock = threading.Lock()

def increment_unsafe():
    global counter
    for _ in range(100000):
        counter += 1

def increment_safe():
    global counter
    for _ in range(100000):
        with lock:
            counter += 1

# 无锁版本
counter = 0
threads = [threading.Thread(target=increment_unsafe) for _ in range(5)]
for t in threads: t.start()
for t in threads: t.join()
print(f"无锁结果: {counter} (预期 500000)")

# 有锁版本
counter = 0
threads = [threading.Thread(target=increment_safe) for _ in range(5)]
for t in threads: t.start()
for t in threads: t.join()
print(f"有锁结果: {counter} (正确)")

案例4：生产者-消费者模型（使用Condition）

"""
producer_consumer.py
使用Condition实现生产者消费者模式
"""

import threading
import time
import random

class Queue:
    def __init__(self, maxsize=5):
        self.items = []
        self.maxsize = maxsize
        self.cond = threading.Condition()
    
    def put(self, item):
        with self.cond:
            while len(self.items) >= self.maxsize:
                print("队列已满，生产者等待")
                self.cond.wait()
            self.items.append(item)
            print(f"生产: {item}, 队列长度: {len(self.items)}")
            self.cond.notify()
    
    def get(self):
        with self.cond:
            while not self.items:
                print("队列为空，消费者等待")
                self.cond.wait()
            item = self.items.pop(0)
            print(f"消费: {item}, 剩余: {len(self.items)}")
            self.cond.notify()
            return item

def producer(q, id):
    for i in range(5):
        item = f"Producer{id}-{i}"
        q.put(item)
        time.sleep(random.uniform(0.1, 0.5))

def consumer(q, id):
    for _ in range(10):
        item = q.get()
        time.sleep(random.uniform(0.2, 0.6))

q = Queue(3)
producers = [threading.Thread(target=producer, args=(q, i)) for i in range(2)]
consumers = [threading.Thread(target=consumer, args=(q, i)) for i in range(2)]

for t in producers + consumers:
    t.start()
for t in producers + consumers:
    t.join()

案例5：使用信号量限制并发数量

"""
semaphore_demo.py
限制同时访问某个资源的线程数量
"""

import threading
import time

sem = threading.Semaphore(3)

def access_resource(thread_id):
    with sem:
        print(f"线程 {thread_id} 获得资源")
        time.sleep(2)
        print(f"线程 {thread_id} 释放资源")

threads = [threading.Thread(target=access_resource, args=(i,)) for i in range(10)]
for t in threads:
    t.start()
for t in threads:
    t.join()

案例6：定时器线程（Timer）

"""
timer_demo.py
使用Timer在指定时间后执行函数
"""

import threading
import time

def delayed_greeting(name):
    print(f"Hello, {name}")

print("主线程: 启动定时器，3秒后执行")
timer = threading.Timer(3, delayed_greeting, args=("张三",))
timer.start()

print("主线程: 等待定时器")
timer.join()  # 可选等待
print("主线程结束")

案例7：线程局部数据（ThreadLocal）

"""
thread_local.py
使用threading.local为每个线程存储独立的数据
"""

import threading
import time

local_data = threading.local()

def worker(name):
    local_data.value = name
    time.sleep(0.1)
    print(f"线程 {name} 的值: {local_data.value}")

t1 = threading.Thread(target=worker, args=("A",))
t2 = threading.Thread(target=worker, args=("B",))
t1.start(); t2.start()
t1.join(); t2.join()

案例8：GIL对计算密集型任务的影响验证

"""
gil_impact.py
验证Python多线程在计算密集型任务上的低效
"""

import threading
import time

def countdown(n):
    while n > 0:
        n -= 1

def run_single():
    start = time.time()
    countdown(100000000)
    print(f"单线程耗时: {time.time() - start:.2f}s")

def run_multi():
    start = time.time()
    t1 = threading.Thread(target=countdown, args=(50000000,))
    t2 = threading.Thread(target=countdown, args=(50000000,))
    t1.start(); t2.start()
    t1.join(); t2.join()
    print(f"多线程耗时: {time.time() - start:.2f}s")

if __name__ == "__main__":
    run_single()
    run_multi()
    # 通常多线程更慢，因为GIL导致频繁切换

⚠️ 易错点避坑总结

序号	坑点描述	后果	解决方案
1	忘记`join()`导致主线程提前退出，子线程被强制终止	子线程任务未完成	在需要等待时调用`join()`
2	在锁内执行IO操作或耗时操作	降低并发性能	锁保护的范围尽量小
3	使用`acquire()`忘记`release()`	导致死锁	使用`with lock:`语句
4	多个锁获取顺序不一致导致死锁	程序永久阻塞	确保所有线程获取锁的顺序一致
5	错误地认为`Lock`可重入	同一线程再次`acquire`会死锁	需要重入时使用`RLock`
6	使用`time.sleep`代替同步机制	效率低下，不可靠	使用正确的同步原语
7	忽视GIL，在多核上期望多线程加速计算任务	实际性能下降	计算密集型用多进程
8	共享可变对象（如列表、字典）未加锁	数据损坏	对该对象加锁或使用线程安全的数据结构
9	在守护线程中访问主线程资源	主线程关闭时守护线程可能访问已释放资源	避免守护线程依赖主线程资源
10	线程间通信使用轮询而非通知	CPU占用高	使用`Condition`、`Event`等