Python 性能优化的20条招数_python 自动优化

最新推荐文章于 2026-01-01 13:45:49 发布

原创最新推荐文章于 2026-01-01 13:45:49 发布 · 737 阅读

4 ·

本内容遵循CC 4.0 BY-SA版权协议

GEO检测

标签

#python #前端 #数据库

后者的效率反而更高，但是如果循环里有 break,用 generator 的好处是显而易见的。yield 也是用于创建 generator：

def yield\_func(ls):  
  for i in ls:
    yield i+1
def not\_yield\_func(ls):  
  return [i+1 for i in ls]
ls = range(1000000)
%timeit -n 10 for i in yield_func(ls):pass
%timeit -n 10 for i in not_yield_func(ls):pass
10 loops, best of 3: 63.8 ms per loop
10 loops, best of 3: 62.9 ms per loop
 yield\_func(ls):  
  for i in ls:
    yield i+1
def not\_yield\_func(ls):  
  return [i+1 for i in ls]
ls = range(1000000)
%timeit -n 10 for i in yield_func(ls):pass
%timeit -n 10 for i in not_yield_func(ls):pass
10 loops, best of 3: 63.8 ms per loop
10 loops, best of 3: 62.9 ms per loop

对于内存不是非常大的 list，可以直接返回一个 list，但是可读性 yield 更佳(人个喜好)。

python2.x 内置 generator 功能的有 xrange 函数、itertools 包等。

优化循环

循环之外能做的事不要放在循环内，比如下面的优化可以快一倍：

a = range(10000)
size_a = len(a)
%timeit -n 1000 for i in a: k = len(a)
%timeit -n 1000 for i in a: k = size_a
1000 loops, best of 3: 569 µs per loop
1000 loops, best of 3: 256 µs per loop
10000)
size_a = len(a)
%timeit -n 1000 for i in a: k = len(a)
%timeit -n 1000 for i in a: k = size_a
1000 loops, best of 3: 569 µs per loop
1000 loops, best of 3: 256 µs per loop

优化包含多个判断表达式的顺序

对于 and，应该把满足条件少的放在前面，对于 or，把满足条件多的放在前面。如：

a = range(2000) 
%timeit -n 100 [i for i in a if 10 < i < 20 or 1000 < i < 2000]
%timeit -n 100 [i for i in a if 1000 < i < 2000 or 100 < i < 20]  
%timeit -n 100 [i for i in a if i % 2 == 0 and i > 1900]
%timeit -n 100 [i for i in a if i > 1900 and i % 2 == 0]
100 loops, best of 3: 287 µs per loop
100 loops, best of 3: 214 µs per loop
100 loops, best of 3: 128 µs per loop
100 loops, best of 3: 56.1 µs per loop
2000) 
%timeit -n 100 [i for i in a if 10 < i < 20 or 1000 < i < 2000]
%timeit -n 100 [i for i in a if 1000 < i < 2000 or 100 < i < 20]  
%timeit -n 100 [i for i in a if i % 2 == 0 and i > 1900]
%timeit -n 100 [i for i in a if i > 1900 and i % 2 == 0]
100 loops, best of 3: 287 µs per loop
100 loops, best of 3: 214 µs per loop
100 loops, best of 3: 128 µs per loop
100 loops, best of 3: 56.1 µs per loop

使用 join 合并迭代器中的字符串

In [1]: %%timeit
 ...: s = ''
 ...: for i in a:
 ...:    s += i
 ...:10000 loops, best of 3: 59.8 µs per loopIn [2]: %%timeit
s = ''.join(a)
 ...:100000 loops, best of 3: 11.8 µs per loop
1]: %%timeit
 ...: s = ''
 ...: for i in a:
 ...:    s += i
 ...:10000 loops, best of 3: 59.8 µs per loopIn [2]: %%timeit
s = ''.join(a)
 ...:100000 loops, best of 3: 11.8 µs per loop

join 对于累加的方式，有大约5倍的提升。

选择合适的格式化字符方式

s1, s2 = 'ax', 'bx'
%timeit -n 100000 'abc%s%s' % (s1, s2)
%timeit -n 100000 'abc{0}{1}'.format(s1, s2)
%timeit -n 100000 'abc' + s1 + s2
100000 loops, best of 3: 183 ns per loop
100000 loops, best of 3: 169 ns per loop
100000 loops, best of 3: 103 ns per loop
'ax', 'bx'
%timeit -n 100000 'abc%s%s' % (s1, s2)
%timeit -n 100000 'abc{0}{1}'.format(s1, s2)
%timeit -n 100000 'abc' + s1 + s2
100000 loops, best of 3: 183 ns per loop
100000 loops, best of 3: 169 ns per loop
100000 loops, best of 3: 103 ns per loop

三种情况中，%的方式是最慢的，但是三者的差距并不大（都非常快）。(个人觉得%的可读性最好)

不借助中间变量交换两个变量的值

In [3]: %%timeit -n 10000
  a,b=1,2
 ....: c=a;a=b;b=c;
 ....:10000 loops, best of 3: 172 ns per loop
In [4]: %%timeit -n 10000
a,b=1,2
a,b=b,a
 ....:
10000 loops, best of 3: 86 ns per loop
3]: %%timeit -n 10000
  a,b=1,2
 ....: c=a;a=b;b=c;
 ....:10000 loops, best of 3: 172 ns per loop
In [4]: %%timeit -n 10000
a,b=1,2
a,b=b,a
 ....:
10000 loops, best of 3: 86 ns per loop

使用a,b=b,a而不是c=a;a=b;b=c;来交换a,b的值，可以快1倍以上。

使用 if is

a = range(10000)
%timeit -n 100 [i for i in a if i == True]
%timeit -n 100 [i for i in a if i is True]
100 loops, best of 3: 531 µs per loop
100 loops, best of 3: 362 µs per loop
10000)
%timeit -n 100 [i for i in a if i == True]
%timeit -n 100 [i for i in a if i is True]
100 loops, best of 3: 531 µs per loop
100 loops, best of 3: 362 µs per loop

使用 if is True 比 if == True 将近快一倍。

使用级联比较x < y < z

x, y, z = 1,2,3
%timeit -n 1000000 if x < y < z:pass
%timeit -n 1000000 if x < y and y < z:pass
1000000 loops, best of 3: 101 ns per loop
1000000 loops, best of 3: 121 ns per loop
1,2,3
%timeit -n 1000000 if x < y < z:pass
%timeit -n 1000000 if x < y and y < z:pass
1000000 loops, best of 3: 101 ns per loop
1000000 loops, best of 3: 121 ns per loop

x < y < z效率略高，而且可读性更好。

while 1 比 while True 更快

def while\_1():
  n = 100000
  while 1:
    n -= 1
    if n <= 0: break
def while\_true():
  n = 100000
  while True:
    n -= 1
    if n <= 0: break  
m, n = 1000000, 1000000 
%timeit -n 100 while_1()
%timeit -n 100 while_true()
100 loops, best of 3: 3.69 ms per loop
100 loops, best of 3: 5.61 ms per loop
 while\_1():
  n = 100000
  while 1:
    n -= 1
    if n <= 0: break
def while\_true():
  n = 100000
  while True:
    n -= 1
    if n <= 0: break  
m, n = 1000000, 1000000 
%timeit -n 100 while_1()
%timeit -n 100 while_true()
100 loops, best of 3: 3.69 ms per loop
100 loops, best of 3: 5.61 ms per loop

while 1 比 while true 快很多，原因是在 python2.x 中，True 是一个全局变量，而非关键字。

使用**而不是 pow

%timeit -n 10000 c = pow(2,20)
%timeit -n 10000 c = 2**20
10000 loops, best of 3: 284 ns per loop
10000 loops, best of 3: 16.9 ns per loop
10000 c = pow(2,20)
%timeit -n 10000 c = 2**20
10000 loops, best of 3: 284 ns per loop
10000 loops, best of 3: 16.9 ns per loop

**就是快10倍以上！

使用 cProfile, cStringIO 和 cPickle 等用c实现相同功能（分别对应profile, StringIO, pickle）的包

import cPickle
import pickle
a = range(10000)
%timeit -n 100 x = cPickle.dumps(a)
%timeit -n 100 x = pickle.dumps(a)
100 loops, best of 3: 1.58 ms per loop
100 loops, best of 3: 17 ms per loop
 cPickle
import pickle
a = range(10000)
%timeit -n 100 x = cPickle.dumps(a)
%timeit -n 100 x = pickle.dumps(a)
100 loops, best of 3: 1.58 ms per loop
100 loops, best of 3: 17 ms per loop

由c实现的包，速度快10倍以上！

使用最佳的反序列化方式

下面比较了 eval, cPickle, json 方式三种对相应字符串反序列化的效率：

import json
import cPickle
a = range(10000)
s1 = str(a)
s2 = cPickle.dumps(a)
s3 = json.dumps(a)
%timeit -n 100 x = eval(s1)
%timeit -n 100 x = cPickle.loads(s2)
%timeit -n 100 x = json.loads(s3)
100 loops, best of 3: 16.8 ms per loop
100 loops, best of 3: 2.02 ms per loop
100 loops, best of 3: 798 µs per loop
 json
import cPickle
a = range(10000)
s1 = str(a)
s2 = cPickle.dumps(a)
s3 = json.dumps(a)
%timeit -n 100 x = eval(s1)
%timeit -n 100 x = cPickle.loads(s2)
%timeit -n 100 x = json.loads(s3)
100 loops, best of 3: 16.8 ms per loop
100 loops, best of 3: 2.02 ms per loop
100 loops, best of 3: 798 µs per loop

可见 json 比 cPickle 快近3倍，比 eval 快20多倍。

使用C扩展(Extension)

目前主要有 CPython(python最常见的实现的方式)原生API, ctypes,Cython，cffi三种方式，它们的作用是使得 Python 程序可以调用由C编译成的动态链接库，其特点分别是：

CPython 原生 API: 通过引入 Python.h 头文件，对应的C程序中可以直接使用Python 的数据结构。实现过程相对繁琐，但是有比较大的适用范围。

ctypes: 通常用于封装(wrap)C程序，让纯 Python 程序调用动态链接库（Windows 中的 dll 或 Unix 中的 so 文件）中的函数。如果想要在 python 中使用已经有C类库，使用 ctypes 是很好的选择，有一些基准测试下，python2+ctypes 是性能最好的方式。

Cython: Cython 是 CPython 的超集，用于简化编写C扩展的过程。Cython 的优点是语法简洁，可以很好地兼容 numpy 等包含大量C扩展的库。Cython 的使得场景一般是针对项目中某个算法或过程的优化。在某些测试中，可以有几百倍的性能提升。

cffi: cffi 的就是 ctypes 在 pypy（详见下文）中的实现，同进也兼容 CPython。cffi提供了在 python 使用C类库的方式，可以直接在 python 代码中编写C代码，同时支持链接到已有的C类库。