业务场景
需求:计算 RTF(实时率)。
说明:
对于语音处理,
R T F = 音 频 处 理 时 间 音 频 时 长 RTF = \frac{音频处理时间}{音频时长} RTF=音频时长音频处理时间
获取总的音频时长,
awk '{sum+=$NF}END{print sum}' path/to/utt2dur
# i.e Ouput is 3000.00, unit is second
获取总的处理时间,因为是用 time ./run.sh 跑的,输出的时间格式为 H:M:S,譬如
$ time bash -x ./lg_run_gmm.sh --dnn false --nj 4 test/polly_punct_gmm test/result/polly_punct_gmm 2>&1 | tee ./logs/lg_run_gmm.sh.polly_punct.log
Done. elapse=713
bash -x ./lg_run_gmm.sh --dnn false --nj 4 test/polly_punct_gmm 2>&1 2370.61s user 11.69s system 334% cpu 11:52.82 total
tee ./logs/lg_run_gmm.sh.polly_punct.log 0.00s user 0.01s system 0% cpu 11:52.81 total
11:52.81 这个时间是我需要转化的。
使用 Python 计算时间差
转化成 time 数据类型:
How to construct a timedelta object from a simple string
我使用 dateutil:
>>> import dateutil
>>> dateutil.parser.parse('10:11.903')
datetime.datetime(2018, 10, 25, 10, 11, 54)
Convert datetime.time to seconds
>>> from datetime import datetime, date, time, timedelta
>>> timeobj = time(12, 45)
>>> t = datetime.combine(date.min, timeobj) - datetime.min
>>> isinstance(t, timedelta)
# True
>>> t.total_seconds()
45900.0
# You can calculate it by yourself:
from datetime import datetime
t = datetime.now().time()
seconds = (t.hour * 60 + t.minute) * 60 + t.second
import datetime
t = datetime.time(10, 0, 5)
seconds = int(datetime.timedelta(hours=t.hour, minutes=t.minute, seconds=t.second).total_seconds())
from datetime import datetime as dtt
time_only = dtt.strptime('15:30', "%H:%M") - dtt.strptime("00:00", "%H:%M")
源码
get_rtf.py
#!/usr/bin/env python3
import sys
import datetime
import time
import argparse
import dateutil
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('utt2dur', help='utt2dur file')
parser.add_argument('elapse', help='total elapse time')
parser.add_argument('--nj', help='number of jobs', default=1, type=int)
# utt2dur = sys.argv[1]
# total_time = sys.argv[2]
# num_job = sys.argv[3]
args = parser.parse_args()
utt2dur = args.utt2dur
total_time = args.elapse
num_jobs = args.nj
with open(utt2dur) as fp:
audio_seconds = sum(float(utt_dur.split()[1]) for utt_dur in fp)
print('audio_seconds:', audio_seconds)
if audio_seconds <= 0:
print('audio_seconds is required positive,'
'but it has value {}'.format(audio_seconds))
sys.exit(-1)
if total_time.count(':') == 1:
total_time = '00:' + total_time
elif total_time.count(':') == 0:
total_time = '00:00:' + total_time
dt = dateutil.parser.parse(total_time)
t = dt.time()
print('process time:', t)
td = datetime.timedelta(hours=t.hour, minutes=t.minute,
seconds=t.second, microseconds=t.microsecond)
proccess_seconds = td.total_seconds()
print('proccess_seconds:', proccess_seconds)
rtf = num_jobs * proccess_seconds / audio_seconds
print('rtf: {:.3f}'.format(rtf))

本文介绍了在Python中如何处理Shell命令输出的时间格式,通过将时间字符串转化为datetime.timedelta对象,来计算RTF(实时率)。详细讲述了在语音处理中计算音频时长与处理时间的差值,以及具体的Python源码实现。

597

被折叠的 条评论
为什么被折叠?



