Day 25 - 重覆呼叫shioaji.Shioaji()产生的记忆体问题-修正篇

在前一天的Day 24 - Shiaoji.Login踩坑经验及修正中,谈到在执行login动作时,未考虑要等待contract fetch动作完成,导致所抓的资料内容不完整。这次的踩坑经验,也让我重新去思考Day 23 - 重覆呼叫shioaji.Shioaji()产生的记忆体问题中,是否也会因为相同的问题,影响测试的结果。
所以,我把当时测试的程序码,改为以下方式:

import os, psutil, gc, time
import shioaji as sj
from dotenv import load_dotenv

load_dotenv('D:\\python\\shioaji\\.env') #读取.env中的环境变数

process = psutil.Process(os.getpid())
print(os.getppid())
print(f'mem usage as start: {process.memory_info()[0]/float(2 ** 20)}MB')
for i in range(5):
    api = sj.Shioaji(simulation=True)
    print(f'({i})mem usage at api initial: {process.memory_info()[0]/float(2 ** 20)}MB')
    start_time = time.time()
    # api.login(
    #     person_id=os.getenv('YOUR_PERSON_ID'), 
    #     passwd=os.getenv('YOUR_PASSWORD'),
    #     contracts_timeout=10000
    # )
    api.login(
        person_id='PAPIUSER01',
        passwd='2222'
    )
    print(api.Contracts) #增加此行,等待所有的Contract资料fetch完成後,才继续
    print(f'contract exe time:{time.time()-start_time}')
    print(f'({i})mem usage at api.login: {process.memory_info()[0]/float(2 ** 20)}MB')
    api.logout()
    del api

gc.collect()
print(f'mem usage after del api: {process.memory_info()[0]/float(2 ** 20)}MB')

这次的程序中,在login後增加一行print(api.Contracts),确保所有的Contract资料都fetch完成後,才继续後续的动作;另外在记忆体计算部份,因为换算时是除以2的20次方,所以单位是MB而不是KB。而这一次的测试,也一并测试Contract fetch所需要的时间,以便与优化後做比较。

修正後的测试结果

本次的测试,也依照上次的4个不同场景做测试,测试结果如下:

程序点 windows&正式环境 windows&测试环境 ubuntu&正式环境 ubuntu&测试环境
start 33.984375 33.9296875 34.55859375 34.4375
0-initial 37.87890625 38.02734375 38.0234375 37.8046875
0-login 226.46875 231.0390625 220.9765625 39.03515625
0-contract fetch time 7.203447580337524 8.030523300170898 5.9245991706848145 7.924567461013794
1-initial 226.4453125 231.01171875 220.9765625 229.625
1-login 406.69921875 409.90625 219.4492188 400.9609375
1-contract fetch time 7.261681318283081 8.347200632095337 6.103384017944336 8.131371021270752
2-initial 406.640625 413.8671875 398.80078125 415.3203125
2-login 587.5625 597.703125 586.6640625 597.3125
2-contract fetch time 6.988409757614136 8.512909173965454 6.160722017288208 7.1584250926971436
3-initial 587.51171875 597.69140625 586.6640625 597.3125
3-login 767.63671875 780.59765625 774.5546875 781.94921875
3-contract fetch time 7.1759161949157715 8.046003103256226 6.2177183628082275 7.363017797470093
4-initial 767.58203125 780.578125 774.5546875 781.94921875
4-login 950.8515625 963.15234375 954.59375 970.33203125
4-contract fetch time 7.381143569946289 8.393444299697876 6.169715166091919 7.520856142044067
del & gc 950.81640625 963.05078125 954.59375 970.33203125

这一次的测试结果,每个测试场景的记忆体使用量及时间,其实都差不多;而上一次的测试结果,测试环境及Linux环境下,之所以都会比windows平台或是正式环境要来得低,都是因为Contract资料下载不完整,而得到错误的测试结果
虽然在测试程序中,有执行del api及gc.collect,但测试的结果却是记忆体使用量没有下降,因为Python的记忆体空间都是由Python memory manager在做管理,而实际上会不会清理出记忆体空间,也是要看是不是符合memory manager的管理机制,并不像C/C++可以直接在程序中做记忆体管理。
关於Python memory manager,可参考下列文章:
Python 3.10.0 Documentation » Python/C API Reference Manual » Memory Management
Memory Management in Python
Python 用del删除变量以后为什么还是OOM(Python的内存管理与垃圾回收)

优化方式及差异

在上次那篇有谈到,记忆体使用量优化的方式,这里就说明程序码调整内容,以及优化後的测试结果
优化的分别程序码如下:

将sj.Shioaji()移至外层,不重覆执行

import os, psutil, time
import shioaji as sj
from dotenv import load_dotenv

load_dotenv('D:\\python\\shioaji\\.env')

process = psutil.Process(os.getpid())
print(os.getppid())
print(f'mem usage as start: {process.memory_info()[0]/float(2 ** 20)}MB')
process_start_time = time.time()
api = sj.Shioaji() #api宣告移至最外层,不重覆执行
for i in range(5):
    print(f'({i})mem usage at api initial: {process.memory_info()[0]/float(2 ** 20)}MB')
    start_time = time.time()
    api.login(
        person_id=os.getenv('YOUR_PERSON_ID'), 
        passwd=os.getenv('YOUR_PASSWORD')
    )
    print(api.Contracts)
    print(f'({i})contract fetch time:{time.time()-start_time}')
    print(f'({i})mem usage at api.login: {process.memory_info()[0]/float(2 ** 20)}MB')
    api.logout()

print(f'process total exe time:{time.time()-process_start_time}')

仅第一次执行时,执行contract fetch并将Contracts资料储存

import os, psutil, time
import shioaji as sj
from dotenv import load_dotenv

load_dotenv('D:\\python\\shioaji\\.env')

process = psutil.Process(os.getpid())
print(os.getppid())
print(f'mem usage as start: {process.memory_info()[0]/float(2 ** 20)}MB')
process_start_time = time.time()
my_contract = None #宣告my_contract物件

for i in range(5):
    api = sj.Shioaji()
    print(f'({i})mem usage at api initial: {process.memory_info()[0]/float(2 ** 20)}MB')
    start_time = time.time()
    # 当my_contract为None时,执行contract fetch动作
    if my_contract is None:
        api.login(
            person_id=os.getenv('YOUR_PERSON_ID'), 
            passwd=os.getenv('YOUR_PASSWORD')
        )
        print(api.Contracts) #等待Contract fetch完成
        my_contract = api.Contracts #将my_contract指向api.Contracts
    else:
        print('fetch_contract=False...')
        api.login(
            person_id=os.getenv('YOUR_PERSON_ID'), 
            passwd=os.getenv('YOUR_PASSWORD'),
            fetch_contract=False #设为False,表示login後不执行contract fetch
        )
        print(my_contract)
    print(f'({i})contract fetch time:{time.time()-start_time}')
    print(f'({i})mem usage at api.login: {process.memory_info()[0]/float(2 ** 20)}MB')
    api.logout()

print(f'process total exe time:{time.time()-process_start_time}')

将sj.Shioaji()移至外层,不重覆执行,以及仅第一次执行fetch动作

import os, psutil, time
import shioaji as sj
from dotenv import load_dotenv

load_dotenv('D:\\python\\shioaji\\.env') #读取.env中的环境变数

process = psutil.Process(os.getpid())
print(os.getppid())
print(f'mem usage as start: {process.memory_info()[0]/float(2 ** 20)}MB')
process_start_time = time.time()
api = sj.Shioaji() #api宣告移至最外层,不重覆执行

for i in range(5):
    print(f'({i})mem usage at api initial: {process.memory_info()[0]/float(2 ** 20)}MB')
    start_time = time.time()
    # 当api.Contracts为None时,执行contract fetch动作
    if api.Contracts is None:
        api.login(
            person_id=os.getenv('YOUR_PERSON_ID'), 
            passwd=os.getenv('YOUR_PASSWORD')
        )
    else:
        print('fetch_contract=False...')
        api.login(
            person_id=os.getenv('YOUR_PERSON_ID'), 
            passwd=os.getenv('YOUR_PASSWORD'),
            fetch_contract=False #设为False,表示login後不执行contract fetch
        )
    print(api.Contracts)
    print(f'({i})contract fetch time:{time.time()-start_time}')
    print(f'({i})mem usage at api.login: {process.memory_info()[0]/float(2 ** 20)}MB')
    api.logout()

print(f'process total exe time:{time.time()-process_start_time}')

测试结果

程序点 sj.Shioaji()移至外层 仅第一次执行时,执行contract fetch并将Contracts资料储存 sj.Shioaji()移至外层+仅第一次fetch
start 34.0390625 33.9765625 33.9765625
0-initial 37.97265625 37.84765625 38.01171875
0-login 226.421875 226.765625 226.22265625
0-contract fetch time 6.864650726318359 7.08452582359314 6.8395609855651855
1-initial 226.4453125 226.91015625 226.16015625
1-login 247.984375 227.140625 226.31640625
1-contract fetch time 1.8755991458892822 1.5779919624328613 1.5284152030944824
2-initial 249.35546875 227.18359375 226.265625
2-login 250.640625 227.34765625 226.4375
2-contract fetch time 2.2606329917907715 1.7178151607513428 1.3352644443511963
3-initial 259.97265625 227.33203125 226.39453125
3-login 273.60546875 227.5 226.5625
3-contract fetch time 2.4294867515563965 1.4267780780792236 1.6401028633117676
4-initial 274.13671875 227.48828125 226.5078125
4-login 276.0234375 227.6484375 226.671875
4-contract fetch time 2.1341898441314697 1.4912939071655273 1.2037618160247803
process total exe time 16.376935720443726 13.961941003799438 12.832036018371582

可以看到,这三种优化方式,从第二次登入後的测试结果,记忆体使用量或是contract fetch time,都跟优化前的有很明显示的改善。


<<:  【把玩Azure DevOps】Day28 设定Pipeline的识别文字格式

>>:  【Day 27】迁移学习(Transfer Learning)(下)

#3 CSS Introduction x Foodie

What is CSS? English: CSS = Cascading Stylesheets ...

MSSQL 远端连线错误

前几天都还可以远端连线至主机A,也没有调整甚麽设定,但突然出现连线错误的讯息... 错误讯息:已超过...

【Day 29】- 应对反爬虫技术-综合篇

前情提要 昨天跟各位读者简介了反爬虫技术中,较常出现的验证码之应对方法。 开始之前 今天要跟各位介绍...

画一个三角形(下)

大家好,我是西瓜,你现在看到的是 2021 iThome 铁人赛『如何在网页中绘制 3D 场景?从 ...

股市小白混乱篇-使用 ticks API(2)

继昨天我们已经可以取得ticks的资料後, 有没有发现资料有点难看, 长长一串array很难观看, ...