Day 24 - Shiaoji.Login踩坑经验及修正

今天原本想开始抓个股的kbar资料及後续处理,结果在清洗Contract资料时,发现抓出来的TSE+OTC的股票Contract资料,相同的程序执行结果有时是592笔,但有时又变成294笔,因为数字实在是差太多了,所以就又回去看了一下官方说明文件内容:https://sinotrade.github.io/tutor/contract/
Contracts的资料目前分为4大类:指数Index、股票Stock、期货Future跟Option选择权,在执行login()时,在登入成功後,shioaji会开始下载及初始化Contract资料。而contracts_cb这个参数,就是当上面4大类的Contract资料下载及初始完成後,就会执行contracts_cb中所传入的function。
在这里,我们修改一下之前的Day 18 - 取得所有Contract程序范例中的程序,并看一下实际上是发生了什麽问题,修改後的程序说明如下:

import os
from dotenv import load_dotenv
import shioaji as sj
import pandas as pd

load_dotenv('D:\\python\\shioaji\\.env') #读取.env中的环境变数
api = sj.Shioaji()
api.login(
    person_id=os.getenv('YOUR_PERSON_ID'), 
    passwd=os.getenv('YOUR_PASSWORD'),
    contracts_cb=print #设定callback为print,即完成初始化时输出至console
)
print('api.login is done...') #输出目前执行的步骤至console
stock_list = []

for exchange in api.Contracts.Stocks:
    for stock in exchange:
        stock_list.append({**stock})

print('for loop is done...') #输出目前执行的步骤至console
print(f'len(stock_list) is :{len(stock_list)}')
df = pd.DataFrame(stock_list)
df.to_csv('stock_list.csv', index=False, encoding="utf_8_sig")
print('df.to_csv is done...') #输出目前执行的步骤至console

api.logout()

执行结果:

Response Code: 0 | Event Code: 0 | Info: host '203.66.91.161:80', hostname '203.66.91.161:80' IP 203.66.91.161:80 (host 1 of 1) (host connection attempt 1 of 1) (total connection attempt 1 of 1) | Event: Session up
SecurityType.Index
SecurityType.Future
api.login is done...
for loop is done...
len(stock_list) is :2773
df.to_csv is done...
SecurityType.Stock
SecurityType.Option

可以看到stock_list的长度为2773。但是Contracts.Stocks资料下载完成後所输出的「SecurityType.Stock」讯息,却是在for loop回圈完成并输出csv档之後,这表示我们抓的Contracts.Stocks其实是不完整的。
接着,把上面的程序码稍做修改,修改之後的程序如下:

import os
from dotenv import load_dotenv
import shioaji as sj
import pandas as pd
from shioaji.constant import SecurityType #汇入SecurityType常数
import threading #汇入threading模组

load_dotenv('D:\\python\\shioaji\\.env') #读取.env中的环境变数
event = threading.Event() #宣告event
api = sj.Shioaji()
def my_cb(security_type):
    print(repr(security_type))
    #当Contracts.Stocks下载完成时,输出讯息并执行event.set()
    if security_type == SecurityType.Stock:
        print('Contracts.Stock is fetch..')
        event.set() #让原本wait的程序继续执行

api.login(
    person_id=os.getenv('YOUR_PERSON_ID'), 
    passwd=os.getenv('YOUR_PASSWORD'),
    contracts_cb=my_cb #指定callback为my_cb
)
print('api.login is done...')
event.wait() #api.login执行完後,让程序先进入等待
stock_list = []
print('start for loop...')
for exchange in api.Contracts.Stocks:
    for stock in exchange:
        stock_list.append({**stock})

print('for loop is done...')
print(f'len(stock_list) is :{len(stock_list)}')
df = pd.DataFrame(stock_list)
print(len(df))
print(len(stock_list))
df.to_csv('stock_list.csv', index=False, encoding="utf_8_sig")
print('df.to_csv is done...')

api.logout()

修改後的程序执行结果如下:

Response Code: 0 | Event Code: 0 | Info: host '203.66.91.161:80', hostname '203.66.91.161:80' IP 203.66.91.161:80 (host 1 of 1) (host connection attempt 1 of 1) (total connection attempt 1 of 1) | Event: Session up
<SecurityType.Index: 'IND'>
<SecurityType.Future: 'FUT'>
api.login is done...
<SecurityType.Stock: 'STK'>
Contracts.Stock is fetch..
start for loop...
<SecurityType.Option: 'OPT'>
for loop is done...
len(stock_list) is :32773
32773
32773
df.to_csv is done...

在api.login执行完成後,我们加入了event.wait()让程序进入等待,当Contracts.Stocks下载完成时,执行event.set()让原本等待中的程序继续执行。跑出来的stock_list长度就增加到32773。且执行的结果,for loop会在Contracts.Stock完成下载後再开始进行,这样子所抓出来的资料才会是完整的。

除了用event.wait()及event.set(),来确认我们要抓的Contracts是否已完整下载外,实测时我发现,其实也可以用print的方式确认,下面的程序即为用print的方式确认:

import os
from dotenv import load_dotenv
import shioaji as sj
import pandas as pd
from shioaji.constant import SecurityType

load_dotenv('D:\\python\\shioaji\\.env') #读取.env中的环境变数
api = sj.Shioaji()
def my_cb(security_type):
    print(repr(security_type))
    if security_type == SecurityType.Stock:
        print('Contracts.Stock is fetch..')

api.login(
    person_id=os.getenv('YOUR_PERSON_ID'), 
    passwd=os.getenv('YOUR_PASSWORD'),
    contracts_cb=my_cb
)
print('api.login is done...')
print(api.Contracts.Stocks) #将api.Contracts.Stocks输出至console
stock_list = []
print('start for loop...')
for exchange in api.Contracts.Stocks:
    for stock in exchange:
        stock_list.append({**stock})

print('for loop is done...')
print(f'len(stock_list) is :{len(stock_list)}')
df = pd.DataFrame(stock_list)
df.to_csv('stock_list.csv', index=False, encoding="utf_8_sig")
print('df.to_csv is done...')

# df = pd.DataFrame(future_list)
# df.to_csv('future_list.csv', index=False, encoding="utf_8_sig")

api.logout()

程序执行结果如下:

Response Code: 0 | Event Code: 0 | Info: host '203.66.91.161:80', hostname '203.66.91.161:80' IP 203.66.91.161:80 (host 1 of 1) (host connection attempt 1 of 1) (total connection attempt 1 of 1) | Event: Session up
<SecurityType.Index: 'IND'>
<SecurityType.Future: 'FUT'>
api.login is done...
<SecurityType.Stock: 'STK'>
Contracts.Stock is fetch..
start for loop...
<SecurityType.Option: 'OPT'>
for loop is done...
len(stock_list) is :32773
32773
32773
df.to_csv is done...

执行後,可以发现虽然程序中没有使用event.wait()进行等待,但在执行print(api.Contracts.Stocks)时,程序其实是会进入blocking状态并等待Contracts.Stocks资料下载完成,才继续进行後续的动作,而且抓出来的stock_list长度也跟上面的程序结果相同。
因为这里的程序只有要抓Contracts.Stocks中的资料,所以只有执行print(api.Contracts.Stocks)来做确保;如果你要确保这4种类型的资料都下载完成,可以改为print(api.Contracts),这样就会变成在Option资料下载完成後,才继续後续的动作。


<<:  【领域展开 24 式】 WordPress 外挂目录中排名第一的 YoastSEO

>>:  [经典回顾]走骇客的路让骇客无路可走?

Day 27 Explore monitoring and reporting

Monitor applications and services Azure Monitor An...

[day 17] Swift 语法梳理後续

Swift 语法介绍 枚举(Enumerations) ,类和结构体 枚举(Enumerations...

Day21:今天来聊一下Firewall的Evasion

最後倒数10天真的是什麽状况都有老婆下雨骑车雷铲,原定在家写的 实做LAB文章只能在医院用手机以注音...

Quora、Answer the Public: 解决用户问题,先知道大家都问些什麽问题?

这个实用网路行销工具系列文,将会整理我平常研究的各项网路行销工具,帮助工程师如果有现成的服务可以快速...

SQLServer 2008R2清除日志文件的方法

清除代码概要,“dbname"为数据库名,”dblogname“为日志名,使用时根据具体情况替换 注...