[Day25] Tableau 轻松学 - TabPy 使用方法 2

前言

直接将所有 Python 程序写在工作簿内的第一种 TabPy 使用方法我们已经学会了，但这种方法的最大缺点是难以管控程序码，无法将程序码提供给多个工作簿共用。这里要分享的是第二种 TabPy 使用方式，以布署函式的方式让我们可以将程序码集中管理。

布署函式

说明

第二种使用方式就是向 TabPy Server 先注册 Python 函式来提前布署，布署好的函式在 TabPy 中就会被视为一个 Model，Tableau Desktop 只需要指定要用来处理资料的 Model 名称，即可等待运算结果回传。

流程

建立一个名称为 TabPyTest.py 的 Python 档案，内容如下

from tabpy.tabpy_tools.client import Client

client = Client('http://localhost:9004/')

def foo(data1, data2):
    return True

client.deploy('foo', foo, 'This is the test function.')

Client(url)：建立一个 TabPy Client 物件，并指定对 url 连线，TabPy 目前不接受远端布署，只能对在 localhost 的 TabPy Server 布署函式，所以这里的 url 网域必为 localhost。
foo(data1, data2)：自行建立可接受两个输入参数的函式，与第一种 TabPy 使用方法不同的是，参数名称可以自定义，这里分别命名为 data1 与 data2。
client.deploy(model_name, function, model_description)：model_name 为布署後的 Model 名称，可以选择与函式不同的名称，但建议为有意义并且容易懂的词汇；function 为要布署的函式；model_description 为 Model 的补充叙述。

执行布署 (需在有安装 TabPy 的虚拟环境中执行)

(Tableau-Python-Server) C:\Users\wrxue>python TabPyTest.py

若布署成功，在 http://localhost:9004/ 的 Deployed Models 区域应该就会看到新增一个名为 foo 的 Model，也就是我们在 TabPyTest.py 内的 foo 函式

"foo": {
    "description": "This is the test function.",
    "type": "model",
    "version": 1,
    "dependencies": [],
    "target": null,
    "creation_time": 1626685276,
    "last_modified_time": 1626685276,
    "schema": null,
    "docstring": "-- no docstring found in query function --"
}

使用方法

TabPy 布署

理解布署的概念与流程之後，便能将我们在方法一使用到的 Python 包装成四个不同的函式来布署，将 TabPyTest.py 修改为如下内容後执行布署

from tabpy.tabpy_tools.client import Client

client = Client('http://localhost:9004/')

def testBool(data):
    return [x > 10000 for x in data]

def testInt(data):
    return [int(x * 2) for x in data]

def testReal(data):
    import math
    return [math.sqrt(x) for x in data]

def testStr(data1, data2):
    return [f'{x[1]} 的销售额为 {int(x[0])}' for x in zip(data1, data2)]

client.deploy('test_SCRIPT_BOOL', testBool, 'Test SCRIPT_BOOL by deployment')
client.deploy('test_SCRIPT_INT', testInt, 'Test SCRIPT_INT by deployment')
client.deploy('test_SCRIPT_REAL', testReal, 'Test SCRIPT_REAL by deployment')
client.deploy('test_SCRIPT_STR', testStr, 'Test SCRIPT_STR by deployment')

TabPy Server 的 Deployed Models 会跟着新增 4 个 Models，分别为 test_SCRIPT_BOOL、test_SCRIPT_INT、test_SCRIPT_REAL 与 test_SCRIPT_STR

"test_SCRIPT_BOOL": {
    "description": "Test SCRIPT_BOOL by deploy",
    "type": "model",
    "version": 1,
    "dependencies": [],
    "target": null,
    "creation_time": 1626686885,
    "last_modified_time": 1626686885,
    "schema": null,
    "docstring": "-- no docstring found in query function --"
},
"test_SCRIPT_INT": {
    "description": "",
    "type": "model",
    "version": 1,
    "dependencies": [],
    "target": null,
    "creation_time": 1626686886,
    "last_modified_time": 1626686886,
    "schema": null,
    "docstring": "-- no docstring found in query function --"
},
"test_SCRIPT_REAL": {
    "description": "",
    "type": "model",
    "version": 1,
    "dependencies": [],
    "target": null,
    "creation_time": 1626686886,
    "last_modified_time": 1626686886,
    "schema": null,
    "docstring": "-- no docstring found in query function --"
},
"test_SCRIPT_STR": {
    "description": "",
    "type": "model",
    "version": 1,
    "dependencies": [],
    "target": null,
    "creation_time": 1626686886,
    "last_modified_time": 1626686886,
    "schema": null,
    "docstring": "-- no docstring found in query function --"
}

Tableau Desktop 呼叫

修改工作簿中 4 个与 SCRIPT 函式有关的 Calculated Field

销售额大於10000

SCRIPT_BOOL("return tabpy.query('test_SCRIPT_BOOL', _arg1)['response']", SUM([Sales]))

2倍销售额

SCRIPT_INT("return tabpy.query('test_SCRIPT_INT', _arg1)['response']", SUM([Sales]))

销售额平方根

SCRIPT_REAL("return tabpy.query('test_SCRIPT_REAL', _arg1)['response']", SUM([Sales]))

销售额说明

SCRIPT_STR("return tabpy.query('test_SCRIPT_STR', _arg1, _arg2)['response']"
, SUM([Sales]), ATTR([State]))

此时的效果就与 [Day24] Tableau 轻松学 - TabPy 使用方法 1 的效果是一样的，只是使用 Python 的方式不同而已。

覆盖与移除 Model

当我们想要直接重新布署已经存在的 Model，会出现错误讯息如下，大意是说已经有相同名称的 Model 存在

RuntimeError: An endpoint with that name (test_SCRIPT_BOOL) already exists. Use "override = True" to force update an existing endpoint.

这时候我们有两个方法让布署能够成功，一个方法是直接覆盖掉现有的 Model，另一个方法则是先移除现存的 Model 再行布署。

覆盖 Model

只需在 client.deploy 加上 override 参数，允许它可以覆盖现有的 Model

client.deploy('test_SCRIPT_BOOL', testBool, 'Test SCRIPT_BOOL by deployment', override=True)

覆盖後，若仔细观察 Deployed Models 中的 test_SCRIPT_BOOL，会看到它的 version 变为 2，这是因为每次覆盖会造成版次自动加 1

"test_SCRIPT_BOOL": {
    "description": "Test SCRIPT_BOOL by deployment",
    "type": "model",
    "version": 2,
    "dependencies": [],
    "target": null,
    "creation_time": 1626686885,
    "last_modified_time": 1626696002,
    "schema": null,
    "docstring": "-- no docstring found in query function --"
},

移除 Model

在 client.deploy 之前先呼叫移除 Model 的函式便能将 Model 名称空出来，避免 Model 撞名导致无法布署

client.remove('test_SCRIPT_BOOL')

结语

这里介绍的 TabPy 使用方法让我们可以集中管理 Python 程序码，使工作簿可以共用相同的函式。但这种方法不容易得知有哪些工作簿使用到对应的 Model，无法快速知道若将 Model 进行更新对应需要修改的工作簿有哪些。我个人认为两种 TabPy 方法可以并行采用，若程序码不会被重复使用，可以考虑直接写在工作簿内，而会被重复使用的程序码还是以布署的方式为主，维护上会比较方便。

工作簿原始档案

完成的工作簿

在实作中遇到困难是难免的，这里提供原始档作为参考，若仍然无法解决欢迎至下方讨论区留言。

<<: 新需求与架构设计的演进

>>: Day-25 尚未开始便已衰败、策略错误的 XBOX ONE

[Day25] Tableau 轻松学 - TabPy 使用方法 2

前言

布署函式

说明

流程

使用方法

TabPy 布署

Tableau Desktop 呼叫

销售额大於10000

2倍销售额

销售额平方根

销售额说明

覆盖与移除 Model

覆盖 Model

移除 Model

结语

工作簿原始档案

Day9 主动情蒐-nmap(1)

[所以我说 Google 你的测试] 品质到底是什麽？

Day 5 - Object & Function

连续 30 天玩玩看 ProtoPie - Day 4

javascript流程控制-判断式2

一Ryu大师: REST API

css display

[Day 15] Reverse 小忙碌

[Day5](win 10/ssh 、smb )认证问题，出示证件!!!!

[Day18] Null byte Injection