一个Experiment的运行逻辑是:
• Tuner 接收搜索空间,生成configuration。
• 将这些生成的configuration提交到很多训练平台上。
• 将各个平台上执行的训练结果返回给Advisor。
• 继续生成新的configuration(若需要的话),进行下回合的训练。
使用者的使用逻辑是:
• 定义搜索空间,按照格式要求编写YML或JSON档。 本例采YML。
• 改动原有模型代码,添加上nni的api。 (只需插入3行 nni开头的程序码而已。)
• 定义实验配置,在config.yml档中,根据要求,设置好对应的参数要求。
以下的YML部分,大家大略的看过。等到後面章节让大家安装、验证时,会更有感觉。
# Config_detail.yml 范例。将所有超参同时放在一个YML档案,比较好说明及理解。
# This example shows more configurable fields comparing to the minimal "config.yml"
# You can use "nnictl create --config config_detailed.yml" to launch this experiment.
# If you see an error message saying "port 8080 is used",
# use "nnictl stop --all" to stop previous experiments.
experimentName: MNIST # An optional name to help you distinguish experiments.
# Hyper-parameter search space can either be configured here or in a seperate file.
# "config.yml" shows how to specify a seperate search space file.
# The common schema of search space is documented here:
# https://nni.readthedocs.io/en/stable/Tutorial/SearchSpaceSpec.html
searchSpace:
batch_size:
_type: choice
_value: [16, 32, 64, 128]
hidden_size:
_type: choice
_value: [128, 256, 512, 1024]
lr:
_type: choice
_value: [0.0001, 0.001, 0.01, 0.1]
momentum:
_type: uniform
_value: [0, 1]
trialCommand: python3 mnist.py
# The command to launch a trial. NOTE: change "python3" to "python" if you are using Windows.
trialCodeDirectory: .
# The path of trial code.
# By default it's ".", which means the same directory of this config file.
trialGpuNumber: 1
# How many GPUs should each trial use. CUDA is required when it's greater than zero.
trialConcurrency: 4 # Run 4 trials concurrently.
maxTrialNumber: 10 # Generate at most 10 trials.
maxExperimentDuration: 1h # Stop generating trials after 1 hour.
# Configure the tuning algorithm.
tuner:
name: TPE
# Supported algorithms: TPE, Random, Anneal, Evolution, GridSearch, GPTuner, PBTTuner, etc.
# Full list: https://nni.readthedocs.io/en/latest/Tuner/BuiltinTuner.html
classArgs: # Algorithm specific arguments. See the tuner's doc for details.
optimize_mode: maximize # "minimize" or "maximize"
# Configure the training platform.
# Supported platforms: local, remote, openpai, aml, kubeflow, kubernetes, adl.
trainingService:
platform: local
useActiveGpu: false
# NOTE: Use "true" if you are using an OS with graphical interface
# (e.g. Windows 10, Ubuntu desktop)
# Reason and details:
# https://nni.readthedocs.io/en/latest/reference/experiment_config.html#useactivegpu
模型程序的部分(mnist.py),请见下方说明。只需要注意:
• 主程序 line159,nni开头的码。
• def main(args) 函数中,有两行 nni开头的码(line 118 and 123),这是和NNI沟通的码。
所以十分方便简洁。
另外,也可留意一下模型本身之参数的定义、和外部参数的合并、参数的叫用等。(下面链结内的程序,往後会提到。看不懂可忽略。)
nni/mnist.py at master · microsoft/nni · GitHub
说了好几天的概念,再不动手真的会睡着。下个章节将动手在本机安装NNI。
<<: Kotlin Android 第15天,从 0 到 ML - Android Jetpack
>>: 05. Feature Test x HTTP Test x API Test
这篇是 Thunkable学习笔记 2 - 加入Firebase登入功能(使用EMail) 的功能加...
前情提要 在前一篇有提到说,Firebase 有提供许多服务供开发者使用 登入验证服务算是蛮常会被使...
使用tablatout串fragment xml程序码 <?xml version="...
教材网址 https://coding104.blogspot.com/2021/06/java-V...
这篇程序码在 https://github.com/DanSnow/ironman-2020/tr...