NNI执行的流程

一个Experiment的运行逻辑是：

• Tuner 接收搜索空间，生成configuration。
• 将这些生成的configuration提交到很多训练平台上。
• 将各个平台上执行的训练结果返回给Advisor。
• 继续生成新的configuration(若需要的话)，进行下回合的训练。

使用者的使用逻辑是：

• 定义搜索空间，按照格式要求编写YML或JSON档。本例采YML。
• 改动原有模型代码，添加上nni的api。 (只需插入3行 nni开头的程序码而已。)
• 定义实验配置，在config.yml档中，根据要求，设置好对应的参数要求。

以下的YML部分，大家大略的看过。等到後面章节让大家安装、验证时，会更有感觉。

# Config_detail.yml 范例。将所有超参同时放在一个YML档案，比较好说明及理解。

# This example shows more configurable fields comparing to the minimal "config.yml"
# You can use "nnictl create --config config_detailed.yml" to launch this experiment.
# If you see an error message saying "port 8080 is used", 
# use "nnictl stop --all" to stop previous experiments.

experimentName: MNIST           # An optional name to help you distinguish experiments.

# Hyper-parameter search space can either be configured here or in a seperate file.
# "config.yml" shows how to specify a seperate search space file.
# The common schema of search space is documented here:
#   https://nni.readthedocs.io/en/stable/Tutorial/SearchSpaceSpec.html

searchSpace:
  batch_size:
    _type: choice
    _value: [16, 32, 64, 128]
  hidden_size:
    _type: choice
    _value: [128, 256, 512, 1024]
  lr:
    _type: choice
    _value: [0.0001, 0.001, 0.01, 0.1]
  momentum:
    _type: uniform
    _value: [0, 1]

trialCommand: python3 mnist.py  
# The command to launch a trial. NOTE: change "python3" to "python" if you are using Windows.

trialCodeDirectory: .             
# The path of trial code. 
# By default it's ".", which means the same directory of this config file.

trialGpuNumber: 1               
# How many GPUs should each trial use. CUDA is required when it's greater than zero.

trialConcurrency: 4            # Run 4 trials concurrently.
maxTrialNumber: 10             # Generate at most 10 trials.
maxExperimentDuration: 1h      # Stop generating trials after 1 hour.

# Configure the tuning algorithm.
tuner:                      
  name: TPE            
  # Supported algorithms: TPE, Random, Anneal, Evolution, GridSearch, GPTuner, PBTTuner, etc.
  # Full list:  https://nni.readthedocs.io/en/latest/Tuner/BuiltinTuner.html
  classArgs:                # Algorithm specific arguments. See the tuner's doc for details.
  optimize_mode: maximize   #   "minimize" or "maximize"

# Configure the training platform.
# Supported platforms: local, remote, openpai, aml, kubeflow, kubernetes, adl.
trainingService:
  platform: local
  useActiveGpu: false           
  # NOTE: Use "true" if you are using an OS with graphical interface 
  # (e.g. Windows 10, Ubuntu desktop)
  # Reason and details:
  # https://nni.readthedocs.io/en/latest/reference/experiment_config.html#useactivegpu

模型程序的部分(mnist.py)，请见下方说明。只需要注意：
• 主程序 line159，nni开头的码。
• def main(args) 函数中，有两行 nni开头的码(line 118 and 123)，这是和NNI沟通的码。
所以十分方便简洁。

另外，也可留意一下模型本身之参数的定义、和外部参数的合并、参数的叫用等。(下面链结内的程序，往後会提到。看不懂可忽略。)

nni/mnist.py at master · microsoft/nni · GitHub

说了好几天的概念，再不动手真的会睡着。下个章节将动手在本机安装NNI。

<<: Kotlin Android 第15天，从 0 到 ML - Android Jetpack

>>: 05. Feature Test x HTTP Test x API Test

NNI执行的流程

Day3 条件判断

Day 15 读 Go Concurrency Patterns - Rob Pike II

Day 21 ：广度优先搜寻 Breadth-First search(BFS)

【从实作学习ASP.NET Core】Day11 | 後台 | 详细资料与 ViewModel

Day20 Vue元件中的网页模板

Thunkable学习笔记 4 - 变数(Firebase EMail登入的延伸)

【在 iOS 开发路上的大小事－Day14】Firebase 的登入验证服务介绍

企划实现(24)

[Java Day18] 4.6. 可变个数的参数

Day 30: 更多的 Vue SSR