Day 7. Hashicorp Nomad: Inspect a job

Hashicorp Nomad: Inspect a job

当一套工具有一个好的Web UI可以使用时,有时候会忘记CLI怎麽下

Job status

透过nomad job status 可以查看所有job的状态

$ nomad job status
ID                                  Type     Priority  Status   Submit Date
java-batch                          batch    50        dead     2021-09-04T00:59:17+08:00
erp                                 service  50        running  2021-09-07T22:51:48+08:00
XXXXXXX                             service  50        dead     2021-09-03T11:02:47+08:00
test                                service  50        running  2021-09-03T11:22:34+08:00
webserv                             service  50        running  2021-08-27T08:38:14+08:00
web-standby                         service  50        running  2021-08-27T08:38:56+08:00

加上ID可以查看该job的详细资讯: nomad job status erp

$ nomad job status erp
ID            = erp
Name          = erp
Submit Date   = 2021-09-07T22:51:48+08:00
Type          = service
Priority      = 50
Datacenters   = Nomad
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
webfront    0       0         2        2       0         0

Latest Deployment
ID          = 263de268
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
webfront    2        4       2        2          2021-09-07T23:02:05+08:00

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created     Modified
59e6a6bb  970f4da9  webfront    0        run      running  12m22s ago  11m53s ago
9862d9de  970f4da9  webfront    0        run      running  12m22s ago  11m53s ago
07152288  970f4da9  webfront    0        stop     failed   13m41s ago  12m20s ago
4fa59294  970f4da9  webfront    0        stop     failed   13m41s ago  12m20s ago

Job evaluation status

Job evaluation是一个job的调度状态,可以透过参数 -evals查看,
例:以下这个job, 一开始是 job-register, 过程有alloc-failure, 再到deployment-watcher
如果有Placement Failures=true,的情况可以使用 nomad eval status EvaluationsID来查看

$ nomad job status -evals erp
ID            = erp
Name          = erp
Submit Date   = 2021-09-07T22:51:48+08:00
Type          = service
Priority      = 50
Datacenters   = Nomad
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
webfront    0       0         2        2       0         0

Evaluations
ID        Priority  Triggered By        Status    Placement Failures
1ba4c1dc  50        deployment-watcher  complete  false
00d90ff5  50        alloc-failure       complete  false
68424b01  50        alloc-failure       complete  false
c260cdb5  50        deployment-watcher  complete  false
c87b22e4  50        alloc-failure       complete  false
e9e513e1  50        job-register        complete  false

Latest Deployment
ID          = 263de268
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
webfront    2        4       2        2          2021-09-07T23:02:05+08:00

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created     Modified
59e6a6bb  970f4da9  webfront    0        run      running  15m14s ago  14m45s ago
9862d9de  970f4da9  webfront    0        run      running  15m14s ago  14m45s ago
07152288  970f4da9  webfront    0        stop     failed   16m33s ago  15m12s ago
4fa59294  970f4da9  webfront    0        stop     failed   16m33s ago  15m12s ago

Job allocation status

Job allocation是一个job的被分配後的状态,包含cpu, memory, disk等,
job allocation失败也会有log资讯
可以透过nomad alloc status AllocationID来查看

$ nomad alloc status 07152288
ID                   = 07152288-c0f0-dc4c-3133-110c38ea2c1f
Eval ID              = e9e513e1
Name                 = erp.webfront[0]
Node ID              = 970f4da9
Node Name            = nomad-worker
Job ID               = erp
Job Version          = 0
Client Status        = failed
Client Description   = Failed tasks
Desired Status       = stop
Desired Description  = alloc was rescheduled because it failed
Created              = 26m27s ago
Modified             = 25m6s ago
Deployment ID        = 263de268
Deployment Health    = unhealthy
Replacement Alloc ID = 9862d9de

Task "nginx" is "dead"
Task Resources
CPU      Memory   Disk     Addresses
200 MHz  128 MiB  300 MiB  web: 10.x.x.x:12345

Host Volumes:
ID   Read Only
test  false

Task Events:
Started At     = N/A
Finished At    = 2021-09-07T14:51:34Z
Total Restarts = 2
Last Restart   = 2021-09-07T22:51:01+08:00

Recent Events:
Time                       Type             Description
2021-09-07T22:51:36+08:00  Killing          Sent interrupt. Waiting 5s before force killing
2021-09-07T22:51:34+08:00  Alloc Unhealthy  Unhealthy because of failed task
2021-09-07T22:51:34+08:00  Not Restarting   Exceeded allowed attempts 2 in interval 30m0s and mode is "fail"
2021-09-07T22:51:34+08:00  Driver Failure   Failed to pull `nginx:1.21`: API error (500): Head https://registry-1.docker.io/v2/library/nginx/manifests/1.21: Get https://auth.docker.io/token?scope=repository%3Alibrary%2Fnginx%3Apull&service=registry.docker.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2021-09-07T22:51:18+08:00  Driver           Downloading image
2021-09-07T22:51:01+08:00  Restarting       Task restarting in 17.735293963s
2021-09-07T22:51:01+08:00  Driver Failure   Failed to pull `nginx:1.21`: API error (500): Head https://registry-1.docker.io/v2/library/nginx/manifests/1.21: net/http: TLS handshake timeout
2021-09-07T22:50:45+08:00  Driver           Downloading image
2021-09-07T22:50:29+08:00  Restarting       Task restarting in 15.739277426s
2021-09-07T22:50:29+08:00  Driver Failure   Failed to pull `nginx:1.21`: API error (500): Head https://registry-1.docker.io/v2/library/nginx/manifests/1.21: net/http: TLS handshake timeout

<<:  [Day 7] 资料产品第三层 - 预测模型

>>:  7.移转 Aras PLM大小事-汇入Aras如何有效执行

Day7:终於要进去新手村了-Javascript-新手开始

终於写到JS的部分了,我自己其实也是没有很懂,所以就记录在这里当作是一种学习方式。 我们在使用网页的...

[Angular] Day14. Built-in directives - structural

在上一章中介绍了 attribute directive 的用法,接着要来介绍另一种 Angular...

Day 4 - 虚拟机的设置

Day 4 - 虚拟机的设置 今天我会讲Android Studio虚拟机的设置,那我们废话不多说,...

DAY 21 制作 Nav Bar - FontAwesome

FontAwesome FontAwesome 让我们可以快速方便的使用 Icon 的设计,不过他有...

Angular Stock登入(一)(Day21)

今天要开始实作登入页面,首先我们先新增一个login元件 ng generate component...