Day 22-state manipulation 之四:让 terraform 遗忘过去的 state rm

透过 state mv,应该对於 terraform state manipulation 有更透彻的理解,接下来要透过 state rm 强迫 terraform 移除 state

课程内容与代码会放在 Github 上: https://github.com/chechiachang/terraform-30-days

赛後文章会整理放到个人的部落格上 http://chechia.net/

追踪粉专可以收到文章的主动推播

https://ithelp.ithome.com.tw/upload/images/20210901/20120327NvpHVr2QC0.jpg


Terraform state rm

https://www.terraform.io/docs/cli/commands/state/rm.html

如同Terraform 官方文件 所述,我们再比较三个部分的差异

  • .tf 维持原状
  • state rm 将已经存在的 terraform state 移除
  • remote resource 仍然存在

state 移除消失,对 terraform 会有什麽影响?回想一下 state 的作用

  • 内含有 remote resource 的 metadata (例如:id)
  • 内含有 terraform 的 intermediate data

将上面两个都移除,也就是 terraform 会"忘记" 这个 resource 曾经存在过

  • 如果 remote resource 仍然存在,terraform 也没有这个 remote reosurce 的连结
  • remote resource 会变成孤儿 orphan resource,无法再透过 terraform 管理
  • 如果 .tf 内仍然有 resource,plan 与 apply 也只会 create 新的 resource

由於上述原因,我们只有在非常少数的状况下,才会使用 state rm,这是最後手段,通常是 state 与 provider 出现严重错误,导致 state 无法被正确移除时,我们才手动操作

  • 而且会先确认 remote reosurce 有正确的被移除
  • 不然 orphan resources 仍然是要收钱的,而且会在看不到的地方影响其他正常 resource 的运作

Let's do state rm

一样回到 azure/foundation/compute_network 的范例

cd azure/foundation/compute_network
terragrunt state list

module.network.data.azurerm_resource_group.network
module.network.azurerm_subnet.subnet[0]
module.network.azurerm_subnet.subnet[1]
module.network.azurerm_subnet.subnet[2]
module.network.azurerm_virtual_network.vnet

我们可以针对其中一个 resource addressing 做 state rm
或是针对整个 module 做 rm

针对整个 module 做 rm 的话,之後要复原的 address 数量就会比较多
我们这边为了示范 rm 而做 rm,并没有实际的需求,所以只示范其中一个 subnet rm

一样能 dry-run 先 dry-run

terragrunt state rm --dry-run "module.network.azurerm_subnet.subnet[0]"

Would remove "module.network.azurerm_subnet.subnet[0]"

然後 state rm

terragrunt state rm --dry-run "module.network.azurerm_subnet.subnet[0]"

Removed module.network.azurerm_subnet.subnet[0]
Successfully removed 1 resource instance(s).

state list 可以看到 state address 已经消失了

terragrunt state list
module.network.data.azurerm_resource_group.network
module.network.azurerm_subnet.subnet[1]
module.network.azurerm_subnet.subnet[2]
module.network.azurerm_virtual_network.vnet

尝试 plan

terragrunt plan

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.network.azurerm_subnet.subnet[0] will be created
  + resource "azurerm_subnet" "subnet" {
      + address_prefix                                 = (known after apply)
      + address_prefixes                               = [
          + "10.2.1.0/24",
        ]
      + enforce_private_link_endpoint_network_policies = false
      + enforce_private_link_service_network_policies  = false
      + id                                             = (known after apply)
      + name                                           = "dev-1"
      + resource_group_name                            = "terraform-30-days"
      + service_endpoints                              = []
      + virtual_network_name                           = "acctvnet"
    }

Plan: 1 to add, 0 to change, 0 to destroy.

terraform 觉得应该要 create,增加一个

  • 由於 .tf 中仍然有 module.network.azurerm_subnet.subnet[0]
  • 而 state 中没有
  • 然而透过 azure web console 仍然可见 subnet 存在,我们只 rm state,remote resource 仍然存在
  • 此时 module.network.azurerm_subnet.subnet[0] 虽然是 terraform 产生的,但由於 terraform 移除 state 後,已经失去所有与 remote resource 的关联,也就是忘记有这个 subnet 存在,这个 subnet 已经是 orphan resource,无法透过 terraform 管理了

az-cli list

透过 az-cli 查询实际的 vnet / subnet 状态

  • 发现 subnet/dev-1 确实存在
az network vnet list
{}

az network vnet list | jq '.[].id'

"/subscriptions/.../resourceGroups/MC_base_general_southeastasia/providers/Microsoft.Network/virtualNetworks/aks-vnet-39532258"
"/subscriptions/.../resourceGroups/terraform-30-days/providers/Microsoft.Network/virtualNetworks/acctvnet"

az network vnet subnet list -g terraform-30-days --vnet-name acctvnet
{}

az network vnet subnet list -g terraform-30-days --vnet-name acctvnet | jq '.[].id'

"/subscriptions/.../resourceGroups/terraform-30-days/providers/Microsoft.Network/virtualNetworks/acctvnet/subnets/dev-1"
"/subscriptions/.../resourceGroups/terraform-30-days/providers/Microsoft.Network/virtualNetworks/acctvnet/subnets/dev-3"
"/subscriptions/.../resourceGroups/terraform-30-days/providers/Microsoft.Network/virtualNetworks/acctvnet/subnets/dev-2"

how to recover? how to un-do state rm

那如何把 state rm 掉的 subnet 加回来?我们可以尝试 terraform apply

terragrunt apply

Terraform will perform the following actions:

  # module.network.azurerm_subnet.subnet[0] will be created
  + resource "azurerm_subnet" "subnet" {
      + address_prefix                                 = (known after apply)
      + address_prefixes                               = [
          + "10.2.1.0/24",
        ]
      + enforce_private_link_endpoint_network_policies = false
      + enforce_private_link_service_network_policies  = false
      + id                                             = (known after apply)
      + name                                           = "dev-1"
      + resource_group_name                            = "terraform-30-days"
      + service_endpoints                              = []
      + virtual_network_name                           = "acctvnet"
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

module.network.azurerm_subnet.subnet[0]: Creating...
╷
│ Error: A resource with the ID "/subscriptions/6fce7237-7e8e-4053-8e7d-ecf8a7c392ce/resourceGroups/terraform-30-days/providers/Microsoft.Network/virtualNetworks/acctvnet/subnets/dev-1" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_subnet" for more information.
│
│   with module.network.azurerm_subnet.subnet[0],
│   on .terraform/modules/network/main.tf line 15, in resource "azurerm_subnet" "subnet":
│   15: resource "azurerm_subnet" "subnet" {
│
╵
ERRO[0129] 1 error occurred:
	* exit status 1

terragrunt apply 时,azure API 回传 error

  • virtualNetworks/acctvnet/subnets/dev-1 依然存在,terraform 想 apply 一个相同 name (id) 的 resource,自然会失败

虽然我们的需求是希望把移除的 state 加回来,但是terrafrom 并没有"把移除的 state 加回来“的概念

  • 记得 state 基本概念中提到的,state 就是 .tf resource 与 remote resource 的连结
  • state rm 强迫 terraform 遗忘 state 存在
  • 换作是其他的 module,就算可以 apply 成功,也是 create 一个新的,原本的 resource 仍然留在原地,变成有两个 remote resource,一个可以透过 terraform 管理,另外一个是 orphan

So really, how to fix this?

那现在整个 azure/foundation/compute_network 不能用了,同事要气噗噗了XD,该要怎麽修复?

error 後面有一句讯息

 to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_subnet" for more information.

意思是

  • terraform 尝试 apply,却找到相同 name 的 resource
  • terraform 建议可以把这个 terraform 不认得的 resource,import 到 terraform 中管理
  • 从既有的 remote resource,建立 remote resource 到 .tf resource 的连结

具体该如何操作?我们看下一节课 terraform state import


<<:  Day 07: 类别、系统、羽化

>>:  Day-07 说明Ruby 的include, extend,require差别?

【Day 21】Lambda 函式

前言 今天要来介绍 Lambda 函式,这个函式相当简单,只有一行就可以表示完了,Lambda 函式...

Day15: 【TypeScript 学起来】Interface VS Type Aliases 用法与差别

上一篇讲到 interface,今天这篇会来讲 type, 他们两个功能几乎很像,但还是有些不一样...

​ 疫情下的BCP对策

企业或机构日常管理铁三角 1. 合理化:做该做的事、花该花的钱 (1). 省小钱花大钱,乱省一通得不...

[05] [Flask 快速上手笔记] 04. HTTP 方法x静态文件x渲染模板

HTTP 方法 在预设情况下 Flask 路由的 HTTP 方法只允许 GET 可以透过route(...

Day 02 什麽是关键字广告?

昨天聊到在搜寻引擎输入字词触发进而触发关键字广告,有些人可能会好奇,这些广告是以什麽形式出现呢? 这...