Jump to content

Server Admin Log/Archive 80

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

2024-05-31

  • 23:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:23 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:13 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:13 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:30 logmsgbot: nshahquinn-wmf@deploy1002 Finished deploy [airflow-dags/analytics_product@f0284c6]: (no justification provided) (duration: 00m 03s)
  • 22:30 logmsgbot: nshahquinn-wmf@deploy1002 Started deploy [airflow-dags/analytics_product@f0284c6]: (no justification provided)
  • 22:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:27 logmsgbot: nshahquinn-wmf@deploy1002 Finished deploy [airflow-dags/analytics_product@f0284c6]: (no justification provided) (duration: 00m 07s)
  • 22:27 logmsgbot: nshahquinn-wmf@deploy1002 Started deploy [airflow-dags/analytics_product@f0284c6]: (no justification provided)
  • 22:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 22:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 22:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T364069)', diff saved to https://phabricator.wikimedia.org/P63803 and previous config saved to /var/cache/conftool/dbconfig/20240531-220920-marostegui.json
  • 22:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:55 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P63802 and previous config saved to /var/cache/conftool/dbconfig/20240531-215412-marostegui.json
  • 21:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P63801 and previous config saved to /var/cache/conftool/dbconfig/20240531-213904-marostegui.json
  • 21:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T364069)', diff saved to https://phabricator.wikimedia.org/P63800 and previous config saved to /var/cache/conftool/dbconfig/20240531-212356-marostegui.json
  • 21:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T364299)', diff saved to https://phabricator.wikimedia.org/P63799 and previous config saved to /var/cache/conftool/dbconfig/20240531-212101-marostegui.json
  • 21:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 21:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 21:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T364299)', diff saved to https://phabricator.wikimedia.org/P63798 and previous config saved to /var/cache/conftool/dbconfig/20240531-212038-marostegui.json
  • 21:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P63797 and previous config saved to /var/cache/conftool/dbconfig/20240531-210530-marostegui.json
  • 21:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P63796 and previous config saved to /var/cache/conftool/dbconfig/20240531-205022-marostegui.json
  • 20:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:46 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T364299)', diff saved to https://phabricator.wikimedia.org/P63795 and previous config saved to /var/cache/conftool/dbconfig/20240531-203514-marostegui.json
  • 20:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:24 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:01 rzl: sudo -i reprepro -C main include bullseye-wikimedia /home/rzl/httpbb/buster/httpbb_0.0.5-1+deb11u1_amd64.changes
  • 20:00 rzl: sudo -i reprepro -C main include buster-wikimedia /home/rzl/httpbb/buster/httpbb_0.0.5-1_amd64.changes
  • 19:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P63794 and previous config saved to /var/cache/conftool/dbconfig/20240531-194131-ladsgroup.json
  • 19:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2204 (T352010)', diff saved to https://phabricator.wikimedia.org/P63793 and previous config saved to /var/cache/conftool/dbconfig/20240531-194037-ladsgroup.json
  • 19:40 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: Maintenance
  • 19:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: Maintenance
  • 19:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P63792 and previous config saved to /var/cache/conftool/dbconfig/20240531-192625-ladsgroup.json
  • 19:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P63791 and previous config saved to /var/cache/conftool/dbconfig/20240531-191119-ladsgroup.json
  • 19:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:03 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P63790 and previous config saved to /var/cache/conftool/dbconfig/20240531-190138-root.json
  • 19:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:57 mutante: Phabricator - added 'JoelyRooke-WMDE (Jo)' to group WMF-NDA (https://phabricator.wikimedia.org/project/profile/61/) (T366145)
  • 18:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P63789 and previous config saved to /var/cache/conftool/dbconfig/20240531-185613-ladsgroup.json
  • 18:55 mutante: LDAP - added uid joelyrookewmde to groups wmde and nda (T366145)
  • 18:55 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P63788 and previous config saved to /var/cache/conftool/dbconfig/20240531-184632-root.json
  • 18:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P63787 and previous config saved to /var/cache/conftool/dbconfig/20240531-183125-root.json
  • 18:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:22 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P63785 and previous config saved to /var/cache/conftool/dbconfig/20240531-181619-root.json
  • 18:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:03 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63782 and previous config saved to /var/cache/conftool/dbconfig/20240531-180113-root.json
  • 17:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63781 and previous config saved to /var/cache/conftool/dbconfig/20240531-174607-root.json
  • 17:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63780 and previous config saved to /var/cache/conftool/dbconfig/20240531-173101-root.json
  • 17:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:32 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:32 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T364299)', diff saved to https://phabricator.wikimedia.org/P63778 and previous config saved to /var/cache/conftool/dbconfig/20240531-161807-marostegui.json
  • 16:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 16:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 16:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T364299)', diff saved to https://phabricator.wikimedia.org/P63777 and previous config saved to /var/cache/conftool/dbconfig/20240531-161744-marostegui.json
  • 16:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P63775 and previous config saved to /var/cache/conftool/dbconfig/20240531-160236-marostegui.json
  • 16:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P63774 and previous config saved to /var/cache/conftool/dbconfig/20240531-154728-marostegui.json
  • 15:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:44 cgoubert@cumin1002: conftool action : set/pooled=yes; selector: name=parse1002.eqiad.wmnet,cluster=kubernetes,service=kubesvc
  • 15:43 claime: pooling and uncordoning parse1002 - T363086
  • 15:39 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 15:39 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 15:39 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 15:39 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 15:36 claime: homer 'cr*eqiad*' commit 'T363086'
  • 15:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T364299)', diff saved to https://phabricator.wikimedia.org/P63773 and previous config saved to /var/cache/conftool/dbconfig/20240531-153220-marostegui.json
  • 15:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:07 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:05 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 14:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:47 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:40 vriley@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:38 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:37 klausman@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 14:37 vriley@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:32 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 14:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:24 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 14:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P63772 and previous config saved to /var/cache/conftool/dbconfig/20240531-135629-root.json
  • 13:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:53 dcausse@deploy1002: Finished deploy [airflow-dags/search@b2f7795]: search: fix NTripleGenerator arguments (duration: 00m 21s)
  • 13:53 dcausse@deploy1002: Started deploy [airflow-dags/search@b2f7795]: search: fix NTripleGenerator arguments
  • 13:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:49 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 13:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:46 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 13:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P63771 and previous config saved to /var/cache/conftool/dbconfig/20240531-134122-root.json
  • 13:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:37 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 13:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:28 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 13:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P63770 and previous config saved to /var/cache/conftool/dbconfig/20240531-132616-root.json
  • 13:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 13:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P63769 and previous config saved to /var/cache/conftool/dbconfig/20240531-131110-root.json
  • 13:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63768 and previous config saved to /var/cache/conftool/dbconfig/20240531-125604-root.json
  • 12:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63767 and previous config saved to /var/cache/conftool/dbconfig/20240531-124058-root.json
  • 12:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2173', diff saved to https://phabricator.wikimedia.org/P63766 and previous config saved to /var/cache/conftool/dbconfig/20240531-123903-root.json
  • 12:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS bookworm
  • 12:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63765 and previous config saved to /var/cache/conftool/dbconfig/20240531-122552-root.json
  • 12:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:21 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS bookworm
  • 12:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:15 dcausse@deploy1002: Finished deploy [airflow-dags/search@45de44b]: search: bump rdf-spark-tools to 0.3.141 (duration: 00m 21s)
  • 12:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:15 dcausse@deploy1002: Started deploy [airflow-dags/search@45de44b]: search: bump rdf-spark-tools to 0.3.141
  • 12:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
  • 12:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
  • 12:03 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2038.codfw.wmnet with reason: host reimage
  • 12:00 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2038.codfw.wmnet with reason: host reimage
  • 12:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 11:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 11:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T352010)', diff saved to https://phabricator.wikimedia.org/P63764 and previous config saved to /var/cache/conftool/dbconfig/20240531-115244-ladsgroup.json
  • 11:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:51 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS bookworm
  • 11:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:46 jiji@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1039.eqiad.wmnet with OS bookworm
  • 11:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:42 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS bookworm
  • 11:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P63763 and previous config saved to /var/cache/conftool/dbconfig/20240531-113735-ladsgroup.json
  • 11:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:26 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS bookworm
  • 11:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P63762 and previous config saved to /var/cache/conftool/dbconfig/20240531-112227-ladsgroup.json
  • 11:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T364299)', diff saved to https://phabricator.wikimedia.org/P63761 and previous config saved to /var/cache/conftool/dbconfig/20240531-111833-marostegui.json
  • 11:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 11:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T364299)', diff saved to https://phabricator.wikimedia.org/P63760 and previous config saved to /var/cache/conftool/dbconfig/20240531-111809-marostegui.json
  • 11:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:09 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2039.codfw.wmnet with reason: host reimage
  • 11:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T352010)', diff saved to https://phabricator.wikimedia.org/P63759 and previous config saved to /var/cache/conftool/dbconfig/20240531-110719-ladsgroup.json
  • 11:06 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2039.codfw.wmnet with reason: host reimage
  • 11:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T364069)', diff saved to https://phabricator.wikimedia.org/P63758 and previous config saved to /var/cache/conftool/dbconfig/20240531-110347-marostegui.json
  • 11:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 11:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T364069)', diff saved to https://phabricator.wikimedia.org/P63757 and previous config saved to /var/cache/conftool/dbconfig/20240531-110324-marostegui.json
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P63756 and previous config saved to /var/cache/conftool/dbconfig/20240531-110301-marostegui.json
  • 10:55 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:55 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2205.codfw.wmnet with reason: Maintenance
  • 10:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2205.codfw.wmnet with reason: Maintenance
  • 10:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P63755 and previous config saved to /var/cache/conftool/dbconfig/20240531-104816-marostegui.json
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P63754 and previous config saved to /var/cache/conftool/dbconfig/20240531-104753-marostegui.json
  • 10:47 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS bookworm
  • 10:47 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1039.eqiad.wmnet with OS bookworm
  • 10:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P63753 and previous config saved to /var/cache/conftool/dbconfig/20240531-103308-marostegui.json
  • 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T364299)', diff saved to https://phabricator.wikimedia.org/P63752 and previous config saved to /var/cache/conftool/dbconfig/20240531-103245-marostegui.json
  • 10:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS bookworm
  • 10:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T364069)', diff saved to https://phabricator.wikimedia.org/P63751 and previous config saved to /var/cache/conftool/dbconfig/20240531-101800-marostegui.json
  • 10:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1040.eqiad.wmnet with OS bookworm
  • 10:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:03 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2040.codfw.wmnet with reason: host reimage
  • 10:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2040.codfw.wmnet with reason: host reimage
  • 09:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:58 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1040.eqiad.wmnet with reason: host reimage
  • 09:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:55 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1040.eqiad.wmnet with reason: host reimage
  • 09:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS bookworm
  • 09:41 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1040.eqiad.wmnet with OS bookworm
  • 09:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:25 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2041.codfw.wmnet with OS bookworm
  • 09:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1041.eqiad.wmnet with OS bookworm
  • 09:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:08 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2041.codfw.wmnet with reason: host reimage
  • 09:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:06 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2041.codfw.wmnet with reason: host reimage
  • 09:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:03 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1041.eqiad.wmnet with reason: host reimage
  • 09:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:00 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1041.eqiad.wmnet with reason: host reimage
  • 08:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:47 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2041.codfw.wmnet with OS bookworm
  • 08:47 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1041.eqiad.wmnet with OS bookworm
  • 08:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:12 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1042.eqiad.wmnet with OS bookworm
  • 08:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:03 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:02 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2042.codfw.wmnet with reason: host reimage
  • 07:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2042.codfw.wmnet with reason: host reimage
  • 07:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1042.eqiad.wmnet with reason: host reimage
  • 07:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1042.eqiad.wmnet with reason: host reimage
  • 07:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:40 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2042.codfw.wmnet with OS bookworm
  • 07:40 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1042.eqiad.wmnet with OS bookworm
  • 07:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13335
  • 07:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:30 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on moss-fe1002.eqiad.wmnet with reason: in development
  • 07:30 mvernon@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on moss-fe1002.eqiad.wmnet with reason: in development
  • 07:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:58 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2043.codfw.wmnet with OS bookworm
  • 06:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1043.eqiad.wmnet with OS bookworm
  • 06:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:41 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2043.codfw.wmnet with reason: host reimage
  • 06:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:38 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2043.codfw.wmnet with reason: host reimage
  • 06:36 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1043.eqiad.wmnet with reason: host reimage
  • 06:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 13335
  • 06:33 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1043.eqiad.wmnet with reason: host reimage
  • 06:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:20 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1043.eqiad.wmnet with OS bookworm
  • 06:20 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2043.codfw.wmnet with OS bookworm
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T364299)', diff saved to https://phabricator.wikimedia.org/P63750 and previous config saved to /var/cache/conftool/dbconfig/20240531-061219-marostegui.json
  • 06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 06:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 06:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T364299)', diff saved to https://phabricator.wikimedia.org/P63749 and previous config saved to /var/cache/conftool/dbconfig/20240531-061156-marostegui.json
  • 06:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P63748 and previous config saved to /var/cache/conftool/dbconfig/20240531-055647-marostegui.json
  • 05:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P63747 and previous config saved to /var/cache/conftool/dbconfig/20240531-054139-marostegui.json
  • 05:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T364299)', diff saved to https://phabricator.wikimedia.org/P63746 and previous config saved to /var/cache/conftool/dbconfig/20240531-052631-marostegui.json
  • 05:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:55 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:55 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T364299)', diff saved to https://phabricator.wikimedia.org/P63745 and previous config saved to /var/cache/conftool/dbconfig/20240531-042604-marostegui.json
  • 04:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 04:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 04:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T364299)', diff saved to https://phabricator.wikimedia.org/P63744 and previous config saved to /var/cache/conftool/dbconfig/20240531-042540-marostegui.json
  • 04:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T352010)', diff saved to https://phabricator.wikimedia.org/P63743 and previous config saved to /var/cache/conftool/dbconfig/20240531-042414-ladsgroup.json
  • 04:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 04:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 04:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T352010)', diff saved to https://phabricator.wikimedia.org/P63742 and previous config saved to /var/cache/conftool/dbconfig/20240531-042350-ladsgroup.json
  • 04:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P63741 and previous config saved to /var/cache/conftool/dbconfig/20240531-041032-marostegui.json
  • 04:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P63740 and previous config saved to /var/cache/conftool/dbconfig/20240531-040842-ladsgroup.json
  • 04:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P63739 and previous config saved to /var/cache/conftool/dbconfig/20240531-035524-marostegui.json
  • 03:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P63738 and previous config saved to /var/cache/conftool/dbconfig/20240531-035334-ladsgroup.json
  • 03:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T364299)', diff saved to https://phabricator.wikimedia.org/P63737 and previous config saved to /var/cache/conftool/dbconfig/20240531-034016-marostegui.json
  • 03:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T352010)', diff saved to https://phabricator.wikimedia.org/P63736 and previous config saved to /var/cache/conftool/dbconfig/20240531-033826-ladsgroup.json
  • 03:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:24 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply

2024-05-30

  • 23:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T364069)', diff saved to https://phabricator.wikimedia.org/P63735 and previous config saved to /var/cache/conftool/dbconfig/20240530-235640-marostegui.json
  • 23:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 23:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 23:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T364069)', diff saved to https://phabricator.wikimedia.org/P63734 and previous config saved to /var/cache/conftool/dbconfig/20240530-235617-marostegui.json
  • 23:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P63733 and previous config saved to /var/cache/conftool/dbconfig/20240530-234109-marostegui.json
  • 23:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P63732 and previous config saved to /var/cache/conftool/dbconfig/20240530-232600-marostegui.json
  • 23:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T364069)', diff saved to https://phabricator.wikimedia.org/P63731 and previous config saved to /var/cache/conftool/dbconfig/20240530-231052-marostegui.json
  • 23:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T364299)', diff saved to https://phabricator.wikimedia.org/P63730 and previous config saved to /var/cache/conftool/dbconfig/20240530-230212-marostegui.json
  • 23:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 23:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 23:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 23:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 23:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T364299)', diff saved to https://phabricator.wikimedia.org/P63729 and previous config saved to /var/cache/conftool/dbconfig/20240530-230129-marostegui.json
  • 22:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P63728 and previous config saved to /var/cache/conftool/dbconfig/20240530-224621-marostegui.json
  • 22:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P63727 and previous config saved to /var/cache/conftool/dbconfig/20240530-223112-marostegui.json
  • 22:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T364299)', diff saved to https://phabricator.wikimedia.org/P63726 and previous config saved to /var/cache/conftool/dbconfig/20240530-221604-marostegui.json
  • 22:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:08 cjming: end of UTC late backport window
  • 21:07 cjming@deploy1002: Finished scap: Backport for Revert "IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath" (T361884) (duration: 11m 43s)
  • 21:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:04 Amir1: dropping old replication user from backup sources
  • 21:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:02 jclark@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:59 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:59 cjming@deploy1002: tchanders and cjming: Continuing with sync
  • 20:58 cjming@deploy1002: tchanders and cjming: Backport for Revert "IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath" (T361884) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:56 jclark@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:55 cjming@deploy1002: Started scap: Backport for Revert "IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath" (T361884)
  • 20:55 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:53 cjming@deploy1002: Finished scap: Backport for Add a stream for tracking the API of WikiLambda (T356228 T360369) (duration: 28m 08s)
  • 20:51 jclark@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:44 cjming@deploy1002: cjming and dmartin: Continuing with sync
  • 20:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 robh@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:32 robh@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:29 cjming@deploy1002: cjming and dmartin: Backport for Add a stream for tracking the API of WikiLambda (T356228 T360369) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 cjming@deploy1002: Started scap: Backport for Add a stream for tracking the API of WikiLambda (T356228 T360369)
  • 20:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:18 cjming@deploy1002: Finished scap: Backport for Popups setting should be string not integer (T364347) (duration: 13m 39s)
  • 20:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 cjming@deploy1002: cjming and jdlrobson: Continuing with sync
  • 20:09 cjming@deploy1002: cjming and jdlrobson: Backport for Popups setting should be string not integer (T364347) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:05 cjming@deploy1002: Started scap: Backport for Popups setting should be string not integer (T364347)
  • 20:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 cdanis@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:59 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device ssw1-d8-codfw.mgmt.codfw.wmnet
  • 19:57 cdanis@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T352010)', diff saved to https://phabricator.wikimedia.org/P63725 and previous config saved to /var/cache/conftool/dbconfig/20240530-193717-ladsgroup.json
  • 19:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 19:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T352010)', diff saved to https://phabricator.wikimedia.org/P63724 and previous config saved to /var/cache/conftool/dbconfig/20240530-193653-ladsgroup.json
  • 19:35 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.7 refs T361401
  • 19:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 jhathaway: bounce exim on mx2001
  • 19:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:27 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-d8-codfw - pt1979@cumin2002"
  • 19:26 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-d8-codfw - pt1979@cumin2002"
  • 19:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 19:24 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-d8-codfw.mgmt.codfw.wmnet
  • 19:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P63723 and previous config saved to /var/cache/conftool/dbconfig/20240530-192145-ladsgroup.json
  • 19:20 dancy@deploy1002: Finished scap: Backport for Temporarily silence noisy new warnings (T366268) (duration: 15m 39s)
  • 19:19 jhathaway: bouncing exim on mx1001
  • 19:11 dancy@deploy1002: jforrester and dancy: Continuing with sync
  • 19:11 dancy@deploy1002: jforrester and dancy: Backport for Temporarily silence noisy new warnings (T366268) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:08 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device ssw1-d1-codfw.mgmt.codfw.wmnet
  • 19:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P63722 and previous config saved to /var/cache/conftool/dbconfig/20240530-190633-ladsgroup.json
  • 19:05 dancy@deploy1002: Started scap: Backport for Temporarily silence noisy new warnings (T366268)
  • 18:59 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d8-codfw.mgmt.codfw.wmnet
  • 18:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T352010)', diff saved to https://phabricator.wikimedia.org/P63720 and previous config saved to /var/cache/conftool/dbconfig/20240530-185125-ladsgroup.json
  • 18:41 cdanis: T365571 💙root@deploy1002.eqiad.wmnet ~ 🕝⁉ kubectl delete node kubernetes2032.codfw.wmnet
  • 18:36 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:35 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-d1-codfw - pt1979@cumin2002"
  • 18:35 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-d1-codfw - pt1979@cumin2002"
  • 18:31 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:31 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-d1-codfw.mgmt.codfw.wmnet
  • 18:29 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d7-codfw.mgmt.codfw.wmnet
  • 18:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:27 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d8-codfw - pt1979@cumin2002"
  • 18:26 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d8-codfw - pt1979@cumin2002"
  • 18:18 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:18 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d8-codfw.mgmt.codfw.wmnet
  • 18:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:08 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d6-codfw.mgmt.codfw.wmnet
  • 17:58 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:58 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d7-codfw - pt1979@cumin2002"
  • 17:57 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d7-codfw - pt1979@cumin2002"
  • 17:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:50 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d7-codfw.mgmt.codfw.wmnet
  • 17:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d5-codfw.mgmt.codfw.wmnet
  • 17:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:37 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d6-codfw - pt1979@cumin2002"
  • 17:36 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d6-codfw - pt1979@cumin2002"
  • 17:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:35 joal@deploy1002: Finished deploy [airflow-dags/analytics@e74e164]: Regular analytics weekly train HOTFIX [airflow-dags/analytics@e74e164f] (duration: 00m 27s)
  • 17:34 joal@deploy1002: Started deploy [airflow-dags/analytics@e74e164]: Regular analytics weekly train HOTFIX [airflow-dags/analytics@e74e164f]
  • 17:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:33 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:33 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d6-codfw.mgmt.codfw.wmnet
  • 17:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:30 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d4-codfw.mgmt.codfw.wmnet
  • 17:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:09 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d5-codfw - pt1979@cumin2002"
  • 17:08 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d5-codfw - pt1979@cumin2002"
  • 17:06 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:06 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d5-codfw.mgmt.codfw.wmnet
  • 17:04 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d3-codfw.mgmt.codfw.wmnet
  • 17:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 17:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 17:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:59 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d4-codfw - pt1979@cumin2002"
  • 16:58 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d4-codfw - pt1979@cumin2002"
  • 16:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1209', diff saved to https://phabricator.wikimedia.org/P63719 and previous config saved to /var/cache/conftool/dbconfig/20240530-165615-root.json
  • 16:55 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:55 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d4-codfw.mgmt.codfw.wmnet
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T364299)', diff saved to https://phabricator.wikimedia.org/P63718 and previous config saved to /var/cache/conftool/dbconfig/20240530-165120-marostegui.json
  • 16:51 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c7-codfw.mgmt.codfw.wmnet
  • 16:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63717 and previous config saved to /var/cache/conftool/dbconfig/20240530-165057-marostegui.json
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63716 and previous config saved to /var/cache/conftool/dbconfig/20240530-165034-root.json
  • 16:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P63715 and previous config saved to /var/cache/conftool/dbconfig/20240530-163549-marostegui.json
  • 16:34 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:33 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 16:33 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d3-codfw - pt1979@cumin2002"
  • 16:32 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:32 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d3-codfw - pt1979@cumin2002"
  • 16:32 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 16:25 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:25 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 16:25 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:24 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:24 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 16:23 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 16:23 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:23 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 16:23 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 16:23 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:23 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d3-codfw.mgmt.codfw.wmnet
  • 16:22 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 16:22 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d2-codfw.mgmt.codfw.wmnet
  • 16:22 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:21 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 16:21 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P63714 and previous config saved to /var/cache/conftool/dbconfig/20240530-162040-marostegui.json
  • 16:20 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:20 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 16:20 sukhe: [correction] sudo homer cr*magru* commit "add 198.35.27.0/24 for magru to announce ns2.wikimedia.org": T346722
  • 16:20 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 16:20 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:20 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:20 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 16:19 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 16:19 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 16:18 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:17 sukhe: sudo homer asw*magru* commit "add 198.35.27.0/24 for magru to announce ns2.wikimedia.org": T346722
  • 16:15 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 16:14 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 16:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 16:12 dancy@deploy1002: Finished scap: Backport for Migrate `wmfstatic` metrics to Prometheus store (T359255) (duration: 17m 19s)
  • 16:09 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:08 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 16:08 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:07 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T364299)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240530-160528-marostegui.json
  • 16:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:03 dancy@deploy1002: denisse and dancy: Continuing with sync
  • 15:59 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns700[1-2].wikimedia.org,service=authdns-ns2
  • 15:58 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 15:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 dancy@deploy1002: denisse and dancy: Backport for Migrate `wmfstatic` metrics to Prometheus store (T359255) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:55 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 15:54 dancy@deploy1002: Started scap: Backport for Migrate `wmfstatic` metrics to Prometheus store (T359255)
  • 15:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d2-codfw - pt1979@cumin2002"
  • 15:50 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d2-codfw - pt1979@cumin2002"
  • 15:48 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:48 dancy@deploy1002: Finished scap: Testing (duration: 10m 43s)
  • 15:46 ejegg: payments-wiki upgraded from 8ff002ef to 0174d89c
  • 15:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d2-codfw.mgmt.codfw.wmnet
  • 15:43 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:43 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c7-codfw.mgmt.codfw.wmnet
  • 15:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63713 and previous config saved to /var/cache/conftool/dbconfig/20240530-154208-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63712 and previous config saved to /var/cache/conftool/dbconfig/20240530-154127-arnaudb.json
  • 15:38 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 15:38 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c5-codfw.mgmt.codfw.wmnet
  • 15:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:37 joal@deploy1002: Finished deploy [airflow-dags/analytics@3659547]: Regular analytics weekly train [airflow-dags/analytics@3659547f] (duration: 00m 29s)
  • 15:37 dancy@deploy1002: Started scap: Testing
  • 15:37 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c6-codfw.mgmt.codfw.wmnet
  • 15:36 joal@deploy1002: Started deploy [airflow-dags/analytics@3659547]: Regular analytics weekly train [airflow-dags/analytics@3659547f]
  • 15:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:34 dancy@deploy1002: Finished scap: Backport for rdbms: Pass array values to makeList on insert/upsert (T366268) (duration: 11m 55s)
  • 15:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63710 and previous config saved to /var/cache/conftool/dbconfig/20240530-152703-arnaudb.json
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63709 and previous config saved to /var/cache/conftool/dbconfig/20240530-152619-arnaudb.json
  • 15:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:24 dancy@deploy1002: umherirrender and dancy: Continuing with sync
  • 15:24 dancy@deploy1002: umherirrender and dancy: Backport for rdbms: Pass array values to makeList on insert/upsert (T366268) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:22 dancy@deploy1002: Started scap: Backport for rdbms: Pass array values to makeList on insert/upsert (T366268)
  • 15:19 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1041
  • 15:19 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1041
  • 15:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63708 and previous config saved to /var/cache/conftool/dbconfig/20240530-151155-arnaudb.json
  • 15:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63707 and previous config saved to /var/cache/conftool/dbconfig/20240530-151113-arnaudb.json
  • 15:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:06 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c6-codfw - pt1979@cumin2002"
  • 15:05 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c6-codfw - pt1979@cumin2002"
  • 15:05 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 14:59 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c6-codfw.mgmt.codfw.wmnet
  • 14:59 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device lsw1-c2-codfw.mgmt.codfw.wmnet
  • 14:58 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:58 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for lsw1-c2-codfw - pt1979@cumin2002"
  • 14:58 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for lsw1-c2-codfw - pt1979@cumin2002"
  • 14:58 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 14:58 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c5-codfw.mgmt.codfw.wmnet
  • 14:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63706 and previous config saved to /var/cache/conftool/dbconfig/20240530-145648-arnaudb.json
  • 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63705 and previous config saved to /var/cache/conftool/dbconfig/20240530-145607-arnaudb.json
  • 14:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:52 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 14:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 hnowlan: Running `decommission` on 5 eqiad api appservers
  • 14:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63704 and previous config saved to /var/cache/conftool/dbconfig/20240530-144142-arnaudb.json
  • 14:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63703 and previous config saved to /var/cache/conftool/dbconfig/20240530-144101-arnaudb.json
  • 14:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1165.eqiad.wmnet
  • 14:26 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1165.eqiad.wmnet
  • 14:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63702 and previous config saved to /var/cache/conftool/dbconfig/20240530-142555-arnaudb.json
  • 14:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 depool for T356240', diff saved to https://phabricator.wikimedia.org/P63701 and previous config saved to /var/cache/conftool/dbconfig/20240530-142519-arnaudb.json
  • 14:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: upgrade db1165
  • 14:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: upgrade db1165
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T366123)', diff saved to https://phabricator.wikimedia.org/P63700 and previous config saved to /var/cache/conftool/dbconfig/20240530-141914-marostegui.json
  • 14:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:11 dcausse: backport window done
  • 14:10 dcausse@deploy1002: Finished scap: Backport for Add UpdateGroup for weighted tags (duration: 11m 51s)
  • 14:05 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 14:05 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63699 and previous config saved to /var/cache/conftool/dbconfig/20240530-140404-marostegui.json
  • 14:02 dcausse@deploy1002: dcausse: Continuing with sync
  • 14:01 dcausse@deploy1002: dcausse: Backport for Add UpdateGroup for weighted tags synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:58 dcausse@deploy1002: Started scap: Backport for Add UpdateGroup for weighted tags
  • 13:55 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63698 and previous config saved to /var/cache/conftool/dbconfig/20240530-134856-marostegui.json
  • 13:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:46 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 13:45 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 13:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:43 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2044.codfw.wmnet with OS bookworm
  • 13:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1044.eqiad.wmnet with OS bookworm
  • 13:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T366123)', diff saved to https://phabricator.wikimedia.org/P63697 and previous config saved to /var/cache/conftool/dbconfig/20240530-133348-marostegui.json
  • 13:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: upgrade
  • 13:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: upgrade
  • 13:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2044.codfw.wmnet with reason: host reimage
  • 13:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:22 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2044.codfw.wmnet with reason: host reimage
  • 13:20 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1044.eqiad.wmnet with reason: host reimage
  • 13:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1173.eqiad.wmnet
  • 13:19 dcausse@deploy1002: Finished scap: Backport for cirrus: Send weighted tags to known clusters (duration: 12m 43s)
  • 13:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1044.eqiad.wmnet with reason: host reimage
  • 13:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:15 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1173.eqiad.wmnet
  • 13:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: upgrade
  • 13:14 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: upgrade
  • 13:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1173 T356240', diff saved to https://phabricator.wikimedia.org/P63695 and previous config saved to /var/cache/conftool/dbconfig/20240530-131349-arnaudb.json
  • 13:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:10 dcausse@deploy1002: dcausse and ebernhardson: Continuing with sync
  • 13:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T366123)', diff saved to https://phabricator.wikimedia.org/P63694 and previous config saved to /var/cache/conftool/dbconfig/20240530-131012-marostegui.json
  • 13:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 13:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T366123)', diff saved to https://phabricator.wikimedia.org/P63693 and previous config saved to /var/cache/conftool/dbconfig/20240530-130946-marostegui.json
  • 13:09 dcausse@deploy1002: dcausse and ebernhardson: Backport for cirrus: Send weighted tags to known clusters synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:06 dcausse@deploy1002: Started scap: Backport for cirrus: Send weighted tags to known clusters
  • 13:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:04 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1044.eqiad.wmnet with OS bookworm
  • 13:04 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS bookworm
  • 13:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:01 joal@deploy1002: Finished deploy [analytics/refinery@ac0b789] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ac0b789b] (duration: 02m 54s)
  • 13:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 joal@deploy1002: Started deploy [analytics/refinery@ac0b789] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ac0b789b]
  • 12:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:55 joal@deploy1002: Finished deploy [analytics/refinery@ac0b789] (thin): Regular analytics weekly train THIN [analytics/refinery@ac0b789b] (duration: 04m 27s)
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P63692 and previous config saved to /var/cache/conftool/dbconfig/20240530-125438-marostegui.json
  • 12:53 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 12:53 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T364069)', diff saved to https://phabricator.wikimedia.org/P63691 and previous config saved to /var/cache/conftool/dbconfig/20240530-125232-marostegui.json
  • 12:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T364069)', diff saved to https://phabricator.wikimedia.org/P63690 and previous config saved to /var/cache/conftool/dbconfig/20240530-125204-marostegui.json
  • 12:50 joal@deploy1002: Started deploy [analytics/refinery@ac0b789] (thin): Regular analytics weekly train THIN [analytics/refinery@ac0b789b]
  • 12:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:44 dcausse@deploy1002: Finished deploy [airflow-dags/search@0faf248]: search: use discolytics 0.23 (duration: 00m 26s)
  • 12:43 dcausse@deploy1002: Started deploy [airflow-dags/search@0faf248]: search: use discolytics 0.23
  • 12:43 joal@deploy1002: Finished deploy [analytics/refinery@ac0b789]: Regular analytics weekly train [analytics/refinery@ac0b789b] (duration: 12m 58s)
  • 12:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P63689 and previous config saved to /var/cache/conftool/dbconfig/20240530-123930-marostegui.json
  • 12:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P63688 and previous config saved to /var/cache/conftool/dbconfig/20240530-123655-marostegui.json
  • 12:30 joal@deploy1002: Started deploy [analytics/refinery@ac0b789]: Regular analytics weekly train [analytics/refinery@ac0b789b]
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T366123)', diff saved to https://phabricator.wikimedia.org/P63687 and previous config saved to /var/cache/conftool/dbconfig/20240530-122422-marostegui.json
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T366123)', diff saved to https://phabricator.wikimedia.org/P63686 and previous config saved to /var/cache/conftool/dbconfig/20240530-122206-marostegui.json
  • 12:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P63685 and previous config saved to /var/cache/conftool/dbconfig/20240530-122146-marostegui.json
  • 12:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 12:08 aikochou@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T364069)', diff saved to https://phabricator.wikimedia.org/P63684 and previous config saved to /var/cache/conftool/dbconfig/20240530-120638-marostegui.json
  • 12:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2205', diff saved to https://phabricator.wikimedia.org/P63683 and previous config saved to /var/cache/conftool/dbconfig/20240530-120455-root.json
  • 12:01 marostegui: Deploy schema changes on old s3 codfw master (db2205) dbmaint T364069
  • 11:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 11:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 11:48 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
  • 11:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
  • 11:47 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
  • 11:47 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
  • 11:46 cgoubert@deploy1002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
  • 11:46 cgoubert@deploy1002: helmfile [staging] START helmfile.d/services/push-notifications: apply
  • 11:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:35 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 11:34 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 11:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:33 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 11:32 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 11:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:32 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 11:31 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 11:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:26 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 11:26 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 11:25 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 11:24 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 11:23 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 11:23 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 11:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T366123)', diff saved to https://phabricator.wikimedia.org/P63682 and previous config saved to /var/cache/conftool/dbconfig/20240530-112047-marostegui.json
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P63681 and previous config saved to /var/cache/conftool/dbconfig/20240530-110539-marostegui.json
  • 11:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:52 hnowlan: switched mw2300 to be an api canary + scap_proxy, removed mw228[34] as canaries
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P63680 and previous config saved to /var/cache/conftool/dbconfig/20240530-105031-marostegui.json
  • 10:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:43 marostegui: Deploy schema changes on old s3 codfw master (db2205) dbmaint T364299
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63679 and previous config saved to /var/cache/conftool/dbconfig/20240530-104034-marostegui.json
  • 10:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 10:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T364299)', diff saved to https://phabricator.wikimedia.org/P63678 and previous config saved to /var/cache/conftool/dbconfig/20240530-104011-marostegui.json
  • 10:38 dcausse@deploy1002: Finished deploy [airflow-dags/search@ded0f17]: search: fix alter table command (duration: 00m 20s)
  • 10:38 dcausse@deploy1002: Started deploy [airflow-dags/search@ded0f17]: search: fix alter table command
  • 10:36 jiji@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl1003.eqiad.wmnet
  • 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T366123)', diff saved to https://phabricator.wikimedia.org/P63677 and previous config saved to /var/cache/conftool/dbconfig/20240530-103523-marostegui.json
  • 10:28 effie: homer "cr*eqiad*" commit 'Add wikikube-ctrl1003'
  • 10:26 claime: Restarted rsyslog on mw1479
  • 10:25 effie: label wikikube-ctrl1003 as master
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P63676 and previous config saved to /var/cache/conftool/dbconfig/20240530-102503-marostegui.json
  • 10:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T352010)', diff saved to https://phabricator.wikimedia.org/P63675 and previous config saved to /var/cache/conftool/dbconfig/20240530-102439-ladsgroup.json
  • 10:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 10:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 10:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T352010)', diff saved to https://phabricator.wikimedia.org/P63674 and previous config saved to /var/cache/conftool/dbconfig/20240530-102414-ladsgroup.json
  • 10:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P63673 and previous config saved to /var/cache/conftool/dbconfig/20240530-100955-marostegui.json
  • 10:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P63672 and previous config saved to /var/cache/conftool/dbconfig/20240530-100906-ladsgroup.json
  • 10:07 effie: add wikikube-ctrl1003 to etcd and run puppet - T353464
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T366123)', diff saved to https://phabricator.wikimedia.org/P63671 and previous config saved to /var/cache/conftool/dbconfig/20240530-100554-marostegui.json
  • 10:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 10:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T366123)', diff saved to https://phabricator.wikimedia.org/P63670 and previous config saved to /var/cache/conftool/dbconfig/20240530-100531-marostegui.json
  • 09:59 dcausse@deploy1002: Finished deploy [airflow-dags/search@66de0db]: search: add missing lexeme fields (duration: 00m 19s)
  • 09:59 dcausse@deploy1002: Started deploy [airflow-dags/search@66de0db]: search: add missing lexeme fields
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T364299)', diff saved to https://phabricator.wikimedia.org/P63669 and previous config saved to /var/cache/conftool/dbconfig/20240530-095447-marostegui.json
  • 09:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P63668 and previous config saved to /var/cache/conftool/dbconfig/20240530-095358-ladsgroup.json
  • 09:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P63667 and previous config saved to /var/cache/conftool/dbconfig/20240530-095021-marostegui.json
  • 09:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 mirror former candidate master weight T366242', diff saved to https://phabricator.wikimedia.org/P63666 and previous config saved to /var/cache/conftool/dbconfig/20240530-094936-root.json
  • 09:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2127 to s3 primary T366242', diff saved to https://phabricator.wikimedia.org/P63665 and previous config saved to /var/cache/conftool/dbconfig/20240530-094632-arnaudb.json
  • 09:45 arnaudb: Starting s3 codfw failover from db2205 to db2127 - T366242
  • 09:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T352010)', diff saved to https://phabricator.wikimedia.org/P63664 and previous config saved to /var/cache/conftool/dbconfig/20240530-093850-ladsgroup.json
  • 09:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P63663 and previous config saved to /var/cache/conftool/dbconfig/20240530-093514-marostegui.json
  • 09:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2127 with weight 0 T366242', diff saved to https://phabricator.wikimedia.org/P63662 and previous config saved to /var/cache/conftool/dbconfig/20240530-093007-arnaudb.json
  • 09:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s3 T366242
  • 09:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s3 T366242
  • 09:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1215.eqiad.wmnet with OS bookworm
  • 09:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T366123)', diff saved to https://phabricator.wikimedia.org/P63661 and previous config saved to /var/cache/conftool/dbconfig/20240530-092004-marostegui.json
  • 09:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T366123)', diff saved to https://phabricator.wikimedia.org/P63660 and previous config saved to /var/cache/conftool/dbconfig/20240530-091751-marostegui.json
  • 09:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 09:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T366123)', diff saved to https://phabricator.wikimedia.org/P63659 and previous config saved to /var/cache/conftool/dbconfig/20240530-091728-marostegui.json
  • 09:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2204 with weight 500 revert T366241', diff saved to https://phabricator.wikimedia.org/P63658 and previous config saved to /var/cache/conftool/dbconfig/20240530-091323-arnaudb.json
  • 09:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2204 with weight 0 T366241', diff saved to https://phabricator.wikimedia.org/P63656 and previous config saved to /var/cache/conftool/dbconfig/20240530-090840-arnaudb.json
  • 09:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s2 T366241
  • 09:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s2 T366241
  • 09:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
  • 09:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P63655 and previous config saved to /var/cache/conftool/dbconfig/20240530-090220-marostegui.json
  • 09:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
  • 08:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:47 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS bookworm
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P63654 and previous config saved to /var/cache/conftool/dbconfig/20240530-084712-marostegui.json
  • 08:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T366123)', diff saved to https://phabricator.wikimedia.org/P63653 and previous config saved to /var/cache/conftool/dbconfig/20240530-083204-marostegui.json
  • 08:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T366123)', diff saved to https://phabricator.wikimedia.org/P63652 and previous config saved to /var/cache/conftool/dbconfig/20240530-083054-marostegui.json
  • 08:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T366123)', diff saved to https://phabricator.wikimedia.org/P63651 and previous config saved to /var/cache/conftool/dbconfig/20240530-083025-marostegui.json
  • 08:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P63650 and previous config saved to /var/cache/conftool/dbconfig/20240530-081517-marostegui.json
  • 08:13 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:13 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P63649 and previous config saved to /var/cache/conftool/dbconfig/20240530-080009-marostegui.json
  • 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T366123)', diff saved to https://phabricator.wikimedia.org/P63648 and previous config saved to /var/cache/conftool/dbconfig/20240530-074501-marostegui.json
  • 07:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T366123)', diff saved to https://phabricator.wikimedia.org/P63645 and previous config saved to /var/cache/conftool/dbconfig/20240530-071559-marostegui.json
  • 07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 07:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T366123)', diff saved to https://phabricator.wikimedia.org/P63644 and previous config saved to /var/cache/conftool/dbconfig/20240530-071535-marostegui.json
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P63643 and previous config saved to /var/cache/conftool/dbconfig/20240530-070027-marostegui.json
  • 06:57 marostegui: Deploy schema changes on old s8 eqiad master (db1209) dbmaint T364299
  • 06:56 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 49666
  • 06:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 49666
  • 06:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8674
  • 06:53 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
  • 06:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8674
  • 06:49 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
  • 06:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:48 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 8674
  • 06:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
  • 06:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P63642 and previous config saved to /var/cache/conftool/dbconfig/20240530-064519-marostegui.json
  • 06:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T366123)', diff saved to https://phabricator.wikimedia.org/P63641 and previous config saved to /var/cache/conftool/dbconfig/20240530-063011-marostegui.json
  • 06:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T366123)', diff saved to https://phabricator.wikimedia.org/P63640 and previous config saved to /var/cache/conftool/dbconfig/20240530-060023-marostegui.json
  • 06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T366123)', diff saved to https://phabricator.wikimedia.org/P63639 and previous config saved to /var/cache/conftool/dbconfig/20240530-055959-marostegui.json
  • 05:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P63638 and previous config saved to /var/cache/conftool/dbconfig/20240530-054451-marostegui.json
  • 05:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P63636 and previous config saved to /var/cache/conftool/dbconfig/20240530-052941-marostegui.json
  • 05:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T364299)', diff saved to https://phabricator.wikimedia.org/P63635 and previous config saved to /var/cache/conftool/dbconfig/20240530-052006-marostegui.json
  • 05:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 05:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 05:17 marostegui: Deploy schema changes on old s8 eqiad master (db1209) dbmaint T355609 T356166
  • 05:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T366123)', diff saved to https://phabricator.wikimedia.org/P63634 and previous config saved to /var/cache/conftool/dbconfig/20240530-051433-marostegui.json
  • 05:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T366123)', diff saved to https://phabricator.wikimedia.org/P63633 and previous config saved to /var/cache/conftool/dbconfig/20240530-051220-marostegui.json
  • 05:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 05:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 05:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1209 T364541', diff saved to https://phabricator.wikimedia.org/P63632 and previous config saved to /var/cache/conftool/dbconfig/20240530-051132-root.json
  • 05:10 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1192 to s8 primary and set section read-write T364541', diff saved to https://phabricator.wikimedia.org/P63631 and previous config saved to /var/cache/conftool/dbconfig/20240530-051031-marostegui.json
  • 05:10 marostegui@cumin1002: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - T364541', diff saved to https://phabricator.wikimedia.org/P63630 and previous config saved to /var/cache/conftool/dbconfig/20240530-051012-marostegui.json
  • 05:09 marostegui: Starting s8 eqiad failover from db1209 to db1192 - T364541
  • 05:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 04:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:43 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1192 from API/vslow/dump T364541', diff saved to https://phabricator.wikimedia.org/P63629 and previous config saved to /var/cache/conftool/dbconfig/20240530-044328-root.json
  • 04:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T364541
  • 04:42 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1192 with weight 0 T364541', diff saved to https://phabricator.wikimedia.org/P63628 and previous config saved to /var/cache/conftool/dbconfig/20240530-044249-root.json
  • 04:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T364541
  • 04:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T364299)', diff saved to https://phabricator.wikimedia.org/P63627 and previous config saved to /var/cache/conftool/dbconfig/20240530-025955-marostegui.json
  • 02:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63626 and previous config saved to /var/cache/conftool/dbconfig/20240530-024447-marostegui.json
  • 02:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63625 and previous config saved to /var/cache/conftool/dbconfig/20240530-022938-marostegui.json
  • 02:27 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c4-codfw.mgmt.codfw.wmnet
  • 02:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T364299)', diff saved to https://phabricator.wikimedia.org/P63624 and previous config saved to /var/cache/conftool/dbconfig/20240530-021430-marostegui.json
  • 01:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:56 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:56 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c4-codfw - pt1979@cumin2002"
  • 01:55 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c4-codfw - pt1979@cumin2002"
  • 01:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2138 (T352010)', diff saved to https://phabricator.wikimedia.org/P63623 and previous config saved to /var/cache/conftool/dbconfig/20240530-014850-ladsgroup.json
  • 01:48 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 01:48 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 01:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T352010)', diff saved to https://phabricator.wikimedia.org/P63622 and previous config saved to /var/cache/conftool/dbconfig/20240530-014827-ladsgroup.json
  • 01:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T364069)', diff saved to https://phabricator.wikimedia.org/P63621 and previous config saved to /var/cache/conftool/dbconfig/20240530-014725-marostegui.json
  • 01:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 01:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 01:39 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 01:39 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c4-codfw.mgmt.codfw.wmnet
  • 01:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P63620 and previous config saved to /var/cache/conftool/dbconfig/20240530-013319-ladsgroup.json
  • 01:28 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c3-codfw.mgmt.codfw.wmnet
  • 01:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P63619 and previous config saved to /var/cache/conftool/dbconfig/20240530-011810-ladsgroup.json
  • 01:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T364299)', diff saved to https://phabricator.wikimedia.org/P63618 and previous config saved to /var/cache/conftool/dbconfig/20240530-011518-marostegui.json
  • 01:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 01:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 01:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T364299)', diff saved to https://phabricator.wikimedia.org/P63617 and previous config saved to /var/cache/conftool/dbconfig/20240530-011454-marostegui.json
  • 01:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T352010)', diff saved to https://phabricator.wikimedia.org/P63616 and previous config saved to /var/cache/conftool/dbconfig/20240530-010302-ladsgroup.json
  • 00:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P63615 and previous config saved to /var/cache/conftool/dbconfig/20240530-005946-marostegui.json
  • 00:57 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 00:57 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c3-codfw - pt1979@cumin2002"
  • 00:56 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c3-codfw - pt1979@cumin2002"
  • 00:54 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:54 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c3-codfw.mgmt.codfw.wmnet
  • 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c2-codfw - pt1979@cumin2002"
  • 00:52 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c2-codfw - pt1979@cumin2002"
  • 00:50 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:50 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c2-codfw.mgmt.codfw.wmnet
  • 00:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P63614 and previous config saved to /var/cache/conftool/dbconfig/20240530-004438-marostegui.json
  • 00:41 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c1-codfw.mgmt.codfw.wmnet
  • 00:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T364299)', diff saved to https://phabricator.wikimedia.org/P63613 and previous config saved to /var/cache/conftool/dbconfig/20240530-002930-marostegui.json
  • 00:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 00:09 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c1-codfw - pt1979@cumin2002"
  • 00:08 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c1-codfw - pt1979@cumin2002"
  • 00:06 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:06 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c1-codfw.mgmt.codfw.wmnet

2024-05-29

  • 23:43 eileen: * civicrm upgraded from 0e3c277e to 44900b8c
  • 23:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T364299)', diff saved to https://phabricator.wikimedia.org/P63612 and previous config saved to /var/cache/conftool/dbconfig/20240529-232924-marostegui.json
  • 23:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 23:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 22:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 22:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 22:16 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 22:06 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 22:05 jclark@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
  • 22:05 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 22:05 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:03 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
  • 22:02 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 22:01 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 22:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 21:47 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:47 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 21:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63611 and previous config saved to /var/cache/conftool/dbconfig/20240529-214338-marostegui.json
  • 21:42 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:41 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:21 jsn@deploy1002: Sync cancelled.
  • 21:21 jsn@deploy1002: jsn: Backport for Revert "feature(Popups): Conditional User Defaults Implementation" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:19 jsn@deploy1002: Started scap: Backport for Revert "feature(Popups): Conditional User Defaults Implementation"
  • 21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P63609 and previous config saved to /var/cache/conftool/dbconfig/20240529-211321-marostegui.json
  • 21:05 eileen: config revision changed from 38360c6d to 9bbbf8d6
  • 20:59 eileen: civicrm upgraded from 755c7e7f to 0e3c277e
  • 20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63608 and previous config saved to /var/cache/conftool/dbconfig/20240529-205813-marostegui.json
  • 20:56 eileen: civicrm upgraded from 8f236b05 to 755c7e7f
  • 20:53 eileen: civicrm upgraded from 5d536940 to 8f236b05
  • 20:49 jsn@deploy1002: Sync cancelled.
  • 20:45 eileen: config revision changed from 5b0b4d22 to d686119a
  • 20:44 jsn@deploy1002: jsn and jdlrobson: Backport for feature(Popups): Conditional User Defaults Implementation (T364347) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:41 jsn@deploy1002: Started scap: Backport for feature(Popups): Conditional User Defaults Implementation (T364347)
  • 20:40 jsn@deploy1002: Finished scap: Backport for Enable wmgUseSandboxLink for Swahili Wikipedia (T365970) (duration: 17m 08s)
  • 20:32 jsn@deploy1002: jsn and nmw03: Continuing with sync
  • 20:25 jsn@deploy1002: jsn and nmw03: Backport for Enable wmgUseSandboxLink for Swahili Wikipedia (T365970) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:23 jsn@deploy1002: Started scap: Backport for Enable wmgUseSandboxLink for Swahili Wikipedia (T365970)
  • 20:21 jsn@deploy1002: Finished scap: Backport for CommonSettings: correct AutoModerator load order (T366203) (duration: 11m 22s)
  • 20:12 jsn@deploy1002: jsn: Continuing with sync
  • 20:12 jsn@deploy1002: jsn: Backport for CommonSettings: correct AutoModerator load order (T366203) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:10 eileen: civicrm upgraded from cc402cd1 to 5d536940
  • 20:10 jsn@deploy1002: Started scap: Backport for CommonSettings: correct AutoModerator load order (T366203)
  • 19:58 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 19:51 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 19:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63607 and previous config saved to /var/cache/conftool/dbconfig/20240529-194309-marostegui.json
  • 19:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 19:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 19:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63606 and previous config saved to /var/cache/conftool/dbconfig/20240529-194245-marostegui.json
  • 19:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 19:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 19:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T366123)', diff saved to https://phabricator.wikimedia.org/P63605 and previous config saved to /var/cache/conftool/dbconfig/20240529-194107-marostegui.json
  • 19:37 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@3287de9]: bump discolytics to 0.22.0 (duration: 00m 27s)
  • 19:36 ebernhardson@deploy1002: Started deploy [airflow-dags/search@3287de9]: bump discolytics to 0.22.0
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P63604 and previous config saved to /var/cache/conftool/dbconfig/20240529-192735-marostegui.json
  • 19:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P63603 and previous config saved to /var/cache/conftool/dbconfig/20240529-192559-marostegui.json
  • 19:17 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.7 refs T361401
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P63602 and previous config saved to /var/cache/conftool/dbconfig/20240529-191227-marostegui.json
  • 19:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P63601 and previous config saved to /var/cache/conftool/dbconfig/20240529-191049-marostegui.json
  • 18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63600 and previous config saved to /var/cache/conftool/dbconfig/20240529-185719-marostegui.json
  • 18:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T366123)', diff saved to https://phabricator.wikimedia.org/P63599 and previous config saved to /var/cache/conftool/dbconfig/20240529-185541-marostegui.json
  • 18:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T352010)', diff saved to https://phabricator.wikimedia.org/P63598 and previous config saved to /var/cache/conftool/dbconfig/20240529-185035-ladsgroup.json
  • 18:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T352010)', diff saved to https://phabricator.wikimedia.org/P63597 and previous config saved to /var/cache/conftool/dbconfig/20240529-185006-ladsgroup.json
  • 18:41 cdanis: 💙cdanis@lvs1020.eqiad.wmnet ~ 🕝☕ sudo systemctl restart pybal.service
  • 18:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P63595 and previous config saved to /var/cache/conftool/dbconfig/20240529-183458-ladsgroup.json
  • 18:33 dancy@deploy1002: Finished scap: Backport for Revert "Wrap tables with JS" (T330527) (duration: 25m 10s)
  • 18:32 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 18:32 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002"
  • 18:31 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002"
  • 18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T366123)', diff saved to https://phabricator.wikimedia.org/P63594 and previous config saved to /var/cache/conftool/dbconfig/20240529-182719-marostegui.json
  • 18:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 18:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T366123)', diff saved to https://phabricator.wikimedia.org/P63593 and previous config saved to /var/cache/conftool/dbconfig/20240529-182656-marostegui.json
  • 18:24 dancy@deploy1002: dancy and jdlrobson: Continuing with sync
  • 18:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P63592 and previous config saved to /var/cache/conftool/dbconfig/20240529-181950-ladsgroup.json
  • 18:16 rzl: evacuate cordoned node parse1002: kubectl -n linkrecommendation delete pod linkrecommendation-internal-load-datasets-28616700-7gsqs; kubectl -n linkrecommendation delete pod linkrecommendation-internal-load-datasets-28616700-xl7t4; kubectl -n toolhub delete pod toolhub-main-crawler-28616760-jrhbb # T363086
  • 18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P63590 and previous config saved to /var/cache/conftool/dbconfig/20240529-181148-marostegui.json
  • 18:11 dancy@deploy1002: dancy and jdlrobson: Backport for Revert "Wrap tables with JS" (T330527) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:08 dancy@deploy1002: Started scap: Backport for Revert "Wrap tables with JS" (T330527)
  • 18:04 akosiaris: kubectl -n mw-debug delete pods mw-debug.eqiad.pinkunicorn-6d4d68cd79-nq695
  • 18:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T352010)', diff saved to https://phabricator.wikimedia.org/P63589 and previous config saved to /var/cache/conftool/dbconfig/20240529-180442-ladsgroup.json
  • 17:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P63588 and previous config saved to /var/cache/conftool/dbconfig/20240529-175640-marostegui.json
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63587 and previous config saved to /var/cache/conftool/dbconfig/20240529-174829-marostegui.json
  • 17:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 17:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T364299)', diff saved to https://phabricator.wikimedia.org/P63586 and previous config saved to /var/cache/conftool/dbconfig/20240529-174806-marostegui.json
  • 17:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T366123)', diff saved to https://phabricator.wikimedia.org/P63585 and previous config saved to /var/cache/conftool/dbconfig/20240529-174132-marostegui.json
  • 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T366123)', diff saved to https://phabricator.wikimedia.org/P63584 and previous config saved to /var/cache/conftool/dbconfig/20240529-173921-marostegui.json
  • 17:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 17:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 17:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T366123)', diff saved to https://phabricator.wikimedia.org/P63583 and previous config saved to /var/cache/conftool/dbconfig/20240529-173857-marostegui.json
  • 17:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P63582 and previous config saved to /var/cache/conftool/dbconfig/20240529-173258-marostegui.json
  • 17:26 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1041.eqiad.wmnet with reason: host reimage
  • 17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P63581 and previous config saved to /var/cache/conftool/dbconfig/20240529-172349-marostegui.json
  • 17:23 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1041.eqiad.wmnet with reason: host reimage
  • 17:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P63580 and previous config saved to /var/cache/conftool/dbconfig/20240529-171750-marostegui.json
  • 17:14 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 17:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P63579 and previous config saved to /var/cache/conftool/dbconfig/20240529-170841-marostegui.json
  • 17:08 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:08 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T364299)', diff saved to https://phabricator.wikimedia.org/P63578 and previous config saved to /var/cache/conftool/dbconfig/20240529-170242-marostegui.json
  • 16:59 stevemunene@deploy1002: Finished deploy [airflow-dags/analytics@229b278]: (no justification provided) (duration: 00m 26s)
  • 16:59 stevemunene@deploy1002: Started deploy [airflow-dags/analytics@229b278]: (no justification provided)
  • 16:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T366123)', diff saved to https://phabricator.wikimedia.org/P63577 and previous config saved to /var/cache/conftool/dbconfig/20240529-165333-marostegui.json
  • 16:52 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T366123)', diff saved to https://phabricator.wikimedia.org/P63576 and previous config saved to /var/cache/conftool/dbconfig/20240529-165121-marostegui.json
  • 16:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T366123)', diff saved to https://phabricator.wikimedia.org/P63575 and previous config saved to /var/cache/conftool/dbconfig/20240529-165057-marostegui.json
  • 16:50 dancy@deploy1002: Started scap: Backport for Revert "Wrap tables with JS" (T330527)
  • 16:40 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2045.codfw.wmnet with OS bookworm
  • 16:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P63574 and previous config saved to /var/cache/conftool/dbconfig/20240529-163549-marostegui.json
  • 16:35 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 16:34 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1045.eqiad.wmnet with OS bookworm
  • 16:32 sukhe: restart pybal on lvs1019
  • 16:29 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 16:28 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 16:27 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:22 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2045.codfw.wmnet with reason: host reimage
  • 16:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P63573 and previous config saved to /var/cache/conftool/dbconfig/20240529-162040-marostegui.json
  • 16:19 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2045.codfw.wmnet with reason: host reimage
  • 16:18 ChrisDobbins901_: sudo cumin -b1 -s60 'A:cp and A:drmrs' 'run-puppet-agent --enable "merging CR 1037089"'
  • 16:17 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1045.eqiad.wmnet with reason: host reimage
  • 16:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63572 and previous config saved to /var/cache/conftool/dbconfig/20240529-161522-arnaudb.json
  • 16:15 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:14 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1045.eqiad.wmnet with reason: host reimage
  • 16:09 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T366123)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240529-160528-marostegui.json
  • 16:04 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:04 ChrisDobbins901_: sudo cumin 'A:cp and A:drmrs' 'disable-puppet "merging CR 1037089"'
  • 16:01 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1045.eqiad.wmnet with OS bookworm
  • 16:01 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2045.codfw.wmnet with OS bookworm
  • 16:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63570 and previous config saved to /var/cache/conftool/dbconfig/20240529-160016-arnaudb.json
  • 16:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 16:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 15:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T366123)', diff saved to https://phabricator.wikimedia.org/P63569 and previous config saved to /var/cache/conftool/dbconfig/20240529-155954-marostegui.json
  • 15:59 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:56 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:55 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:55 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 15:55 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T364299)', diff saved to https://phabricator.wikimedia.org/P63568 and previous config saved to /var/cache/conftool/dbconfig/20240529-155349-marostegui.json
  • 15:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T364299)', diff saved to https://phabricator.wikimedia.org/P63567 and previous config saved to /var/cache/conftool/dbconfig/20240529-155321-marostegui.json
  • 15:52 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudvirt1041']
  • 15:49 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbprov2003.codfw.wmnet with reason: upgrade to 10.6
  • 15:49 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on dbprov2003.codfw.wmnet with reason: upgrade to 10.6
  • 15:49 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbprov1003.eqiad.wmnet with reason: upgrade to 10.6
  • 15:49 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on dbprov1003.eqiad.wmnet with reason: upgrade to 10.6
  • 15:48 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:48 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: upgrade to 10.6
  • 15:48 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on db2141.codfw.wmnet with reason: upgrade to 10.6
  • 15:48 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:45 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63566 and previous config saved to /var/cache/conftool/dbconfig/20240529-154510-arnaudb.json
  • 15:45 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P63565 and previous config saved to /var/cache/conftool/dbconfig/20240529-154446-marostegui.json
  • 15:39 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:38 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P63564 and previous config saved to /var/cache/conftool/dbconfig/20240529-153813-marostegui.json
  • 15:32 dancy@deploy1002: Finished scap: Backport for Remove the php symlink (v2) (T359643) (duration: 13m 03s)
  • 15:31 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:31 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:30 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63563 and previous config saved to /var/cache/conftool/dbconfig/20240529-153001-arnaudb.json
  • 15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P63562 and previous config saved to /var/cache/conftool/dbconfig/20240529-152937-marostegui.json
  • 15:29 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cloudvirt1041']
  • 15:27 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: correct IPs for apus - mvernon@cumin2002"
  • 15:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: correct IPs for apus - mvernon@cumin2002"
  • 15:25 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:25 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:23 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:23 dancy@deploy1002: dancy: Continuing with sync
  • 15:23 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 15:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P63561 and previous config saved to /var/cache/conftool/dbconfig/20240529-152305-marostegui.json
  • 15:22 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:22 dancy@deploy1002: dancy: Backport for Remove the php symlink (v2) (T359643) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:21 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:20 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: sync
  • 15:19 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: sync
  • 15:19 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: sync
  • 15:19 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: sync
  • 15:19 dancy@deploy1002: Started scap: Backport for Remove the php symlink (v2) (T359643)
  • 15:18 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-parsoid: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 15:17 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:17 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63560 and previous config saved to /var/cache/conftool/dbconfig/20240529-151455-arnaudb.json
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T366123)', diff saved to https://phabricator.wikimedia.org/P63559 and previous config saved to /var/cache/conftool/dbconfig/20240529-151430-marostegui.json
  • 15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T366123)', diff saved to https://phabricator.wikimedia.org/P63558 and previous config saved to /var/cache/conftool/dbconfig/20240529-151219-marostegui.json
  • 15:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 15:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63557 and previous config saved to /var/cache/conftool/dbconfig/20240529-151152-arnaudb.json
  • 15:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 15:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T366123)', diff saved to https://phabricator.wikimedia.org/P63556 and previous config saved to /var/cache/conftool/dbconfig/20240529-151145-marostegui.json
  • 15:09 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 15:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS bookworm
  • 15:08 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T364299)', diff saved to https://phabricator.wikimedia.org/P63555 and previous config saved to /var/cache/conftool/dbconfig/20240529-150757-marostegui.json
  • 15:07 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:07 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2046.codfw.wmnet with OS bookworm
  • 15:07 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:06 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:06 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cloudvirt1041']
  • 15:05 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:05 andrew@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:05 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cloudvirt1041']
  • 15:04 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 14:58 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1046.eqiad.wmnet with OS bookworm
  • 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63554 and previous config saved to /var/cache/conftool/dbconfig/20240529-145646-arnaudb.json
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P63553 and previous config saved to /var/cache/conftool/dbconfig/20240529-145637-marostegui.json
  • 14:56 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: sync
  • 14:54 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: sync
  • 14:54 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:53 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 14:52 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:52 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 14:50 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 14:50 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:49 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 14:49 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2046.codfw.wmnet with reason: host reimage
  • 14:47 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
  • 14:47 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:45 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:45 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:44 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2046.codfw.wmnet with reason: host reimage
  • 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Discovery IPs for apus service - mvernon@cumin2002"
  • 14:43 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
  • 14:43 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Discovery IPs for apus service - mvernon@cumin2002"
  • 14:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 14:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T364069)', diff saved to https://phabricator.wikimedia.org/P63552 and previous config saved to /var/cache/conftool/dbconfig/20240529-144229-marostegui.json
  • 14:42 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1046.eqiad.wmnet with reason: host reimage
  • 14:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63551 and previous config saved to /var/cache/conftool/dbconfig/20240529-144140-arnaudb.json
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P63550 and previous config saved to /var/cache/conftool/dbconfig/20240529-144129-marostegui.json
  • 14:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 14:39 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1046.eqiad.wmnet with reason: host reimage
  • 14:37 fabfur: enabled puppet on A:cp as https://gerrit.wikimedia.org/r/c/operations/puppet/+/1036711 has been reverted (not applied anywhere but cp4037) (T365718)
  • 14:33 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:30 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS bookworm
  • 14:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1163.eqiad.wmnet with reason: reimage
  • 14:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1163.eqiad.wmnet with reason: reimage
  • 14:28 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1163 T364290', diff saved to https://phabricator.wikimedia.org/P63549 and previous config saved to /var/cache/conftool/dbconfig/20240529-142830-arnaudb.json
  • 14:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63548 and previous config saved to /var/cache/conftool/dbconfig/20240529-142750-arnaudb.json
  • 14:27 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P63547 and previous config saved to /var/cache/conftool/dbconfig/20240529-142721-marostegui.json
  • 14:27 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:26 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:26 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:26 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2046.codfw.wmnet with OS bookworm
  • 14:26 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:26 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1046.eqiad.wmnet with OS bookworm
  • 14:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63546 and previous config saved to /var/cache/conftool/dbconfig/20240529-142627-arnaudb.json
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T366123)', diff saved to https://phabricator.wikimedia.org/P63545 and previous config saved to /var/cache/conftool/dbconfig/20240529-142619-marostegui.json
  • 14:25 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:24 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:24 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:22 brouberol@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:22 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:22 brouberol@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 14:21 brouberol@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:20 brouberol@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:19 brouberol@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:19 brouberol@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:18 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:18 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:17 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:17 brouberol@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:17 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:16 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:16 brouberol@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:15 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:15 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:14 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63544 and previous config saved to /var/cache/conftool/dbconfig/20240529-141244-arnaudb.json
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P63543 and previous config saved to /var/cache/conftool/dbconfig/20240529-141213-marostegui.json
  • 14:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63542 and previous config saved to /var/cache/conftool/dbconfig/20240529-141114-arnaudb.json
  • 14:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1169.eqiad.wmnet with OS bookworm
  • 13:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63541 and previous config saved to /var/cache/conftool/dbconfig/20240529-135738-arnaudb.json
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T364069)', diff saved to https://phabricator.wikimedia.org/P63540 and previous config saved to /var/cache/conftool/dbconfig/20240529-135706-marostegui.json
  • 13:55 jiji@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl1002.eqiad.wmnet
  • 13:55 effie: label wikikube-ctrl1002 as master
  • 13:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T364299)', diff saved to https://phabricator.wikimedia.org/P63539 and previous config saved to /var/cache/conftool/dbconfig/20240529-135300-marostegui.json
  • 13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T364299)', diff saved to https://phabricator.wikimedia.org/P63538 and previous config saved to /var/cache/conftool/dbconfig/20240529-135237-marostegui.json
  • 13:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
  • 13:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
  • 13:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63537 and previous config saved to /var/cache/conftool/dbconfig/20240529-134232-arnaudb.json
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P63536 and previous config saved to /var/cache/conftool/dbconfig/20240529-133729-marostegui.json
  • 13:36 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2047.codfw.wmnet with OS bookworm
  • 13:30 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS bookworm
  • 13:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1169.eqiad.wmnet with reason: reimage
  • 13:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1169.eqiad.wmnet with reason: reimage
  • 13:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1169 T364290', diff saved to https://phabricator.wikimedia.org/P63535 and previous config saved to /var/cache/conftool/dbconfig/20240529-132818-arnaudb.json
  • 13:27 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1047.eqiad.wmnet with OS bookworm
  • 13:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63534 and previous config saved to /var/cache/conftool/dbconfig/20240529-132726-arnaudb.json
  • 13:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS bookworm
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T366123)', diff saved to https://phabricator.wikimedia.org/P63533 and previous config saved to /var/cache/conftool/dbconfig/20240529-132553-marostegui.json
  • 13:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 13:24 otto@deploy1002: Finished scap: Backport for Create eventlogging-processor legacy converter to proxy to eventgate for mediawiki.org (T353817 T323828) (duration: 18m 25s)
  • 13:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P63532 and previous config saved to /var/cache/conftool/dbconfig/20240529-132221-marostegui.json
  • 13:16 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2047.codfw.wmnet with reason: host reimage
  • 13:16 fabfur: temporary disabling puppet on A:cp to rollout https://gerrit.wikimedia.org/r/c/operations/puppet/+/1036711 (T365718)
  • 13:14 otto@deploy1002: otto: Continuing with sync
  • 13:13 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2047.codfw.wmnet with reason: host reimage
  • 13:11 moritzm: installing apache2 security updates
  • 13:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1047.eqiad.wmnet with reason: host reimage
  • 13:08 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1047.eqiad.wmnet with reason: host reimage
  • 13:08 otto@deploy1002: otto: Backport for Create eventlogging-processor legacy converter to proxy to eventgate for mediawiki.org (T353817 T323828) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T364299)', diff saved to https://phabricator.wikimedia.org/P63531 and previous config saved to /var/cache/conftool/dbconfig/20240529-130713-marostegui.json
  • 13:05 otto@deploy1002: Started scap: Backport for Create eventlogging-processor legacy converter to proxy to eventgate for mediawiki.org (T353817 T323828)
  • 13:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
  • 13:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mariadb::core
  • 13:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
  • 12:55 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2047.codfw.wmnet with OS bookworm
  • 12:54 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1047.eqiad.wmnet with OS bookworm
  • 12:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mariadb::core
  • 12:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 12:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T366123)', diff saved to https://phabricator.wikimedia.org/P63530 and previous config saved to /var/cache/conftool/dbconfig/20240529-125255-marostegui.json
  • 12:46 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS bookworm
  • 12:45 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2048.codfw.wmnet with OS bookworm
  • 12:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db[1154,1196].eqiad.wmnet with reason: reimage db1196
  • 12:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db[1154,1196].eqiad.wmnet with reason: reimage db1196
  • 12:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1196 T364290', diff saved to https://phabricator.wikimedia.org/P63529 and previous config saved to /var/cache/conftool/dbconfig/20240529-124352-arnaudb.json
  • 12:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1196.eqiad.wmnet with reason: reimage
  • 12:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1196.eqiad.wmnet with reason: reimage
  • 12:42 marostegui: recreate triggers on s7 codfw db maint db2187:3317 T366167
  • 12:42 marostegui: recreate triggers on s7 codfw db maint db1155:3317 T366167
  • 12:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1048.eqiad.wmnet with OS bookworm
  • 12:39 elukey: move thanos-fe100[3,4] and thanos-fe2* to PKI TLS certs (envoy, backends for thanos-swift.discovery.wmnet) - T344324
  • 12:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P63528 and previous config saved to /var/cache/conftool/dbconfig/20240529-123746-marostegui.json
  • 12:37 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
  • 12:35 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
  • 12:34 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2002.codfw.wmnet with OS bookworm
  • 12:30 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
  • 12:29 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
  • 12:28 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2048.codfw.wmnet with reason: host reimage
  • 12:26 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
  • 12:25 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/media-analytics: apply
  • 12:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1048.eqiad.wmnet with reason: host reimage
  • 12:22 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2048.codfw.wmnet with reason: host reimage
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P63527 and previous config saved to /var/cache/conftool/dbconfig/20240529-122239-marostegui.json
  • 12:21 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 12:19 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 12:19 slyngs: Failover idp.wikimedia.org for CAS upgrade to 6.6.15
  • 12:19 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1048.eqiad.wmnet with reason: host reimage
  • 12:18 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 12:17 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-staging2002.codfw.wmnet with reason: host reimage
  • 12:17 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 12:16 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
  • 12:15 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
  • 12:14 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2002.codfw.wmnet with reason: host reimage
  • 12:12 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 12:11 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 12:10 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 12:08 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 12:07 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T366123)', diff saved to https://phabricator.wikimedia.org/P63526 and previous config saved to /var/cache/conftool/dbconfig/20240529-120730-marostegui.json
  • 12:06 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 12:05 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1048.eqiad.wmnet with OS bookworm
  • 12:04 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2048.codfw.wmnet with OS bookworm
  • 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1235.eqiad.wmnet
  • 11:53 klausman@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2002.codfw.wmnet with OS bookworm
  • 11:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T364299)', diff saved to https://phabricator.wikimedia.org/P63525 and previous config saved to /var/cache/conftool/dbconfig/20240529-115051-marostegui.json
  • 11:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 11:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 11:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T364299)', diff saved to https://phabricator.wikimedia.org/P63524 and previous config saved to /var/cache/conftool/dbconfig/20240529-115025-marostegui.json
  • 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Mabualruz out of all services on: 2198 hosts
  • 11:46 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Mabualruz out of all services on: 2198 hosts
  • 11:44 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 11:42 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 11:42 marostegui: recreate triggers on s7 eqiad db maint db1155:3317 T366167
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T366123)', diff saved to https://phabricator.wikimedia.org/P63523 and previous config saved to /var/cache/conftool/dbconfig/20240529-114153-marostegui.json
  • 11:41 hnowlan: homer "cr*eqiad*" commit 'adding bgp state for wikikube-ctrl1002'
  • 11:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 11:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T366123)', diff saved to https://phabricator.wikimedia.org/P63522 and previous config saved to /var/cache/conftool/dbconfig/20240529-114129-marostegui.json
  • 11:40 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 11:38 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P63521 and previous config saved to /var/cache/conftool/dbconfig/20240529-113517-marostegui.json
  • 11:26 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 11:26 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=wikikube-ctrl1001.eqiad.wmnet
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P63520 and previous config saved to /var/cache/conftool/dbconfig/20240529-112621-marostegui.json
  • 11:26 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 11:25 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 11:23 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:23 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:23 akosiaris: T366094 re-undeploy otel-collector, it being around increased traffic to the API >50%
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P63519 and previous config saved to /var/cache/conftool/dbconfig/20240529-112009-marostegui.json
  • 11:19 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: sync
  • 11:16 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: sync
  • 11:15 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 11:15 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 11:12 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: sync
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P63518 and previous config saved to /var/cache/conftool/dbconfig/20240529-111112-marostegui.json
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: sync
  • 11:06 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:06 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T364299)', diff saved to https://phabricator.wikimedia.org/P63517 and previous config saved to /var/cache/conftool/dbconfig/20240529-110501-marostegui.json
  • 11:04 akosiaris: redeploy opentelemetry collector T366094
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:03 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1049.eqiad.wmnet with OS bookworm
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: sync
  • 10:57 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:57 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: sync
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: sync
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: sync
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T366123)', diff saved to https://phabricator.wikimedia.org/P63516 and previous config saved to /var/cache/conftool/dbconfig/20240529-105604-marostegui.json
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: sync
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T366123)', diff saved to https://phabricator.wikimedia.org/P63515 and previous config saved to /var/cache/conftool/dbconfig/20240529-105454-marostegui.json
  • 10:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 10:52 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: sync
  • 10:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 10:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 10:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 10:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 10:46 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1049.eqiad.wmnet with reason: host reimage
  • 10:45 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:45 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: sync
  • 10:43 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1049.eqiad.wmnet with reason: host reimage
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: sync
  • 10:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:36 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:36 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:35 akosiaris@cumin1002: conftool action : set/pooled=inactive; selector: name=parse1002.eqiad.wmnet
  • 10:29 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1049.eqiad.wmnet with OS bookworm
  • 10:26 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:26 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:26 moritzm: installing intel-microcode security updates
  • 10:24 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:24 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:24 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:24 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:24 akosiaris@cumin1002: conftool action : set/pooled=inactive; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 10:19 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:19 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:17 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubemaster1002.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:16 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubemaster1002.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:16 moritzm: installing python-idna security updates
  • 10:16 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 10:15 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 10:14 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 10:12 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 10:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1235.eqiad.wmnet
  • 10:10 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 10:09 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 10:07 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:07 moritzm: installing systemd security updates
  • 10:06 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:06 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:06 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:05 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:05 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:05 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:05 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1002.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:05 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1002.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:04 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1234.eqiad.wmnet
  • 10:02 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 10:01 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:01 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 09:59 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 09:57 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T364299)', diff saved to https://phabricator.wikimedia.org/P63514 and previous config saved to /var/cache/conftool/dbconfig/20240529-095437-marostegui.json
  • 09:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 09:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 09:51 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:50 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1234.eqiad.wmnet
  • 09:39 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 09:38 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 09:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 09:37 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 09:36 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 09:36 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1232.eqiad.wmnet
  • 09:29 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 09:27 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 09:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1232.eqiad.wmnet
  • 09:12 marostegui: Deploy schema change on s7 eqiad dbmaint T307501
  • 09:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8674
  • 08:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 08:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 08:40 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 08:39 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 08:35 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
  • 08:33 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: sync
  • 08:31 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 49666
  • 08:29 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 49666
  • 08:22 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: sync
  • 08:10 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: sync
  • 08:00 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: sync
  • 07:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1228.eqiad.wmnet
  • 07:54 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1228.eqiad.wmnet
  • 07:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1219.eqiad.wmnet
  • 07:35 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1219.eqiad.wmnet
  • 07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T364299)', diff saved to https://phabricator.wikimedia.org/P63513 and previous config saved to /var/cache/conftool/dbconfig/20240529-073017-marostegui.json
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P63512 and previous config saved to /var/cache/conftool/dbconfig/20240529-071509-marostegui.json
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P63511 and previous config saved to /var/cache/conftool/dbconfig/20240529-070001-marostegui.json
  • 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1218.eqiad.wmnet
  • 06:49 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1218.eqiad.wmnet
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T364299)', diff saved to https://phabricator.wikimedia.org/P63510 and previous config saved to /var/cache/conftool/dbconfig/20240529-064453-marostegui.json
  • 04:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T364299)', diff saved to https://phabricator.wikimedia.org/P63509 and previous config saved to /var/cache/conftool/dbconfig/20240529-042402-marostegui.json
  • 04:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T364299)', diff saved to https://phabricator.wikimedia.org/P63508 and previous config saved to /var/cache/conftool/dbconfig/20240529-042339-marostegui.json
  • 04:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P63507 and previous config saved to /var/cache/conftool/dbconfig/20240529-040831-marostegui.json
  • 04:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137 (T364069)', diff saved to https://phabricator.wikimedia.org/P63506 and previous config saved to /var/cache/conftool/dbconfig/20240529-040259-marostegui.json
  • 04:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 04:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 04:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T364069)', diff saved to https://phabricator.wikimedia.org/P63505 and previous config saved to /var/cache/conftool/dbconfig/20240529-040236-marostegui.json
  • 03:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T352010)', diff saved to https://phabricator.wikimedia.org/P63504 and previous config saved to /var/cache/conftool/dbconfig/20240529-035538-ladsgroup.json
  • 03:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 03:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 03:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P63503 and previous config saved to /var/cache/conftool/dbconfig/20240529-035323-marostegui.json
  • 03:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P63502 and previous config saved to /var/cache/conftool/dbconfig/20240529-034728-marostegui.json
  • 03:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T364299)', diff saved to https://phabricator.wikimedia.org/P63501 and previous config saved to /var/cache/conftool/dbconfig/20240529-033814-marostegui.json
  • 03:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P63500 and previous config saved to /var/cache/conftool/dbconfig/20240529-033221-marostegui.json
  • 03:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T364069)', diff saved to https://phabricator.wikimedia.org/P63499 and previous config saved to /var/cache/conftool/dbconfig/20240529-031710-marostegui.json
  • 02:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T364299)', diff saved to https://phabricator.wikimedia.org/P63498 and previous config saved to /var/cache/conftool/dbconfig/20240529-023432-marostegui.json
  • 02:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63497 and previous config saved to /var/cache/conftool/dbconfig/20240529-023409-marostegui.json
  • 02:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P63496 and previous config saved to /var/cache/conftool/dbconfig/20240529-021901-marostegui.json
  • 02:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P63495 and previous config saved to /var/cache/conftool/dbconfig/20240529-020353-marostegui.json
  • 01:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63494 and previous config saved to /var/cache/conftool/dbconfig/20240529-014845-marostegui.json
  • 00:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63493 and previous config saved to /var/cache/conftool/dbconfig/20240529-004343-marostegui.json
  • 00:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 00:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 00:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T364299)', diff saved to https://phabricator.wikimedia.org/P63492 and previous config saved to /var/cache/conftool/dbconfig/20240529-004319-marostegui.json
  • 00:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P63491 and previous config saved to /var/cache/conftool/dbconfig/20240529-002811-marostegui.json
  • 00:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P63490 and previous config saved to /var/cache/conftool/dbconfig/20240529-001303-marostegui.json

2024-05-28

  • 23:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T364299)', diff saved to https://phabricator.wikimedia.org/P63489 and previous config saved to /var/cache/conftool/dbconfig/20240528-235755-marostegui.json
  • 22:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T364299)', diff saved to https://phabricator.wikimedia.org/P63488 and previous config saved to /var/cache/conftool/dbconfig/20240528-225541-marostegui.json
  • 22:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 22:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 22:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T364299)', diff saved to https://phabricator.wikimedia.org/P63487 and previous config saved to /var/cache/conftool/dbconfig/20240528-225516-marostegui.json
  • 22:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:45 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:45 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P63486 and previous config saved to /var/cache/conftool/dbconfig/20240528-224008-marostegui.json
  • 22:39 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:39 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:30 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:30 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P63485 and previous config saved to /var/cache/conftool/dbconfig/20240528-222500-marostegui.json
  • 22:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T364299)', diff saved to https://phabricator.wikimedia.org/P63484 and previous config saved to /var/cache/conftool/dbconfig/20240528-220950-marostegui.json
  • 21:10 ejegg: payments-wiki upgraded from 0bed1814 to 8ff002ef
  • 21:06 ejegg: donorwiki upgraded from fa7de70f to 8ff002ef
  • 21:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T364299)', diff saved to https://phabricator.wikimedia.org/P63483 and previous config saved to /var/cache/conftool/dbconfig/20240528-210533-marostegui.json
  • 21:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 21:05 ejegg: payments-wiki upgraded from 2bfd247a to 0bed1814
  • 21:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 21:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T364299)', diff saved to https://phabricator.wikimedia.org/P63482 and previous config saved to /var/cache/conftool/dbconfig/20240528-210510-marostegui.json
  • 21:01 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 10m 21s)
  • 20:51 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 11m 23s)
  • 20:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P63481 and previous config saved to /var/cache/conftool/dbconfig/20240528-205001-marostegui.json
  • 20:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P63479 and previous config saved to /var/cache/conftool/dbconfig/20240528-203453-marostegui.json
  • 20:34 cjming@deploy1002: Finished scap: Backport for cirrus: Move remaining public writes to SUP (T363475) (duration: 12m 11s)
  • 20:31 eileen: config revision changed from 6c4cd6c2 to 5b0b4d22 revert schedule
  • 20:25 cjming@deploy1002: cjming and ebernhardson: Continuing with sync
  • 20:25 cjming@deploy1002: cjming and ebernhardson: Backport for cirrus: Move remaining public writes to SUP (T363475) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:22 cjming@deploy1002: Started scap: Backport for cirrus: Move remaining public writes to SUP (T363475)
  • 20:21 eileen: revert to Smarty 2 revision changed from de15d068 to cdc89b59
  • 20:20 cjming@deploy1002: Finished scap: Backport for deploy(Popups): Make use of conditional user defaults (T364347) (duration: 15m 52s)
  • 20:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T364299)', diff saved to https://phabricator.wikimedia.org/P63478 and previous config saved to /var/cache/conftool/dbconfig/20240528-201945-marostegui.json
  • 20:11 cjming@deploy1002: mabualruz and cjming: Continuing with sync
  • 20:08 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 20:07 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 20:07 cjming@deploy1002: mabualruz and cjming: Backport for deploy(Popups): Make use of conditional user defaults (T364347) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:05 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 20:05 cjming@deploy1002: Started scap: Backport for deploy(Popups): Make use of conditional user defaults (T364347)
  • 19:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_eqiad
  • 19:45 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_eqiad
  • 19:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 19:30 herron: disable swap on grafana1002
  • 19:28 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 herron: ganeti1027:~$ sudo gnt-instance reboot grafana1002
  • 19:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 19:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host elastic1056.eqiad.wmnet
  • 19:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P63476 and previous config saved to /var/cache/conftool/dbconfig/20240528-190021-root.json
  • 18:54 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host elastic1056.eqiad.wmnet
  • 18:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on elastic1056.eqiad.wmnet with reason: rebooting after abnormally high load
  • 18:53 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on elastic1056.eqiad.wmnet with reason: rebooting after abnormally high load
  • 18:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:50 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 18:47 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.7 refs T361401
  • 18:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P63475 and previous config saved to /var/cache/conftool/dbconfig/20240528-184515-root.json
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T364299)', diff saved to https://phabricator.wikimedia.org/P63474 and previous config saved to /var/cache/conftool/dbconfig/20240528-184110-marostegui.json
  • 18:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 18:37 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:35 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P63473 and previous config saved to /var/cache/conftool/dbconfig/20240528-183009-root.json
  • 18:21 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:20 dancy@deploy1002: sync-world aborted: Backport for Remove the php symlink (T359643) (duration: 00m 30s)
  • 18:19 dancy@deploy1002: Started scap: Backport for Remove the php symlink (T359643)
  • 18:19 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:17 dancy@deploy1002: sync-world aborted: Backport for Remove the php symlink (T359643) (duration: 01m 00s)
  • 18:16 dancy@deploy1002: Started scap: Backport for Remove the php symlink (T359643)
  • 18:16 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 18:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P63472 and previous config saved to /var/cache/conftool/dbconfig/20240528-181503-root.json
  • 17:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63471 and previous config saved to /var/cache/conftool/dbconfig/20240528-175954-root.json
  • 17:58 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 17:56 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 17:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63470 and previous config saved to /var/cache/conftool/dbconfig/20240528-175638-arnaudb.json
  • 17:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63469 and previous config saved to /var/cache/conftool/dbconfig/20240528-174448-root.json
  • 17:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63468 and previous config saved to /var/cache/conftool/dbconfig/20240528-174131-arnaudb.json
  • 17:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 17:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 17:39 ladsgroup@deploy1002: Finished scap: Backport for Set zhwiki to read new for pagelinks migration (T351237) (duration: 11m 48s)
  • 17:30 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 17:30 ladsgroup@deploy1002: ladsgroup: Backport for Set zhwiki to read new for pagelinks migration (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63467 and previous config saved to /var/cache/conftool/dbconfig/20240528-172942-root.json
  • 17:27 ladsgroup@deploy1002: Started scap: Backport for Set zhwiki to read new for pagelinks migration (T351237)
  • 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63466 and previous config saved to /var/cache/conftool/dbconfig/20240528-172625-arnaudb.json
  • 17:25 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic1056.eqiad.wmnet for ban highly-loaded node - bking@cumin2002
  • 17:25 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic1056.eqiad.wmnet for ban highly-loaded node - bking@cumin2002
  • 17:25 dduvall: removing blubberoid from eqiad, `helmfile -e eqiad destroy` (T365742)
  • 17:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: elastic1056 for ban highly-loaded node - bking@cumin2002
  • 17:25 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic1056 for ban highly-loaded node - bking@cumin2002
  • 17:24 sukhe: sudo -i puppet cert clean blubberoid.discovery.wmnet: T365742
  • 17:24 dduvall: removing blubberoid from codfw, `helmfile -e codfw destroy` (T365742)
  • 17:21 dduvall: removing blubberoid from staging, `helmfile -e staging destroy` (T365742)
  • 17:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS bookworm
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63464 and previous config saved to /var/cache/conftool/dbconfig/20240528-171119-arnaudb.json
  • 17:09 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2014.codfw.wmnet
  • 17:09 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for lvs2014.codfw.wmnet
  • 17:09 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1020.eqiad.wmnet
  • 17:09 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for lvs1020.eqiad.wmnet
  • 17:08 sukhe: sudo cumin 'A:lvs-secondary-eqiad or A:lvs-low-traffic-eqiad' 'ipvsadm --delete-service --tcp-service 10.2.2.31:4666': T365742
  • 17:03 sukhe: removing blubberoid's IP from ipvsadm: T365742
  • 16:59 sukhe@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad or A:lvs-secondary-codfw and A:lvs (T365742)
  • 16:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
  • 16:57 sukhe: sudo cumin 'A:lvs-low-traffic-eqiad or A:lvs-low-traffic-codfw' 'systemctl restart pybal.serice'
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63463 and previous config saved to /var/cache/conftool/dbconfig/20240528-165612-arnaudb.json
  • 16:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T364069)', diff saved to https://phabricator.wikimedia.org/P63462 and previous config saved to /var/cache/conftool/dbconfig/20240528-165002-marostegui.json
  • 16:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 16:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 16:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 16:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 16:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T364299)', diff saved to https://phabricator.wikimedia.org/P63461 and previous config saved to /var/cache/conftool/dbconfig/20240528-164902-marostegui.json
  • 16:47 sukhe@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad or A:lvs-secondary-codfw and A:lvs (T365742)
  • 16:43 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 16:41 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS bookworm
  • 16:41 sukhe: sudo cumin 'O:lvs::balancer' 'run-puppet-agent': T365742
  • 16:41 sukhe: cumin 'O:lvs::balancer' 'run-puppet-agent'
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63459 and previous config saved to /var/cache/conftool/dbconfig/20240528-164106-arnaudb.json
  • 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1211', diff saved to https://phabricator.wikimedia.org/P63458 and previous config saved to /var/cache/conftool/dbconfig/20240528-163810-marostegui.json
  • 16:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63457 and previous config saved to /var/cache/conftool/dbconfig/20240528-163647-arnaudb.json
  • 16:34 sukhe: running run-puppet-agent on A:dnsbox
  • 16:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS bookworm
  • 16:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P63455 and previous config saved to /var/cache/conftool/dbconfig/20240528-163353-marostegui.json
  • 16:33 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 16:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63454 and previous config saved to /var/cache/conftool/dbconfig/20240528-162141-arnaudb.json
  • 16:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P63453 and previous config saved to /var/cache/conftool/dbconfig/20240528-161845-marostegui.json
  • 16:17 ladsgroup@deploy1002: Finished scap: Backport for x-wikimedia-debug: add datacenter options for k8s (T365478) (duration: 12m 00s)
  • 16:14 Lucas_WMDE: lucaswerkmeister-wmde@stat1011:~$ sudo -u analytics-wmde rm -rf /srv/analytics-wmde/wdcm/ # T364965; contained src/ as a clean git clone as of c2b0a324e9 / I024691a148, and nothing else
  • 16:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
  • 16:12 hnowlan: kubectl node uncordon wikikube-worker2002.codfw.wmnet
  • 16:11 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker2002.codfw.wmnet
  • 16:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
  • 16:10 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:09 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:09 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:08 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:08 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:08 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:08 ladsgroup@deploy1002: ladsgroup and jiji: Continuing with sync
  • 16:08 ladsgroup@deploy1002: ladsgroup and jiji: Backport for x-wikimedia-debug: add datacenter options for k8s (T365478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63452 and previous config saved to /var/cache/conftool/dbconfig/20240528-160635-arnaudb.json
  • 16:05 ladsgroup@deploy1002: Started scap: Backport for x-wikimedia-debug: add datacenter options for k8s (T365478)
  • 16:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T364299)', diff saved to https://phabricator.wikimedia.org/P63451 and previous config saved to /var/cache/conftool/dbconfig/20240528-160337-marostegui.json
  • 16:01 ladsgroup@deploy1002: Finished scap: Backport for Create electionadmin group on testwiki (T209892) (duration: 17m 48s)
  • 16:00 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:00 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:55 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS bookworm
  • 15:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1206.eqiad.wmnet with reason: reimage
  • 15:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1206.eqiad.wmnet with reason: reimage
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1206 T364290', diff saved to https://phabricator.wikimedia.org/P63449 and previous config saved to /var/cache/conftool/dbconfig/20240528-155309-arnaudb.json
  • 15:52 ejegg: fundraising civicrm upgraded from 4dd78bcc to 3fee95bc
  • 15:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63448 and previous config saved to /var/cache/conftool/dbconfig/20240528-155129-arnaudb.json
  • 15:50 hnowlan: ran `sudo puppet node deactivate kubernetes2032.codfw.wmnet` to fix renamed host erroring in scap
  • 15:48 ladsgroup@deploy1002: tstarling and ladsgroup: Continuing with sync
  • 15:48 ladsgroup@deploy1002: tstarling and ladsgroup: Backport for Create electionadmin group on testwiki (T209892) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:45 sukhe: sudo cumin -b1 -s120 'A:dnsbox and not P{dns6001*}' 'run-puppet-agent --enable "merging CR 1034476"'
  • 15:45 brennen@deploy1002: Finished deploy [phabricator/deployment@e7093e2]: deploy phab1004 for T366075 (duration: 00m 32s)
  • 15:44 brennen@deploy1002: Started deploy [phabricator/deployment@e7093e2]: deploy phab1004 for T366075
  • 15:44 brennen@deploy1002: Finished deploy [phabricator/deployment@e7093e2]: deploy phab2002 for T366075 (duration: 00m 33s)
  • 15:44 ladsgroup@deploy1002: Started scap: Backport for Create electionadmin group on testwiki (T209892)
  • 15:43 brennen@deploy1002: Started deploy [phabricator/deployment@e7093e2]: deploy phab2002 for T366075
  • 15:41 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on phabricator.wikimedia.org with reason: phabricator deploy
  • 15:41 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phabricator.wikimedia.org with reason: phabricator deploy
  • 15:40 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab.wmfusercontent.org with reason: phabricator deploy
  • 15:40 jiji@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: Kubernetes masters trouble - no deployments - serviceops (duration: 114m 39s)
  • 15:40 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab.wmfusercontent.org with reason: phabricator deploy
  • 15:39 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on phabricator.wikimedia.org with reason: phabricator deploy
  • 15:39 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phabricator.wikimedia.org with reason: phabricator deploy
  • 15:38 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: phabricator deploy
  • 15:38 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: phabricator deploy
  • 15:38 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: phabricator deploy
  • 15:38 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: phabricator deploy
  • 15:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63447 and previous config saved to /var/cache/conftool/dbconfig/20240528-153622-arnaudb.json
  • 15:35 sukhe: sudo cumin 'A:dnsbox' 'disable-puppet "merging CR 1034476"'
  • 15:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2002.codfw.wmnet with OS bullseye
  • 15:31 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 15:30 ejegg: fundraising civicrm upgraded from 7e998894 to 4dd78bcc
  • 15:29 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1206.eqiad.wmnet
  • 15:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1206.eqiad.wmnet
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1186.eqiad.wmnet
  • 15:13 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 15:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2002.codfw.wmnet with reason: host reimage
  • 15:12 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 15:09 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2002.codfw.wmnet with reason: host reimage
  • 15:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1207.eqiad.wmnet with OS bookworm
  • 15:06 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 15:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1186.eqiad.wmnet
  • 15:05 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 14:56 akosiaris: migrate kubemaster1002 to ganeti1037
  • 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1184.eqiad.wmnet
  • 14:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host <spicerack.netbox.NetboxServer object at 0x7f5406426910>
  • 14:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2002
  • 14:49 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2002
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2002.codfw.wmnet 223.16.192.10.in-addr.arpa 3.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:49 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2002.codfw.wmnet 223.16.192.10.in-addr.arpa 3.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2002 - hnowlan@cumin1002"
  • 14:48 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2002 - hnowlan@cumin1002"
  • 14:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1207.eqiad.wmnet with reason: host reimage
  • 14:44 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 14:44 akosiaris: gnt-instance replace-disks for kubemaster1002, set ganeti1037 as a secondary
  • 14:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1207.eqiad.wmnet with reason: host reimage
  • 14:37 akosiaris: reboot kubemaster1001 with 8 vpus for consistency with kubemaster1002.
  • 14:37 akosiaris: repool kubemaster1001 with 8 vpus for consistency with kubemaster1002.
  • 14:31 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 14:30 akosiaris: repool kubemaster1001, testing something
  • 14:29 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1001.eqiad.wmnet
  • 14:29 akosiaris: depool kubemaster1001, it's CPU is saturated after a test roll restart
  • 14:29 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1207.eqiad.wmnet with OS bookworm
  • 14:28 arnaudb@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host db1207.eqiad.wmnet with OS bookworm
  • 14:28 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1001.eqiad.wmnet
  • 14:27 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 14:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1184.eqiad.wmnet
  • 14:25 effie: enabling puppet on wikikube-ctrl100[1-2]*
  • 14:24 ejegg: fundraising civicrm upgraded from e2dc8f4e to 7e998894
  • 14:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63444 and previous config saved to /var/cache/conftool/dbconfig/20240528-142431-arnaudb.json
  • 14:21 akosiaris@cumin1002: conftool action : set/weight=10; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 14:19 akosiaris: add another 4 vcpus to kubemaster1002
  • 14:11 akosiaris: restart kube-apiserver on kubemaster1002
  • 14:09 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63442 and previous config saved to /var/cache/conftool/dbconfig/20240528-140925-arnaudb.json
  • 14:08 akosiaris@cumin1002: conftool action : set/weight=1; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 14:07 akosiaris@cumin1002: conftool action : set/weight=5; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 14:04 akosiaris: roll restart mw-api-int pods
  • 14:03 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 14:03 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 14:03 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 14:01 akosiaris: remove wikikube-ctrl1002 from the rotation to test a theory
  • 14:01 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=wikikube-ctrl1001.eqiad.wmnet
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T364299)', diff saved to https://phabricator.wikimedia.org/P63440 and previous config saved to /var/cache/conftool/dbconfig/20240528-135912-marostegui.json
  • 13:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63439 and previous config saved to /var/cache/conftool/dbconfig/20240528-135848-marostegui.json
  • 13:57 ejegg: fundraising civicrm upgraded from 6c1fdd4f to e2dc8f4e
  • 13:55 moritzm: installing pillow security updates
  • 13:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63438 and previous config saved to /var/cache/conftool/dbconfig/20240528-135419-arnaudb.json
  • 13:54 akosiaris: add manually ferm client rule on wikikube-ctrl1002 and disable puppet
  • 13:51 akosiaris: run puppet and restart ferm on wikikube-ctrl1001
  • 13:51 akosiaris: run puppet and restart ferm
  • 13:46 jiji@deploy1002: Locking from deployment [ALL REPOSITORIES]: Kubernetes masters trouble - no deployments - serviceops
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P63437 and previous config saved to /var/cache/conftool/dbconfig/20240528-134341-marostegui.json
  • 13:43 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Create electionadmin group on testwiki (T209892) (duration: 34m 29s)
  • 13:42 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1207.eqiad.wmnet with OS bookworm
  • 13:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1207.eqiad.wmnet with reason: reimage
  • 13:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1207.eqiad.wmnet with reason: reimage
  • 13:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1207 T364290', diff saved to https://phabricator.wikimedia.org/P63436 and previous config saved to /var/cache/conftool/dbconfig/20240528-134150-arnaudb.json
  • 13:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63435 and previous config saved to /var/cache/conftool/dbconfig/20240528-133913-arnaudb.json
  • 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1169.eqiad.wmnet
  • 13:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P63434 and previous config saved to /var/cache/conftool/dbconfig/20240528-132833-marostegui.json
  • 13:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63433 and previous config saved to /var/cache/conftool/dbconfig/20240528-132407-arnaudb.json
  • 13:20 sukhe: sudo cumin -b1 -s120 'A:dnsbox and not P{dns6001*}' 'run-puppet-agent --enable "merging CR 1036644"'
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63432 and previous config saved to /var/cache/conftool/dbconfig/20240528-131325-marostegui.json
  • 13:12 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and tstarling: Continuing with sync
  • 13:11 moritzm: installing bzip2 bugfix updates
  • 13:11 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and tstarling: Backport for Create electionadmin group on testwiki (T209892) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:09 sukhe: sudo cumin 'A:dnsbox' 'disable-puppet "merging CR 1036644"'
  • 13:09 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Create electionadmin group on testwiki (T209892)
  • 13:06 moritzm: installing man-db bugfix updates
  • 13:04 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host <spicerack.netbox.NetboxServer object at 0x7f5406426910>
  • 13:04 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2002.codfw.wmnet with OS bullseye
  • 13:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS bookworm
  • 12:58 vgutierrez: testing fifo-log-demux 0.7.5 on cp3081 and cp3073
  • 12:52 moritzm: installing python-urllib3 security updates
  • 12:51 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
  • 12:47 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
  • 12:47 tstarling@deploy1002: Synchronized wmf-config/core-Permissions.php: create electionadmin group on testwiki T209892 (attempt 2 after k8s-related rollback) (duration: 16m 02s)
  • 12:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
  • 12:40 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=thanos-fe1002.eqiad.wmnet
  • 12:38 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
  • 12:38 elukey: move thanos-fe1002's envoy TLS cert to CFSSL/PKI - T344324
  • 12:37 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=thanos-fe1002.eqiad.wmnet
  • 12:30 tstarling@deploy1002: Synchronized wmf-config/core-Permissions.php: create electionadmin group on testwiki T209892 (duration: 31m 52s)
  • 12:27 moritzm: installing jetty9 security updates
  • 12:25 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS bookworm
  • 12:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1218.eqiad.wmnet with reason: reimage
  • 12:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1218.eqiad.wmnet with reason: reimage
  • 12:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1218 T364290', diff saved to https://phabricator.wikimedia.org/P63431 and previous config saved to /var/cache/conftool/dbconfig/20240528-122442-arnaudb.json
  • 12:18 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2002.codfw.wmnet with OS bullseye
  • 12:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2002.codfw.wmnet with OS bullseye
  • 12:13 jiji@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl1001.eqiad.wmnet
  • 12:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2032 to wikikube-worker2002
  • 12:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2002
  • 12:11 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2002
  • 12:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2032 to wikikube-worker2002 - hnowlan@cumin1002"
  • 12:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63429 and previous config saved to /var/cache/conftool/dbconfig/20240528-121037-marostegui.json
  • 12:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:10 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2032 to wikikube-worker2002 - hnowlan@cumin1002"
  • 12:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:09 moritzm: installing glib2.0 security updates
  • 12:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 12:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 12:07 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 12:07 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2032 to wikikube-worker2002
  • 12:07 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1169.eqiad.wmnet
  • 12:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P63428 and previous config saved to /var/cache/conftool/dbconfig/20240528-120503-root.json
  • 11:51 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker2001.codfw.wmnet
  • 11:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P63426 and previous config saved to /var/cache/conftool/dbconfig/20240528-114957-root.json
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P63425 and previous config saved to /var/cache/conftool/dbconfig/20240528-113451-root.json
  • 11:32 Dreamy_Jazz: Restarted MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1163.eqiad.wmnet
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P63424 and previous config saved to /var/cache/conftool/dbconfig/20240528-111946-root.json
  • 11:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63423 and previous config saved to /var/cache/conftool/dbconfig/20240528-110817-arnaudb.json
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63422 and previous config saved to /var/cache/conftool/dbconfig/20240528-110440-root.json
  • 11:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1163.eqiad.wmnet
  • 10:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63421 and previous config saved to /var/cache/conftool/dbconfig/20240528-105311-arnaudb.json
  • 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63420 and previous config saved to /var/cache/conftool/dbconfig/20240528-104934-root.json
  • 10:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63419 and previous config saved to /var/cache/conftool/dbconfig/20240528-103805-arnaudb.json
  • 10:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63418 and previous config saved to /var/cache/conftool/dbconfig/20240528-103428-root.json
  • 10:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 10:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 10:23 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2216.codfw.wmnet
  • 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63417 and previous config saved to /var/cache/conftool/dbconfig/20240528-102259-arnaudb.json
  • 10:21 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2049.codfw.wmnet with OS bookworm
  • 10:18 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 10:08 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 10:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63416 and previous config saved to /var/cache/conftool/dbconfig/20240528-100752-arnaudb.json
  • 10:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2216.codfw.wmnet
  • 10:02 moritzm: installing jinja2 security updates
  • 09:57 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2049.codfw.wmnet with reason: host reimage
  • 09:54 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2049.codfw.wmnet with reason: host reimage
  • 09:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63415 and previous config saved to /var/cache/conftool/dbconfig/20240528-095058-arnaudb.json
  • 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2212.codfw.wmnet
  • 09:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS bookworm
  • 09:45 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:45 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:43 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:43 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1243.eqiad.wmnet with reason: unknown lag
  • 09:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1243.eqiad.wmnet with reason: unknown lag
  • 09:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2212.codfw.wmnet
  • 09:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63414 and previous config saved to /var/cache/conftool/dbconfig/20240528-093552-arnaudb.json
  • 09:35 zabe@deploy1002: Finished scap: Backport for Stop writing to af_user(_text)/afh_user(_text) everywhere (T337920), Update interwiki cache (duration: 17m 49s)
  • 09:35 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 09:34 jiji@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mc2049.codfw.wmnet
  • 09:33 jiji@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mc2049.codfw.wmnet
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T364299)', diff saved to https://phabricator.wikimedia.org/P63413 and previous config saved to /var/cache/conftool/dbconfig/20240528-093344-marostegui.json
  • 09:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
  • 09:21 zabe@deploy1002: zabe: Continuing with sync
  • 09:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63412 and previous config saved to /var/cache/conftool/dbconfig/20240528-092046-arnaudb.json
  • 09:20 zabe@deploy1002: zabe: Backport for Stop writing to af_user(_text)/afh_user(_text) everywhere (T337920), Update interwiki cache synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P63411 and previous config saved to /var/cache/conftool/dbconfig/20240528-091836-marostegui.json
  • 09:17 zabe@deploy1002: Started scap: Backport for Stop writing to af_user(_text)/afh_user(_text) everywhere (T337920), Update interwiki cache
  • 09:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit1003.wikimedia.org
  • 09:14 jelto@cumin1002: START - Cookbook sre.hosts.remove-downtime for gerrit1003.wikimedia.org
  • 09:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit2002.wikimedia.org
  • 09:14 jelto@cumin1002: START - Cookbook sre.hosts.remove-downtime for gerrit2002.wikimedia.org
  • 09:13 zabe: zabe@mwmaint1002:~$ mwscript extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --wiki=dtpwiki --cluster=all 2>&1 | tee /tmp/dtpwiki.UpdateSearchIndexConfig.log # T365220
  • 09:09 zabe@deploy1002: Finished scap: T365220 (duration: 19m 22s)
  • 09:08 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS bookworm
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1219 T364290', diff saved to https://phabricator.wikimedia.org/P63410 and previous config saved to /var/cache/conftool/dbconfig/20240528-090724-arnaudb.json
  • 09:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63409 and previous config saved to /var/cache/conftool/dbconfig/20240528-090538-arnaudb.json
  • 09:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P63408 and previous config saved to /var/cache/conftool/dbconfig/20240528-090328-marostegui.json
  • 09:03 jiji@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host mc2049.codfw.wmnet
  • 08:58 jiji@cumin2002: START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet
  • 08:55 zabe@deploy1002: zabe: Continuing with sync
  • 08:54 zabe@deploy1002: zabe: T365220 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:53 jiji@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mc2049.codfw.wmnet
  • 08:52 jiji@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host mc2049.codfw.wmnet
  • 08:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1219.eqiad.wmnet with reason: reimage
  • 08:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1219.eqiad.wmnet with reason: reimage
  • 08:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63407 and previous config saved to /var/cache/conftool/dbconfig/20240528-085032-arnaudb.json
  • 08:50 zabe@deploy1002: Started scap: T365220
  • 08:49 hashar: Upgraded gerrit.wikimedia.org from Gerrit 3.8.5 to 3.8.6 # T365328
  • 08:48 zabe: create Wikipedia Central Dusun # T365220
  • 08:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T364299)', diff saved to https://phabricator.wikimedia.org/P63406 and previous config saved to /var/cache/conftool/dbconfig/20240528-084820-marostegui.json
  • 08:47 hashar@deploy1002: Finished deploy [gerrit/gerrit@c93e47d]: Gerrit to v3.8.6 on gerrit1003 - T365328 (duration: 00m 05s)
  • 08:47 hashar@deploy1002: Started deploy [gerrit/gerrit@c93e47d]: Gerrit to v3.8.6 on gerrit1003 - T365328
  • 08:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1228.eqiad.wmnet with OS bookworm
  • 08:45 hashar: Upgraded gerrit-replica.wikimedia.org from Gerrit 3.8.5 to 3.8.6 # T365328
  • 08:37 jiji@cumin2002: START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet
  • 08:33 hashar@deploy1002: Finished deploy [gerrit/gerrit@c93e47d]: Gerrit to v3.8.6 on gerrit2002 - T365328 (duration: 00m 08s)
  • 08:33 hashar@deploy1002: Started deploy [gerrit/gerrit@c93e47d]: Gerrit to v3.8.6 on gerrit2002 - T365328
  • 08:27 moritzm: imported jenkins to 2.452.1 in component thirdparty/ci T366008
  • 08:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1228.eqiad.wmnet with reason: host reimage
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: Gerrit patchset upgrade
  • 08:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit1003.wikimedia.org with reason: Gerrit patchset upgrade
  • 08:23 jiji@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mc2049.codfw.wmnet
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit2002.wikimedia.org with reason: Gerrit patchset upgrade
  • 08:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit2002.wikimedia.org with reason: Gerrit patchset upgrade
  • 08:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1228.eqiad.wmnet with reason: host reimage
  • 08:22 jiji@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2049.codfw.wmnet with OS bookworm
  • 08:12 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2049.codfw.wmnet with reason: host reimage
  • 08:10 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1228.eqiad.wmnet with OS bookworm
  • 08:09 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 08:09 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2049.codfw.wmnet with reason: host reimage
  • 08:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1228.eqiad.wmnet with reason: reimage
  • 08:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1228.eqiad.wmnet with reason: reimage
  • 08:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1228 T364290', diff saved to https://phabricator.wikimedia.org/P63404 and previous config saved to /var/cache/conftool/dbconfig/20240528-080835-arnaudb.json
  • 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2203.codfw.wmnet
  • 07:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1243.eqiad.wmnet with OS bookworm
  • 07:51 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 07:51 jiji@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc2049.codfw.wmnet with OS bookworm
  • 07:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 07:46 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 07:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2203.codfw.wmnet
  • 07:41 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 07:38 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 07:35 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 07:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1243.eqiad.wmnet with reason: host reimage
  • 07:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1243.eqiad.wmnet with reason: host reimage
  • 07:30 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 07:23 kartik@deploy1002: Finished scap: Backport for Section Translation: Enable in newly created Wikipedias (T366003) (duration: 19m 51s)
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2209 (T364299)', diff saved to https://phabricator.wikimedia.org/P63403 and previous config saved to /var/cache/conftool/dbconfig/20240528-072006-marostegui.json
  • 07:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 07:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63402 and previous config saved to /var/cache/conftool/dbconfig/20240528-071942-marostegui.json
  • 07:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1243.eqiad.wmnet with OS bookworm
  • 07:07 kartik@deploy1002: kartik: Continuing with sync
  • 07:06 kartik@deploy1002: kartik: Backport for Section Translation: Enable in newly created Wikipedias (T366003) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P63401 and previous config saved to /var/cache/conftool/dbconfig/20240528-070434-marostegui.json
  • 07:03 kartik@deploy1002: Started scap: Backport for Section Translation: Enable in newly created Wikipedias (T366003)
  • 06:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P63400 and previous config saved to /var/cache/conftool/dbconfig/20240528-064926-marostegui.json
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63399 and previous config saved to /var/cache/conftool/dbconfig/20240528-063417-marostegui.json
  • 05:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 05:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 05:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63398 and previous config saved to /var/cache/conftool/dbconfig/20240528-052952-marostegui.json
  • 05:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P63397 and previous config saved to /var/cache/conftool/dbconfig/20240528-051444-marostegui.json
  • 05:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63396 and previous config saved to /var/cache/conftool/dbconfig/20240528-050527-marostegui.json
  • 05:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 05:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 05:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T364299)', diff saved to https://phabricator.wikimedia.org/P63395 and previous config saved to /var/cache/conftool/dbconfig/20240528-050504-marostegui.json
  • 04:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P63394 and previous config saved to /var/cache/conftool/dbconfig/20240528-045936-marostegui.json
  • 04:50 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 04:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P63393 and previous config saved to /var/cache/conftool/dbconfig/20240528-044955-marostegui.json
  • 04:49 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 04:49 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 04:48 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 04:48 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 04:48 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 04:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63392 and previous config saved to /var/cache/conftool/dbconfig/20240528-044428-marostegui.json
  • 04:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P63391 and previous config saved to /var/cache/conftool/dbconfig/20240528-043446-marostegui.json
  • 04:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T364299)', diff saved to https://phabricator.wikimedia.org/P63390 and previous config saved to /var/cache/conftool/dbconfig/20240528-041937-marostegui.json
  • 04:03 mwpresync@deploy1002: Pruned MediaWiki: 1.43.0-wmf.4 (duration: 03m 44s)
  • 04:02 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.7 refs T361401 (duration: 59m 56s)
  • 03:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63389 and previous config saved to /var/cache/conftool/dbconfig/20240528-035915-marostegui.json
  • 03:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 03:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 03:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T364069)', diff saved to https://phabricator.wikimedia.org/P63388 and previous config saved to /var/cache/conftool/dbconfig/20240528-035852-marostegui.json
  • 03:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P63387 and previous config saved to /var/cache/conftool/dbconfig/20240528-034344-marostegui.json
  • 03:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P63386 and previous config saved to /var/cache/conftool/dbconfig/20240528-032835-marostegui.json
  • 03:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T364069)', diff saved to https://phabricator.wikimedia.org/P63385 and previous config saved to /var/cache/conftool/dbconfig/20240528-031327-marostegui.json
  • 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.7 refs T361401
  • 02:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T364299)', diff saved to https://phabricator.wikimedia.org/P63384 and previous config saved to /var/cache/conftool/dbconfig/20240528-025234-marostegui.json
  • 02:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 02:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 02:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T364299)', diff saved to https://phabricator.wikimedia.org/P63383 and previous config saved to /var/cache/conftool/dbconfig/20240528-025211-marostegui.json
  • 02:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P63382 and previous config saved to /var/cache/conftool/dbconfig/20240528-023703-marostegui.json
  • 02:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2211 (T364069)', diff saved to https://phabricator.wikimedia.org/P63381 and previous config saved to /var/cache/conftool/dbconfig/20240528-022627-marostegui.json
  • 02:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 02:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 02:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P63380 and previous config saved to /var/cache/conftool/dbconfig/20240528-022155-marostegui.json
  • 02:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T364299)', diff saved to https://phabricator.wikimedia.org/P63379 and previous config saved to /var/cache/conftool/dbconfig/20240528-020647-marostegui.json
  • 01:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 01:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 01:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T364069)', diff saved to https://phabricator.wikimedia.org/P63378 and previous config saved to /var/cache/conftool/dbconfig/20240528-013123-marostegui.json
  • 01:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P63377 and previous config saved to /var/cache/conftool/dbconfig/20240528-011615-marostegui.json
  • 01:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P63376 and previous config saved to /var/cache/conftool/dbconfig/20240528-010107-marostegui.json
  • 00:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T364069)', diff saved to https://phabricator.wikimedia.org/P63375 and previous config saved to /var/cache/conftool/dbconfig/20240528-004559-marostegui.json
  • 00:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T364299)', diff saved to https://phabricator.wikimedia.org/P63374 and previous config saved to /var/cache/conftool/dbconfig/20240528-003255-marostegui.json
  • 00:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63373 and previous config saved to /var/cache/conftool/dbconfig/20240528-003230-marostegui.json
  • 00:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P63372 and previous config saved to /var/cache/conftool/dbconfig/20240528-001721-marostegui.json
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T364069)', diff saved to https://phabricator.wikimedia.org/P63371 and previous config saved to /var/cache/conftool/dbconfig/20240528-000602-marostegui.json
  • 00:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 00:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 00:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T364069)', diff saved to https://phabricator.wikimedia.org/P63370 and previous config saved to /var/cache/conftool/dbconfig/20240528-000549-marostegui.json
  • 00:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P63369 and previous config saved to /var/cache/conftool/dbconfig/20240528-000213-marostegui.json

2024-05-27

  • 23:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P63368 and previous config saved to /var/cache/conftool/dbconfig/20240527-235041-marostegui.json
  • 23:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63367 and previous config saved to /var/cache/conftool/dbconfig/20240527-234705-marostegui.json
  • 23:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P63366 and previous config saved to /var/cache/conftool/dbconfig/20240527-233533-marostegui.json
  • 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T364069)', diff saved to https://phabricator.wikimedia.org/P63365 and previous config saved to /var/cache/conftool/dbconfig/20240527-232025-marostegui.json
  • 22:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T364069)', diff saved to https://phabricator.wikimedia.org/P63364 and previous config saved to /var/cache/conftool/dbconfig/20240527-222759-marostegui.json
  • 22:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 22:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 22:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T364069)', diff saved to https://phabricator.wikimedia.org/P63363 and previous config saved to /var/cache/conftool/dbconfig/20240527-222735-marostegui.json
  • 22:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63362 and previous config saved to /var/cache/conftool/dbconfig/20240527-221330-marostegui.json
  • 22:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 22:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 22:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 22:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 22:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T364299)', diff saved to https://phabricator.wikimedia.org/P63361 and previous config saved to /var/cache/conftool/dbconfig/20240527-221302-marostegui.json
  • 22:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P63360 and previous config saved to /var/cache/conftool/dbconfig/20240527-221227-marostegui.json
  • 21:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P63359 and previous config saved to /var/cache/conftool/dbconfig/20240527-215754-marostegui.json
  • 21:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P63358 and previous config saved to /var/cache/conftool/dbconfig/20240527-215719-marostegui.json
  • 21:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P63357 and previous config saved to /var/cache/conftool/dbconfig/20240527-214246-marostegui.json
  • 21:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T364069)', diff saved to https://phabricator.wikimedia.org/P63356 and previous config saved to /var/cache/conftool/dbconfig/20240527-214210-marostegui.json
  • 21:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T364299)', diff saved to https://phabricator.wikimedia.org/P63355 and previous config saved to /var/cache/conftool/dbconfig/20240527-212738-marostegui.json
  • 20:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T364069)', diff saved to https://phabricator.wikimedia.org/P63353 and previous config saved to /var/cache/conftool/dbconfig/20240527-204653-marostegui.json
  • 20:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 20:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 20:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63352 and previous config saved to /var/cache/conftool/dbconfig/20240527-204630-marostegui.json
  • 20:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P63351 and previous config saved to /var/cache/conftool/dbconfig/20240527-203922-ladsgroup.json
  • 20:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P63350 and previous config saved to /var/cache/conftool/dbconfig/20240527-203122-marostegui.json
  • 20:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P63349 and previous config saved to /var/cache/conftool/dbconfig/20240527-202416-ladsgroup.json
  • 20:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P63348 and previous config saved to /var/cache/conftool/dbconfig/20240527-201614-marostegui.json
  • 20:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P63347 and previous config saved to /var/cache/conftool/dbconfig/20240527-200910-ladsgroup.json
  • 20:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63346 and previous config saved to /var/cache/conftool/dbconfig/20240527-200106-marostegui.json
  • 19:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P63345 and previous config saved to /var/cache/conftool/dbconfig/20240527-195404-ladsgroup.json
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T364299)', diff saved to https://phabricator.wikimedia.org/P63344 and previous config saved to /var/cache/conftool/dbconfig/20240527-195232-marostegui.json
  • 19:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 19:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T364299)', diff saved to https://phabricator.wikimedia.org/P63343 and previous config saved to /var/cache/conftool/dbconfig/20240527-195158-marostegui.json
  • 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P63342 and previous config saved to /var/cache/conftool/dbconfig/20240527-193650-marostegui.json
  • 19:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P63341 and previous config saved to /var/cache/conftool/dbconfig/20240527-192142-marostegui.json
  • 19:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T364299)', diff saved to https://phabricator.wikimedia.org/P63340 and previous config saved to /var/cache/conftool/dbconfig/20240527-190634-marostegui.json
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63339 and previous config saved to /var/cache/conftool/dbconfig/20240527-190155-marostegui.json
  • 19:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 19:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T364069)', diff saved to https://phabricator.wikimedia.org/P63338 and previous config saved to /var/cache/conftool/dbconfig/20240527-190132-marostegui.json
  • 18:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P63337 and previous config saved to /var/cache/conftool/dbconfig/20240527-184624-marostegui.json
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P63336 and previous config saved to /var/cache/conftool/dbconfig/20240527-183115-marostegui.json
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T364069)', diff saved to https://phabricator.wikimedia.org/P63335 and previous config saved to /var/cache/conftool/dbconfig/20240527-181607-marostegui.json
  • 17:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T364299)', diff saved to https://phabricator.wikimedia.org/P63334 and previous config saved to /var/cache/conftool/dbconfig/20240527-173035-marostegui.json
  • 17:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 17:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T364069)', diff saved to https://phabricator.wikimedia.org/P63333 and previous config saved to /var/cache/conftool/dbconfig/20240527-172258-marostegui.json
  • 17:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 16:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 16:36 jiji@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2049.codfw.wmnet with OS bookworm
  • 16:29 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 16:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:18 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 16:09 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 16:03 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 16:03 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 16:02 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:57 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 15:56 elukey: run `apt-get clean` on dse-k8s-worker1001 to free space on the root partition
  • 15:56 jiji@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2049.codfw.wmnet with OS bookworm
  • 15:54 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 15:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 15:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 15:44 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:30 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 15:28 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 15:22 effie: disable puppet on mc1049 pending OS upgrade
  • 15:20 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T360332)', diff saved to https://phabricator.wikimedia.org/P63332 and previous config saved to /var/cache/conftool/dbconfig/20240527-150735-arnaudb.json
  • 15:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 15:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T364299)', diff saved to https://phabricator.wikimedia.org/P63331 and previous config saved to /var/cache/conftool/dbconfig/20240527-150514-marostegui.json
  • 15:01 fabfur: enable puppet on A:cp (T365718)
  • 14:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P63330 and previous config saved to /var/cache/conftool/dbconfig/20240527-145226-arnaudb.json
  • 14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P63329 and previous config saved to /var/cache/conftool/dbconfig/20240527-145004-marostegui.json
  • 14:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 14:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 14:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63328 and previous config saved to /var/cache/conftool/dbconfig/20240527-144538-marostegui.json
  • 14:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P63327 and previous config saved to /var/cache/conftool/dbconfig/20240527-143718-arnaudb.json
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P63326 and previous config saved to /var/cache/conftool/dbconfig/20240527-143457-marostegui.json
  • 14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240527-143025-marostegui.json
  • 14:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T360332)', diff saved to https://phabricator.wikimedia.org/P63324 and previous config saved to /var/cache/conftool/dbconfig/20240527-142210-arnaudb.json
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T364299)', diff saved to https://phabricator.wikimedia.org/P63323 and previous config saved to /var/cache/conftool/dbconfig/20240527-141949-marostegui.json
  • 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1238 (T360332)', diff saved to https://phabricator.wikimedia.org/P63322 and previous config saved to /var/cache/conftool/dbconfig/20240527-141948-arnaudb.json
  • 14:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 14:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P63321 and previous config saved to /var/cache/conftool/dbconfig/20240527-141515-marostegui.json
  • 14:14 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --touched-after=20240524120000 2>&1 | tee -a ~/T315510-enwiki-7; date # cc T365974
  • 14:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript updateCollation.php bswikiquote --previous-collation=uppercase # T365133
  • 14:06 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 14:05 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Set $wgCategoryCollation to uca-bs-u-kn on Bosnian Wikiquote (T365133) (duration: 17m 25s)
  • 14:00 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 14:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63320 and previous config saved to /var/cache/conftool/dbconfig/20240527-140007-marostegui.json
  • 13:54 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 13:54 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 13:51 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and nmw03: Continuing with sync
  • 13:50 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and nmw03: Backport for Set $wgCategoryCollation to uca-bs-u-kn on Bosnian Wikiquote (T365133) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:47 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Set $wgCategoryCollation to uca-bs-u-kn on Bosnian Wikiquote (T365133)
  • 13:46 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:46 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on enwiki (T315353), Pre-emptively disable DiscussionToolsEnableThanks (no-op) (duration: 18m 15s)
  • 13:46 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:44 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 13:42 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:42 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T364299)', diff saved to https://phabricator.wikimedia.org/P63319 and previous config saved to /var/cache/conftool/dbconfig/20240527-133636-marostegui.json
  • 13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T364299)', diff saved to https://phabricator.wikimedia.org/P63318 and previous config saved to /var/cache/conftool/dbconfig/20240527-133605-marostegui.json
  • 13:32 logmsgbot: lucaswerkmeister-wmde@deploy1002 esanders and matmarex and lucaswerkmeister-wmde: Continuing with sync
  • 13:32 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 13:32 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:30 logmsgbot: lucaswerkmeister-wmde@deploy1002 esanders and matmarex and lucaswerkmeister-wmde: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on enwiki (T315353), Pre-emptively disable DiscussionToolsEnableThanks (no-op) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:29 fabfur: enabled puppet on cp4037 to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1035440 (T365718)
  • 13:28 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on enwiki (T315353), Pre-emptively disable DiscussionToolsEnableThanks (no-op)
  • 13:26 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Revert "arwiki: Disable Extension:ContentTranslation for non-autoreview users" (T255022) (duration: 20m 46s)
  • 13:21 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P63317 and previous config saved to /var/cache/conftool/dbconfig/20240527-132057-marostegui.json
  • 13:18 fabfur: disabling puppet on A:cp to safely apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1035440 (T365718)
  • 13:16 vgutierrez: test fifo-log-demux 0.7.5 on cp4052
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63316 and previous config saved to /var/cache/conftool/dbconfig/20240527-131539-marostegui.json
  • 13:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T364069)', diff saved to https://phabricator.wikimedia.org/P63315 and previous config saved to /var/cache/conftool/dbconfig/20240527-131516-marostegui.json
  • 13:15 hnowlan@cumin1002: conftool action : set/pooled=no; selector: name=parse1002.eqiad.wmnet
  • 13:12 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and gergesshamon: Continuing with sync
  • 13:08 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and gergesshamon: Backport for Revert "arwiki: Disable Extension:ContentTranslation for non-autoreview users" (T255022) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:08 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 13:06 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Revert "arwiki: Disable Extension:ContentTranslation for non-autoreview users" (T255022)
  • 13:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P63314 and previous config saved to /var/cache/conftool/dbconfig/20240527-130549-marostegui.json
  • 13:05 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:04 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P63313 and previous config saved to /var/cache/conftool/dbconfig/20240527-130008-marostegui.json
  • 12:56 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 12:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T364299)', diff saved to https://phabricator.wikimedia.org/P63312 and previous config saved to /var/cache/conftool/dbconfig/20240527-125041-marostegui.json
  • 12:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P63311 and previous config saved to /var/cache/conftool/dbconfig/20240527-124500-marostegui.json
  • 12:42 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 12:40 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 12:19 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 12:18 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 12:18 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 12:18 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 12:17 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 12:17 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host <spicerack.netbox.NetboxServer object at 0x7f9776417550>
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2001
  • 12:11 ayounsi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2001
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2001.codfw.wmnet 39.16.192.10.in-addr.arpa 9.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:11 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2001.codfw.wmnet 39.16.192.10.in-addr.arpa 9.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2001 - ayounsi@cumin1002"
  • 12:10 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2001 - ayounsi@cumin1002"
  • 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T364299)', diff saved to https://phabricator.wikimedia.org/P63309 and previous config saved to /var/cache/conftool/dbconfig/20240527-120732-marostegui.json
  • 12:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 12:07 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 12:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63308 and previous config saved to /var/cache/conftool/dbconfig/20240527-120709-marostegui.json
  • 12:06 ayounsi@cumin1002: START - Cookbook sre.hosts.move-vlan for host <spicerack.netbox.NetboxServer object at 0x7f9776417550>
  • 12:05 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 11:52 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P63307 and previous config saved to /var/cache/conftool/dbconfig/20240527-115200-marostegui.json
  • 11:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: add CasApereo auth and update wheels - ayounsi@cumin1002 - T308002
  • 11:49 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: add CasApereo auth and update wheels - ayounsi@cumin1002 - T308002
  • 11:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox-dev2002.codfw.wmnet with reason: add python-jose and update wheels - ayounsi@cumin1002 - T308002
  • 11:44 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: add python-jose and update wheels - ayounsi@cumin1002 - T308002
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T364069)', diff saved to https://phabricator.wikimedia.org/P63306 and previous config saved to /var/cache/conftool/dbconfig/20240527-114316-marostegui.json
  • 11:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 11:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T364069)', diff saved to https://phabricator.wikimedia.org/P63305 and previous config saved to /var/cache/conftool/dbconfig/20240527-114252-marostegui.json
  • 11:41 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P63304 and previous config saved to /var/cache/conftool/dbconfig/20240527-113651-marostegui.json
  • 11:33 moritzm: installing jinja2 security updates
  • 11:29 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 11:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P63303 and previous config saved to /var/cache/conftool/dbconfig/20240527-112744-marostegui.json
  • 11:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63302 and previous config saved to /var/cache/conftool/dbconfig/20240527-112143-marostegui.json
  • 11:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P63301 and previous config saved to /var/cache/conftool/dbconfig/20240527-111236-marostegui.json
  • 10:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T364069)', diff saved to https://phabricator.wikimedia.org/P63299 and previous config saved to /var/cache/conftool/dbconfig/20240527-105728-marostegui.json
  • 10:55 Amir1: dbmaint s2@codfw (T364985)
  • 10:55 Amir1: main s2@codfw (T364985)
  • 10:52 slyngs: Upgrade IDM to Bitu 0.0.8
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63298 and previous config saved to /var/cache/conftool/dbconfig/20240527-103759-marostegui.json
  • 10:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 10:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63297 and previous config saved to /var/cache/conftool/dbconfig/20240527-103734-marostegui.json
  • 10:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 100%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63296 and previous config saved to /var/cache/conftool/dbconfig/20240527-102639-root.json
  • 10:26 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P63295 and previous config saved to /var/cache/conftool/dbconfig/20240527-102226-marostegui.json
  • 10:14 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 10:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 75%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63294 and previous config saved to /var/cache/conftool/dbconfig/20240527-101133-root.json
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P63293 and previous config saved to /var/cache/conftool/dbconfig/20240527-100717-marostegui.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T364069)', diff saved to https://phabricator.wikimedia.org/P63292 and previous config saved to /var/cache/conftool/dbconfig/20240527-100523-marostegui.json
  • 10:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T364069)', diff saved to https://phabricator.wikimedia.org/P63291 and previous config saved to /var/cache/conftool/dbconfig/20240527-100459-marostegui.json
  • 10:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:58 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 09:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 50%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63290 and previous config saved to /var/cache/conftool/dbconfig/20240527-095626-root.json
  • 09:56 ladsgroup@deploy1002: Finished scap: Backport for Update tagline and wordmark of Persian Wikibooks (T365913) (duration: 16m 59s)
  • 09:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:52 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:52 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63289 and previous config saved to /var/cache/conftool/dbconfig/20240527-095208-marostegui.json
  • 09:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P63288 and previous config saved to /var/cache/conftool/dbconfig/20240527-094951-marostegui.json
  • 09:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox-dev2002.codfw.wmnet with reason: add python-social-auth and update wheels - ayounsi@cumin1002 - T308002
  • 09:45 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 09:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 ladsgroup@deploy1002: ebrahim and ladsgroup: Continuing with sync
  • 09:41 ladsgroup@deploy1002: ebrahim and ladsgroup: Backport for Update tagline and wordmark of Persian Wikibooks (T365913) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:41 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: add python-social-auth and update wheels - ayounsi@cumin1002 - T308002
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 25%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63287 and previous config saved to /var/cache/conftool/dbconfig/20240527-094120-root.json
  • 09:39 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 ladsgroup@deploy1002: Started scap: Backport for Update tagline and wordmark of Persian Wikibooks (T365913)
  • 09:39 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 zabe@deploy1002: Finished scap: Backport for Stop writing to af_user(_text)/afh_user(_text) in group1 wikis (T337920) (duration: 29m 43s)
  • 09:37 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 64096
  • 09:36 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 64096
  • 09:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 09:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 09:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P63286 and previous config saved to /var/cache/conftool/dbconfig/20240527-093443-marostegui.json
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P63285 and previous config saved to /var/cache/conftool/dbconfig/20240527-093306-root.json
  • 09:31 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:31 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 10%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63284 and previous config saved to /var/cache/conftool/dbconfig/20240527-092614-root.json
  • 09:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63283 and previous config saved to /var/cache/conftool/dbconfig/20240527-092459-root.json
  • 09:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:23 zabe@deploy1002: zabe: Continuing with sync
  • 09:23 zabe@deploy1002: zabe: Backport for Stop writing to af_user(_text)/afh_user(_text) in group1 wikis (T337920) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T364069)', diff saved to https://phabricator.wikimedia.org/P63282 and previous config saved to /var/cache/conftool/dbconfig/20240527-091935-marostegui.json
  • 09:14 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:14 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 5%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63280 and previous config saved to /var/cache/conftool/dbconfig/20240527-091108-root.json
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63279 and previous config saved to /var/cache/conftool/dbconfig/20240527-090953-root.json
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63278 and previous config saved to /var/cache/conftool/dbconfig/20240527-090938-marostegui.json
  • 09:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 09:09 zabe@deploy1002: Started scap: Backport for Stop writing to af_user(_text)/afh_user(_text) in group1 wikis (T337920)
  • 09:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T364299)', diff saved to https://phabricator.wikimedia.org/P63277 and previous config saved to /var/cache/conftool/dbconfig/20240527-090915-marostegui.json
  • 09:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 1%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63276 and previous config saved to /var/cache/conftool/dbconfig/20240527-085602-root.json
  • 08:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63275 and previous config saved to /var/cache/conftool/dbconfig/20240527-085447-root.json
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P63274 and previous config saved to /var/cache/conftool/dbconfig/20240527-085407-marostegui.json
  • 08:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P63273 and previous config saved to /var/cache/conftool/dbconfig/20240527-083859-marostegui.json
  • 08:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:32 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:32 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T364069)', diff saved to https://phabricator.wikimedia.org/P63272 and previous config saved to /var/cache/conftool/dbconfig/20240527-082603-marostegui.json
  • 08:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T364069)', diff saved to https://phabricator.wikimedia.org/P63271 and previous config saved to /var/cache/conftool/dbconfig/20240527-082539-marostegui.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T364299)', diff saved to https://phabricator.wikimedia.org/P63270 and previous config saved to /var/cache/conftool/dbconfig/20240527-082351-marostegui.json
  • 08:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2188.codfw.wmnet
  • 08:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:18 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:14 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:14 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P63269 and previous config saved to /var/cache/conftool/dbconfig/20240527-081031-marostegui.json
  • 08:01 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2188.codfw.wmnet
  • 08:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2129.codfw.wmnet with reason: Long schema change
  • 08:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2129.codfw.wmnet with reason: Long schema change
  • 08:00 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:00 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 marostegui: Deploy schema change on s6 codfw (old master) dbmaint T364299
  • 07:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129 T365783', diff saved to https://phabricator.wikimedia.org/P63268 and previous config saved to /var/cache/conftool/dbconfig/20240527-075602-root.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P63267 and previous config saved to /var/cache/conftool/dbconfig/20240527-075524-marostegui.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2214 to s6 primary T365783', diff saved to https://phabricator.wikimedia.org/P63266 and previous config saved to /var/cache/conftool/dbconfig/20240527-075512-marostegui.json
  • 07:54 marostegui: Starting s6 codfw failover from db2129 to db2214 - T365783
  • 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2176.codfw.wmnet
  • 07:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T364299)', diff saved to https://phabricator.wikimedia.org/P63265 and previous config saved to /var/cache/conftool/dbconfig/20240527-074105-marostegui.json
  • 07:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T364299)', diff saved to https://phabricator.wikimedia.org/P63264 and previous config saved to /var/cache/conftool/dbconfig/20240527-074042-marostegui.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T364069)', diff saved to https://phabricator.wikimedia.org/P63263 and previous config saved to /var/cache/conftool/dbconfig/20240527-074009-marostegui.json
  • 07:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:35 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s6 T365783
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2214 with weight 0 T365783', diff saved to https://phabricator.wikimedia.org/P63262 and previous config saved to /var/cache/conftool/dbconfig/20240527-073545-root.json
  • 07:35 root@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s6 T365783
  • 07:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:35 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:33 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:33 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2176.codfw.wmnet
  • 07:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2174.codfw.wmnet
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P63261 and previous config saved to /var/cache/conftool/dbconfig/20240527-072534-marostegui.json
  • 07:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:21 marostegui: Deploy schema change on s7 codfw dbmaint T307501
  • 07:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2174.codfw.wmnet
  • 07:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P63260 and previous config saved to /var/cache/conftool/dbconfig/20240527-071026-marostegui.json
  • 07:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T364299)', diff saved to https://phabricator.wikimedia.org/P63259 and previous config saved to /var/cache/conftool/dbconfig/20240527-065518-marostegui.json
  • 06:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:47 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:44 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1243.eqiad.wmnet with OS bookworm
  • 06:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1183 (T364069)', diff saved to https://phabricator.wikimedia.org/P63258 and previous config saved to /var/cache/conftool/dbconfig/20240527-063832-marostegui.json
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 06:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T364069)', diff saved to https://phabricator.wikimedia.org/P63257 and previous config saved to /var/cache/conftool/dbconfig/20240527-063809-marostegui.json
  • 06:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:25 kart_: Updated cxserver to 2024-05-20-182409-production (T354666, T365230)
  • 06:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P63256 and previous config saved to /var/cache/conftool/dbconfig/20240527-062301-marostegui.json
  • 06:17 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 06:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:17 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 06:15 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 06:15 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T364299)', diff saved to https://phabricator.wikimedia.org/P63255 and previous config saved to /var/cache/conftool/dbconfig/20240527-061252-marostegui.json
  • 06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P63253 and previous config saved to /var/cache/conftool/dbconfig/20240527-060752-marostegui.json
  • 06:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:00 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:53 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:53 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T364069)', diff saved to https://phabricator.wikimedia.org/P63252 and previous config saved to /var/cache/conftool/dbconfig/20240527-055244-marostegui.json
  • 05:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:24 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1243.eqiad.wmnet with OS bookworm
  • 05:24 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1243.eqiad.wmnet with OS bookworm
  • 05:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:21 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:15 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:15 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:07 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1243.eqiad.wmnet with OS bookworm
  • 05:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1243', diff saved to https://phabricator.wikimedia.org/P63251 and previous config saved to /var/cache/conftool/dbconfig/20240527-050551-marostegui.json
  • 05:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:03 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T364069)', diff saved to https://phabricator.wikimedia.org/P63250 and previous config saved to /var/cache/conftool/dbconfig/20240527-045301-marostegui.json
  • 04:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 04:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:47 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 04:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 04:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:21 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:21 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:14 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:14 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:52 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:52 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:33 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:33 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:18 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:09 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:47 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:15 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:15 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:31 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:31 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:26 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:26 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-05-26

  • 23:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:33 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:33 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:55 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:23 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:00 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:00 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:55 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:32 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:32 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:35 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:25 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:18 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:55 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:45 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:25 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:46 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:31 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:31 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:25 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:37 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:37 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:21 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:21 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:13 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:13 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T364069)', diff saved to https://phabricator.wikimedia.org/P63249 and previous config saved to /var/cache/conftool/dbconfig/20240526-140250-marostegui.json
  • 14:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 jelto: restart apache2 on gerrit1003
  • 13:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P63248 and previous config saved to /var/cache/conftool/dbconfig/20240526-134742-marostegui.json
  • 13:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P63247 and previous config saved to /var/cache/conftool/dbconfig/20240526-133234-marostegui.json
  • 13:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T364069)', diff saved to https://phabricator.wikimedia.org/P63246 and previous config saved to /var/cache/conftool/dbconfig/20240526-131726-marostegui.json
  • 13:14 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:14 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:48 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2209 (T364069)', diff saved to https://phabricator.wikimedia.org/P63245 and previous config saved to /var/cache/conftool/dbconfig/20240526-111558-marostegui.json
  • 11:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 11:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T364069)', diff saved to https://phabricator.wikimedia.org/P63244 and previous config saved to /var/cache/conftool/dbconfig/20240526-111534-marostegui.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P63243 and previous config saved to /var/cache/conftool/dbconfig/20240526-110026-marostegui.json
  • 10:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P63242 and previous config saved to /var/cache/conftool/dbconfig/20240526-104518-marostegui.json
  • 10:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:31 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:31 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T364069)', diff saved to https://phabricator.wikimedia.org/P63241 and previous config saved to /var/cache/conftool/dbconfig/20240526-103010-marostegui.json
  • 10:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:48 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:48 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:46 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2194 (T364069)', diff saved to https://phabricator.wikimedia.org/P63240 and previous config saved to /var/cache/conftool/dbconfig/20240526-082333-marostegui.json
  • 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T364069)', diff saved to https://phabricator.wikimedia.org/P63239 and previous config saved to /var/cache/conftool/dbconfig/20240526-082310-marostegui.json
  • 08:09 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:09 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P63238 and previous config saved to /var/cache/conftool/dbconfig/20240526-080802-marostegui.json
  • 08:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P63237 and previous config saved to /var/cache/conftool/dbconfig/20240526-075253-marostegui.json
  • 07:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T364069)', diff saved to https://phabricator.wikimedia.org/P63236 and previous config saved to /var/cache/conftool/dbconfig/20240526-073745-marostegui.json
  • 07:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2204 (T364299)', diff saved to https://phabricator.wikimedia.org/P63235 and previous config saved to /var/cache/conftool/dbconfig/20240526-072316-marostegui.json
  • 07:09 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:09 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to https://phabricator.wikimedia.org/P63234 and previous config saved to /var/cache/conftool/dbconfig/20240526-070808-marostegui.json
  • 06:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to https://phabricator.wikimedia.org/P63233 and previous config saved to /var/cache/conftool/dbconfig/20240526-065259-marostegui.json
  • 06:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2204 (T364299)', diff saved to https://phabricator.wikimedia.org/P63232 and previous config saved to /var/cache/conftool/dbconfig/20240526-063752-marostegui.json
  • 06:30 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:30 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:15 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:15 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2204 (T364299)', diff saved to https://phabricator.wikimedia.org/P63231 and previous config saved to /var/cache/conftool/dbconfig/20240526-054305-marostegui.json
  • 05:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2204.codfw.wmnet with reason: Maintenance
  • 05:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2204.codfw.wmnet with reason: Maintenance
  • 05:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T364069)', diff saved to https://phabricator.wikimedia.org/P63230 and previous config saved to /var/cache/conftool/dbconfig/20240526-053127-marostegui.json
  • 05:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 05:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T364069)', diff saved to https://phabricator.wikimedia.org/P63229 and previous config saved to /var/cache/conftool/dbconfig/20240526-053103-marostegui.json
  • 05:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P63228 and previous config saved to /var/cache/conftool/dbconfig/20240526-051555-marostegui.json
  • 05:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P63227 and previous config saved to /var/cache/conftool/dbconfig/20240526-050047-marostegui.json
  • 04:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 04:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 04:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63226 and previous config saved to /var/cache/conftool/dbconfig/20240526-045357-marostegui.json
  • 04:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T364069)', diff saved to https://phabricator.wikimedia.org/P63225 and previous config saved to /var/cache/conftool/dbconfig/20240526-044539-marostegui.json
  • 04:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P63224 and previous config saved to /var/cache/conftool/dbconfig/20240526-043849-marostegui.json
  • 04:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P63223 and previous config saved to /var/cache/conftool/dbconfig/20240526-042341-marostegui.json
  • 04:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63222 and previous config saved to /var/cache/conftool/dbconfig/20240526-040833-marostegui.json
  • 03:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63221 and previous config saved to /var/cache/conftool/dbconfig/20240526-030259-marostegui.json
  • 03:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 03:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 03:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63220 and previous config saved to /var/cache/conftool/dbconfig/20240526-030236-marostegui.json
  • 02:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P63219 and previous config saved to /var/cache/conftool/dbconfig/20240526-024728-marostegui.json
  • 02:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P63218 and previous config saved to /var/cache/conftool/dbconfig/20240526-023220-marostegui.json
  • 02:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63217 and previous config saved to /var/cache/conftool/dbconfig/20240526-021711-marostegui.json
  • 02:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T364069)', diff saved to https://phabricator.wikimedia.org/P63216 and previous config saved to /var/cache/conftool/dbconfig/20240526-021238-marostegui.json
  • 02:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 02:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 02:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T364069)', diff saved to https://phabricator.wikimedia.org/P63215 and previous config saved to /var/cache/conftool/dbconfig/20240526-021213-marostegui.json
  • 01:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P63214 and previous config saved to /var/cache/conftool/dbconfig/20240526-015704-marostegui.json
  • 01:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P63213 and previous config saved to /var/cache/conftool/dbconfig/20240526-014156-marostegui.json
  • 01:37 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:37 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T364069)', diff saved to https://phabricator.wikimedia.org/P63212 and previous config saved to /var/cache/conftool/dbconfig/20240526-012648-marostegui.json
  • 01:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63211 and previous config saved to /var/cache/conftool/dbconfig/20240526-010523-marostegui.json
  • 01:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 01:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 01:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T364299)', diff saved to https://phabricator.wikimedia.org/P63210 and previous config saved to /var/cache/conftool/dbconfig/20240526-010500-marostegui.json
  • 00:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P63209 and previous config saved to /var/cache/conftool/dbconfig/20240526-004952-marostegui.json
  • 00:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P63208 and previous config saved to /var/cache/conftool/dbconfig/20240526-003444-marostegui.json
  • 00:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T364299)', diff saved to https://phabricator.wikimedia.org/P63207 and previous config saved to /var/cache/conftool/dbconfig/20240526-001936-marostegui.json

2024-05-25

  • 23:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T364299)', diff saved to https://phabricator.wikimedia.org/P63206 and previous config saved to /var/cache/conftool/dbconfig/20240525-230523-marostegui.json
  • 23:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 23:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 23:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T364299)', diff saved to https://phabricator.wikimedia.org/P63205 and previous config saved to /var/cache/conftool/dbconfig/20240525-230500-marostegui.json
  • 22:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T364069)', diff saved to https://phabricator.wikimedia.org/P63204 and previous config saved to /var/cache/conftool/dbconfig/20240525-225331-marostegui.json
  • 22:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 22:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 22:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 22:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 22:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T364069)', diff saved to https://phabricator.wikimedia.org/P63203 and previous config saved to /var/cache/conftool/dbconfig/20240525-225251-marostegui.json
  • 22:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P63202 and previous config saved to /var/cache/conftool/dbconfig/20240525-224952-marostegui.json
  • 22:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P63201 and previous config saved to /var/cache/conftool/dbconfig/20240525-223743-marostegui.json
  • 22:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P63200 and previous config saved to /var/cache/conftool/dbconfig/20240525-223444-marostegui.json
  • 22:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P63199 and previous config saved to /var/cache/conftool/dbconfig/20240525-222235-marostegui.json
  • 22:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T364299)', diff saved to https://phabricator.wikimedia.org/P63198 and previous config saved to /var/cache/conftool/dbconfig/20240525-221936-marostegui.json
  • 22:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T364069)', diff saved to https://phabricator.wikimedia.org/P63197 and previous config saved to /var/cache/conftool/dbconfig/20240525-220727-marostegui.json
  • 21:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138 (T364299)', diff saved to https://phabricator.wikimedia.org/P63196 and previous config saved to /var/cache/conftool/dbconfig/20240525-210754-marostegui.json
  • 21:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 21:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 21:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T364299)', diff saved to https://phabricator.wikimedia.org/P63195 and previous config saved to /var/cache/conftool/dbconfig/20240525-210731-marostegui.json
  • 20:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P63194 and previous config saved to /var/cache/conftool/dbconfig/20240525-205223-marostegui.json
  • 20:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P63193 and previous config saved to /var/cache/conftool/dbconfig/20240525-203715-marostegui.json
  • 20:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T364299)', diff saved to https://phabricator.wikimedia.org/P63192 and previous config saved to /var/cache/conftool/dbconfig/20240525-202207-marostegui.json
  • 19:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T364069)', diff saved to https://phabricator.wikimedia.org/P63191 and previous config saved to /var/cache/conftool/dbconfig/20240525-193047-marostegui.json
  • 19:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 19:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T364299)', diff saved to https://phabricator.wikimedia.org/P63190 and previous config saved to /var/cache/conftool/dbconfig/20240525-191242-marostegui.json
  • 19:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T364299)', diff saved to https://phabricator.wikimedia.org/P63189 and previous config saved to /var/cache/conftool/dbconfig/20240525-191201-marostegui.json
  • 18:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P63188 and previous config saved to /var/cache/conftool/dbconfig/20240525-185653-marostegui.json
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P63187 and previous config saved to /var/cache/conftool/dbconfig/20240525-184145-marostegui.json
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T364299)', diff saved to https://phabricator.wikimedia.org/P63186 and previous config saved to /var/cache/conftool/dbconfig/20240525-182637-marostegui.json
  • 16:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T364299)', diff saved to https://phabricator.wikimedia.org/P63185 and previous config saved to /var/cache/conftool/dbconfig/20240525-164506-marostegui.json
  • 16:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 16:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 16:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 16:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 16:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T364069)', diff saved to https://phabricator.wikimedia.org/P63184 and previous config saved to /var/cache/conftool/dbconfig/20240525-164135-marostegui.json
  • 16:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P63183 and previous config saved to /var/cache/conftool/dbconfig/20240525-162627-marostegui.json
  • 16:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P63182 and previous config saved to /var/cache/conftool/dbconfig/20240525-161118-marostegui.json
  • 15:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T364069)', diff saved to https://phabricator.wikimedia.org/P63181 and previous config saved to /var/cache/conftool/dbconfig/20240525-155610-marostegui.json
  • 15:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 15:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T364299)', diff saved to https://phabricator.wikimedia.org/P63180 and previous config saved to /var/cache/conftool/dbconfig/20240525-135800-marostegui.json
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P63179 and previous config saved to /var/cache/conftool/dbconfig/20240525-134252-marostegui.json
  • 13:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P63178 and previous config saved to /var/cache/conftool/dbconfig/20240525-132744-marostegui.json
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T364069)', diff saved to https://phabricator.wikimedia.org/P63177 and previous config saved to /var/cache/conftool/dbconfig/20240525-131619-marostegui.json
  • 13:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T364299)', diff saved to https://phabricator.wikimedia.org/P63176 and previous config saved to /var/cache/conftool/dbconfig/20240525-131236-marostegui.json
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1246 (T364299)', diff saved to https://phabricator.wikimedia.org/P63175 and previous config saved to /var/cache/conftool/dbconfig/20240525-110931-marostegui.json
  • 11:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 11:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 10:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 10:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 09:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T364299)', diff saved to https://phabricator.wikimedia.org/P63174 and previous config saved to /var/cache/conftool/dbconfig/20240525-093814-marostegui.json
  • 09:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P63173 and previous config saved to /var/cache/conftool/dbconfig/20240525-092306-marostegui.json
  • 09:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P63172 and previous config saved to /var/cache/conftool/dbconfig/20240525-090758-marostegui.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T364299)', diff saved to https://phabricator.wikimedia.org/P63171 and previous config saved to /var/cache/conftool/dbconfig/20240525-085250-marostegui.json
  • 08:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 08:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T364069)', diff saved to https://phabricator.wikimedia.org/P63170 and previous config saved to /var/cache/conftool/dbconfig/20240525-082057-marostegui.json
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P63169 and previous config saved to /var/cache/conftool/dbconfig/20240525-080549-marostegui.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P63168 and previous config saved to /var/cache/conftool/dbconfig/20240525-075041-marostegui.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T364069)', diff saved to https://phabricator.wikimedia.org/P63167 and previous config saved to /var/cache/conftool/dbconfig/20240525-073533-marostegui.json
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T364299)', diff saved to https://phabricator.wikimedia.org/P63166 and previous config saved to /var/cache/conftool/dbconfig/20240525-063712-marostegui.json
  • 06:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 06:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T364299)', diff saved to https://phabricator.wikimedia.org/P63165 and previous config saved to /var/cache/conftool/dbconfig/20240525-063649-marostegui.json
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P63164 and previous config saved to /var/cache/conftool/dbconfig/20240525-062141-marostegui.json
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T364069)', diff saved to https://phabricator.wikimedia.org/P63163 and previous config saved to /var/cache/conftool/dbconfig/20240525-061028-marostegui.json
  • 06:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 06:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 06:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 06:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T364069)', diff saved to https://phabricator.wikimedia.org/P63162 and previous config saved to /var/cache/conftool/dbconfig/20240525-060947-marostegui.json
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P63161 and previous config saved to /var/cache/conftool/dbconfig/20240525-060633-marostegui.json
  • 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P63160 and previous config saved to /var/cache/conftool/dbconfig/20240525-055439-marostegui.json
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T364299)', diff saved to https://phabricator.wikimedia.org/P63159 and previous config saved to /var/cache/conftool/dbconfig/20240525-055125-marostegui.json
  • 05:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P63158 and previous config saved to /var/cache/conftool/dbconfig/20240525-053931-marostegui.json
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T364069)', diff saved to https://phabricator.wikimedia.org/P63157 and previous config saved to /var/cache/conftool/dbconfig/20240525-052423-marostegui.json
  • 04:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T364299)', diff saved to https://phabricator.wikimedia.org/P63156 and previous config saved to /var/cache/conftool/dbconfig/20240525-044304-marostegui.json
  • 04:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 04:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 03:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 03:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 03:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T364299)', diff saved to https://phabricator.wikimedia.org/P63155 and previous config saved to /var/cache/conftool/dbconfig/20240525-030316-marostegui.json
  • 02:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T364069)', diff saved to https://phabricator.wikimedia.org/P63154 and previous config saved to /var/cache/conftool/dbconfig/20240525-025742-marostegui.json
  • 02:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 02:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 02:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T364069)', diff saved to https://phabricator.wikimedia.org/P63153 and previous config saved to /var/cache/conftool/dbconfig/20240525-025719-marostegui.json
  • 02:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P63152 and previous config saved to /var/cache/conftool/dbconfig/20240525-024808-marostegui.json
  • 02:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P63151 and previous config saved to /var/cache/conftool/dbconfig/20240525-024211-marostegui.json
  • 02:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P63150 and previous config saved to /var/cache/conftool/dbconfig/20240525-023300-marostegui.json
  • 02:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P63149 and previous config saved to /var/cache/conftool/dbconfig/20240525-022703-marostegui.json
  • 02:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T364299)', diff saved to https://phabricator.wikimedia.org/P63148 and previous config saved to /var/cache/conftool/dbconfig/20240525-021752-marostegui.json
  • 02:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T364069)', diff saved to https://phabricator.wikimedia.org/P63147 and previous config saved to /var/cache/conftool/dbconfig/20240525-021154-marostegui.json
  • 01:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T364299)', diff saved to https://phabricator.wikimedia.org/P63146 and previous config saved to /var/cache/conftool/dbconfig/20240525-011423-marostegui.json
  • 01:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 01:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 01:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T364299)', diff saved to https://phabricator.wikimedia.org/P63145 and previous config saved to /var/cache/conftool/dbconfig/20240525-011359-marostegui.json
  • 00:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P63144 and previous config saved to /var/cache/conftool/dbconfig/20240525-005851-marostegui.json
  • 00:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P63143 and previous config saved to /var/cache/conftool/dbconfig/20240525-004343-marostegui.json
  • 00:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T364299)', diff saved to https://phabricator.wikimedia.org/P63142 and previous config saved to /var/cache/conftool/dbconfig/20240525-002835-marostegui.json

2024-05-24

  • 23:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T364069)', diff saved to https://phabricator.wikimedia.org/P63141 and previous config saved to /var/cache/conftool/dbconfig/20240524-234433-marostegui.json
  • 23:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 23:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 23:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T364069)', diff saved to https://phabricator.wikimedia.org/P63140 and previous config saved to /var/cache/conftool/dbconfig/20240524-234410-marostegui.json
  • 23:31 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:31 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P63139 and previous config saved to /var/cache/conftool/dbconfig/20240524-232902-marostegui.json
  • 23:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T364299)', diff saved to https://phabricator.wikimedia.org/P63138 and previous config saved to /var/cache/conftool/dbconfig/20240524-232508-marostegui.json
  • 23:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 23:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 23:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63137 and previous config saved to /var/cache/conftool/dbconfig/20240524-232445-marostegui.json
  • 23:16 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P63136 and previous config saved to /var/cache/conftool/dbconfig/20240524-231354-marostegui.json
  • 23:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P63135 and previous config saved to /var/cache/conftool/dbconfig/20240524-230937-marostegui.json
  • 22:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T364069)', diff saved to https://phabricator.wikimedia.org/P63134 and previous config saved to /var/cache/conftool/dbconfig/20240524-225846-marostegui.json
  • 22:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P63133 and previous config saved to /var/cache/conftool/dbconfig/20240524-225428-marostegui.json
  • 22:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63132 and previous config saved to /var/cache/conftool/dbconfig/20240524-223921-marostegui.json
  • 22:24 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:24 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:37 eileen: tools upgraded from 36840b71 to 8c98b674
  • 21:08 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:07 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:04 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:03 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:54 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:53 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:53 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:52 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:47 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:47 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63131 and previous config saved to /var/cache/conftool/dbconfig/20240524-203243-marostegui.json
  • 20:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 20:32 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 20:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T364299)', diff saved to https://phabricator.wikimedia.org/P63130 and previous config saved to /var/cache/conftool/dbconfig/20240524-203219-marostegui.json
  • 20:31 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T364069)', diff saved to https://phabricator.wikimedia.org/P63129 and previous config saved to /var/cache/conftool/dbconfig/20240524-203037-marostegui.json
  • 20:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T364069)', diff saved to https://phabricator.wikimedia.org/P63128 and previous config saved to /var/cache/conftool/dbconfig/20240524-203014-marostegui.json
  • 20:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P63127 and previous config saved to /var/cache/conftool/dbconfig/20240524-201711-marostegui.json
  • 20:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P63126 and previous config saved to /var/cache/conftool/dbconfig/20240524-201506-marostegui.json
  • 20:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P63125 and previous config saved to /var/cache/conftool/dbconfig/20240524-200203-marostegui.json
  • 19:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P63124 and previous config saved to /var/cache/conftool/dbconfig/20240524-195958-marostegui.json
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T364299)', diff saved to https://phabricator.wikimedia.org/P63123 and previous config saved to /var/cache/conftool/dbconfig/20240524-194655-marostegui.json
  • 19:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T364069)', diff saved to https://phabricator.wikimedia.org/P63122 and previous config saved to /var/cache/conftool/dbconfig/20240524-194450-marostegui.json
  • 19:32 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 19:32 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 19:26 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 19:26 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 19:24 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 19:19 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:45 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 18:45 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1162 (T364299)', diff saved to https://phabricator.wikimedia.org/P63121 and previous config saved to /var/cache/conftool/dbconfig/20240524-184009-marostegui.json
  • 18:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63120 and previous config saved to /var/cache/conftool/dbconfig/20240524-183945-marostegui.json
  • 18:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P63118 and previous config saved to /var/cache/conftool/dbconfig/20240524-182437-marostegui.json
  • 18:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P63117 and previous config saved to /var/cache/conftool/dbconfig/20240524-180929-marostegui.json
  • 17:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63116 and previous config saved to /var/cache/conftool/dbconfig/20240524-175421-marostegui.json
  • 17:50 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 17:45 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 17:45 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 17:44 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 17:41 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:34 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:30 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 17:30 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 17:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T364069)', diff saved to https://phabricator.wikimedia.org/P63115 and previous config saved to /var/cache/conftool/dbconfig/20240524-171833-marostegui.json
  • 17:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 17:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 17:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63114 and previous config saved to /var/cache/conftool/dbconfig/20240524-171809-marostegui.json
  • 17:10 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P63113 and previous config saved to /var/cache/conftool/dbconfig/20240524-170301-marostegui.json
  • 16:58 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 16:57 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 16:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P63112 and previous config saved to /var/cache/conftool/dbconfig/20240524-164753-marostegui.json
  • 16:36 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 16:35 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63111 and previous config saved to /var/cache/conftool/dbconfig/20240524-163245-marostegui.json
  • 15:51 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 15:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63109 and previous config saved to /var/cache/conftool/dbconfig/20240524-154108-marostegui.json
  • 15:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 15:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 15:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:18 Lucas_WMDE: FINISHED lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76882704"]' 2>&1 | tee -a ~/T315510-enwiki-6; date # a few minutes ago
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63107 and previous config saved to /var/cache/conftool/dbconfig/20240524-145912-marostegui.json
  • 14:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 14:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T364299)', diff saved to https://phabricator.wikimedia.org/P63106 and previous config saved to /var/cache/conftool/dbconfig/20240524-145139-marostegui.json
  • 14:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P63105 and previous config saved to /var/cache/conftool/dbconfig/20240524-143630-marostegui.json
  • 14:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P63104 and previous config saved to /var/cache/conftool/dbconfig/20240524-142122-marostegui.json
  • 14:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T364299)', diff saved to https://phabricator.wikimedia.org/P63103 and previous config saved to /var/cache/conftool/dbconfig/20240524-140614-marostegui.json
  • 14:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63102 and previous config saved to /var/cache/conftool/dbconfig/20240524-140258-arnaudb.json
  • 13:59 hashar@deploy1002: Finished deploy [gerrit/gerrit@af1257f]: wm-pcc: add a run action - T363918 (duration: 00m 07s)
  • 13:59 hashar@deploy1002: Started deploy [gerrit/gerrit@af1257f]: wm-pcc: add a run action - T363918
  • 13:57 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76882704"]' 2>&1 | tee -a ~/T315510-enwiki-6; date
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63101 and previous config saved to /var/cache/conftool/dbconfig/20240524-134752-arnaudb.json
  • 13:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts snapshot1008.eqiad.wmnet
  • 13:36 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:36 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1008.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 13:34 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1008.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 13:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63100 and previous config saved to /var/cache/conftool/dbconfig/20240524-133245-arnaudb.json
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T364299)', diff saved to https://phabricator.wikimedia.org/P63099 and previous config saved to /var/cache/conftool/dbconfig/20240524-132514-marostegui.json
  • 13:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 13:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 13:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T364299)', diff saved to https://phabricator.wikimedia.org/P63098 and previous config saved to /var/cache/conftool/dbconfig/20240524-132450-marostegui.json
  • 13:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63097 and previous config saved to /var/cache/conftool/dbconfig/20240524-131739-arnaudb.json
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P63096 and previous config saved to /var/cache/conftool/dbconfig/20240524-130942-marostegui.json
  • 13:05 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 13:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 10%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63095 and previous config saved to /var/cache/conftool/dbconfig/20240524-130233-arnaudb.json
  • 13:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 100%: post clone (src) repool', diff saved to https://phabricator.wikimedia.org/P63094 and previous config saved to /var/cache/conftool/dbconfig/20240524-130217-arnaudb.json
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P63093 and previous config saved to /var/cache/conftool/dbconfig/20240524-125433-marostegui.json
  • 12:53 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts snapshot1008.eqiad.wmnet
  • 12:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 5%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63092 and previous config saved to /var/cache/conftool/dbconfig/20240524-124727-arnaudb.json
  • 12:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 75%: post clone (src) repool', diff saved to https://phabricator.wikimedia.org/P63091 and previous config saved to /var/cache/conftool/dbconfig/20240524-124711-arnaudb.json
  • 12:37 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 12:25 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 12:24 btullis@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 12:23 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 12:23 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 12:23 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 12:22 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 12:20 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 12:20 btullis@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 12:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 1%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63087 and previous config saved to /var/cache/conftool/dbconfig/20240524-121715-arnaudb.json
  • 12:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 25%: post clone (src) repool', diff saved to https://phabricator.wikimedia.org/P63086 and previous config saved to /var/cache/conftool/dbconfig/20240524-121659-arnaudb.json
  • 12:16 arnaudb@cumin1002: dbctl commit (dc=all): 'fix wrong weight', diff saved to https://phabricator.wikimedia.org/P63085 and previous config saved to /var/cache/conftool/dbconfig/20240524-121641-arnaudb.json
  • 12:16 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 12:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:15 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 12:15 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 12:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: post clone (src) repool', diff saved to https://phabricator.wikimedia.org/P63084 and previous config saved to /var/cache/conftool/dbconfig/20240524-121523-arnaudb.json
  • 12:14 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 12:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2116.codfw.wmnet onto db2176.codfw.wmnet
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T364299)', diff saved to https://phabricator.wikimedia.org/P63083 and previous config saved to /var/cache/conftool/dbconfig/20240524-115351-marostegui.json
  • 11:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 11:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T364299)', diff saved to https://phabricator.wikimedia.org/P63082 and previous config saved to /var/cache/conftool/dbconfig/20240524-115328-marostegui.json
  • 11:44 akosiaris: manually delete the 1 sessionstore pod running on parse1004
  • 11:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P63081 and previous config saved to /var/cache/conftool/dbconfig/20240524-113820-marostegui.json
  • 11:24 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 11:24 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P63080 and previous config saved to /var/cache/conftool/dbconfig/20240524-112310-marostegui.json
  • 11:22 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
  • 11:22 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
  • 11:21 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
  • 11:21 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
  • 11:21 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
  • 11:20 btullis@deploy1002: helmfile [staging] START helmfile.d/services/media-analytics: apply
  • 11:19 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 11:18 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 11:18 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 11:17 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 11:15 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 11:15 btullis@deploy1002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 11:10 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 11:10 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 11:10 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 11:09 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T364299)', diff saved to https://phabricator.wikimedia.org/P63079 and previous config saved to /var/cache/conftool/dbconfig/20240524-110802-marostegui.json
  • 11:07 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
  • 11:07 btullis@deploy1002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2150.codfw.wmnet with reason: reimage
  • 10:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2150.codfw.wmnet with reason: reimage
  • 10:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2150, hardware issues ', diff saved to https://phabricator.wikimedia.org/P63078 and previous config saved to /var/cache/conftool/dbconfig/20240524-104953-arnaudb.json
  • 10:27 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2116.codfw.wmnet onto db2176.codfw.wmnet
  • 10:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2116 to clone on db2176 T365793', diff saved to https://phabricator.wikimedia.org/P63077 and previous config saved to /var/cache/conftool/dbconfig/20240524-102424-arnaudb.json
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T364299)', diff saved to https://phabricator.wikimedia.org/P63076 and previous config saved to /var/cache/conftool/dbconfig/20240524-102340-marostegui.json
  • 10:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63075 and previous config saved to /var/cache/conftool/dbconfig/20240524-102315-marostegui.json
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P63074 and previous config saved to /var/cache/conftool/dbconfig/20240524-100807-marostegui.json
  • 09:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2176.codfw.wmnet with reason: Host has issues
  • 09:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2176.codfw.wmnet with reason: Host has issues
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P63073 and previous config saved to /var/cache/conftool/dbconfig/20240524-095259-marostegui.json
  • 09:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2176', diff saved to https://phabricator.wikimedia.org/P63072 and previous config saved to /var/cache/conftool/dbconfig/20240524-094703-arnaudb.json
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63071 and previous config saved to /var/cache/conftool/dbconfig/20240524-093751-marostegui.json
  • 09:25 hashar@deploy1002: Finished deploy [gerrit/gerrit@159288a]: Allow users to recheck tests in checkers - T363918 (duration: 00m 07s)
  • 09:25 hashar@deploy1002: Started deploy [gerrit/gerrit@159288a]: Allow users to recheck tests in checkers - T363918
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63070 and previous config saved to /var/cache/conftool/dbconfig/20240524-085423-marostegui.json
  • 08:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 08:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T364299)', diff saved to https://phabricator.wikimedia.org/P63069 and previous config saved to /var/cache/conftool/dbconfig/20240524-085400-marostegui.json
  • 08:41 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P63068 and previous config saved to /var/cache/conftool/dbconfig/20240524-083851-marostegui.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P63067 and previous config saved to /var/cache/conftool/dbconfig/20240524-082343-marostegui.json
  • 08:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T364299)', diff saved to https://phabricator.wikimedia.org/P63066 and previous config saved to /var/cache/conftool/dbconfig/20240524-080835-marostegui.json
  • 07:40 dcausse@deploy1002: Finished deploy [airflow-dags/search@8f0b4a1]: search: fix import_ttl dag (duration: 00m 19s)
  • 07:40 dcausse@deploy1002: Started deploy [airflow-dags/search@8f0b4a1]: search: fix import_ttl dag
  • 07:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T364299)', diff saved to https://phabricator.wikimedia.org/P63065 and previous config saved to /var/cache/conftool/dbconfig/20240524-072639-marostegui.json
  • 07:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63064 and previous config saved to /var/cache/conftool/dbconfig/20240524-072616-marostegui.json
  • 07:13 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:13 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P63063 and previous config saved to /var/cache/conftool/dbconfig/20240524-071108-marostegui.json
  • 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P63062 and previous config saved to /var/cache/conftool/dbconfig/20240524-065600-marostegui.json
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63061 and previous config saved to /var/cache/conftool/dbconfig/20240524-064053-marostegui.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P63060 and previous config saved to /var/cache/conftool/dbconfig/20240524-061812-root.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P63059 and previous config saved to /var/cache/conftool/dbconfig/20240524-060305-root.json
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63058 and previous config saved to /var/cache/conftool/dbconfig/20240524-055616-marostegui.json
  • 05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 05:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 05:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T364299)', diff saved to https://phabricator.wikimedia.org/P63057 and previous config saved to /var/cache/conftool/dbconfig/20240524-055553-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P63056 and previous config saved to /var/cache/conftool/dbconfig/20240524-054759-root.json
  • 05:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P63055 and previous config saved to /var/cache/conftool/dbconfig/20240524-054045-marostegui.json
  • 05:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P63054 and previous config saved to /var/cache/conftool/dbconfig/20240524-053250-root.json
  • 05:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P63053 and previous config saved to /var/cache/conftool/dbconfig/20240524-052537-marostegui.json
  • 05:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2122.codfw.wmnet with OS bookworm
  • 05:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P63052 and previous config saved to /var/cache/conftool/dbconfig/20240524-051744-root.json
  • 05:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T364299)', diff saved to https://phabricator.wikimedia.org/P63051 and previous config saved to /var/cache/conftool/dbconfig/20240524-051028-marostegui.json
  • 04:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2122.codfw.wmnet with reason: host reimage
  • 04:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2122.codfw.wmnet with reason: host reimage
  • 04:36 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2122.codfw.wmnet with OS bookworm
  • 04:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2122', diff saved to https://phabricator.wikimedia.org/P63050 and previous config saved to /var/cache/conftool/dbconfig/20240524-043441-root.json
  • 04:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T364299)', diff saved to https://phabricator.wikimedia.org/P63049 and previous config saved to /var/cache/conftool/dbconfig/20240524-042358-marostegui.json
  • 04:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 02:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T364299)', diff saved to https://phabricator.wikimedia.org/P63048 and previous config saved to /var/cache/conftool/dbconfig/20240524-004342-marostegui.json
  • 00:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P63047 and previous config saved to /var/cache/conftool/dbconfig/20240524-002834-marostegui.json
  • 00:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P63046 and previous config saved to /var/cache/conftool/dbconfig/20240524-001326-marostegui.json

2024-05-23

  • 23:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T364299)', diff saved to https://phabricator.wikimedia.org/P63045 and previous config saved to /var/cache/conftool/dbconfig/20240523-235817-marostegui.json
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P63044 and previous config saved to /var/cache/conftool/dbconfig/20240523-233017-ladsgroup.json
  • 23:24 zabe@deploy1002: Finished scap: Backport for Deploy configuration for wrapping B type passwords with encrypted Argon2 (T112359) (duration: 16m 00s)
  • 23:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P63043 and previous config saved to /var/cache/conftool/dbconfig/20240523-231511-ladsgroup.json
  • 23:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2217 (T364299)', diff saved to https://phabricator.wikimedia.org/P63042 and previous config saved to /var/cache/conftool/dbconfig/20240523-231302-marostegui.json
  • 23:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 23:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 23:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T364299)', diff saved to https://phabricator.wikimedia.org/P63041 and previous config saved to /var/cache/conftool/dbconfig/20240523-231238-marostegui.json
  • 23:11 zabe@deploy1002: zabe: Continuing with sync
  • 23:10 zabe@deploy1002: zabe: Backport for Deploy configuration for wrapping B type passwords with encrypted Argon2 (T112359) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:08 zabe@deploy1002: Started scap: Backport for Deploy configuration for wrapping B type passwords with encrypted Argon2 (T112359)
  • 23:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P63040 and previous config saved to /var/cache/conftool/dbconfig/20240523-230005-ladsgroup.json
  • 22:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P63039 and previous config saved to /var/cache/conftool/dbconfig/20240523-225730-marostegui.json
  • 22:54 eileen: tools upgraded from bce5f52b to 91893e29
  • 22:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P63038 and previous config saved to /var/cache/conftool/dbconfig/20240523-224459-ladsgroup.json
  • 22:43 zabe@deploy1002: Finished scap: Backport for Stop writing to af_user(_text)/afh_user(_text) in group0 wikis (T337920) (duration: 18m 39s)
  • 22:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P63037 and previous config saved to /var/cache/conftool/dbconfig/20240523-224222-marostegui.json
  • 22:30 zabe@deploy1002: zabe: Continuing with sync
  • 22:27 zabe@deploy1002: zabe: Backport for Stop writing to af_user(_text)/afh_user(_text) in group0 wikis (T337920) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T364299)', diff saved to https://phabricator.wikimedia.org/P63036 and previous config saved to /var/cache/conftool/dbconfig/20240523-222714-marostegui.json
  • 22:26 eileen: civicrm upgraded from 72aa5118 to 6c1fdd4f
  • 22:24 zabe@deploy1002: Started scap: Backport for Stop writing to af_user(_text)/afh_user(_text) in group0 wikis (T337920)
  • 22:11 eileen: civicrm upgraded from 22a38356 to 72aa5118
  • 21:52 thcipriani@deploy1002: Finished scap: Backport for wikitech: (Un)block GitLab accounts when (un)blocked on wikitech (T316418) (duration: 18m 01s)
  • 21:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2214 (T364299)', diff saved to https://phabricator.wikimedia.org/P63035 and previous config saved to /var/cache/conftool/dbconfig/20240523-214614-marostegui.json
  • 21:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 21:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 21:40 thcipriani@deploy1002: thcipriani and bd808: Continuing with sync
  • 21:37 thcipriani@deploy1002: thcipriani and bd808: Backport for wikitech: (Un)block GitLab accounts when (un)blocked on wikitech (T316418) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:34 thcipriani@deploy1002: Started scap: Backport for wikitech: (Un)block GitLab accounts when (un)blocked on wikitech (T316418)
  • 21:29 eileen: civicrm upgraded from 55cb3cf7 to 22a38356
  • 21:26 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 21:18 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 21:14 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 21:14 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 21:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 21:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 21:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T364299)', diff saved to https://phabricator.wikimedia.org/P63034 and previous config saved to /var/cache/conftool/dbconfig/20240523-211044-marostegui.json
  • 20:59 eileen: civicrm upgraded from de92d6bc to 55cb3cf7
  • 20:55 jsn@deploy1002: Finished scap: Backport for Always use desktop watchlist HTML on mobile (T109277) (duration: 16m 23s)
  • 20:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P63033 and previous config saved to /var/cache/conftool/dbconfig/20240523-205536-marostegui.json
  • 20:44 jsn@deploy1002: jdlrobson and jsn: Continuing with sync
  • 20:42 jsn@deploy1002: jdlrobson and jsn: Backport for Always use desktop watchlist HTML on mobile (T109277) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P63032 and previous config saved to /var/cache/conftool/dbconfig/20240523-204028-marostegui.json
  • 20:39 jsn@deploy1002: Started scap: Backport for Always use desktop watchlist HTML on mobile (T109277)
  • 20:36 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 20:36 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T364299)', diff saved to https://phabricator.wikimedia.org/P63031 and previous config saved to /var/cache/conftool/dbconfig/20240523-202520-marostegui.json
  • 20:24 jsn@deploy1002: Finished scap: Backport for CommonSettings: Load AutoModerator extension (T361643), InitialiseSettings: testwiki enable AutoModerator (T361643) (duration: 17m 30s)
  • 20:24 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 20:23 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 20:21 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 20:20 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 20:12 jsn@deploy1002: jsn: Continuing with sync
  • 20:09 jsn@deploy1002: jsn: Backport for CommonSettings: Load AutoModerator extension (T361643), InitialiseSettings: testwiki enable AutoModerator (T361643) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
  • 20:07 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
  • 20:06 jsn@deploy1002: Started scap: Backport for CommonSettings: Load AutoModerator extension (T361643), InitialiseSettings: testwiki enable AutoModerator (T361643)
  • 20:06 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
  • 20:05 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/toolhub: apply
  • 20:05 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 20:04 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 20:04 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T364299)', diff saved to https://phabricator.wikimedia.org/P63030 and previous config saved to /var/cache/conftool/dbconfig/20240523-194723-marostegui.json
  • 19:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 19:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63029 and previous config saved to /var/cache/conftool/dbconfig/20240523-194659-marostegui.json
  • 19:38 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 19:38 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P63028 and previous config saved to /var/cache/conftool/dbconfig/20240523-193152-marostegui.json
  • 19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P63027 and previous config saved to /var/cache/conftool/dbconfig/20240523-191644-marostegui.json
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63026 and previous config saved to /var/cache/conftool/dbconfig/20240523-190136-marostegui.json
  • 18:55 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 18:55 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 18:55 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 18:54 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:48 cdanis: T365626 helmfile destroy'd all opentelemetry-collector releases
  • 18:32 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2050.codfw.wmnet with OS bookworm
  • 18:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1050.eqiad.wmnet with OS bookworm
  • 18:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 18:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63025 and previous config saved to /var/cache/conftool/dbconfig/20240523-181643-marostegui.json
  • 18:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 18:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63024 and previous config saved to /var/cache/conftool/dbconfig/20240523-181630-marostegui.json
  • 18:14 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2050.codfw.wmnet with reason: host reimage
  • 18:11 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2050.codfw.wmnet with reason: host reimage
  • 18:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1050.eqiad.wmnet with reason: host reimage
  • 18:06 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1050.eqiad.wmnet with reason: host reimage
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P63023 and previous config saved to /var/cache/conftool/dbconfig/20240523-180122-marostegui.json
  • 17:53 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2050.codfw.wmnet with OS bookworm
  • 17:53 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1050.eqiad.wmnet with OS bookworm
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P63022 and previous config saved to /var/cache/conftool/dbconfig/20240523-174614-marostegui.json
  • 17:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63021 and previous config saved to /var/cache/conftool/dbconfig/20240523-173106-marostegui.json
  • 17:16 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 17:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63020 and previous config saved to /var/cache/conftool/dbconfig/20240523-171022-arnaudb.json
  • 17:06 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 17:05 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:04 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:04 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:03 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:03 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:03 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 16:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63019 and previous config saved to /var/cache/conftool/dbconfig/20240523-165516-arnaudb.json
  • 16:43 dduvall@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
  • 16:43 dduvall@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
  • 16:43 dduvall@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
  • 16:42 dduvall@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
  • 16:42 dduvall@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
  • 16:42 dduvall@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63018 and previous config saved to /var/cache/conftool/dbconfig/20240523-164010-arnaudb.json
  • 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63017 and previous config saved to /var/cache/conftool/dbconfig/20240523-164002-marostegui.json
  • 16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63016 and previous config saved to /var/cache/conftool/dbconfig/20240523-163938-marostegui.json
  • 16:37 dduvall: destroying all blubberoid deployments as part of its decommissioning (T318289)
  • 16:27 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
  • 16:26 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63015 and previous config saved to /var/cache/conftool/dbconfig/20240523-162457-arnaudb.json
  • 16:24 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 16:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P63014 and previous config saved to /var/cache/conftool/dbconfig/20240523-162430-marostegui.json
  • 16:23 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 16:21 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
  • 16:20 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
  • 16:19 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 16:18 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 16:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63013 and previous config saved to /var/cache/conftool/dbconfig/20240523-161755-arnaudb.json
  • 16:17 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 16:16 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 16:15 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 16:15 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 16:14 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 16:13 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 16:13 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 16:12 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 16:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63012 and previous config saved to /var/cache/conftool/dbconfig/20240523-160951-arnaudb.json
  • 16:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P63011 and previous config saved to /var/cache/conftool/dbconfig/20240523-160921-marostegui.json
  • 16:08 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 16:08 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 16:05 topranks: enabling BFD on transit circuit to telxius in magru
  • 16:04 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
  • 16:04 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 16:04 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 16:03 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
  • 16:02 rzl@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63010 and previous config saved to /var/cache/conftool/dbconfig/20240523-160249-arnaudb.json
  • 16:02 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 16:02 rzl@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
  • 16:01 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 16:00 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
  • 15:59 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
  • 15:58 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 15:57 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 15:56 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 15:55 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 15:55 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 15:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63009 and previous config saved to /var/cache/conftool/dbconfig/20240523-155444-arnaudb.json
  • 15:54 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63008 and previous config saved to /var/cache/conftool/dbconfig/20240523-155413-marostegui.json
  • 15:53 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 15:51 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 15:50 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 15:47 rzl@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 15:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63006 and previous config saved to /var/cache/conftool/dbconfig/20240523-154743-arnaudb.json
  • 15:47 rzl@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
  • 15:41 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
  • 15:41 rzl@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 15:40 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
  • 15:40 rzl@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
  • 15:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63004 and previous config saved to /var/cache/conftool/dbconfig/20240523-153937-arnaudb.json
  • 15:37 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 15:36 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 15:34 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
  • 15:34 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/media-analytics: apply
  • 15:32 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
  • 15:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63003 and previous config saved to /var/cache/conftool/dbconfig/20240523-153237-arnaudb.json
  • 15:32 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
  • 15:30 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 15:30 jhathaway: moving phabricator outbound email to postfix based mx-out{1001,2001}
  • 15:29 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 15:28 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 15:28 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:26 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 15:26 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
  • 15:25 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 15:25 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
  • 15:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63002 and previous config saved to /var/cache/conftool/dbconfig/20240523-152431-arnaudb.json
  • 15:22 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 15:22 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 15:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
  • 15:18 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63001 and previous config saved to /var/cache/conftool/dbconfig/20240523-151731-arnaudb.json
  • 15:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS bookworm
  • 15:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63000 and previous config saved to /var/cache/conftool/dbconfig/20240523-150225-arnaudb.json
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T364299)', diff saved to https://phabricator.wikimedia.org/P62999 and previous config saved to /var/cache/conftool/dbconfig/20240523-145938-marostegui.json
  • 14:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T364299)', diff saved to https://phabricator.wikimedia.org/P62998 and previous config saved to /var/cache/conftool/dbconfig/20240523-145858-marostegui.json
  • 14:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
  • 14:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
  • 14:51 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 14:51 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 14:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62997 and previous config saved to /var/cache/conftool/dbconfig/20240523-144719-arnaudb.json
  • 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host stat1008.eqiad.wmnet
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62996 and previous config saved to /var/cache/conftool/dbconfig/20240523-144351-marostegui.json
  • 14:39 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS bookworm
  • 14:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: reimage
  • 14:38 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: reimage
  • 14:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1235 T364290', diff saved to https://phabricator.wikimedia.org/P62995 and previous config saved to /var/cache/conftool/dbconfig/20240523-143742-arnaudb.json
  • 14:35 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host stat1008.eqiad.wmnet
  • 14:34 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62994 and previous config saved to /var/cache/conftool/dbconfig/20240523-143213-arnaudb.json
  • 14:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2116.codfw.wmnet with OS bookworm
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62993 and previous config saved to /var/cache/conftool/dbconfig/20240523-142843-marostegui.json
  • 14:26 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 5769
  • 14:25 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 5769
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T364299)', diff saved to https://phabricator.wikimedia.org/P62992 and previous config saved to /var/cache/conftool/dbconfig/20240523-141334-marostegui.json
  • 14:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2116.codfw.wmnet with reason: host reimage
  • 14:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2116.codfw.wmnet with reason: host reimage
  • 13:56 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:49 reedy@deploy1002: Synchronized wmf-config/interwiki-labs.php: (no justification provided) (duration: 16m 30s)
  • 13:46 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2116.codfw.wmnet with OS bookworm
  • 13:41 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2051.codfw.wmnet with OS bookworm
  • 13:36 arnaudb@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2116.codfw.wmnet with OS bookworm
  • 13:29 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1051.eqiad.wmnet with OS bookworm
  • 13:22 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2051.codfw.wmnet with reason: host reimage
  • 13:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2116.codfw.wmnet with reason: host reimage
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T364299)', diff saved to https://phabricator.wikimedia.org/P62991 and previous config saved to /var/cache/conftool/dbconfig/20240523-131734-marostegui.json
  • 13:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 13:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T364299)', diff saved to https://phabricator.wikimedia.org/P62990 and previous config saved to /var/cache/conftool/dbconfig/20240523-131710-marostegui.json
  • 13:16 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2051.codfw.wmnet with reason: host reimage
  • 13:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2116.codfw.wmnet with reason: host reimage
  • 13:13 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1051.eqiad.wmnet with reason: host reimage
  • 13:10 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1051.eqiad.wmnet with reason: host reimage
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62989 and previous config saved to /var/cache/conftool/dbconfig/20240523-130202-marostegui.json
  • 12:59 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2116.codfw.wmnet with OS bookworm
  • 12:58 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2051.codfw.wmnet with OS bookworm
  • 12:57 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1051.eqiad.wmnet with OS bookworm
  • 12:57 vgutierrez: repool upload@esams with IPIP encapsulation enabled - T357257
  • 12:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2116.codfw.wmnet with reason: reimage
  • 12:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2116.codfw.wmnet with reason: reimage
  • 12:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2116 T364290', diff saved to https://phabricator.wikimedia.org/P62988 and previous config saved to /var/cache/conftool/dbconfig/20240523-125641-arnaudb.json
  • 12:50 vgutierrez: rolling restart of pybal on lvs3010 and lvs3009 - T357257
  • 12:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62987 and previous config saved to /var/cache/conftool/dbconfig/20240523-124832-arnaudb.json
  • 12:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62986 and previous config saved to /var/cache/conftool/dbconfig/20240523-124654-marostegui.json
  • 12:21 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS bookworm
  • 12:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62983 and previous config saved to /var/cache/conftool/dbconfig/20240523-121819-arnaudb.json
  • 12:17 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1052.eqiad.wmnet with OS bookworm
  • 12:04 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2052.codfw.wmnet with reason: host reimage
  • 12:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62982 and previous config saved to /var/cache/conftool/dbconfig/20240523-120313-arnaudb.json
  • 12:01 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1052.eqiad.wmnet with reason: host reimage
  • 12:01 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2052.codfw.wmnet with reason: host reimage
  • 11:56 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1052.eqiad.wmnet with reason: host reimage
  • 11:52 vgutierrez: depool upload@esams before enabling IPIP encapsulation - T357257
  • 11:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62981 and previous config saved to /var/cache/conftool/dbconfig/20240523-114807-arnaudb.json
  • 11:43 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS bookworm
  • 11:43 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1052.eqiad.wmnet with OS bookworm
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T364299)', diff saved to https://phabricator.wikimedia.org/P62980 and previous config saved to /var/cache/conftool/dbconfig/20240523-114259-marostegui.json
  • 11:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 11:40 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host stat1008.eqiad.wmnet with OS bullseye
  • 11:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62979 and previous config saved to /var/cache/conftool/dbconfig/20240523-113301-arnaudb.json
  • 11:27 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62978 and previous config saved to /var/cache/conftool/dbconfig/20240523-112704-root.json
  • 11:24 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2053.codfw.wmnet with OS bookworm
  • 11:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1053.eqiad.wmnet with OS bookworm
  • 11:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62977 and previous config saved to /var/cache/conftool/dbconfig/20240523-111755-arnaudb.json
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62976 and previous config saved to /var/cache/conftool/dbconfig/20240523-111157-root.json
  • 11:11 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on stat1008.eqiad.wmnet with reason: host reimage
  • 11:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on stat1008.eqiad.wmnet with reason: host reimage
  • 11:06 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2053.codfw.wmnet with reason: host reimage
  • 11:02 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1053.eqiad.wmnet with reason: host reimage
  • 11:02 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2053.codfw.wmnet with reason: host reimage
  • 11:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62975 and previous config saved to /var/cache/conftool/dbconfig/20240523-110249-arnaudb.json
  • 10:57 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1053.eqiad.wmnet with reason: host reimage
  • 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2170.codfw.wmnet
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62974 and previous config saved to /var/cache/conftool/dbconfig/20240523-105651-root.json
  • 10:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2130.codfw.wmnet with OS bookworm
  • 10:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 10:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 10:45 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host stat1008.eqiad.wmnet with OS bullseye
  • 10:44 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS bookworm
  • 10:44 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1053.eqiad.wmnet with OS bookworm
  • 10:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2170.codfw.wmnet
  • 10:42 hnowlan@cumin1002: conftool action : set/pooled=no; selector: name=wikikube-worker2001.codfw.wmnet
  • 10:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62973 and previous config saved to /var/cache/conftool/dbconfig/20240523-104145-root.json
  • 10:40 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host stat1008.eqiad.wmnet with OS bullseye
  • 10:39 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker2001.codfw.wmnet
  • 10:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2130.codfw.wmnet with reason: host reimage
  • 10:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62972 and previous config saved to /var/cache/conftool/dbconfig/20240523-102639-root.json
  • 10:25 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2130.codfw.wmnet with reason: host reimage
  • 10:25 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host stat1008.eqiad.wmnet with OS bullseye
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62971 and previous config saved to /var/cache/conftool/dbconfig/20240523-101133-root.json
  • 10:08 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host stat1008.eqiad.wmnet with OS bullseye
  • 10:06 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2130.codfw.wmnet with OS bookworm
  • 10:06 btullis@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool device-analytics in eqiad: maintenance
  • 10:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2130 T364290', diff saved to https://phabricator.wikimedia.org/P62970 and previous config saved to /var/cache/conftool/dbconfig/20240523-100452-arnaudb.json
  • 10:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2130.codfw.wmnet with reason: reimage
  • 10:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2130.codfw.wmnet with reason: reimage
  • 10:01 btullis@cumin1002: START - Cookbook sre.discovery.service-route pool device-analytics in eqiad: maintenance
  • 09:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 09:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 09:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62969 and previous config saved to /var/cache/conftool/dbconfig/20240523-095627-root.json
  • 09:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2146.codfw.wmnet
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P62968 and previous config saved to /var/cache/conftool/dbconfig/20240523-095338-marostegui.json
  • 09:50 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 09:49 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 09:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host stat1008.eqiad.wmnet with OS bullseye
  • 09:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2146.codfw.wmnet
  • 09:42 btullis@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool device-analytics in eqiad: maintenance
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T364299)', diff saved to https://phabricator.wikimedia.org/P62967 and previous config saved to /var/cache/conftool/dbconfig/20240523-093830-marostegui.json
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T364299)', diff saved to https://phabricator.wikimedia.org/P62966 and previous config saved to /var/cache/conftool/dbconfig/20240523-093720-marostegui.json
  • 09:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 09:37 btullis@cumin1002: START - Cookbook sre.discovery.service-route depool device-analytics in eqiad: maintenance
  • 09:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T364299)', diff saved to https://phabricator.wikimedia.org/P62965 and previous config saved to /var/cache/conftool/dbconfig/20240523-093703-marostegui.json
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2145.codfw.wmnet
  • 09:30 moritzm: installing zeromq3 bugfix updates from Bullseye point release
  • 09:30 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 09:29 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 09:24 btullis@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool device-analytics in codfw: maintenance
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P62963 and previous config saved to /var/cache/conftool/dbconfig/20240523-092153-marostegui.json
  • 09:19 btullis@cumin1002: START - Cookbook sre.discovery.service-route pool device-analytics in codfw: maintenance
  • 09:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2145.codfw.wmnet
  • 09:18 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 09:17 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 09:12 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2054.codfw.wmnet with OS bookworm
  • 09:12 btullis@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool device-analytics in codfw: maintenance
  • 09:08 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS bookworm
  • 09:07 btullis@cumin1002: START - Cookbook sre.discovery.service-route depool device-analytics in codfw: maintenance
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P62962 and previous config saved to /var/cache/conftool/dbconfig/20240523-090645-marostegui.json
  • 09:04 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 09:04 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 08:54 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2054.codfw.wmnet with reason: host reimage
  • 08:51 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2054.codfw.wmnet with reason: host reimage
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T364299)', diff saved to https://phabricator.wikimedia.org/P62961 and previous config saved to /var/cache/conftool/dbconfig/20240523-085137-marostegui.json
  • 08:51 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
  • 08:51 marostegui: Deploy schema change on s4 eqiad old master db1238 dbmaint T356166
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1238', diff saved to https://phabricator.wikimedia.org/P62960 and previous config saved to /var/cache/conftool/dbconfig/20240523-085023-root.json
  • 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2130.codfw.wmnet
  • 08:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62959 and previous config saved to /var/cache/conftool/dbconfig/20240523-084834-arnaudb.json
  • 08:48 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
  • 08:44 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 08:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2130.codfw.wmnet
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2116.codfw.wmnet
  • 08:35 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2001.codfw.wmnet
  • 08:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1238.eqiad.wmnet with OS bookworm
  • 08:33 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc2054.codfw.wmnet with OS bookworm
  • 08:33 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS bookworm
  • 08:31 aklapper@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.6 refs T361400
  • 08:25 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 08:23 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 08:20 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
  • 08:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2116.codfw.wmnet
  • 08:15 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
  • 08:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1238.eqiad.wmnet with reason: host reimage
  • 08:11 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2001.codfw.wmnet
  • 08:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1238.eqiad.wmnet with reason: host reimage
  • 08:07 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 08:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2023 to wikikube-worker2001
  • 08:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2001
  • 08:01 ayounsi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2001
  • 08:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2023 to wikikube-worker2001 - ayounsi@cumin1002"
  • 07:59 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2023 to wikikube-worker2001 - ayounsi@cumin1002"
  • 07:57 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 07:57 ayounsi@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2023 to wikikube-worker2001
  • 07:56 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1238.eqiad.wmnet with OS bookworm
  • 07:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1238.eqiad.wmnet with reason: reimage
  • 07:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1238.eqiad.wmnet with reason: reimage
  • 07:49 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=99) from kubernetes2023 to wikikube-worker2001
  • 07:48 ayounsi@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2023 to wikikube-worker2001
  • 07:42 dcausse@deploy1002: Finished scap: Backport for extension registration: Fix handling of null default values (T365190) (duration: 16m 56s)
  • 07:30 dcausse@deploy1002: dcausse: Continuing with sync
  • 07:28 dcausse@deploy1002: dcausse: Backport for extension registration: Fix handling of null default values (T365190) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:25 dcausse@deploy1002: Started scap: Backport for extension registration: Fix handling of null default values (T365190)
  • 07:20 dcausse@deploy1002: Finished scap: Backport for cirrus: Keep archive writes running through cirrus (duration: 17m 19s)
  • 07:08 dcausse@deploy1002: ebernhardson and dcausse: Continuing with sync
  • 07:06 dcausse@deploy1002: ebernhardson and dcausse: Backport for cirrus: Keep archive writes running through cirrus synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:03 dcausse@deploy1002: Started scap: Backport for cirrus: Keep archive writes running through cirrus
  • 06:45 dcausse@deploy1002: Finished deploy [airflow-dags/search@49369da]: search: automate graph split and n3 dump generation (duration: 00m 19s)
  • 06:45 dcausse@deploy1002: Started deploy [airflow-dags/search@49369da]: search: automate graph split and n3 dump generation
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62954 and previous config saved to /var/cache/conftool/dbconfig/20240523-064027-root.json
  • 06:31 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 06:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1238 T363689', diff saved to https://phabricator.wikimedia.org/P62953 and previous config saved to /var/cache/conftool/dbconfig/20240523-063025-arnaudb.json
  • 06:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62952 and previous config saved to /var/cache/conftool/dbconfig/20240523-062521-root.json
  • 06:24 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 06:24 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 06:16 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 06:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db1160 to s4 primary and set section read-write T363689', diff saved to https://phabricator.wikimedia.org/P62951 and previous config saved to /var/cache/conftool/dbconfig/20240523-061524-arnaudb.json
  • 06:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - T363689', diff saved to https://phabricator.wikimedia.org/P62950 and previous config saved to /var/cache/conftool/dbconfig/20240523-061408-arnaudb.json
  • 06:13 arnaudb: Starting s4 eqiad failover from db1238 to db1160 - T363689
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62949 and previous config saved to /var/cache/conftool/dbconfig/20240523-061014-root.json
  • 05:57 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62948 and previous config saved to /var/cache/conftool/dbconfig/20240523-055747-root.json
  • 05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1155.eqiad.wmnet with OS bookworm
  • 05:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62947 and previous config saved to /var/cache/conftool/dbconfig/20240523-055508-root.json
  • 05:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db1160 with weight 0 T363689', diff saved to https://phabricator.wikimedia.org/P62946 and previous config saved to /var/cache/conftool/dbconfig/20240523-054816-arnaudb.json
  • 05:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s4 T363689
  • 05:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s4 T363689
  • 05:42 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62945 and previous config saved to /var/cache/conftool/dbconfig/20240523-054240-root.json
  • 05:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62944 and previous config saved to /var/cache/conftool/dbconfig/20240523-054002-root.json
  • 05:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1155.eqiad.wmnet with reason: host reimage
  • 05:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1155.eqiad.wmnet with reason: host reimage
  • 05:27 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62942 and previous config saved to /var/cache/conftool/dbconfig/20240523-052734-root.json
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62941 and previous config saved to /var/cache/conftool/dbconfig/20240523-052456-root.json
  • 05:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1155.eqiad.wmnet with OS bookworm
  • 05:12 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62940 and previous config saved to /var/cache/conftool/dbconfig/20240523-051228-root.json
  • 05:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62939 and previous config saved to /var/cache/conftool/dbconfig/20240523-050950-root.json
  • 05:08 marostegui: Install 10..6.18 on db1174 T365338
  • 05:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1174', diff saved to https://phabricator.wikimedia.org/P62938 and previous config saved to /var/cache/conftool/dbconfig/20240523-050626-root.json
  • 04:57 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62937 and previous config saved to /var/cache/conftool/dbconfig/20240523-045722-root.json
  • 03:16 eileen: civicrm upgraded from 50211434 to 252eed3c
  • 02:56 eileen: config revision changed from f8af8188 to d8905b73
  • 02:54 eileen: config revision changed from b9fbe283 to f8af8188
  • 02:53 eileen: tools upgraded from ad48f63e to bce5f52b
  • 01:50 eileen: civicrm upgraded from 5cb7c467 to 50211434
  • 01:26 eileen: civicrm upgraded from 172feea2 to 5cb7c467
  • 00:29 ejegg: fundraising civicrm upgraded from 84c36324 to 172feea2

2024-05-22

  • 23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T364299)', diff saved to https://phabricator.wikimedia.org/P62936 and previous config saved to /var/cache/conftool/dbconfig/20240522-234937-marostegui.json
  • 23:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 23:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62935 and previous config saved to /var/cache/conftool/dbconfig/20240522-234914-marostegui.json
  • 23:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P62934 and previous config saved to /var/cache/conftool/dbconfig/20240522-233406-marostegui.json
  • 23:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P62933 and previous config saved to /var/cache/conftool/dbconfig/20240522-231858-marostegui.json
  • 23:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62932 and previous config saved to /var/cache/conftool/dbconfig/20240522-230350-marostegui.json
  • 22:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
  • 21:57 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
  • 21:56 eileen: civicrm upgraded from b0a3965a to 84c36324
  • 21:54 ryankemper: T363973 Finished manual rolling restart of hadoop masters `an-master100[3,4].eqiad.wmnet`
  • 21:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:47 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:50 jdrewniak@deploy1002: Finished scap: Backport for Revert "Add exclusion behaviour for "width" option in Appearance menu" (T364015), Small font size is not applying to excluded pages (T364887 T365408) (duration: 16m 46s)
  • 20:44 ejegg: payments-wiki upgraded from 5b86bd09 to d871e439
  • 20:39 Lucas_WMDE: STOPPED lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76318767"]' 2>&1 | tee -a ~/T315510-enwiki-5; date # ca. 1 hour and 20 minutes ago, after running for a bit over 6 days; some errors
  • 20:37 jdrewniak@deploy1002: jdrewniak and jdlrobson: Continuing with sync
  • 20:36 jdrewniak@deploy1002: jdrewniak and jdlrobson: Backport for Revert "Add exclusion behaviour for "width" option in Appearance menu" (T364015), Small font size is not applying to excluded pages (T364887 T365408) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:33 jdrewniak@deploy1002: Started scap: Backport for Revert "Add exclusion behaviour for "width" option in Appearance menu" (T364015), Small font size is not applying to excluded pages (T364887 T365408)
  • 18:33 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts sretest2002.wikimedia.org
  • 18:33 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:33 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2002.wikimedia.org decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 18:32 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2002.wikimedia.org decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 18:29 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 18:18 cmooney@cumin1002: START - Cookbook sre.hosts.decommission for hosts sretest2002.wikimedia.org
  • 18:16 cmooney@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host sretest2002.wikimedia.org with OS bookworm
  • 18:05 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2002.wikimedia.org with OS bookworm
  • 17:58 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 17:58 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 17:55 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2002.wikimedia.org with OS bookworm
  • 17:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62930 and previous config saved to /var/cache/conftool/dbconfig/20240522-173900-arnaudb.json
  • 17:24 topranks: Setting DHCP in codfw row A to 'forward-only' mode to troubleshoot DHCP bug T365204
  • 17:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62929 and previous config saved to /var/cache/conftool/dbconfig/20240522-172354-arnaudb.json
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62928 and previous config saved to /var/cache/conftool/dbconfig/20240522-170848-arnaudb.json
  • 17:07 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2002.wikimedia.org with OS bookworm
  • 16:58 ejegg: standalone SmashPig upgraded from a9c5ee43 to edf573bb
  • 16:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62926 and previous config saved to /var/cache/conftool/dbconfig/20240522-165558-root.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62925 and previous config saved to /var/cache/conftool/dbconfig/20240522-165340-arnaudb.json
  • 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62924 and previous config saved to /var/cache/conftool/dbconfig/20240522-164052-root.json
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62923 and previous config saved to /var/cache/conftool/dbconfig/20240522-163834-arnaudb.json
  • 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62922 and previous config saved to /var/cache/conftool/dbconfig/20240522-162546-root.json
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62921 and previous config saved to /var/cache/conftool/dbconfig/20240522-162327-arnaudb.json
  • 16:19 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
  • 16:19 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
  • 16:13 James_F: Running `mwscript extensions/WikiLambda/maintenance/migrateZ16K1StringsToZ61s.php --wiki=wikifunctionswiki --implement` on mwmaint1002 for T287153
  • 16:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62920 and previous config saved to /var/cache/conftool/dbconfig/20240522-161039-root.json
  • 16:08 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 16:08 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 16:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62919 and previous config saved to /var/cache/conftool/dbconfig/20240522-160821-arnaudb.json
  • 15:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Long schema change
  • 15:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Long schema change
  • 15:56 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
  • 15:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1226', diff saved to https://phabricator.wikimedia.org/P62918 and previous config saved to /var/cache/conftool/dbconfig/20240522-155621-root.json
  • 15:55 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
  • 15:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62917 and previous config saved to /var/cache/conftool/dbconfig/20240522-155533-root.json
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62916 and previous config saved to /var/cache/conftool/dbconfig/20240522-155315-arnaudb.json
  • 15:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2130.codfw.wmnet with OS bookworm
  • 15:44 elukey: upload to bookworm-wikimedia dragonfly-{dfdaemon,dfget}, calicoctl, calico-cni - T365253
  • 15:42 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 15:42 kamila@deploy1002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 15:42 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2001.codfw.wmnet with OS bookworm
  • 15:40 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 15:40 kamila@deploy1002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 15:39 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 15:39 kamila@deploy1002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 15:34 damilare: civicrm upgraded from 8c5fee40 to b0a3965a
  • 15:32 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 15:32 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 15:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2130.codfw.wmnet with reason: host reimage
  • 15:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2130.codfw.wmnet with reason: host reimage
  • 15:22 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage
  • 15:19 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage
  • 15:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T352010)', diff saved to https://phabricator.wikimedia.org/P62915 and previous config saved to /var/cache/conftool/dbconfig/20240522-151923-ladsgroup.json
  • 15:16 vgutierrez: repool upload@drmrs with IPIP encapsulation enabled - T357257
  • 15:16 fabfur: enabling puppet on all cp-ulsfo (T365566)
  • 15:16 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts contint1003.eqiad.wmnet
  • 15:16 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:16 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1002"
  • 15:14 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1002"
  • 15:10 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 15:10 vgutierrez: rolling restart of pybal on lvs6003 and lvs6002 - T357257
  • 15:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2130.codfw.wmnet with reason: reimage
  • 15:06 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2130.codfw.wmnet with reason: reimage
  • 15:06 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2130.codfw.wmnet with OS bookworm
  • 15:05 dzahn@cumin1002: START - Cookbook sre.hosts.decommission for hosts contint1003.eqiad.wmnet
  • 15:05 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2130', diff saved to https://phabricator.wikimedia.org/P62914 and previous config saved to /var/cache/conftool/dbconfig/20240522-150516-arnaudb.json
  • 15:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P62913 and previous config saved to /var/cache/conftool/dbconfig/20240522-150415-ladsgroup.json
  • 15:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62912 and previous config saved to /var/cache/conftool/dbconfig/20240522-150333-ladsgroup.json
  • 15:01 jynus: stopping eqiad mediabackups for cleaning up missing files T365607
  • 14:58 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS bookworm
  • 14:57 hnowlan: running `puppet cert revoke sessionstore.discovery.wmnet ` T363996
  • 14:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P62911 and previous config saved to /var/cache/conftool/dbconfig/20240522-144907-ladsgroup.json
  • 14:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P62910 and previous config saved to /var/cache/conftool/dbconfig/20240522-144826-ladsgroup.json
  • 14:43 vgutierrez: depool upload@drmrs before enabling IPIP encapsulation - T357257
  • 14:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T352010)', diff saved to https://phabricator.wikimedia.org/P62909 and previous config saved to /var/cache/conftool/dbconfig/20240522-143359-ladsgroup.json
  • 14:33 jayme: drained, cordoned and pooled=inactive kubernetes2023 and kubernetes2032 for cookbook testing - T350152 T365571
  • 14:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P62908 and previous config saved to /var/cache/conftool/dbconfig/20240522-143318-ladsgroup.json
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62907 and previous config saved to /var/cache/conftool/dbconfig/20240522-143238-arnaudb.json
  • 14:32 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubernetes20(23|32).codfw.wmnet
  • 14:28 elukey: copy calico, istio-cni, kubernetes-node packages from bullseye-wikimedia to bookworm-wikimedia - T365253
  • 14:28 fabfur: disabling puppet on all cp-ulsfo to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1034852 selectively (T365566)
  • 14:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62906 and previous config saved to /var/cache/conftool/dbconfig/20240522-141809-ladsgroup.json
  • 14:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62905 and previous config saved to /var/cache/conftool/dbconfig/20240522-141732-arnaudb.json
  • 14:14 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for PrefixSearch: Make sure $prefix is a string (T365565) (duration: 14m 58s)
  • 14:02 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Continuing with sync
  • 14:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62904 and previous config saved to /var/cache/conftool/dbconfig/20240522-140225-arnaudb.json
  • 14:02 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Backport for PrefixSearch: Make sure $prefix is a string (T365565) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:59 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for PrefixSearch: Make sure $prefix is a string (T365565)
  • 13:55 moritzm: installing libcaca security updates
  • 13:53 vgutierrez: repool upload@eqiad with IPIP encapsulation enabled - T357257
  • 13:48 moritzm: installing bind9 security updates (client-side tools/libs)
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62902 and previous config saved to /var/cache/conftool/dbconfig/20240522-134717-arnaudb.json
  • 13:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62901 and previous config saved to /var/cache/conftool/dbconfig/20240522-134646-arnaudb.json
  • 13:39 vgutierrez: rolling restart of pybal on lvs1020 and lvs1018 - T357257
  • 13:36 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Change $wgUploadNavigationUrl for azwiki (T364674) (duration: 16m 27s)
  • 13:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62900 and previous config saved to /var/cache/conftool/dbconfig/20240522-133209-arnaudb.json
  • 13:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62899 and previous config saved to /var/cache/conftool/dbconfig/20240522-133140-arnaudb.json
  • 13:27 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 100%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62898 and previous config saved to /var/cache/conftool/dbconfig/20240522-132712-kormat.json
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62897 and previous config saved to /var/cache/conftool/dbconfig/20240522-132526-marostegui.json
  • 13:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T364299)', diff saved to https://phabricator.wikimedia.org/P62896 and previous config saved to /var/cache/conftool/dbconfig/20240522-132501-marostegui.json
  • 13:24 logmsgbot: lucaswerkmeister-wmde@deploy1002 nmw03 and lucaswerkmeister-wmde: Continuing with sync
  • 13:23 logmsgbot: lucaswerkmeister-wmde@deploy1002 nmw03 and lucaswerkmeister-wmde: Backport for Change $wgUploadNavigationUrl for azwiki (T364674) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:20 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Change $wgUploadNavigationUrl for azwiki (T364674)
  • 13:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62895 and previous config saved to /var/cache/conftool/dbconfig/20240522-131700-arnaudb.json
  • 13:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62894 and previous config saved to /var/cache/conftool/dbconfig/20240522-131634-arnaudb.json
  • 13:12 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 90%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62893 and previous config saved to /var/cache/conftool/dbconfig/20240522-131206-kormat.json
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P62892 and previous config saved to /var/cache/conftool/dbconfig/20240522-130954-marostegui.json
  • 13:08 urbanecm@deploy1002: Finished scap: Backport for foundationwiki: Grant autopatrol to the editor group (T365584), Remove forward slashes (T332580 T363815) (duration: 25m 09s)
  • 13:05 fabfur: restarting all benthos instances in A:cp-ulsfo
  • 13:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62891 and previous config saved to /var/cache/conftool/dbconfig/20240522-130154-arnaudb.json
  • 13:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62890 and previous config saved to /var/cache/conftool/dbconfig/20240522-130128-arnaudb.json
  • 13:00 vgutierrez: depool upload@eqiad before enabling IPIP encapsulation - T357257
  • 13:00 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:59 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:57 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 75%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62889 and previous config saved to /var/cache/conftool/dbconfig/20240522-125659-kormat.json
  • 12:55 urbanecm@deploy1002: urbanecm and cyndywikime: Continuing with sync
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P62888 and previous config saved to /var/cache/conftool/dbconfig/20240522-125446-marostegui.json
  • 12:50 vgutierrez: repool upload@magru with IPIP encapsulation enabled - T357257
  • 12:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62887 and previous config saved to /var/cache/conftool/dbconfig/20240522-124648-arnaudb.json
  • 12:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62886 and previous config saved to /var/cache/conftool/dbconfig/20240522-124622-arnaudb.json
  • 12:45 urbanecm@deploy1002: urbanecm and cyndywikime: Backport for foundationwiki: Grant autopatrol to the editor group (T365584), Remove forward slashes (T332580 T363815) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2145.codfw.wmnet with OS bookworm
  • 12:26 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 45%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62881 and previous config saved to /var/cache/conftool/dbconfig/20240522-122647-kormat.json
  • 12:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2145.codfw.wmnet with reason: host reimage
  • 12:20 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2145.codfw.wmnet with reason: host reimage
  • 12:18 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 12:18 vgutierrez: depool upload@magru before enabling IPIP encapsulation - T357257
  • 12:18 daniel@deploy1002: Finished scap: Backport for REST: fix metrics keys (T365111) (duration: 16m 53s)
  • 12:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62880 and previous config saved to /var/cache/conftool/dbconfig/20240522-121611-arnaudb.json
  • 12:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62879 and previous config saved to /var/cache/conftool/dbconfig/20240522-121245-ladsgroup.json
  • 12:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 12:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 12:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T352010)', diff saved to https://phabricator.wikimedia.org/P62878 and previous config saved to /var/cache/conftool/dbconfig/20240522-121222-ladsgroup.json
  • 12:11 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 30%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62877 and previous config saved to /var/cache/conftool/dbconfig/20240522-121139-kormat.json
  • 12:07 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 12:06 daniel@deploy1002: daniel: Continuing with sync
  • 12:04 daniel@deploy1002: daniel: Backport for REST: fix metrics keys (T365111) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:04 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2145.codfw.wmnet with OS bookworm
  • 12:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2145.codfw.wmnet with reason: reimage
  • 12:02 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2145.codfw.wmnet with reason: reimage
  • 12:02 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2145', diff saved to https://phabricator.wikimedia.org/P62876 and previous config saved to /var/cache/conftool/dbconfig/20240522-120223-arnaudb.json
  • 12:01 daniel@deploy1002: Started scap: Backport for REST: fix metrics keys (T365111)
  • 12:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62875 and previous config saved to /var/cache/conftool/dbconfig/20240522-120105-arnaudb.json
  • 12:00 daniel@deploy1002: Finished scap: Backport for REST: fix metrics keys (T365111) (duration: 17m 25s)
  • 11:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P62874 and previous config saved to /var/cache/conftool/dbconfig/20240522-115714-ladsgroup.json
  • 11:56 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 15%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62873 and previous config saved to /var/cache/conftool/dbconfig/20240522-115633-kormat.json
  • 11:55 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62872 and previous config saved to /var/cache/conftool/dbconfig/20240522-115458-kormat.json
  • 11:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62871 and previous config saved to /var/cache/conftool/dbconfig/20240522-115313-arnaudb.json
  • 11:47 daniel@deploy1002: daniel: Continuing with sync
  • 11:45 daniel@deploy1002: daniel: Backport for REST: fix metrics keys (T365111) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:42 daniel@deploy1002: Started scap: Backport for REST: fix metrics keys (T365111)
  • 11:42 mvolz@deploy1002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 11:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P62870 and previous config saved to /var/cache/conftool/dbconfig/20240522-114206-ladsgroup.json
  • 11:41 mvolz@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 11:39 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 90%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62869 and previous config saved to /var/cache/conftool/dbconfig/20240522-113952-kormat.json
  • 11:39 mvolz@deploy1002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 11:38 mvolz@deploy1002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 11:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62868 and previous config saved to /var/cache/conftool/dbconfig/20240522-113807-arnaudb.json
  • 11:29 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 11:29 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T352010)', diff saved to https://phabricator.wikimedia.org/P62867 and previous config saved to /var/cache/conftool/dbconfig/20240522-112658-ladsgroup.json
  • 11:24 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62866 and previous config saved to /var/cache/conftool/dbconfig/20240522-112444-kormat.json
  • 11:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62865 and previous config saved to /var/cache/conftool/dbconfig/20240522-112301-arnaudb.json
  • 11:09 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 60%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62864 and previous config saved to /var/cache/conftool/dbconfig/20240522-110938-kormat.json
  • 11:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62863 and previous config saved to /var/cache/conftool/dbconfig/20240522-110754-arnaudb.json
  • 11:02 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl2003.codfw.wmnet
  • 10:54 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 45%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62862 and previous config saved to /var/cache/conftool/dbconfig/20240522-105432-kormat.json
  • 10:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS bookworm
  • 10:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62861 and previous config saved to /var/cache/conftool/dbconfig/20240522-105248-arnaudb.json
  • 10:40 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl2002.codfw.wmnet
  • 10:39 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 30%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62860 and previous config saved to /var/cache/conftool/dbconfig/20240522-103924-kormat.json
  • 10:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62859 and previous config saved to /var/cache/conftool/dbconfig/20240522-103742-arnaudb.json
  • 10:32 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
  • 10:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
  • 10:24 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 15%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62858 and previous config saved to /var/cache/conftool/dbconfig/20240522-102418-kormat.json
  • 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62857 and previous config saved to /var/cache/conftool/dbconfig/20240522-102236-arnaudb.json
  • 10:10 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS bookworm
  • 10:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: reimage
  • 10:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: reimage
  • 10:09 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl2001.codfw.wmnet
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2153', diff saved to https://phabricator.wikimedia.org/P62856 and previous config saved to /var/cache/conftool/dbconfig/20240522-100834-arnaudb.json
  • 10:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62855 and previous config saved to /var/cache/conftool/dbconfig/20240522-100730-arnaudb.json
  • 10:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS bookworm
  • 10:02 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.6.5 update to hostname to bgp group mappings - cmooney@cumin1002 - T353464
  • 10:00 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.6.5 update to hostname to bgp group mappings - cmooney@cumin1002 - T353464
  • 09:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62854 and previous config saved to /var/cache/conftool/dbconfig/20240522-095507-arnaudb.json
  • 09:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
  • 09:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
  • 09:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62853 and previous config saved to /var/cache/conftool/dbconfig/20240522-094001-arnaudb.json
  • 09:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62852 and previous config saved to /var/cache/conftool/dbconfig/20240522-092455-arnaudb.json
  • 09:22 hnowlan: running homer to add bgp status for wikikube-ctrl2001
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS bookworm
  • 09:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2211 (T352010)', diff saved to https://phabricator.wikimedia.org/P62851 and previous config saved to /var/cache/conftool/dbconfig/20240522-091942-ladsgroup.json
  • 09:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 09:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 09:14 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 09:13 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 09:13 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 09:13 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 09:12 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 09:12 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 09:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62850 and previous config saved to /var/cache/conftool/dbconfig/20240522-090949-arnaudb.json
  • 09:06 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 09:06 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 08:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62849 and previous config saved to /var/cache/conftool/dbconfig/20240522-085443-arnaudb.json
  • 08:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 26162
  • 08:50 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 26162
  • 08:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 20121
  • 08:49 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 20121
  • 08:49 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 08:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 20121
  • 08:49 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 20121
  • 08:49 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 08:48 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 20121
  • 08:48 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 20121
  • 08:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
  • 08:46 arnaudb@cumin1002: Updating IPMI password on 1 hosts - arnaudb@cumin1002
  • 08:46 arnaudb@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 08:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
  • 08:46 arnaudb@cumin1002: Updating IPMI password on 1 hosts - arnaudb@cumin1002
  • 08:45 arnaudb@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 08:45 arnaudb@cumin1002: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)
  • 08:45 arnaudb@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 08:41 aklapper@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.6 refs T361400
  • 08:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62848 and previous config saved to /var/cache/conftool/dbconfig/20240522-083937-arnaudb.json
  • 08:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62846 and previous config saved to /var/cache/conftool/dbconfig/20240522-082431-arnaudb.json
  • 08:16 hashar@deploy1002: Finished scap: Backport for Fix fatal error due to missing signature on very old comments (T365495) (duration: 16m 27s)
  • 08:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: reimage
  • 08:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: reimage
  • 08:11 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2173', diff saved to https://phabricator.wikimedia.org/P62845 and previous config saved to /var/cache/conftool/dbconfig/20240522-081059-arnaudb.json
  • 08:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62844 and previous config saved to /var/cache/conftool/dbconfig/20240522-080924-arnaudb.json
  • 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1249.eqiad.wmnet
  • 08:02 hashar@deploy1002: jforrester and hashar: Continuing with sync
  • 08:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS bookworm
  • 08:02 hashar@deploy1002: jforrester and hashar: Backport for Fix fatal error due to missing signature on very old comments (T365495) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:00 hashar@deploy1002: Started scap: Backport for Fix fatal error due to missing signature on very old comments (T365495)
  • 07:56 kartik@deploy1002: Finished scap: Backport for SpecialNotifyTranslators: Fix group id in dropdown (T253984) (duration: 22m 42s)
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62843 and previous config saved to /var/cache/conftool/dbconfig/20240522-075142-root.json
  • 07:43 kartik@deploy1002: abi and kartik: Continuing with sync
  • 07:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1154.eqiad.wmnet with OS bookworm
  • 07:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
  • 07:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62842 and previous config saved to /var/cache/conftool/dbconfig/20240522-073636-root.json
  • 07:36 kartik@deploy1002: abi and kartik: Backport for SpecialNotifyTranslators: Fix group id in dropdown (T253984) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:33 kartik@deploy1002: Started scap: Backport for SpecialNotifyTranslators: Fix group id in dropdown (T253984)
  • 07:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 07:33 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 07:32 moritzm: installing postgresql-11 security updates
  • 07:30 kartik@deploy1002: Finished scap: Backport for Disable Section Translation on simplewiki (T361597) (duration: 19m 47s)
  • 07:26 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS bookworm
  • 07:25 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 07:25 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 07:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: reimage
  • 07:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: reimage
  • 07:23 moritzm: installing nodejs security updates
  • 07:23 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db1232', diff saved to https://phabricator.wikimedia.org/P62841 and previous config saved to /var/cache/conftool/dbconfig/20240522-072307-arnaudb.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62840 and previous config saved to /var/cache/conftool/dbconfig/20240522-072130-root.json
  • 07:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1154.eqiad.wmnet with reason: host reimage
  • 07:17 kartik@deploy1002: kartik: Continuing with sync
  • 07:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1154.eqiad.wmnet with reason: host reimage
  • 07:13 kartik@deploy1002: kartik: Backport for Disable Section Translation on simplewiki (T361597) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:10 kartik@deploy1002: Started scap: Backport for Disable Section Translation on simplewiki (T361597)
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62839 and previous config saved to /var/cache/conftool/dbconfig/20240522-070624-root.json
  • 07:03 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1154.eqiad.wmnet with OS bookworm
  • 07:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1249.eqiad.wmnet
  • 07:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1248.eqiad.wmnet
  • 06:58 marostegui: Reimage db1154 (sanitarium) there will be lag in s1, s3, s5 and s8 in wiki replicas
  • 06:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 06:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 06:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P62838 and previous config saved to /var/cache/conftool/dbconfig/20240522-065340-ladsgroup.json
  • 06:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1248.eqiad.wmnet
  • 06:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1247.eqiad.wmnet
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62837 and previous config saved to /var/cache/conftool/dbconfig/20240522-065117-root.json
  • 06:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1247.eqiad.wmnet
  • 06:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P62836 and previous config saved to /var/cache/conftool/dbconfig/20240522-063832-ladsgroup.json
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62835 and previous config saved to /var/cache/conftool/dbconfig/20240522-063610-root.json
  • 06:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P62834 and previous config saved to /var/cache/conftool/dbconfig/20240522-062324-ladsgroup.json
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62833 and previous config saved to /var/cache/conftool/dbconfig/20240522-062103-root.json
  • 06:19 marostegui: Install 10..6.18 on db1249 T365338
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1249', diff saved to https://phabricator.wikimedia.org/P62832 and previous config saved to /var/cache/conftool/dbconfig/20240522-061806-root.json
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62831 and previous config saved to /var/cache/conftool/dbconfig/20240522-060901-root.json
  • 06:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P62830 and previous config saved to /var/cache/conftool/dbconfig/20240522-060814-ladsgroup.json
  • 05:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1249.eqiad.wmnet with OS bookworm
  • 05:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62829 and previous config saved to /var/cache/conftool/dbconfig/20240522-055355-root.json
  • 05:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T352010)', diff saved to https://phabricator.wikimedia.org/P62828 and previous config saved to /var/cache/conftool/dbconfig/20240522-054857-ladsgroup.json
  • 05:48 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 05:48 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 05:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T352010)', diff saved to https://phabricator.wikimedia.org/P62827 and previous config saved to /var/cache/conftool/dbconfig/20240522-054834-ladsgroup.json
  • 05:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1249.eqiad.wmnet with reason: host reimage
  • 05:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1249.eqiad.wmnet with reason: host reimage
  • 05:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P62826 and previous config saved to /var/cache/conftool/dbconfig/20240522-053326-ladsgroup.json
  • 05:22 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1249.eqiad.wmnet with OS bookworm
  • 05:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1249', diff saved to https://phabricator.wikimedia.org/P62825 and previous config saved to /var/cache/conftool/dbconfig/20240522-052108-root.json
  • 05:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P62824 and previous config saved to /var/cache/conftool/dbconfig/20240522-051818-ladsgroup.json
  • 05:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1192 for a schema change', diff saved to https://phabricator.wikimedia.org/P62823 and previous config saved to /var/cache/conftool/dbconfig/20240522-050727-root.json
  • 05:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Long schema change
  • 05:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Long schema change
  • 05:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T352010)', diff saved to https://phabricator.wikimedia.org/P62822 and previous config saved to /var/cache/conftool/dbconfig/20240522-050310-ladsgroup.json
  • 04:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P62821 and previous config saved to /var/cache/conftool/dbconfig/20240522-041922-ladsgroup.json
  • 04:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 04:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 04:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P62820 and previous config saved to /var/cache/conftool/dbconfig/20240522-041858-ladsgroup.json
  • 04:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P62819 and previous config saved to /var/cache/conftool/dbconfig/20240522-040349-ladsgroup.json
  • 03:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P62818 and previous config saved to /var/cache/conftool/dbconfig/20240522-034840-ladsgroup.json
  • 03:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P62817 and previous config saved to /var/cache/conftool/dbconfig/20240522-033332-ladsgroup.json
  • 02:21 eileen: civicrm upgraded from f1c24cb7 to c9d64b68
  • 02:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T364299)', diff saved to https://phabricator.wikimedia.org/P62816 and previous config saved to /var/cache/conftool/dbconfig/20240522-021116-marostegui.json
  • 02:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 02:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 02:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T364299)', diff saved to https://phabricator.wikimedia.org/P62815 and previous config saved to /var/cache/conftool/dbconfig/20240522-021053-marostegui.json
  • 01:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P62814 and previous config saved to /var/cache/conftool/dbconfig/20240522-015545-marostegui.json
  • 01:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P62813 and previous config saved to /var/cache/conftool/dbconfig/20240522-014037-marostegui.json
  • 01:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T364299)', diff saved to https://phabricator.wikimedia.org/P62812 and previous config saved to /var/cache/conftool/dbconfig/20240522-012529-marostegui.json
  • 01:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P62811 and previous config saved to /var/cache/conftool/dbconfig/20240522-011536-ladsgroup.json
  • 01:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 01:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 01:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T352010)', diff saved to https://phabricator.wikimedia.org/P62810 and previous config saved to /var/cache/conftool/dbconfig/20240522-011512-ladsgroup.json
  • 01:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P62809 and previous config saved to /var/cache/conftool/dbconfig/20240522-010004-ladsgroup.json
  • 00:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P62808 and previous config saved to /var/cache/conftool/dbconfig/20240522-004456-ladsgroup.json
  • 00:33 eileen: civicrm upgraded from c77df721 to f1c24cb7
  • 00:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T352010)', diff saved to https://phabricator.wikimedia.org/P62807 and previous config saved to /var/cache/conftool/dbconfig/20240522-002948-ladsgroup.json

2024-05-21

  • 23:46 eileen: civicrm upgraded from c77df721 to 9f65d36a
  • 23:40 eileen: config revision changed from 22106526 to b9fbe283
  • 23:40 eileen: civicrm upgraded from c77df721 to 9f65d36a
  • 23:39 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Apply change that makes encryption optional - eevans@cumin1002
  • 23:23 zabe@deploy1002: Finished scap: Backport for Use encrypted Argon2 Hashes to store user passwords (T150647 T216682) (duration: 26m 51s)
  • 23:19 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Apply change that makes encryption optional - eevans@cumin1002
  • 23:10 zabe@deploy1002: zabe: Continuing with sync
  • 22:59 zabe@deploy1002: zabe: Backport for Use encrypted Argon2 Hashes to store user passwords (T150647 T216682) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:56 zabe@deploy1002: Started scap: Backport for Use encrypted Argon2 Hashes to store user passwords (T150647 T216682)
  • 22:40 zabe@deploy1002: Finished scap: Backport for Stop writing to af_user(_text)/afh_user(_text) on test wikis (T337920) (duration: 16m 23s)
  • 22:27 zabe@deploy1002: zabe: Continuing with sync
  • 22:27 zabe@deploy1002: zabe: Backport for Stop writing to af_user(_text)/afh_user(_text) on test wikis (T337920) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:24 zabe@deploy1002: Started scap: Backport for Stop writing to af_user(_text)/afh_user(_text) on test wikis (T337920)
  • 22:17 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=ukwiki --logwiki=metawiki 'QFTP2024' 'Organic2024' # T365533
  • 22:16 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=ptwiki --logwiki=metawiki 'Aurelio de Sandoval' 'Aurelio Sandoval' # T365533
  • 22:15 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=eswiki --logwiki=metawiki '17420g' 'Ras I' # T365533
  • 21:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T352010)', diff saved to https://phabricator.wikimedia.org/P62806 and previous config saved to /var/cache/conftool/dbconfig/20240521-215924-ladsgroup.json
  • 21:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62805 and previous config saved to /var/cache/conftool/dbconfig/20240521-215900-ladsgroup.json
  • 21:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P62804 and previous config saved to /var/cache/conftool/dbconfig/20240521-214352-ladsgroup.json
  • 21:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P62803 and previous config saved to /var/cache/conftool/dbconfig/20240521-212842-ladsgroup.json
  • 21:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62802 and previous config saved to /var/cache/conftool/dbconfig/20240521-211335-ladsgroup.json
  • 21:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 21:08 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 21:07 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 20:56 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 20:51 jforrester@deploy1002: Finished scap: Backport for Drop responsive behaviour (T109277), Decouple MFUseDesktopSpecialWatchlistPage from EditWatchlist page, Enable desktop watchlist HTML on mobile (T109277), Don't define wmgUseListings, no longer read (duration: 18m 17s)
  • 20:38 jforrester@deploy1002: jforrester and jdlrobson: Continuing with sync
  • 20:36 jforrester@deploy1002: jforrester and jdlrobson: Backport for Drop responsive behaviour (T109277), Decouple MFUseDesktopSpecialWatchlistPage from EditWatchlist page, Enable desktop watchlist HTML on mobile (T109277), Don't define wmgUseListings, no longer read synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:34 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 20:34 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 20:33 jforrester@deploy1002: Started scap: Backport for Drop responsive behaviour (T109277), Decouple MFUseDesktopSpecialWatchlistPage from EditWatchlist page, Enable desktop watchlist HTML on mobile (T109277), Don't define wmgUseListings, no longer read
  • 20:33 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:32 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:31 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:27 jforrester@deploy1002: Finished scap: Backport for Cleanup night mode exclude list (T365084) (duration: 19m 32s)
  • 20:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T352010)', diff saved to https://phabricator.wikimedia.org/P62801 and previous config saved to /var/cache/conftool/dbconfig/20240521-202218-ladsgroup.json
  • 20:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 20:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 20:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62800 and previous config saved to /var/cache/conftool/dbconfig/20240521-201910-root.json
  • 20:13 jforrester@deploy1002: jforrester and jdlrobson: Continuing with sync
  • 20:10 jforrester@deploy1002: jforrester and jdlrobson: Backport for Cleanup night mode exclude list (T365084) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 jforrester@deploy1002: Started scap: Backport for Cleanup night mode exclude list (T365084)
  • 20:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62799 and previous config saved to /var/cache/conftool/dbconfig/20240521-200402-root.json
  • 20:03 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: T365467 (duration: 14m 56s)
  • 20:02 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:00 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:00 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:00 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:00 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 19:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62798 and previous config saved to /var/cache/conftool/dbconfig/20240521-194856-root.json
  • 19:47 reedy@deploy1002: Synchronized dblists-index.php: T365467 (duration: 15m 00s)
  • 19:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62797 and previous config saved to /var/cache/conftool/dbconfig/20240521-193349-root.json
  • 19:28 reedy@deploy1002: Synchronized multiversion/MWMultiVersion.php: T365467 (duration: 14m 59s)
  • 19:21 eileen: civicrm upgraded from f41f3432 to c77df721
  • 19:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62796 and previous config saved to /var/cache/conftool/dbconfig/20240521-191841-root.json
  • 19:11 reedy@deploy1002: Synchronized dblists/translate.dblist: T365467 (duration: 16m 59s)
  • 19:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62795 and previous config saved to /var/cache/conftool/dbconfig/20240521-190334-root.json
  • 18:50 mutante: gitlab-runners*.wmnet: ran puppet via cumin to deploy update of docker.gc service to use image 1.3.0 (from 1.2.0) - T350478
  • 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62794 and previous config saved to /var/cache/conftool/dbconfig/20240521-184828-root.json
  • 18:44 ebernhardson: T363734: start reindex of cloudelastic
  • 18:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 18:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:38 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62793 and previous config saved to /var/cache/conftool/dbconfig/20240521-183735-ladsgroup.json
  • 18:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 18:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 18:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P62792 and previous config saved to /var/cache/conftool/dbconfig/20240521-183711-ladsgroup.json
  • 18:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:35 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:34 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:25 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:22 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: host reimage
  • 18:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P62791 and previous config saved to /var/cache/conftool/dbconfig/20240521-182203-ladsgroup.json
  • 18:18 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: host reimage
  • 18:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1003.eqiad.wmnet with OS bullseye
  • 18:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:13 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P62790 and previous config saved to /var/cache/conftool/dbconfig/20240521-180655-ladsgroup.json
  • 18:04 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 18:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 18:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:01 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1002.eqiad.wmnet with OS bullseye
  • 18:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 17:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1003.eqiad.wmnet with reason: host reimage
  • 17:56 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 17:55 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1003.eqiad.wmnet with reason: host reimage
  • 17:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P62789 and previous config saved to /var/cache/conftool/dbconfig/20240521-175146-ladsgroup.json
  • 17:43 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2002.codfw.wmnet with reason: host reimage
  • 17:41 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1002.eqiad.wmnet with reason: host reimage
  • 17:40 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2002.codfw.wmnet with reason: host reimage
  • 17:38 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1002.eqiad.wmnet with reason: host reimage
  • 17:27 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@49bc8eb]: discolytics to 0.21, update search metrics group ownership (duration: 00m 26s)
  • 17:27 ebernhardson@deploy1002: Started deploy [airflow-dags/search@49bc8eb]: discolytics to 0.21, update search metrics group ownership
  • 17:25 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 17:24 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 17:23 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 17:21 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-ctrl2002.codfw.wmnet on all recursors
  • 17:21 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-ctrl2002.codfw.wmnet on all recursors
  • 17:09 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 17:09 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:58 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 16:42 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:42 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for wikikube-ctrl2002.codfw.wmnet - cmooney@cumin1002"
  • 16:41 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for wikikube-ctrl2002.codfw.wmnet - cmooney@cumin1002"
  • 16:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:40 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:38 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:38 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-ctrl2002.codfw.wmnet on all recursors
  • 16:38 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-ctrl2002.codfw.wmnet on all recursors
  • 16:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 16:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 16:35 kormat@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1182.eqiad.wmnet onto db1246.eqiad.wmnet
  • 16:34 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 16:34 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2001.codfw.wmnet with OS bullseye
  • 16:34 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 16:30 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1003.eqiad.wmnet with OS bullseye
  • 16:30 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 16:30 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 16:25 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 16:23 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1002.eqiad.wmnet with OS bullseye
  • 16:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:17 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:16 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2003.codfw.wmnet with reason: host reimage
  • 16:13 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2001.codfw.wmnet with reason: host reimage
  • 16:12 ladsgroup@deploy1002: Finished scap: Backport for x-wikimedia-debug: Update k8s-mwdebug label, move to front (T362662) (duration: 17m 16s)
  • 16:11 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2003.codfw.wmnet with reason: host reimage
  • 16:10 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2001.codfw.wmnet with reason: host reimage
  • 16:09 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 16:03 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1003']
  • 16:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::launcher
  • 16:00 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1003']
  • 16:00 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1003']
  • 16:00 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:59 ladsgroup@deploy1002: ladsgroup and jforrester: Continuing with sync
  • 15:59 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:59 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 15:59 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:58 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:58 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:58 ladsgroup@deploy1002: ladsgroup and jforrester: Backport for x-wikimedia-debug: Update k8s-mwdebug label, move to front (T362662) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:56 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 15:55 ladsgroup@deploy1002: Started scap: Backport for x-wikimedia-debug: Update k8s-mwdebug label, move to front (T362662)
  • 15:54 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2001.codfw.wmnet with OS bullseye
  • 15:54 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 15:53 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::launcher
  • 15:50 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:49 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:49 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:49 ladsgroup@deploy1002: Finished scap: Backport for configure parsercache servers via dbconfig in etcd (T362786) (duration: 23m 46s)
  • 15:48 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 15:43 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 15:43 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 15:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:41 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:39 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:39 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 15:38 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 15:38 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 15:36 ladsgroup@deploy1002: ladsgroup and swfrench: Continuing with sync
  • 15:35 brouberol@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 15:35 brouberol@cumin2002: START - Cookbook sre.wdqs.restart
  • 15:34 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:33 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 15:33 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:33 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-ctrl2001.codfw.wmnet with OS bullseye
  • 15:32 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 15:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P62787 and previous config saved to /var/cache/conftool/dbconfig/20240521-153010-ladsgroup.json
  • 15:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:28 ladsgroup@deploy1002: ladsgroup and swfrench: Backport for configure parsercache servers via dbconfig in etcd (T362786) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:25 ladsgroup@deploy1002: Started scap: Backport for configure parsercache servers via dbconfig in etcd (T362786)
  • 15:22 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 15:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:19 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 15:17 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 15:17 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 15:15 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2001.codfw.wmnet with OS bullseye
  • 15:10 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1002.eqiad.wmnet with OS bullseye
  • 15:10 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:10 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1003.eqiad.wmnet with OS bullseye
  • 15:10 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 15:07 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 15:07 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 15:07 jgiannelos@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 15:06 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 15:06 kormat@cumin1002: START - Cookbook sre.mysql.clone of db1182.eqiad.wmnet onto db1246.eqiad.wmnet
  • 15:05 jgiannelos@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 15:04 ejegg: fundraising civicrm upgraded from 8901b5b3 to f41f3432
  • 15:04 aklapper@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.6 refs T361400
  • 15:03 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 15:02 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 15:02 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 15:02 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:02 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 kormat@cumin1002: dbctl commit (dc=all): 'Depooling db1182 as cloning source T364552', diff saved to https://phabricator.wikimedia.org/P62785 and previous config saved to /var/cache/conftool/dbconfig/20240521-145924-kormat.json
  • 14:58 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 14:58 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 14:54 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS bookworm
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T364299)', diff saved to https://phabricator.wikimedia.org/P62784 and previous config saved to /var/cache/conftool/dbconfig/20240521-145451-marostegui.json
  • 14:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 14:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T364299)', diff saved to https://phabricator.wikimedia.org/P62783 and previous config saved to /var/cache/conftool/dbconfig/20240521-145428-marostegui.json
  • 14:52 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 14:51 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1003.eqiad.wmnet with OS bullseye
  • 14:51 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1002.eqiad.wmnet with OS bullseye
  • 14:48 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 14:46 aklapper@deploy1002: Finished scap: Backport for DocumentationAid: Fix fatal error (T365451) (duration: 16m 30s)
  • 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P62782 and previous config saved to /var/cache/conftool/dbconfig/20240521-143920-marostegui.json
  • 14:37 klausman@cumin1002: conftool action : set/pooled=yes; selector: name=ml-serve2002.codfw.wmnet
  • 14:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2055.codfw.wmnet with reason: host reimage
  • 14:36 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2006-dev.codfw.wmnet with OS bookworm
  • 14:36 vgutierrez: testing fifo-log-demux 0.7.4 on cp4052
  • 14:34 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2055.codfw.wmnet with reason: host reimage
  • 14:33 aklapper@deploy1002: aklapper: Continuing with sync
  • 14:32 aklapper@deploy1002: aklapper: Backport for DocumentationAid: Fix fatal error (T365451) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:30 aklapper@deploy1002: Started scap: Backport for DocumentationAid: Fix fatal error (T365451)
  • 14:29 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:29 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P62781 and previous config saved to /var/cache/conftool/dbconfig/20240521-142412-marostegui.json
  • 14:19 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:16 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc2055.codfw.wmnet with OS bookworm
  • 14:09 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2006-dev.codfw.wmnet with reason: host reimage
  • 14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T364299)', diff saved to https://phabricator.wikimedia.org/P62780 and previous config saved to /var/cache/conftool/dbconfig/20240521-140904-marostegui.json
  • 14:06 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2006-dev.codfw.wmnet with reason: host reimage
  • 14:04 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:03 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:55 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp2003.codfw.wmnet with OS bookworm
  • 13:47 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet2006-dev.codfw.wmnet with OS bookworm
  • 13:39 kormat@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1246.eqiad.wmnet with OS bookworm
  • 13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp2003.codfw.wmnet with reason: host reimage
  • 13:34 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp2003.codfw.wmnet with reason: host reimage
  • 13:34 zabe@deploy1002: Finished scap: Backport for arwiki: Disable Extension:ContentTranslation for non-autoreview users (T255022) (duration: 23m 20s)
  • 13:30 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 13:29 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:27 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 13:26 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 13:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:20 zabe@deploy1002: zabe and gergesshamon: Continuing with sync
  • 13:18 kormat@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 13:17 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc-gp2003.codfw.wmnet with OS bookworm
  • 13:15 kormat@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 13:13 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:13 zabe@deploy1002: zabe and gergesshamon: Backport for arwiki: Disable Extension:ContentTranslation for non-autoreview users (T255022) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:13 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:13 marostegui: Deploy schema change on s3 eqiad with replication dbmaint T365465
  • 13:12 vgutierrez: re-enable puppet on acme-chief clients - T364589
  • 13:11 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:11 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 13:11 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 13:10 zabe@deploy1002: Started scap: Backport for arwiki: Disable Extension:ContentTranslation for non-autoreview users (T255022)
  • 13:09 marostegui: Deploy schema change on s5 (azwikimedia wikifunctionswiki vewikimedia) eqiad with replication dbmaint T365465
  • 13:08 marostegui: Deploy schema change on s7 (metawiki and frwiktionary ) eqiad with replication dbmaint T365465
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T364299)', diff saved to https://phabricator.wikimedia.org/P62778 and previous config saved to /var/cache/conftool/dbconfig/20240521-130838-marostegui.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 13:08 vgutierrez: upgrading to acme-chief 0.37 on acmechief instances - T364589
  • 13:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62777 and previous config saved to /var/cache/conftool/dbconfig/20240521-130814-marostegui.json
  • 13:07 marostegui: Deploy schema change on s4 eqiad with replication dbmaint T365465
  • 13:06 marostegui: Deploy schema change on s8 eqiad with replication dbmaint T365465
  • 13:04 vgutierrez: disable puppet on acme-chief clients - T364589
  • 13:01 kormat@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 12:59 vgutierrez: upgrading to acme-chief 0.37 on acmechief-test instances - T364589
  • 12:55 vgutierrez: upload acme-chief 0.37 to apt.wm.org (bookworm-wikimedia) - T364589
  • 12:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P62775 and previous config saved to /var/cache/conftool/dbconfig/20240521-125306-marostegui.json
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62771 and previous config saved to /var/cache/conftool/dbconfig/20240521-122250-marostegui.json
  • 12:15 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:13 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 12:07 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:06 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet1006.eqiad.wmnet with OS bookworm
  • 12:04 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:02 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1244.eqiad.wmnet
  • 12:01 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:01 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:01 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for wikikube-ctrl2002.codfw.wmnet - cmooney@cumin1002"
  • 12:00 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for wikikube-ctrl2002.codfw.wmnet - cmooney@cumin1002"
  • 12:00 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 11:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl2002
  • 11:58 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 11:57 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-ctrl2002.codfw.wmnet on all recursors
  • 11:57 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-ctrl2002.codfw.wmnet on all recursors
  • 11:57 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 11:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1244.eqiad.wmnet
  • 11:55 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1243.eqiad.wmnet
  • 11:52 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2002
  • 11:51 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 11:51 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl2003
  • 11:51 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2003
  • 11:49 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet1006.eqiad.wmnet with reason: host reimage
  • 11:49 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:47 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:46 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2002
  • 11:46 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet1006.eqiad.wmnet with reason: host reimage
  • 11:46 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 11:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl2003
  • 11:43 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2002
  • 11:43 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2003
  • 11:43 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 11:42 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:42 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:41 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:41 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:41 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:40 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 11:39 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 11:36 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 11:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1243.eqiad.wmnet
  • 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1242.eqiad.wmnet
  • 11:32 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bookworm
  • 11:10 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudnet1005
  • 11:10 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet1005
  • 11:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl1003
  • 11:00 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl1003
  • 11:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl1002
  • 10:59 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl1002
  • 10:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl1001
  • 10:57 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl1001
  • 10:57 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2002
  • 10:57 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 10:51 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2001
  • 10:51 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2001
  • 10:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add renamed k8s ctrl nodes - hnowlan@cumin1002"
  • 10:50 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 10:49 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add renamed k8s ctrl nodes - hnowlan@cumin1002"
  • 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1238.eqiad.wmnet
  • 10:46 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:43 hnowlan@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:41 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:41 hnowlan@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:38 joal@deploy1002: Finished deploy [analytics/refinery@4d42877]: Deploy of Refinery after reimage of an-launcher1002 [analytics/refinery@4d42877e] (duration: 01m 01s)
  • 10:37 joal@deploy1002: Started deploy [analytics/refinery@4d42877]: Deploy of Refinery after reimage of an-launcher1002 [analytics/refinery@4d42877e]
  • 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 10:34 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:33 hnowlan@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:31 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 10:31 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:24 aklapper@deploy1002: rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.43.0-wmf.5"
  • 10:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:20 effie: restart memcached on mc2055
  • 10:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1238.eqiad.wmnet
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1199.eqiad.wmnet
  • 09:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw[2331,2361,2391].codfw.wmnet,mw[1372,1429,1436].eqiad.wmnet
  • 09:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw[2331,2361,2391].codfw.wmnet,mw[1372,1429,1436].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - hnowlan@cumin1002"
  • 09:57 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw[2331,2361,2391].codfw.wmnet,mw[1372,1429,1436].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - hnowlan@cumin1002"
  • 09:57 moritzm: installing mariadb-10.3 security updates (libs/tools as packaged in Debian, unrelated to wmf-db)
  • 09:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1199.eqiad.wmnet
  • 09:55 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1190.eqiad.wmnet
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62769 and previous config saved to /var/cache/conftool/dbconfig/20240521-094744-root.json
  • 09:41 aklapper@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.6 refs T361400
  • 09:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1190.eqiad.wmnet
  • 09:34 hnowlan@cumin1002: START - Cookbook sre.hosts.decommission for hosts mw[2331,2361,2391].codfw.wmnet,mw[1372,1429,1436].eqiad.wmnet
  • 09:33 hnowlan: decommissioning 6 appservers in advance of reimaging to k8s control nodes
  • 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1160.eqiad.wmnet
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62768 and previous config saved to /var/cache/conftool/dbconfig/20240521-093238-root.json
  • 09:31 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-launcher1002.eqiad.wmnet with reason: host reimage
  • 09:29 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet1005.eqiad.wmnet with OS bookworm
  • 09:28 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-launcher1002.eqiad.wmnet with reason: host reimage
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62767 and previous config saved to /var/cache/conftool/dbconfig/20240521-091732-root.json
  • 09:16 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-launcher1002.eqiad.wmnet with OS bullseye
  • 09:13 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet1005.eqiad.wmnet with reason: host reimage
  • 09:10 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet1005.eqiad.wmnet with reason: host reimage
  • away: UTC morning deploys done
  • 09:06 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1160.eqiad.wmnet
  • 09:05 tgr@deploy1002: Finished scap: Backport for Temporarily restore $wgCentralAuthDatabase (T348486) (duration: 17m 45s)
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62766 and previous config saved to /var/cache/conftool/dbconfig/20240521-090224-root.json
  • 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2219.codfw.wmnet
  • 08:55 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet1005.eqiad.wmnet with OS bookworm
  • 08:51 tgr@deploy1002: tgr: Continuing with sync
  • 08:51 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2219.codfw.wmnet
  • 08:50 tgr@deploy1002: tgr: Backport for Temporarily restore $wgCentralAuthDatabase (T348486) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2210.codfw.wmnet
  • 08:48 moritzm: installing edk2 security updates
  • 08:47 tgr@deploy1002: Started scap: Backport for Temporarily restore $wgCentralAuthDatabase (T348486)
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62765 and previous config saved to /var/cache/conftool/dbconfig/20240521-084718-root.json
  • 08:43 moritzm: installing ghostscript security updates
  • 08:41 matthiasmullie: UTC morning backports done
  • 08:41 mlitn@deploy1002: Finished scap: Backport for Allow async (job queue based) chunked upload on all wikis (T364644) (duration: 17m 32s)
  • 08:40 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2002.wikimedia.org on all recursors
  • 08:40 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2002.wikimedia.org on all recursors
  • 08:38 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:38 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add dns for sretest2002 - cmooney@cumin1002"
  • 08:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2210.codfw.wmnet
  • 08:37 effie: enable puppet on all mw* baremetal hosts
  • 08:37 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add dns for sretest2002 - cmooney@cumin1002"
  • 08:35 marostegui: Deploy schema change on s8 eqiad, this will cause a few hours of replication lag in s8 clouddb replicas T364299
  • 08:34 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 08:34 cmooney@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
  • 08:34 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 08:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Long schema change
  • 08:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Long schema change
  • 08:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Long schema change
  • 08:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Long schema change
  • 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62764 and previous config saved to /var/cache/conftool/dbconfig/20240521-083212-root.json
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1167 for a schema change', diff saved to https://phabricator.wikimedia.org/P62763 and previous config saved to /var/cache/conftool/dbconfig/20240521-083053-root.json
  • 08:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62762 and previous config saved to /var/cache/conftool/dbconfig/20240521-082842-root.json
  • 08:27 mlitn@deploy1002: mlitn and bawolff: Continuing with sync
  • 08:26 mlitn@deploy1002: mlitn and bawolff: Backport for Allow async (job queue based) chunked upload on all wikis (T364644) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:23 mlitn@deploy1002: Started scap: Backport for Allow async (job queue based) chunked upload on all wikis (T364644)
  • 08:22 mlitn@deploy1002: Finished scap: Backport for Remove complicated synchronization of caption/description inputs (T365119) (duration: 17m 40s)
  • 08:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62761 and previous config saved to /var/cache/conftool/dbconfig/20240521-081930-ladsgroup.json
  • 08:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1221.eqiad.wmnet with OS bookworm
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62760 and previous config saved to /var/cache/conftool/dbconfig/20240521-081706-root.json
  • 08:14 effie: enable puppet on mediawiki codfw servers
  • 08:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62759 and previous config saved to /var/cache/conftool/dbconfig/20240521-081336-root.json
  • 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2206.codfw.wmnet
  • 08:09 mlitn@deploy1002: mlitn: Continuing with sync
  • 08:07 mlitn@deploy1002: mlitn: Backport for Remove complicated synchronization of caption/description inputs (T365119) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:04 mlitn@deploy1002: Started scap: Backport for Remove complicated synchronization of caption/description inputs (T365119)
  • 08:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P62758 and previous config saved to /var/cache/conftool/dbconfig/20240521-080422-ladsgroup.json
  • 08:04 mlitn@deploy1002: Finished scap: Backport for Fix automatic numbering of copied titles (T365107) (duration: 17m 02s)
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62757 and previous config saved to /var/cache/conftool/dbconfig/20240521-080145-root.json
  • 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62756 and previous config saved to /var/cache/conftool/dbconfig/20240521-075830-root.json
  • 07:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1221.eqiad.wmnet with reason: host reimage
  • 07:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1221.eqiad.wmnet with reason: host reimage
  • 07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2206.codfw.wmnet
  • 07:51 moritzm: installing nginx security updates
  • 07:50 mlitn@deploy1002: mlitn: Continuing with sync
  • 07:49 mlitn@deploy1002: mlitn: Backport for Fix automatic numbering of copied titles (T365107) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:49 effie: disable puppet on all mediawiki hardware hosts - T345740
  • 07:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2179.codfw.wmnet
  • 07:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P62755 and previous config saved to /var/cache/conftool/dbconfig/20240521-074914-ladsgroup.json
  • 07:47 mlitn@deploy1002: Started scap: Backport for Fix automatic numbering of copied titles (T365107)
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62754 and previous config saved to /var/cache/conftool/dbconfig/20240521-074639-root.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62753 and previous config saved to /var/cache/conftool/dbconfig/20240521-074323-root.json
  • 07:40 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2179.codfw.wmnet
  • 07:40 moritzm: installing python 3.7 security updates
  • 07:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1221.eqiad.wmnet with OS bookworm
  • 07:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2172.codfw.wmnet
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1221', diff saved to https://phabricator.wikimedia.org/P62752 and previous config saved to /var/cache/conftool/dbconfig/20240521-073727-marostegui.json
  • 07:35 kartik@deploy1002: Finished scap: Backport for Fix the mobile experience for a second group of Wikipedias where CX is in beta (T361597) (duration: 20m 18s)
  • 07:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62751 and previous config saved to /var/cache/conftool/dbconfig/20240521-073407-ladsgroup.json
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62750 and previous config saved to /var/cache/conftool/dbconfig/20240521-073133-root.json
  • 07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS bookworm
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62749 and previous config saved to /var/cache/conftool/dbconfig/20240521-072817-root.json
  • 07:21 kartik@deploy1002: kartik: Continuing with sync
  • 07:17 kartik@deploy1002: kartik: Backport for Fix the mobile experience for a second group of Wikipedias where CX is in beta (T361597) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62748 and previous config saved to /var/cache/conftool/dbconfig/20240521-071627-root.json
  • 07:15 kartik@deploy1002: Started scap: Backport for Fix the mobile experience for a second group of Wikipedias where CX is in beta (T361597)
  • 07:14 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2172.codfw.wmnet
  • 07:14 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 11170
  • 07:13 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 11170
  • 07:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
  • 07:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2147.codfw.wmnet
  • 07:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
  • 07:08 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 8075
  • 07:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2147.codfw.wmnet
  • 07:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2140.codfw.wmnet
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62747 and previous config saved to /var/cache/conftool/dbconfig/20240521-070121-root.json
  • 07:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 8075
  • 06:54 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS bookworm
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1237 T358642', diff saved to https://phabricator.wikimedia.org/P62746 and previous config saved to /var/cache/conftool/dbconfig/20240521-065318-marostegui.json
  • 06:52 moritzm: installing postgresql-11 security updates
  • 06:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2140.codfw.wmnet
  • 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62745 and previous config saved to /var/cache/conftool/dbconfig/20240521-064615-root.json
  • 06:44 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 398203
  • 06:44 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 398203
  • 06:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2137.codfw.wmnet
  • 06:36 moritzm: installing nghttp2 security updates
  • 06:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS bookworm
  • 06:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2137.codfw.wmnet
  • 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2136.codfw.wmnet
  • 06:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62744 and previous config saved to /var/cache/conftool/dbconfig/20240521-063109-root.json
  • 06:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2136.codfw.wmnet
  • 06:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
  • 06:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
  • 05:56 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS bookworm
  • 05:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1182 T361543', diff saved to https://phabricator.wikimedia.org/P62743 and previous config saved to /var/cache/conftool/dbconfig/20240521-055501-root.json
  • 05:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2102.codfw.wmnet with OS bookworm
  • 05:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62742 and previous config saved to /var/cache/conftool/dbconfig/20240521-053627-ladsgroup.json
  • 05:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 05:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 05:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P62741 and previous config saved to /var/cache/conftool/dbconfig/20240521-053602-ladsgroup.json
  • 05:35 marostegui: Deploy schema change on s7 (metawiki) eqiad dbmaint T365352
  • 05:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2102.codfw.wmnet with reason: host reimage
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P62740 and previous config saved to /var/cache/conftool/dbconfig/20240521-052054-ladsgroup.json
  • 05:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2102.codfw.wmnet with reason: host reimage
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P62739 and previous config saved to /var/cache/conftool/dbconfig/20240521-050546-ladsgroup.json
  • 05:05 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2102.codfw.wmnet with OS bookworm
  • 05:00 marostegui: Deploy schema change on s7 (metawiki) codfw dbmaint T365352
  • 04:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Schema change
  • 04:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Schema change
  • 04:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P62738 and previous config saved to /var/cache/conftool/dbconfig/20240521-045037-ladsgroup.json
  • 04:05 mwpresync@deploy1002: Pruned MediaWiki: 1.43.0-wmf.3 (duration: 05m 28s)
  • 04:01 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.6 refs T361400 (duration: 58m 51s)
  • 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.6 refs T361400
  • 02:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P62737 and previous config saved to /var/cache/conftool/dbconfig/20240521-024715-ladsgroup.json
  • 02:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P62736 and previous config saved to /var/cache/conftool/dbconfig/20240521-024652-ladsgroup.json
  • 02:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P62735 and previous config saved to /var/cache/conftool/dbconfig/20240521-023144-ladsgroup.json
  • 02:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P62734 and previous config saved to /var/cache/conftool/dbconfig/20240521-021634-ladsgroup.json
  • 02:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P62733 and previous config saved to /var/cache/conftool/dbconfig/20240521-020126-ladsgroup.json
  • 01:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62732 and previous config saved to /var/cache/conftool/dbconfig/20240521-015014-marostegui.json
  • 01:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 01:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 01:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T364299)', diff saved to https://phabricator.wikimedia.org/P62731 and previous config saved to /var/cache/conftool/dbconfig/20240521-014949-marostegui.json
  • 01:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62730 and previous config saved to /var/cache/conftool/dbconfig/20240521-013441-marostegui.json
  • 01:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62729 and previous config saved to /var/cache/conftool/dbconfig/20240521-011931-marostegui.json
  • 01:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T364299)', diff saved to https://phabricator.wikimedia.org/P62728 and previous config saved to /var/cache/conftool/dbconfig/20240521-010423-marostegui.json
  • 00:18 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2002.codfw.wmnet with OS bullseye
  • 00:17 eileen: civicrm upgraded from 19b6a9a0 to 8901b5b3
  • 00:17 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest2002']
  • 00:16 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2002']
  • 00:16 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 00:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 00:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 00:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED

2024-05-20

  • 23:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P62727 and previous config saved to /var/cache/conftool/dbconfig/20240520-234431-ladsgroup.json
  • 23:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 23:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 23:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P62726 and previous config saved to /var/cache/conftool/dbconfig/20240520-234406-ladsgroup.json
  • 23:33 eileen: civicrm upgraded from f838d84d to 19b6a9a0
  • 23:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P62725 and previous config saved to /var/cache/conftool/dbconfig/20240520-232858-ladsgroup.json
  • 23:26 mutante: LDAP - added jaycano to wmf group (T365349)
  • 23:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P62724 and previous config saved to /var/cache/conftool/dbconfig/20240520-231350-ladsgroup.json
  • 23:13 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:05 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 22:59 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 22:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P62723 and previous config saved to /var/cache/conftool/dbconfig/20240520-225842-ladsgroup.json
  • 22:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:16 urbanecm@deploy1002: Finished scap: Backport for Add account_conversion event streams. (T363815) (duration: 16m 18s)
  • 22:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 22:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 22:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T352010)', diff saved to https://phabricator.wikimedia.org/P62722 and previous config saved to /var/cache/conftool/dbconfig/20240520-220247-ladsgroup.json
  • 22:00 urbanecm@deploy1002: Started scap: Backport for Add account_conversion event streams. (T363815)
  • 21:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P62721 and previous config saved to /var/cache/conftool/dbconfig/20240520-214739-ladsgroup.json
  • 21:38 bking@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 21:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P62720 and previous config saved to /var/cache/conftool/dbconfig/20240520-213230-ladsgroup.json
  • 21:32 bking@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 21:29 bking@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 21:22 bking@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 21:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T352010)', diff saved to https://phabricator.wikimedia.org/P62719 and previous config saved to /var/cache/conftool/dbconfig/20240520-211721-ladsgroup.json
  • 20:57 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:51 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:51 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:44 urbanecm@deploy1002: Finished scap: Backport for Remove readability survey tool (T349337), wgVectorShareUserScripts should be false now (T301212) (duration: 18m 34s)
  • 20:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P62718 and previous config saved to /var/cache/conftool/dbconfig/20240520-203811-ladsgroup.json
  • 20:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 20:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 20:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T352010)', diff saved to https://phabricator.wikimedia.org/P62717 and previous config saved to /var/cache/conftool/dbconfig/20240520-203748-ladsgroup.json
  • 20:30 urbanecm@deploy1002: ksarabia and jdlrobson and urbanecm: Continuing with sync
  • 20:28 urbanecm@deploy1002: ksarabia and jdlrobson and urbanecm: Backport for Remove readability survey tool (T349337), wgVectorShareUserScripts should be false now (T301212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:25 urbanecm@deploy1002: Started scap: Backport for Remove readability survey tool (T349337), wgVectorShareUserScripts should be false now (T301212)
  • 20:25 urbanecm@deploy1002: Finished scap: Backport for Introduce sample overrides to web_ui_actions (T361962), Disable wgParserEnableLegacyMediaDOM (T363597), Disable last remaining projects using share user scripts (T301212) (duration: 18m 18s)
  • 20:24 eileen: config revision changed from 21dba21a to 22106526
  • 20:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P62716 and previous config saved to /var/cache/conftool/dbconfig/20240520-202240-ladsgroup.json
  • 20:11 urbanecm@deploy1002: urbanecm and jdlrobson and ksarabia: Continuing with sync
  • 20:09 urbanecm@deploy1002: urbanecm and jdlrobson and ksarabia: Backport for Introduce sample overrides to web_ui_actions (T361962), Disable wgParserEnableLegacyMediaDOM (T363597), Disable last remaining projects using share user scripts (T301212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P62715 and previous config saved to /var/cache/conftool/dbconfig/20240520-200732-ladsgroup.json
  • 20:06 urbanecm@deploy1002: Started scap: Backport for Introduce sample overrides to web_ui_actions (T361962), Disable wgParserEnableLegacyMediaDOM (T363597), Disable last remaining projects using share user scripts (T301212)
  • 19:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T352010)', diff saved to https://phabricator.wikimedia.org/P62714 and previous config saved to /var/cache/conftool/dbconfig/20240520-195224-ladsgroup.json
  • 19:46 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 19:45 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 19:33 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 19:32 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 19:31 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 19:31 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 19:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:23 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 19:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62713 and previous config saved to /var/cache/conftool/dbconfig/20240520-190908-root.json
  • 19:02 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:02 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62712 and previous config saved to /var/cache/conftool/dbconfig/20240520-185402-root.json
  • 18:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62711 and previous config saved to /var/cache/conftool/dbconfig/20240520-183856-root.json
  • 18:29 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 18:28 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 18:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62710 and previous config saved to /var/cache/conftool/dbconfig/20240520-182350-root.json
  • 18:16 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 18:15 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 18:11 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 18:09 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62709 and previous config saved to /var/cache/conftool/dbconfig/20240520-180844-root.json
  • 18:00 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 17:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 17:59 akosiaris@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - akosiaris@cumin1002"
  • 17:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 17:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - akosiaris@cumin1002"
  • 17:59 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62708 and previous config saved to /var/cache/conftool/dbconfig/20240520-175337-root.json
  • 17:42 mforns@deploy1002: Finished deploy [airflow-dags/analytics@b977332]: (no justification provided) (duration: 00m 27s)
  • 17:42 mforns@deploy1002: Started deploy [airflow-dags/analytics@b977332]: (no justification provided)
  • 17:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62707 and previous config saved to /var/cache/conftool/dbconfig/20240520-173831-root.json
  • 17:33 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 17:30 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS bullseye
  • 17:25 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 17:23 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - akosiaris@cumin1002"
  • 17:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS bullseye
  • 17:16 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - akosiaris@cumin1002"
  • 17:16 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 17:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 17:13 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
  • 17:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1183 (T352010)', diff saved to https://phabricator.wikimedia.org/P62706 and previous config saved to /var/cache/conftool/dbconfig/20240520-171228-ladsgroup.json
  • 17:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 17:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 17:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P62705 and previous config saved to /var/cache/conftool/dbconfig/20240520-171204-ladsgroup.json
  • 17:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
  • 17:06 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
  • 17:04 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 17:03 taavi@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 17:03 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 17:02 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
  • 17:01 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 17:01 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 17:01 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
  • 17:00 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
  • 17:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 17:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 16:59 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 16:58 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 16:58 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 16:57 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 16:57 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 16:57 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:57 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 16:56 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 16:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P62704 and previous config saved to /var/cache/conftool/dbconfig/20240520-165656-ladsgroup.json
  • 16:55 robh@cumin2002: START - Cookbook sre.dns.netbox
  • 16:55 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
  • 16:54 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
  • 16:54 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 16:54 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:54 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 16:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:53 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 16:53 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 16:52 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 16:52 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 16:52 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 16:52 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 16:51 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 16:50 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 16:50 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:46 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:42 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 16:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P62703 and previous config saved to /var/cache/conftool/dbconfig/20240520-164148-ladsgroup.json
  • 16:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS bullseye
  • 16:40 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS bullseye
  • 16:40 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 16:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:39 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:38 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 16:38 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 16:37 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 16:37 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:36 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:35 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:32 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:32 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:29 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P62702 and previous config saved to /var/cache/conftool/dbconfig/20240520-162640-ladsgroup.json
  • 16:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 16:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 16:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 15:58 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:56 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:55 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:52 swfrench@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:52 swfrench@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:51 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 15:50 swfrench@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 15:50 swfrench@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:48 swfrench@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 15:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 15:38 vgutierrez: repool upload@codfw with IPIP encapsulation enabled - T357257
  • 15:33 hnowlan: move 100% of commons traffic to run on k8s
  • 15:30 vgutierrez: rolling restart of pybal on lvs2014 and lvs2012 - T357257
  • 15:27 ejegg: payments-wiki upgraded from 3c23d3d8 to bc25f115
  • 15:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest2002']
  • 15:21 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2002']
  • 15:05 vgutierrez: depool upload@codfw before enabling IPIP encapsulation - T357257
  • 15:04 mforns@deploy1002: Finished deploy [analytics/refinery@4d42877] (hadoop-test): Deploy Commons Impact Metrics query improvements TEST [analytics/refinery@4d42877e] (duration: 03m 50s)
  • 15:02 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 15:01 mforns@deploy1002: Started deploy [analytics/refinery@4d42877] (hadoop-test): Deploy Commons Impact Metrics query improvements TEST [analytics/refinery@4d42877e]
  • 14:59 mforns@deploy1002: Finished deploy [analytics/refinery@4d42877] (thin): Deploy Commons Impact Metrics query improvements THIN [analytics/refinery@4d42877e] (duration: 04m 00s)
  • 14:55 mforns@deploy1002: Started deploy [analytics/refinery@4d42877] (thin): Deploy Commons Impact Metrics query improvements THIN [analytics/refinery@4d42877e]
  • 14:53 ejegg: re-enabled fundraising scheduled jobs
  • 14:53 mforns@deploy1002: Finished deploy [analytics/refinery@4d42877]: Deploy Commons Impact Metrics query improvements [analytics/refinery@4d42877e] (duration: 14m 08s)
  • 14:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest2002']
  • 14:48 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2002']
  • 14:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest2002']
  • 14:48 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2002']
  • 14:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:47 ejegg: fundraising civicrm upgraded from 4f6f2dc3 to 7839feb6
  • 14:46 ejegg: disabled fundraising scheduled jobs for Civi upgrade
  • 14:42 filippo@deploy1002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 14:41 filippo@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 14:41 filippo@deploy1002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 14:41 filippo@deploy1002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 14:41 filippo@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 14:40 filippo@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 14:40 filippo@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 14:39 filippo@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 14:39 filippo@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 14:39 filippo@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 14:38 mforns@deploy1002: Started deploy [analytics/refinery@4d42877]: Deploy Commons Impact Metrics query improvements [analytics/refinery@4d42877e]
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T364299)', diff saved to https://phabricator.wikimedia.org/P62700 and previous config saved to /var/cache/conftool/dbconfig/20240520-142621-marostegui.json
  • 14:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 14:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T364299)', diff saved to https://phabricator.wikimedia.org/P62699 and previous config saved to /var/cache/conftool/dbconfig/20240520-142558-marostegui.json
  • 14:19 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 14:19 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62698 and previous config saved to /var/cache/conftool/dbconfig/20240520-141828-root.json
  • 14:18 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS bullseye
  • 14:17 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2007.codfw.wmnet with OS bullseye
  • 14:17 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 14:14 reedy@deploy1002: Synchronized wmf-config/core-Permissions.php: T360977 (duration: 15m 54s)
  • 14:12 vgutierrez: repool upload@eqsin with IPIP encapsulation enabled - T357257
  • 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P62697 and previous config saved to /var/cache/conftool/dbconfig/20240520-141050-marostegui.json
  • 14:06 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 14:06 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62696 and previous config saved to /var/cache/conftool/dbconfig/20240520-140321-root.json
  • 14:03 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 14:02 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 14:01 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2002.wikimedia.org with OS bookworm
  • 14:00 vgutierrez: rolling restart of pybal on lvs5005 and lvs5006 - T357257
  • 13:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P62695 and previous config saved to /var/cache/conftool/dbconfig/20240520-135542-marostegui.json
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62694 and previous config saved to /var/cache/conftool/dbconfig/20240520-134815-root.json
  • 13:47 reedy@deploy1002: Synchronized wmf-config/throttle.php: T365221 (duration: 15m 20s)
  • 13:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P62693 and previous config saved to /var/cache/conftool/dbconfig/20240520-134613-ladsgroup.json
  • 13:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 13:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 13:45 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 13:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T364299)', diff saved to https://phabricator.wikimedia.org/P62692 and previous config saved to /var/cache/conftool/dbconfig/20240520-134034-marostegui.json
  • 13:39 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62691 and previous config saved to /var/cache/conftool/dbconfig/20240520-133309-root.json
  • 13:29 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 13:28 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 13:27 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS bullseye
  • 13:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:27 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS bullseye
  • 13:26 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:25 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:24 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 13:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:23 reedy@deploy1002: Synchronized wmf-config/: T360989 T365323 (duration: 15m 35s)
  • 13:22 hnowlan: migrating 80% of commons traffic to k8s
  • 13:19 topranks: adding outbound ACL on irb.2002 on lsw1 switches in codfw to test DHCP function T365204
  • 13:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62689 and previous config saved to /var/cache/conftool/dbconfig/20240520-131803-root.json
  • 13:17 vgutierrez: depool upload@eqsin before enabling IPIP encapsulation - T357257
  • 13:11 vgutierrez: Re-enable puppet on A:ncredir && A:cp-upload_ulsfo - T365354
  • 13:04 Emperor: depool, restart swift-proxy, repool moss-fe1001 as ~12% connection failures reported by envoy since late 14th May T360913
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62688 and previous config saved to /var/cache/conftool/dbconfig/20240520-130257-root.json
  • 12:59 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 12:54 vgutierrez: disable puppet on A:ncredir && A:cp-upload_ulsfo before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1034074 - T365354
  • 12:52 marostegui: Deploy schema change on s7 (only frwiktionary) eqiad with replication dbmaint T365352
  • 12:48 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2002.wikimedia.org with OS bookworm
  • 12:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62687 and previous config saved to /var/cache/conftool/dbconfig/20240520-124749-root.json
  • 12:46 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:46 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change mgmt dns for sretest2002 - cmooney@cumin1002"
  • 12:45 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change mgmt dns for sretest2002 - cmooney@cumin1002"
  • 12:44 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2002.mgmt.codfw.wmnet on all recursors
  • 12:44 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2002.mgmt.codfw.wmnet on all recursors
  • 12:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS bookworm
  • 12:01 marostegui: Deploy schema change on s4 eqiad with replication dbmaint T365352
  • 11:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
  • 11:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
  • 11:56 marostegui: Deploy schema change on s5 eqiad with replication dbmaint T365352
  • 11:47 marostegui: Deploy urgent schema change on s8 eqiad with replication dbmaint T365352
  • 11:40 hnowlan: migrating 30% of commons traffic to k8s
  • 11:38 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS bookworm
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62685 and previous config saved to /var/cache/conftool/dbconfig/20240520-113038-root.json
  • 11:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Migration to bookworm
  • 11:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Migration to bookworm
  • 11:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62684 and previous config saved to /var/cache/conftool/dbconfig/20240520-111530-root.json
  • 11:15 marostegui@cumin1002: Updating IPMI password on 1 hosts - marostegui@cumin1002
  • 11:14 marostegui@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 11:14 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)
  • 11:14 marostegui@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 11:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: Migration to bookworm
  • 11:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: Migration to bookworm
  • 11:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2181 T363792', diff saved to https://phabricator.wikimedia.org/P62682 and previous config saved to /var/cache/conftool/dbconfig/20240520-110217-root.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62681 and previous config saved to /var/cache/conftool/dbconfig/20240520-110023-root.json
  • 10:46 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62680 and previous config saved to /var/cache/conftool/dbconfig/20240520-104517-root.json
  • 10:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2175.codfw.wmnet with OS bookworm
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62679 and previous config saved to /var/cache/conftool/dbconfig/20240520-103011-root.json
  • 10:18 godog: bounce prometheus@k8s in eqiad - T343529
  • 10:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2175.codfw.wmnet with reason: host reimage
  • 10:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2175.codfw.wmnet with reason: host reimage
  • 09:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T352010)', diff saved to https://phabricator.wikimedia.org/P62678 and previous config saved to /var/cache/conftool/dbconfig/20240520-095729-ladsgroup.json
  • 09:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62677 and previous config saved to /var/cache/conftool/dbconfig/20240520-095706-ladsgroup.json
  • 09:45 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2175.codfw.wmnet with OS bookworm
  • 09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2175.codfw.wmnet with reason: Migration to bookworm
  • 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2175.codfw.wmnet with reason: Migration to bookworm
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2175 T361543', diff saved to https://phabricator.wikimedia.org/P62676 and previous config saved to /var/cache/conftool/dbconfig/20240520-094352-marostegui.json
  • 09:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P62675 and previous config saved to /var/cache/conftool/dbconfig/20240520-094159-ladsgroup.json
  • 09:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P62674 and previous config saved to /var/cache/conftool/dbconfig/20240520-092651-ladsgroup.json
  • 09:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1014.eqiad.wmnet with reason: Testing new mariadb version
  • 09:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1014.eqiad.wmnet with reason: Testing new mariadb version
  • 09:18 marostegui: Install 10.6.18 on db1125 and pc1014 T365338
  • 09:17 hnowlan: Increasing commons on k8s traffic to 15%
  • 09:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62673 and previous config saved to /var/cache/conftool/dbconfig/20240520-091143-ladsgroup.json
  • 09:02 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 09:02 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 08:57 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 08:56 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 08:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1125.eqiad.wmnet with OS bookworm
  • 08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1125.eqiad.wmnet with reason: host reimage
  • 08:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1125.eqiad.wmnet with reason: host reimage
  • 08:19 urbanecm@deploy1002: Finished scap: Backport for [Growth] enwiki: Enable AddLink backend (T308144) (duration: 17m 07s)
  • 08:16 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1125.eqiad.wmnet with OS bookworm
  • 08:06 urbanecm@deploy1002: urbanecm: Continuing with sync
  • 08:05 urbanecm@deploy1002: urbanecm: Backport for [Growth] enwiki: Enable AddLink backend (T308144) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:02 urbanecm@deploy1002: Started scap: Backport for [Growth] enwiki: Enable AddLink backend (T308144)
  • 07:41 urbanecm@deploy1002: Finished scap: Backport for Update interwiki.php cache (T363658) (duration: 27m 04s)
  • 07:14 urbanecm@deploy1002: Started scap: Backport for Update interwiki.php cache (T363658)
  • 06:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db2161.codfw.wmnet with reason: Schema change T364299
  • 06:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 20:00:00 on db2161.codfw.wmnet with reason: Schema change T364299
  • 05:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2161 T365339', diff saved to https://phabricator.wikimedia.org/P62672 and previous config saved to /var/cache/conftool/dbconfig/20240520-055908-root.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2165 to s8 primary T365339', diff saved to https://phabricator.wikimedia.org/P62671 and previous config saved to /var/cache/conftool/dbconfig/20240520-055812-root.json
  • 05:57 marostegui: Starting s8 codfw failover from db2161 to db2165 - T365339
  • 05:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T365339
  • 05:35 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2165 with weight 0 T365339', diff saved to https://phabricator.wikimedia.org/P62670 and previous config saved to /var/cache/conftool/dbconfig/20240520-053523-root.json
  • 05:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T365339
  • 03:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T364299)', diff saved to https://phabricator.wikimedia.org/P62669 and previous config saved to /var/cache/conftool/dbconfig/20240520-034057-marostegui.json
  • 03:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 03:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance

2024-05-19

  • 22:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62668 and previous config saved to /var/cache/conftool/dbconfig/20240519-223525-ladsgroup.json
  • 22:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 22:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 22:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T352010)', diff saved to https://phabricator.wikimedia.org/P62667 and previous config saved to /var/cache/conftool/dbconfig/20240519-223502-ladsgroup.json
  • 22:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P62666 and previous config saved to /var/cache/conftool/dbconfig/20240519-221954-ladsgroup.json
  • 22:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P62665 and previous config saved to /var/cache/conftool/dbconfig/20240519-220445-ladsgroup.json
  • 21:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T352010)', diff saved to https://phabricator.wikimedia.org/P62664 and previous config saved to /var/cache/conftool/dbconfig/20240519-214936-ladsgroup.json
  • 18:56 vgutierrez: vgutierrez@cp4049:~$ sudo rm /var/lib/prometheus/node.d/realserver-mss.prom
  • 18:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 18:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 18:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62663 and previous config saved to /var/cache/conftool/dbconfig/20240519-182447-marostegui.json
  • 18:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P62662 and previous config saved to /var/cache/conftool/dbconfig/20240519-180939-marostegui.json
  • 17:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P62661 and previous config saved to /var/cache/conftool/dbconfig/20240519-175431-marostegui.json
  • 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62660 and previous config saved to /var/cache/conftool/dbconfig/20240519-173923-marostegui.json
  • 16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Schema change
  • 16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Schema change
  • 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62658 and previous config saved to /var/cache/conftool/dbconfig/20240519-163855-marostegui.json
  • 16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62657 and previous config saved to /var/cache/conftool/dbconfig/20240519-112730-ladsgroup.json
  • 11:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P62656 and previous config saved to /var/cache/conftool/dbconfig/20240519-111222-ladsgroup.json
  • 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P62655 and previous config saved to /var/cache/conftool/dbconfig/20240519-105714-ladsgroup.json
  • 10:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62654 and previous config saved to /var/cache/conftool/dbconfig/20240519-104206-ladsgroup.json
  • 10:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T352010)', diff saved to https://phabricator.wikimedia.org/P62653 and previous config saved to /var/cache/conftool/dbconfig/20240519-102315-ladsgroup.json
  • 10:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 10:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 10:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62652 and previous config saved to /var/cache/conftool/dbconfig/20240519-102247-ladsgroup.json
  • 10:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P62651 and previous config saved to /var/cache/conftool/dbconfig/20240519-100739-ladsgroup.json
  • 09:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P62650 and previous config saved to /var/cache/conftool/dbconfig/20240519-095231-ladsgroup.json
  • 09:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62649 and previous config saved to /var/cache/conftool/dbconfig/20240519-093723-ladsgroup.json
  • 07:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62648 and previous config saved to /var/cache/conftool/dbconfig/20240519-074556-ladsgroup.json
  • 07:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 07:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 07:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62647 and previous config saved to /var/cache/conftool/dbconfig/20240519-074532-ladsgroup.json
  • 07:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P62646 and previous config saved to /var/cache/conftool/dbconfig/20240519-073025-ladsgroup.json
  • 07:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P62645 and previous config saved to /var/cache/conftool/dbconfig/20240519-071517-ladsgroup.json
  • 07:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62644 and previous config saved to /var/cache/conftool/dbconfig/20240519-070008-ladsgroup.json
  • 05:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 05:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 05:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62643 and previous config saved to /var/cache/conftool/dbconfig/20240519-051029-ladsgroup.json
  • 05:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 05:10 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62642 and previous config saved to /var/cache/conftool/dbconfig/20240519-014335-ladsgroup.json
  • 01:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P62641 and previous config saved to /var/cache/conftool/dbconfig/20240519-012827-ladsgroup.json
  • 01:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P62640 and previous config saved to /var/cache/conftool/dbconfig/20240519-011320-ladsgroup.json
  • 00:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62639 and previous config saved to /var/cache/conftool/dbconfig/20240519-005811-ladsgroup.json

2024-05-18

  • 23:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62638 and previous config saved to /var/cache/conftool/dbconfig/20240518-230800-ladsgroup.json
  • 23:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62637 and previous config saved to /var/cache/conftool/dbconfig/20240518-230736-ladsgroup.json
  • 22:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P62636 and previous config saved to /var/cache/conftool/dbconfig/20240518-225228-ladsgroup.json
  • 22:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P62635 and previous config saved to /var/cache/conftool/dbconfig/20240518-223720-ladsgroup.json
  • 22:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62634 and previous config saved to /var/cache/conftool/dbconfig/20240518-222748-ladsgroup.json
  • 22:27 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 22:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 22:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62633 and previous config saved to /var/cache/conftool/dbconfig/20240518-222725-ladsgroup.json
  • 22:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62632 and previous config saved to /var/cache/conftool/dbconfig/20240518-222212-ladsgroup.json
  • 22:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P62631 and previous config saved to /var/cache/conftool/dbconfig/20240518-221216-ladsgroup.json
  • 21:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P62630 and previous config saved to /var/cache/conftool/dbconfig/20240518-215708-ladsgroup.json
  • 21:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62629 and previous config saved to /var/cache/conftool/dbconfig/20240518-214200-ladsgroup.json
  • 20:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62628 and previous config saved to /var/cache/conftool/dbconfig/20240518-200322-ladsgroup.json
  • 20:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 20:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 20:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62627 and previous config saved to /var/cache/conftool/dbconfig/20240518-200258-ladsgroup.json
  • 19:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P62626 and previous config saved to /var/cache/conftool/dbconfig/20240518-194750-ladsgroup.json
  • 19:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P62625 and previous config saved to /var/cache/conftool/dbconfig/20240518-193240-ladsgroup.json
  • 19:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62624 and previous config saved to /var/cache/conftool/dbconfig/20240518-191732-ladsgroup.json
  • 18:59 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 18:58 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 18:56 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2090.codfw.wmnet with OS bullseye
  • 18:36 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
  • 18:33 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
  • 18:16 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
  • 16:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 16:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 16:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62623 and previous config saved to /var/cache/conftool/dbconfig/20240518-162907-marostegui.json
  • 16:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P62622 and previous config saved to /var/cache/conftool/dbconfig/20240518-161400-marostegui.json
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P62621 and previous config saved to /var/cache/conftool/dbconfig/20240518-155852-marostegui.json
  • 15:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62620 and previous config saved to /var/cache/conftool/dbconfig/20240518-155136-ladsgroup.json
  • 15:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 15:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 15:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62619 and previous config saved to /var/cache/conftool/dbconfig/20240518-155112-ladsgroup.json
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62618 and previous config saved to /var/cache/conftool/dbconfig/20240518-154343-marostegui.json
  • 15:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P62617 and previous config saved to /var/cache/conftool/dbconfig/20240518-153604-ladsgroup.json
  • 15:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P62616 and previous config saved to /var/cache/conftool/dbconfig/20240518-152056-ladsgroup.json
  • 15:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62615 and previous config saved to /var/cache/conftool/dbconfig/20240518-150548-ladsgroup.json
  • 11:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62614 and previous config saved to /var/cache/conftool/dbconfig/20240518-112824-ladsgroup.json
  • 11:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 11:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62613 and previous config saved to /var/cache/conftool/dbconfig/20240518-112745-ladsgroup.json
  • 11:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62612 and previous config saved to /var/cache/conftool/dbconfig/20240518-111237-ladsgroup.json
  • 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62611 and previous config saved to /var/cache/conftool/dbconfig/20240518-105729-ladsgroup.json
  • 10:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62610 and previous config saved to /var/cache/conftool/dbconfig/20240518-104222-ladsgroup.json
  • 07:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62609 and previous config saved to /var/cache/conftool/dbconfig/20240518-071726-ladsgroup.json
  • 07:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 07:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 07:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62608 and previous config saved to /var/cache/conftool/dbconfig/20240518-071703-ladsgroup.json
  • 07:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62607 and previous config saved to /var/cache/conftool/dbconfig/20240518-070155-ladsgroup.json
  • 06:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62606 and previous config saved to /var/cache/conftool/dbconfig/20240518-064646-ladsgroup.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62605 and previous config saved to /var/cache/conftool/dbconfig/20240518-063529-marostegui.json
  • 06:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62604 and previous config saved to /var/cache/conftool/dbconfig/20240518-063505-marostegui.json
  • 06:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62603 and previous config saved to /var/cache/conftool/dbconfig/20240518-063138-ladsgroup.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P62602 and previous config saved to /var/cache/conftool/dbconfig/20240518-061958-marostegui.json
  • 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P62601 and previous config saved to /var/cache/conftool/dbconfig/20240518-060450-marostegui.json
  • 05:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62600 and previous config saved to /var/cache/conftool/dbconfig/20240518-055125-ladsgroup.json
  • 05:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62599 and previous config saved to /var/cache/conftool/dbconfig/20240518-055100-ladsgroup.json
  • 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62598 and previous config saved to /var/cache/conftool/dbconfig/20240518-054942-marostegui.json
  • 05:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P62597 and previous config saved to /var/cache/conftool/dbconfig/20240518-053550-ladsgroup.json
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P62596 and previous config saved to /var/cache/conftool/dbconfig/20240518-052043-ladsgroup.json
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62595 and previous config saved to /var/cache/conftool/dbconfig/20240518-050535-ladsgroup.json
  • 03:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62594 and previous config saved to /var/cache/conftool/dbconfig/20240518-030359-ladsgroup.json
  • 03:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 03:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 02:39 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2090.codfw.wmnet with OS bullseye
  • 01:18 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
  • 00:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 00:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 00:35 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2090* for ban elastic2090 before reimage - ryankemper@cumin2002 - T353878
  • 00:35 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2090* for ban elastic2090 before reimage - ryankemper@cumin2002 - T353878
  • 00:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 00:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:02 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"

2024-05-17

  • 23:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 23:43 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 23:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 23:41 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 23:08 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 23:06 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 23:05 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 22:43 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 22:21 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 22:20 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 22:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 22:19 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 21:57 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:57 akosiaris@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:47 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:10 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 21:02 ryankemper@puppetmaster1001: conftool action : set/weight=10:pooled=yes; selector: name=elastic2090\.codfw\.wmnet
  • 20:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 20:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 19:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 19:42 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:38 dzahn@cumin1002: conftool action : set/pooled=no; selector: name=ml-serve2002.codfw.wmnet
  • 19:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 18:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62592 and previous config saved to /var/cache/conftool/dbconfig/20240517-184554-marostegui.json
  • 18:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 18:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 18:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62591 and previous config saved to /var/cache/conftool/dbconfig/20240517-184530-marostegui.json
  • 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P62590 and previous config saved to /var/cache/conftool/dbconfig/20240517-183022-marostegui.json
  • 18:22 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P62589 and previous config saved to /var/cache/conftool/dbconfig/20240517-181515-marostegui.json
  • 18:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62588 and previous config saved to /var/cache/conftool/dbconfig/20240517-180006-marostegui.json
  • 17:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62587 and previous config saved to /var/cache/conftool/dbconfig/20240517-173608-ladsgroup.json
  • 17:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 17:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 17:18 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:18 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:35 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:22 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 16:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:22 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 15:21 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 15:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62585 and previous config saved to /var/cache/conftool/dbconfig/20240517-140648-ladsgroup.json
  • 13:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P62584 and previous config saved to /var/cache/conftool/dbconfig/20240517-135138-ladsgroup.json
  • 13:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P62583 and previous config saved to /var/cache/conftool/dbconfig/20240517-133630-ladsgroup.json
  • 13:26 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:25 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:24 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:23 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:22 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62582 and previous config saved to /var/cache/conftool/dbconfig/20240517-132122-ladsgroup.json
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagetcd[1004-1006].eqiad.wmnet
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[1004-1006].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:55 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[1004-1006].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica1006.wikimedia.org
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1006.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:46 kevinbazira@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica1005.wikimedia.org
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:24 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagetcd[1004-1006].eqiad.wmnet with reason: decom
  • 12:24 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagetcd[1004-1006].eqiad.wmnet with reason: decom
  • 12:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagemaster[1001-1002].eqiad.wmnet
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:11 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:11 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica1005.wikimedia.org
  • 12:11 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:09 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:08 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica2008.wikimedia.org
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2008.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:05 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2008.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:02 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kubestagemaster[1001-1002].eqiad.wmnet
  • 11:56 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:53 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubestagemaster100[12].eqiad.wmnet
  • 11:51 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica2008.wikimedia.org
  • 11:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagemaster[1001-1002].eqiad.wmnet with reason: decom
  • 11:51 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagemaster[1001-1002].eqiad.wmnet with reason: decom
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica2007.wikimedia.org
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2007.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 11:47 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2007.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 11:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:39 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica2007.wikimedia.org
  • 11:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62579 and previous config saved to /var/cache/conftool/dbconfig/20240517-113142-ladsgroup.json
  • 11:31 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 11:31 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 11:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62578 and previous config saved to /var/cache/conftool/dbconfig/20240517-113119-ladsgroup.json
  • 11:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P62577 and previous config saved to /var/cache/conftool/dbconfig/20240517-111611-ladsgroup.json
  • 11:08 jayme@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=kubestagemaster100[3-5].eqiad.wmnet
  • 11:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P62576 and previous config saved to /var/cache/conftool/dbconfig/20240517-110101-ladsgroup.json
  • 10:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62575 and previous config saved to /var/cache/conftool/dbconfig/20240517-104553-ladsgroup.json
  • 09:44 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 09:39 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 09:25 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1016.eqiad.wmnet
  • 09:17 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host snapshot1016.eqiad.wmnet
  • 09:06 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1015.eqiad.wmnet
  • 09:01 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host snapshot1015.eqiad.wmnet
  • 08:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62574 and previous config saved to /var/cache/conftool/dbconfig/20240517-082636-ladsgroup.json
  • 08:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 08:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 08:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62573 and previous config saved to /var/cache/conftool/dbconfig/20240517-082613-ladsgroup.json
  • 08:17 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 08:17 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 08:16 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 08:16 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 08:15 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 08:14 jayme@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 08:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P62572 and previous config saved to /var/cache/conftool/dbconfig/20240517-081105-ladsgroup.json
  • 07:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P62571 and previous config saved to /var/cache/conftool/dbconfig/20240517-075558-ladsgroup.json
  • 07:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62570 and previous config saved to /var/cache/conftool/dbconfig/20240517-074050-ladsgroup.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62568 and previous config saved to /var/cache/conftool/dbconfig/20240517-065920-marostegui.json
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62567 and previous config saved to /var/cache/conftool/dbconfig/20240517-065857-marostegui.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P62566 and previous config saved to /var/cache/conftool/dbconfig/20240517-064350-marostegui.json
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P62565 and previous config saved to /var/cache/conftool/dbconfig/20240517-062842-marostegui.json
  • 06:18 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 55 hosts
  • 06:17 ryankemper@cumin2002: START - Cookbook sre.hosts.remove-downtime for 55 hosts
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62564 and previous config saved to /var/cache/conftool/dbconfig/20240517-061334-marostegui.json
  • 06:10 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 05:52 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 05:52 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 55 hosts with reason: T363975
  • 05:50 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on 55 hosts with reason: T363975
  • 05:17 marostegui: Restart wikibugs
  • 05:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62563 and previous config saved to /var/cache/conftool/dbconfig/20240517-051721-ladsgroup.json
  • 05:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 05:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 05:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62562 and previous config saved to /var/cache/conftool/dbconfig/20240517-051658-ladsgroup.json
  • 05:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 05:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 05:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P62561 and previous config saved to /var/cache/conftool/dbconfig/20240517-050150-ladsgroup.json
  • 04:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P62560 and previous config saved to /var/cache/conftool/dbconfig/20240517-044642-ladsgroup.json
  • 04:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62559 and previous config saved to /var/cache/conftool/dbconfig/20240517-043134-ladsgroup.json
  • 02:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62558 and previous config saved to /var/cache/conftool/dbconfig/20240517-021211-ladsgroup.json
  • 02:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62557 and previous config saved to /var/cache/conftool/dbconfig/20240517-021148-ladsgroup.json
  • 01:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62556 and previous config saved to /var/cache/conftool/dbconfig/20240517-015640-ladsgroup.json
  • 01:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62555 and previous config saved to /var/cache/conftool/dbconfig/20240517-014132-ladsgroup.json
  • 01:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62554 and previous config saved to /var/cache/conftool/dbconfig/20240517-012622-ladsgroup.json

2024-05-16

  • 23:43 cwhite: restart apache on gerrit1003
  • 23:17 zabe@deploy1002: Synchronized private/PrivateSettings.php: Add secret for encrypting user password hashes - T150647 (duration: 16m 42s)
  • 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62553 and previous config saved to /var/cache/conftool/dbconfig/20240516-230951-ladsgroup.json
  • 23:09 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 23:09 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62552 and previous config saved to /var/cache/conftool/dbconfig/20240516-230939-ladsgroup.json
  • 23:05 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@312e2be]: Correct new range partition sensor granularity (duration: 00m 21s)
  • 23:04 ebernhardson@deploy1002: Started deploy [airflow-dags/search@312e2be]: Correct new range partition sensor granularity
  • 22:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P62551 and previous config saved to /var/cache/conftool/dbconfig/20240516-225430-ladsgroup.json
  • 22:47 jsn@deploy1002: Finished scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) (duration: 21m 57s)
  • 22:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P62550 and previous config saved to /var/cache/conftool/dbconfig/20240516-223922-ladsgroup.json
  • 22:27 jsn@deploy1002: jsn and cscott: Continuing with sync
  • 22:27 jsn@deploy1002: jsn and cscott: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:25 jsn@deploy1002: Started scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036)
  • 22:24 jsn@deploy1002: Sync cancelled.
  • 22:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62549 and previous config saved to /var/cache/conftool/dbconfig/20240516-222414-ladsgroup.json
  • 22:02 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@cb359e4]: add dags to collect daily webrequest and satisfaction search metrics (duration: 00m 25s)
  • 22:02 ebernhardson@deploy1002: Started deploy [airflow-dags/search@cb359e4]: add dags to collect daily webrequest and satisfaction search metrics
  • 21:52 jsn@deploy1002: cscott and jsn: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:49 jsn@deploy1002: Started scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036)
  • 21:31 jsn@deploy1002: Finished scap: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052) (duration: 25m 10s)
  • 21:11 jsn@deploy1002: jsn and esanders: Continuing with sync
  • 21:09 mutante: LDAP - added uid rickijay to group nda (T365138)
  • 21:08 jsn@deploy1002: jsn and esanders: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:06 jsn@deploy1002: Started scap: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052)
  • 21:05 mutante: LDAP - added uid dmuthuri to group wmf T364320
  • 20:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 20:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 20:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62548 and previous config saved to /var/cache/conftool/dbconfig/20240516-204342-ladsgroup.json
  • 20:33 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1013.eqiad.wmnet
  • 20:33 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for aqs1013.eqiad.wmnet
  • 20:33 mutante: contint2002 - as usual have to manually "a2dismod mpm_event" on a machine using apache that has just been installed to fix the race condition with apache modules
  • 20:33 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2002.wikimedia.org with OS bullseye
  • 20:31 jdrewniak@deploy1002: Finished scap: Backport for Fix exclude list for dark mode (T365084) (duration: 22m 36s)
  • 20:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P62547 and previous config saved to /var/cache/conftool/dbconfig/20240516-202834-ladsgroup.json
  • 20:14 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
  • 20:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P62546 and previous config saved to /var/cache/conftool/dbconfig/20240516-201326-ladsgroup.json
  • 20:12 jdrewniak@deploy1002: jdrewniak and mabualruz: Continuing with sync
  • 20:11 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
  • 20:11 jdrewniak@deploy1002: jdrewniak and mabualruz: Backport for Fix exclude list for dark mode (T365084) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 ryankemper: [Hadoop] Restarted `hadoop-hdfs-datanode` on `an-worker1172`
  • 20:08 jdrewniak@deploy1002: Started scap: Backport for Fix exclude list for dark mode (T365084)
  • 20:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62545 and previous config saved to /var/cache/conftool/dbconfig/20240516-200618-ladsgroup.json
  • 20:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 20:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 20:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62544 and previous config saved to /var/cache/conftool/dbconfig/20240516-200552-ladsgroup.json
  • 20:03 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.roll-restart-workers (exit_code=99) restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 19:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62543 and previous config saved to /var/cache/conftool/dbconfig/20240516-195817-ladsgroup.json
  • 19:55 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 19:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P62542 and previous config saved to /var/cache/conftool/dbconfig/20240516-195044-ladsgroup.json
  • 19:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62541 and previous config saved to /var/cache/conftool/dbconfig/20240516-194613-marostegui.json
  • 19:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62540 and previous config saved to /var/cache/conftool/dbconfig/20240516-194548-marostegui.json
  • 19:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P62539 and previous config saved to /var/cache/conftool/dbconfig/20240516-193535-ladsgroup.json
  • 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62538 and previous config saved to /var/cache/conftool/dbconfig/20240516-193040-marostegui.json
  • 19:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62537 and previous config saved to /var/cache/conftool/dbconfig/20240516-192027-ladsgroup.json
  • 19:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62536 and previous config saved to /var/cache/conftool/dbconfig/20240516-191532-marostegui.json
  • 19:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62535 and previous config saved to /var/cache/conftool/dbconfig/20240516-190024-marostegui.json
  • 18:58 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
  • 18:46 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host contint2002.wikimedia.org with OS bullseye
  • 18:32 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 18:17 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 18:15 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host contint2002.wikimedia.org
  • 18:13 cmooney@cumin1002: START - Cookbook sre.hosts.dhcp for host contint2002.wikimedia.org
  • 18:04 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint2002.wikimedia.org with OS buster
  • 17:53 brennen@deploy1002: Finished deploy [phabricator/deployment@7d858df]: test scap deployment with keyholder key misconfigured for T313624 (duration: 00m 38s)
  • 17:52 brennen@deploy1002: Started deploy [phabricator/deployment@7d858df]: test scap deployment with keyholder key misconfigured for T313624
  • 17:45 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 17:34 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:34 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:34 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:02 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:02 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:02 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:01 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:01 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:00 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 17:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62529 and previous config saved to /var/cache/conftool/dbconfig/20240516-170035-ladsgroup.json
  • 17:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 16:58 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
  • 16:57 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:57 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host contint2002.wikimedia.org with OS bullseye
  • 16:57 ryankemper@cumin2002: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:41 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 16:41 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:40 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint2002.wikimedia.org with OS bullseye
  • 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62528 and previous config saved to /var/cache/conftool/dbconfig/20240516-163915-arnaudb.json
  • 16:39 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:38 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:32 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:31 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:31 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:30 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62526 and previous config saved to /var/cache/conftool/dbconfig/20240516-162408-arnaudb.json
  • 16:12 topranks: announcing wikidough anycast ranges to Inernet (transit) in magru T362421
  • 16:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62525 and previous config saved to /var/cache/conftool/dbconfig/20240516-160902-arnaudb.json
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62523 and previous config saved to /var/cache/conftool/dbconfig/20240516-155356-arnaudb.json
  • 15:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62522 and previous config saved to /var/cache/conftool/dbconfig/20240516-155034-arnaudb.json
  • 15:45 dhinus: systemctl restart mariadb@s4.service on clouddb1015 (using too much RAM) T365164
  • 15:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62521 and previous config saved to /var/cache/conftool/dbconfig/20240516-153850-arnaudb.json
  • 15:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62520 and previous config saved to /var/cache/conftool/dbconfig/20240516-153527-arnaudb.json
  • 15:25 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 15:24 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint2002.wikimedia.org with OS bullseye
  • 15:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62519 and previous config saved to /var/cache/conftool/dbconfig/20240516-152343-arnaudb.json
  • 15:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62518 and previous config saved to /var/cache/conftool/dbconfig/20240516-152021-arnaudb.json
  • 15:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62517 and previous config saved to /var/cache/conftool/dbconfig/20240516-150837-arnaudb.json
  • 15:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62516 and previous config saved to /var/cache/conftool/dbconfig/20240516-150515-arnaudb.json
  • 15:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 14:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62515 and previous config saved to /var/cache/conftool/dbconfig/20240516-145330-arnaudb.json
  • 14:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62514 and previous config saved to /var/cache/conftool/dbconfig/20240516-145009-arnaudb.json
  • 14:49 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62513 and previous config saved to /var/cache/conftool/dbconfig/20240516-144945-root.json
  • 14:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS bookworm
  • 14:43 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:43 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62512 and previous config saved to /var/cache/conftool/dbconfig/20240516-143503-arnaudb.json
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62511 and previous config saved to /var/cache/conftool/dbconfig/20240516-143439-root.json
  • 14:28 ladsgroup@deploy1002: Finished scap: Backport for Stop writing to the old columns of pagelinks in s6 (T352010) (duration: 15m 42s)
  • 14:28 hnowlan: migrated 5% of commons traffic to k8s
  • 14:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
  • 14:25 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
  • 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62510 and previous config saved to /var/cache/conftool/dbconfig/20240516-141957-arnaudb.json
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62509 and previous config saved to /var/cache/conftool/dbconfig/20240516-141932-root.json
  • 14:15 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 14:15 ladsgroup@deploy1002: ladsgroup: Backport for Stop writing to the old columns of pagelinks in s6 (T352010) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:13 ladsgroup@deploy1002: Started scap: Backport for Stop writing to the old columns of pagelinks in s6 (T352010)
  • 14:09 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76318767"]' 2>&1 | tee -a ~/T315510-enwiki-5; date
  • 14:08 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS bookworm
  • 14:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: reimage
  • 14:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: reimage
  • 14:06 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2174', diff saved to https://phabricator.wikimedia.org/P62508 and previous config saved to /var/cache/conftool/dbconfig/20240516-140620-arnaudb.json
  • 14:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62507 and previous config saved to /var/cache/conftool/dbconfig/20240516-140451-arnaudb.json
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62506 and previous config saved to /var/cache/conftool/dbconfig/20240516-140426-root.json
  • 14:04 jsn@deploy1002: Finished scap: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001) (duration: 16m 11s)
  • 14:03 Emperor: depool, restart swift-proxy, repool ms-fe1010 as ~12% connection failures reported by envoy since late 14th May T360913
  • 13:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS bookworm
  • 13:51 jsn@deploy1002: jsn and lucaswerkmeister-wmde: Continuing with sync
  • 13:50 jsn@deploy1002: jsn and lucaswerkmeister-wmde: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:49 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62505 and previous config saved to /var/cache/conftool/dbconfig/20240516-134918-root.json
  • 13:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1024.eqiad.wmnet with OS bookworm
  • 13:47 jsn@deploy1002: Started scap: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001)
  • 13:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
  • 13:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
  • 13:32 jsn@deploy1002: Finished scap: Backport for Enable async jobqueue-powered URL uploads on commons (T295007) (duration: 18m 18s)
  • 13:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1024.eqiad.wmnet with reason: host reimage
  • 13:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1024.eqiad.wmnet with reason: host reimage
  • 13:19 jsn@deploy1002: jsn and hnowlan: Continuing with sync
  • 13:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62503 and previous config saved to /var/cache/conftool/dbconfig/20240516-131800-ladsgroup.json
  • 13:17 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS bookworm
  • 13:16 jsn@deploy1002: jsn and hnowlan: Backport for Enable async jobqueue-powered URL uploads on commons (T295007) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:15 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.upgrade (exit_code=97) for db2176.codfw.wmnet
  • 13:15 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2176.codfw.wmnet
  • 13:14 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2176', diff saved to https://phabricator.wikimedia.org/P62502 and previous config saved to /var/cache/conftool/dbconfig/20240516-131429-arnaudb.json
  • 13:14 jsn@deploy1002: Started scap: Backport for Enable async jobqueue-powered URL uploads on commons (T295007)
  • 13:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1024.eqiad.wmnet with OS bookworm
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1024 T364289', diff saved to https://phabricator.wikimedia.org/P62501 and previous config saved to /var/cache/conftool/dbconfig/20240516-131111-root.json
  • 13:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62500 and previous config saved to /var/cache/conftool/dbconfig/20240516-130252-ladsgroup.json
  • 12:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62499 and previous config saved to /var/cache/conftool/dbconfig/20240516-124743-ladsgroup.json
  • 10:48 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 10:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P62497 and previous config saved to /var/cache/conftool/dbconfig/20240516-104601-ladsgroup.json
  • 10:43 claime: New redirects for T25216 T204830 T31186 operational
  • 10:37 fnegri@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 10:32 claime: cumin 'A:all-mw' -b30 "run-puppet-agent -q" - T25216 T204830 T31186
  • 10:31 claime: cumin 'A:all-mw' "enable-puppet 'New redirects T25216 T204830 T31186 - cgoubert'"
  • 10:31 marostegui@cumin1002: dbctl commit (dc=all): 'Test pc4 master switch', diff saved to https://phabricator.wikimedia.org/P62496 and previous config saved to /var/cache/conftool/dbconfig/20240516-103148-marostegui.json
  • 10:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P62495 and previous config saved to /var/cache/conftool/dbconfig/20240516-103055-ladsgroup.json
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'Test pc4 master switch', diff saved to https://phabricator.wikimedia.org/P62494 and previous config saved to /var/cache/conftool/dbconfig/20240516-103039-marostegui.json
  • 10:30 cgoubert@deploy1002: Finished scap: Deploy new redirects to mw-on-k8s - T25216 T204830 T31186 (duration: 08m 06s)
  • 10:22 cgoubert@deploy1002: Started scap: Deploy new redirects to mw-on-k8s - T25216 T204830 T31186
  • 10:21 claime: New redirects ok on mwdebug - T25216 T204830 T31186
  • 10:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:19 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2015 and pc1015 to pc4 as depooled spares T362786', diff saved to https://phabricator.wikimedia.org/P62493 and previous config saved to /var/cache/conftool/dbconfig/20240516-101829-marostegui.json
  • 10:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62492 and previous config saved to /var/cache/conftool/dbconfig/20240516-101553-root.json
  • 10:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P62491 and previous config saved to /var/cache/conftool/dbconfig/20240516-101548-ladsgroup.json
  • 10:15 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2016 and pc1016 to pc4 T362786', diff saved to https://phabricator.wikimedia.org/P62490 and previous config saved to /var/cache/conftool/dbconfig/20240516-101543-marostegui.json
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2014 and pc1014 to pc4 T362786', diff saved to https://phabricator.wikimedia.org/P62489 and previous config saved to /var/cache/conftool/dbconfig/20240516-101122-marostegui.json
  • 10:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 100%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62488 and previous config saved to /var/cache/conftool/dbconfig/20240516-101018-arnaudb.json
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2013 and pc1013 to pc2 T362786', diff saved to https://phabricator.wikimedia.org/P62487 and previous config saved to /var/cache/conftool/dbconfig/20240516-101009-marostegui.json
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2012 and pc1012 to pc2 T362786', diff saved to https://phabricator.wikimedia.org/P62486 and previous config saved to /var/cache/conftool/dbconfig/20240516-100858-marostegui.json
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2011 to pc1 T362786', diff saved to https://phabricator.wikimedia.org/P62485 and previous config saved to /var/cache/conftool/dbconfig/20240516-100744-marostegui.json
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc1011 to pc1 T362786', diff saved to https://phabricator.wikimedia.org/P62484 and previous config saved to /var/cache/conftool/dbconfig/20240516-100418-marostegui.json
  • 10:02 claime: cumin 'A:all-mw' "disable-puppet 'New redirects T25216 T204830 T31186 - cgoubert'"
  • 10:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62483 and previous config saved to /var/cache/conftool/dbconfig/20240516-100040-root.json
  • 09:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P62482 and previous config saved to /var/cache/conftool/dbconfig/20240516-095927-ladsgroup.json
  • 09:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62481 and previous config saved to /var/cache/conftool/dbconfig/20240516-095817-ladsgroup.json
  • 09:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 09:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 09:56 ladsgroup@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 09:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 09:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 75%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62480 and previous config saved to /var/cache/conftool/dbconfig/20240516-095459-arnaudb.json
  • 09:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62479 and previous config saved to /var/cache/conftool/dbconfig/20240516-094717-ladsgroup.json
  • 09:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 09:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 09:45 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62478 and previous config saved to /var/cache/conftool/dbconfig/20240516-094534-root.json
  • 09:44 godog: clean up MediaWiki.rest_api_latency and MediaWiki.rest_api_errors - T365111
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 50%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62476 and previous config saved to /var/cache/conftool/dbconfig/20240516-093803-arnaudb.json
  • 09:30 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62475 and previous config saved to /var/cache/conftool/dbconfig/20240516-093028-root.json
  • 09:28 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:28 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 25%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62474 and previous config saved to /var/cache/conftool/dbconfig/20240516-092257-arnaudb.json
  • 09:18 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 09:18 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 09:18 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 09:16 arnaudb@cumin1002: dbctl commit (dc=all): 'vslow/dump T364814 fix', diff saved to https://phabricator.wikimedia.org/P62473 and previous config saved to /var/cache/conftool/dbconfig/20240516-091613-arnaudb.json
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62472 and previous config saved to /var/cache/conftool/dbconfig/20240516-091522-root.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'vslow/dump T364814 fix', diff saved to https://phabricator.wikimedia.org/P62471 and previous config saved to /var/cache/conftool/dbconfig/20240516-091515-arnaudb.json
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2204 to vslow/dump T364814', diff saved to https://phabricator.wikimedia.org/P62470 and previous config saved to /var/cache/conftool/dbconfig/20240516-091400-arnaudb.json
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Group test readd', diff saved to https://phabricator.wikimedia.org/P62469 and previous config saved to /var/cache/conftool/dbconfig/20240516-090753-arnaudb.json
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Group test removal', diff saved to https://phabricator.wikimedia.org/P62468 and previous config saved to /var/cache/conftool/dbconfig/20240516-090732-arnaudb.json
  • 09:03 Dreamy_Jazz: Stopping MediaModeration scanning script on `medium.dblist`
  • 09:03 Dreamy_Jazz: Stopping MediaModeration scanning script on `enwiki`
  • 08:59 Dreamy_Jazz: Scanning `enwiki` with MediaModeration script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 08:58 Dreamy_Jazz: Starting MediaModeration scanning script on `medium.dblist` - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 08:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2204 with weight 500 T364814', diff saved to https://phabricator.wikimedia.org/P62466 and previous config saved to /var/cache/conftool/dbconfig/20240516-085123-arnaudb.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2207 to s2 primary T364814', diff saved to https://phabricator.wikimedia.org/P62465 and previous config saved to /var/cache/conftool/dbconfig/20240516-084420-root.json
  • 08:41 arnaudb: Starting s2 codfw failover from db2204 to db2207 - T364814
  • 08:33 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:23 hashar@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.5 refs T361399
  • 08:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 depool', diff saved to https://phabricator.wikimedia.org/P62463 and previous config saved to /var/cache/conftool/dbconfig/20240516-081207-arnaudb.json
  • 08:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62462 and previous config saved to /var/cache/conftool/dbconfig/20240516-081136-arnaudb.json
  • 08:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62461 and previous config saved to /var/cache/conftool/dbconfig/20240516-081107-marostegui.json
  • 08:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T364299)', diff saved to https://phabricator.wikimedia.org/P62460 and previous config saved to /var/cache/conftool/dbconfig/20240516-081044-marostegui.json
  • 08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1021.eqiad.wmnet with reason: host reimage
  • 08:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1021.eqiad.wmnet with reason: host reimage
  • 07:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62458 and previous config saved to /var/cache/conftool/dbconfig/20240516-075628-arnaudb.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P62457 and previous config saved to /var/cache/conftool/dbconfig/20240516-075537-marostegui.json
  • 07:51 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1021.eqiad.wmnet with OS bookworm
  • 07:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2207 from API/vslow/dump T364814', diff saved to https://phabricator.wikimedia.org/P62456 and previous config saved to /var/cache/conftool/dbconfig/20240516-075024-arnaudb.json
  • 07:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2207 with weight 0 T364814', diff saved to https://phabricator.wikimedia.org/P62455 and previous config saved to /var/cache/conftool/dbconfig/20240516-074927-arnaudb.json
  • 07:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s2 T364814
  • 07:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s2 T364814
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1021 T364289', diff saved to https://phabricator.wikimedia.org/P62454 and previous config saved to /var/cache/conftool/dbconfig/20240516-074837-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1024 weight', diff saved to https://phabricator.wikimedia.org/P62453 and previous config saved to /var/cache/conftool/dbconfig/20240516-074625-marostegui.json
  • 07:44 mabualruz@deploy1002: Finished scap: Backport for Correct behaviour of ConfigHelper, add tests (T365084) (duration: 17m 31s)
  • 07:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62452 and previous config saved to /var/cache/conftool/dbconfig/20240516-074121-arnaudb.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P62451 and previous config saved to /var/cache/conftool/dbconfig/20240516-074030-marostegui.json
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1024 weight', diff saved to https://phabricator.wikimedia.org/P62450 and previous config saved to /var/cache/conftool/dbconfig/20240516-073750-marostegui.json
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1025 to es5 primary master T365094', diff saved to https://phabricator.wikimedia.org/P62449 and previous config saved to /var/cache/conftool/dbconfig/20240516-073719-marostegui.json
  • 07:30 mabualruz@deploy1002: mabualruz: Continuing with sync
  • 07:30 mabualruz@deploy1002: mabualruz: Backport for Correct behaviour of ConfigHelper, add tests (T365084) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:26 mabualruz@deploy1002: Started scap: Backport for Correct behaviour of ConfigHelper, add tests (T365084)
  • 07:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62448 and previous config saved to /var/cache/conftool/dbconfig/20240516-072614-arnaudb.json
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T364299)', diff saved to https://phabricator.wikimedia.org/P62447 and previous config saved to /var/cache/conftool/dbconfig/20240516-072521-marostegui.json
  • 07:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62446 and previous config saved to /var/cache/conftool/dbconfig/20240516-072355-arnaudb.json
  • 07:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P62445 and previous config saved to /var/cache/conftool/dbconfig/20240516-065823-root.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P62444 and previous config saved to /var/cache/conftool/dbconfig/20240516-064317-root.json
  • 06:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 06:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 06:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P62443 and previous config saved to /var/cache/conftool/dbconfig/20240516-062812-root.json
  • 06:18 marostegui: Make es5 standalone and disconnect replication T364447
  • 06:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P62442 and previous config saved to /var/cache/conftool/dbconfig/20240516-061306-root.json
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1020 to es4 primary master T364816', diff saved to https://phabricator.wikimedia.org/P62441 and previous config saved to /var/cache/conftool/dbconfig/20240516-060532-marostegui.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P62440 and previous config saved to /var/cache/conftool/dbconfig/20240516-055759-root.json
  • 05:43 marostegui: Make es4 standalone and disconnect replication T364447
  • 05:37 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1021 weight', diff saved to https://phabricator.wikimedia.org/P62439 and previous config saved to /var/cache/conftool/dbconfig/20240516-053746-marostegui.json
  • 05:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 05:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 05:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 05:23 marostegui: Deploy schema change dbmaint db1173 eqiad s6 T355609
  • 05:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1173 T364523', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240516-051853-root.json
  • 05:18 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1231 to s6 primary and set section read-write T364523', diff saved to https://phabricator.wikimedia.org/P62437 and previous config saved to /var/cache/conftool/dbconfig/20240516-051808-marostegui.json
  • 05:17 marostegui: Starting s6 eqiad failover from db1173 to db1231 - T364523
  • 04:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364523
  • 04:58 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T364523', diff saved to https://phabricator.wikimedia.org/P62435 and previous config saved to /var/cache/conftool/dbconfig/20240516-045831-marostegui.json
  • 04:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364523
  • 04:04 eileen: civicrm upgraded from 26e7422a to 4f6f2dc3
  • 02:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T352010)', diff saved to https://phabricator.wikimedia.org/P62434 and previous config saved to /var/cache/conftool/dbconfig/20240516-020200-ladsgroup.json
  • 02:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62433 and previous config saved to /var/cache/conftool/dbconfig/20240516-020137-ladsgroup.json
  • 01:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P62432 and previous config saved to /var/cache/conftool/dbconfig/20240516-014630-ladsgroup.json
  • 01:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P62431 and previous config saved to /var/cache/conftool/dbconfig/20240516-013122-ladsgroup.json
  • 01:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62430 and previous config saved to /var/cache/conftool/dbconfig/20240516-011613-ladsgroup.json
  • 01:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye


Other archives

2000s

2010s

2020s