Jump to content

Server Admin Log/Archive 79

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

2024-05-15

  • 22:41 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@12e0cb9]: bump discolytics to 0.19.0 (duration: 00m 27s)
  • 22:40 ebernhardson@deploy1002: Started deploy [airflow-dags/search@12e0cb9]: bump discolytics to 0.19.0
  • 21:55 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:55 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: delete ssw1-d1-codfw mgmt ip - cmooney@cumin1002"
  • 21:54 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: delete ssw1-d1-codfw mgmt ip - cmooney@cumin1002"
  • 21:44 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@718b2dd]: specify analytics-hadoop in hdfs urls (duration: 00m 25s)
  • 21:44 ebernhardson@deploy1002: Started deploy [airflow-dags/search@718b2dd]: specify analytics-hadoop in hdfs urls
  • 21:27 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 21:22 eileen: civicrm upgraded from ddc96594 to 26e7422a
  • 21:17 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:17 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add ssw1-d1-codfw mgmt ip - cmooney@cumin1002"
  • 21:16 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add ssw1-d1-codfw mgmt ip - cmooney@cumin1002"
  • 21:16 TheresNoTime: UTC late backport window complete
  • 21:16 samtar@deploy1002: Finished scap: Backport for AbuseFilterHooks: Provide feature flags for AF custom actions (T20110) (duration: 16m 31s)
  • 21:14 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 21:03 samtar@deploy1002: samtar and kharlan: Continuing with sync
  • 21:02 samtar@deploy1002: samtar and kharlan: Backport for AbuseFilterHooks: Provide feature flags for AF custom actions (T20110) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:59 samtar@deploy1002: Started scap: Backport for AbuseFilterHooks: Provide feature flags for AF custom actions (T20110)
  • 20:48 samtar@deploy1002: Finished scap: Backport for Enable night mode as a desktop beta feature (T363814), [enwiki] Throttle exemption for Editathon (T364708) (duration: 17m 35s)
  • 20:35 samtar@deploy1002: samtar and superpes and jdlrobson: Continuing with sync
  • 20:33 samtar@deploy1002: samtar and superpes and jdlrobson: Backport for Enable night mode as a desktop beta feature (T363814), [enwiki] Throttle exemption for Editathon (T364708) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T364299)', diff saved to https://phabricator.wikimedia.org/P62427 and previous config saved to /var/cache/conftool/dbconfig/20240515-203116-marostegui.json
  • 20:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T364299)', diff saved to https://phabricator.wikimedia.org/P62426 and previous config saved to /var/cache/conftool/dbconfig/20240515-203037-marostegui.json
  • 20:30 samtar@deploy1002: Started scap: Backport for Enable night mode as a desktop beta feature (T363814), [enwiki] Throttle exemption for Editathon (T364708)
  • 20:28 samtar@deploy1002: Finished scap: Backport for [ParserCache] Preserve information from the JsonException when logging failures (T365036) (duration: 16m 41s)
  • 20:16 samtar@deploy1002: cscott and samtar: Continuing with sync
  • 20:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P62425 and previous config saved to /var/cache/conftool/dbconfig/20240515-201529-marostegui.json
  • 20:15 samtar@deploy1002: cscott and samtar: Backport for [ParserCache] Preserve information from the JsonException when logging failures (T365036) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:12 samtar@deploy1002: Started scap: Backport for [ParserCache] Preserve information from the JsonException when logging failures (T365036)
  • 20:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P62424 and previous config saved to /var/cache/conftool/dbconfig/20240515-200022-marostegui.json
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T364299)', diff saved to https://phabricator.wikimedia.org/P62423 and previous config saved to /var/cache/conftool/dbconfig/20240515-194514-marostegui.json
  • 19:06 cstone: payments-wiki upgraded from 3380990f to 98189883
  • 18:44 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 18:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons.
  • 18:03 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 17:46 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 17:40 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 17:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62420 and previous config saved to /var/cache/conftool/dbconfig/20240515-173259-ladsgroup.json
  • 17:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 17:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 17:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P62419 and previous config saved to /var/cache/conftool/dbconfig/20240515-173236-ladsgroup.json
  • 17:28 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 17:22 ryankemper@cumin2002: START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons.
  • 17:21 ryankemper@cumin2002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 17:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P62418 and previous config saved to /var/cache/conftool/dbconfig/20240515-171729-ladsgroup.json
  • 17:17 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 17:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P62417 and previous config saved to /var/cache/conftool/dbconfig/20240515-170221-ladsgroup.json
  • 16:50 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 16:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P62416 and previous config saved to /var/cache/conftool/dbconfig/20240515-164713-ladsgroup.json
  • 16:40 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 16:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:34 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 16:33 ryankemper@cumin2002: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 16:31 mutante: gerrit2002 - mv /run/motd.dynamic.new /run/motd.dynamic
  • 16:24 mutante: gerrit1003 - MOTD wasn't updating anymore but manual "run-parts /etc/update-motd.d" showed updated data - while /run/motd.dynamic was outdated. fixed by manually renaming /run/motd.dynamic.new to /run/motd.dynamic and logging in because it's triggered by PAM.. but .. why
  • 16:06 hashar: Gerrit was briefly unreachable between 15:42 and 15:55 UTC | T365041
  • 15:58 vgutierrez: repool upload@ulsfo with IPIP encapsulation enabled - T357257
  • 15:56 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:56 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 15:55 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 15:51 cgoubert@deploy1002: Finished scap: mw-on-k8s: Bump maxUnavailable to 6% - T362323 (duration: 02m 01s)
  • 15:49 cgoubert@deploy1002: Started scap: mw-on-k8s: Bump maxUnavailable to 6% - T362323
  • 15:43 hnowlan@deploy1002: Finished deploy [restbase/deploy@92abb6a]: Deploying new wikis T360304 T360311 T363244 T363250 T363257 T363264 T363271 (duration: 16m 52s)
  • 15:37 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:37 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 15:36 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:36 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:35 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:35 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:32 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:32 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:31 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:31 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:28 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:28 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:26 hnowlan@deploy1002: Started deploy [restbase/deploy@92abb6a]: Deploying new wikis T360304 T360311 T363244 T363250 T363257 T363264 T363271
  • 15:25 vgutierrez: rolling restart of pybal on lvs4010 and lvs4009 - T357257
  • 15:06 jsn@deploy1002: Finished scap: Backport for [Follow-up] Override VE overlays in night-mode (T363861), Mark night mode as a valid beta feature (T363814), Mark night mode as a valid beta feature (T363814) (duration: 18m 26s)
  • 15:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:57 vgutierrez: re-enable puppet on A:lvs - T357257
  • 14:53 jsn@deploy1002: jsn and jdlrobson: Continuing with sync
  • 14:51 vgutierrez: disable puppet on A:lvs before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1031827- T357257
  • 14:51 jsn@deploy1002: jsn and jdlrobson: Backport for [Follow-up] Override VE overlays in night-mode (T363861), Mark night mode as a valid beta feature (T363814), Mark night mode as a valid beta feature (T363814) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:48 jsn@deploy1002: Started scap: Backport for [Follow-up] Override VE overlays in night-mode (T363861), Mark night mode as a valid beta feature (T363814), Mark night mode as a valid beta feature (T363814)
  • 14:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 14:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:43 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS bullseye
  • 14:41 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:40 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:39 claime: Repooling mw2286.codfw.wmnet - T364863
  • 14:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS bullseye
  • 14:39 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:38 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2286.codfw.wmnet
  • 14:38 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for mw2286.codfw.wmnet
  • 14:38 claime: Removing downtime on mw2286.codfw.wmnet - T364863
  • 14:37 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:32 vgutierrez: depool upload@ulsfo before enabling IPIP encapsulation - T357257
  • 14:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
  • 14:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
  • 14:24 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
  • 14:22 jsn@deploy1002: Finished scap: Backport for InitialiseSettings.php: Add wmgUseAutoModerator (T364034) (duration: 16m 44s)
  • 14:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 14:20 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
  • 14:20 fab@deploy1002: Finished deploy [airflow-dags/research@ecf603d]: (no justification provided) (duration: 00m 32s)
  • 14:20 fab@deploy1002: Started deploy [airflow-dags/research@ecf603d]: (no justification provided)
  • 14:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 14:17 vgutierrez: uploaded tcp-mss-clamper 0.5.1 to bullseye-wikimedia (apt.wm.o) - T357257
  • 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1214.eqiad.wmnet
  • 14:10 jsn@deploy1002: jsn: Continuing with sync
  • 14:10 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 14:10 vgutierrez: re-enable puppet on A:lvs - T357257
  • 14:09 jsn@deploy1002: jsn: Backport for InitialiseSettings.php: Add wmgUseAutoModerator (T364034) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:06 jsn@deploy1002: Started scap: Backport for InitialiseSettings.php: Add wmgUseAutoModerator (T364034)
  • 14:06 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1214.eqiad.wmnet
  • 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1211.eqiad.wmnet
  • 14:02 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 14:02 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:02 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS bullseye
  • 14:02 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS bullseye
  • 14:01 jsn@deploy1002: Finished scap: Backport for extension-list: Add AutoModerator (T364034) (duration: 51m 44s)
  • 14:01 vgutierrez: disable puppet on A:lvs before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1031814 - T357257
  • 14:00 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 13:54 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1211.eqiad.wmnet
  • 13:54 moritzm: installing nghttp2 security updates
  • 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1209.eqiad.wmnet
  • 13:52 eevans@deploy1002: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
  • 13:51 eevans@deploy1002: helmfile [eqiad] START helmfile.d/services/echostore: apply
  • 13:49 eevans@deploy1002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
  • 13:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:49 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 13:48 eevans@deploy1002: helmfile [codfw] START helmfile.d/services/echostore: apply
  • 13:45 eevans@deploy1002: helmfile [staging] DONE helmfile.d/services/echostore: apply
  • 13:44 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 13:44 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:44 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 13:44 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:44 eevans@deploy1002: helmfile [staging] START helmfile.d/services/echostore: apply
  • 13:43 jsn@deploy1002: jsn: Continuing with sync
  • 13:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 13:42 jsn@deploy1002: jsn: Backport for extension-list: Add AutoModerator (T364034) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:41 moritzm: installing libpgjava security updates
  • 13:40 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=thanos-fe1001.eqiad.wmnet
  • 13:40 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1209.eqiad.wmnet
  • 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1203.eqiad.wmnet
  • 13:34 elukey: depool thanos-fe1001 and move envoy to PKI TLS cert
  • 13:34 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=thanos-fe1001.eqiad.wmnet
  • 13:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:27 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:27 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:26 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.reboot_sanitaria (exit_code=97) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:26 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:25 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.reboot_sanitaria (exit_code=99) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:25 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 13:22 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 13:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:18 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:17 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.reboot_sanitaria (exit_code=99) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:17 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:16 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:15 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:12 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:11 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:11 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:11 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:10 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:10 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:09 jsn@deploy1002: Started scap: Backport for extension-list: Add AutoModerator (T364034)
  • 13:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:09 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:07 vgutierrez: uploaded golang-github-florianl-go-tc 0.4.4-0.20240511074908-d584238bf6cb to apt.wm.o (bookworm-wikimedia)
  • 13:04 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:03 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:03 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:01 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:01 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:00 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 12:58 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagetcd[2001-2003].codfw.wmnet
  • 12:57 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:57 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[2001-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:57 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 12:56 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[2001-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:53 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:46 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kubestagetcd[2001-2003].codfw.wmnet
  • 12:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagetcd[2001-2003].codfw.wmnet with reason: decom
  • 12:23 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagetcd[2001-2003].codfw.wmnet with reason: decom
  • 12:19 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1203.eqiad.wmnet
  • 11:52 aborrero@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 11:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: openldap::rw
  • 11:34 mvolz@deploy1002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 11:33 mvolz@deploy1002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 11:33 mvolz@deploy1002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 11:32 mvolz@deploy1002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 11:31 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 11:31 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 11:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: openldap::rw
  • 11:28 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for backend: Fix Unknown column 'Array' in 'where clause' (T364974), backend: Fix Unknown column 'Array' in 'where clause' (T364974) (duration: 15m 36s)
  • 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1193.eqiad.wmnet
  • 11:16 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Continuing with sync
  • 11:15 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Backport for backend: Fix Unknown column 'Array' in 'where clause' (T364974), backend: Fix Unknown column 'Array' in 'where clause' (T364974) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:13 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for backend: Fix Unknown column 'Array' in 'where clause' (T364974), backend: Fix Unknown column 'Array' in 'where clause' (T364974)
  • 11:10 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 11:09 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1193.eqiad.wmnet
  • 11:05 aborrero@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 11:03 logmsgbot: lucaswerkmeister-wmde@deploy1002 Sync cancelled.
  • 10:54 gmodena@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:54 gmodena@deploy1002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:53 gmodena@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:53 gmodena@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:53 logmsgbot: lucaswerkmeister-wmde@deploy1002 zabe and lucaswerkmeister-wmde: Backport for Fix capitalization of Subquery (T364974), Fix capitalization of Subquery (T364974) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:52 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 10:50 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Fix capitalization of Subquery (T364974), Fix capitalization of Subquery (T364974)
  • 10:49 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:49 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:40 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:32 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:32 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:31 cmooney@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device cloudsw1-e4-eqiad
  • 10:29 cmooney@cumin1002: START - Cookbook sre.network.tls for network device cloudsw1-e4-eqiad
  • 10:28 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 10:28 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 10:20 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 10:15 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:09 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:09 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:06 btullis@deploy1002: Finished deploy [airflow-dags/analytics@ecf603d]: (no justification provided) (duration: 00m 30s)
  • 10:06 btullis@deploy1002: Started deploy [airflow-dags/analytics@ecf603d]: (no justification provided)
  • 10:06 btullis@deploy1002: Finished deploy [airflow-dags/analytics_test@ecf603d]: (no justification provided) (duration: 00m 11s)
  • 10:06 btullis@deploy1002: Started deploy [airflow-dags/analytics_test@ecf603d]: (no justification provided)
  • 10:02 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 10:02 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:59 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:59 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:57 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:57 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:54 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:54 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:53 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagemaster[2001-2002].codfw.wmnet
  • 09:53 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:53 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 09:52 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 09:50 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 09:49 claime: Manually relaunching mediawiki_job_update_special_pages_s5.service
  • 09:47 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:47 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:43 btullis@deploy1002: Finished deploy [analytics/refinery@88ed505] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@88ed505e] (duration: 02m 53s)
  • 09:43 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kubestagemaster[2001-2002].codfw.wmnet
  • 09:40 btullis@deploy1002: Started deploy [analytics/refinery@88ed505] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@88ed505e]
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
  • 09:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
  • 09:25 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:22 Dreamy_Jazz: Starting MediaModeration script on group2 wikis for a test
  • 09:20 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.copy (exit_code=97) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:19 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.copy (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:13 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:11 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubestagemaster200[12].codfw.wmnet
  • 09:10 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.copy (exit_code=97) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:09 btullis@deploy1002: Finished deploy [analytics/refinery@88ed505] (thin): Regular analytics weekly train THIN [analytics/refinery@88ed505e] (duration: 04m 17s)
  • 09:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P62410 and previous config saved to /var/cache/conftool/dbconfig/20240515-090522-ladsgroup.json
  • 09:05 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 09:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 09:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T352010)', diff saved to https://phabricator.wikimedia.org/P62409 and previous config saved to /var/cache/conftool/dbconfig/20240515-090458-ladsgroup.json
  • 09:04 btullis@deploy1002: Started deploy [analytics/refinery@88ed505] (thin): Regular analytics weekly train THIN [analytics/refinery@88ed505e]
  • 09:03 moritzm: upgrade seaborgium to bullseye T364823
  • 09:02 btullis@deploy1002: Finished deploy [analytics/refinery@88ed505]: Regular analytics weekly train [analytics/refinery@88ed505e] (duration: 14m 41s)
  • 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on seaborgium.wikimedia.org with reason: OS update
  • 09:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on seaborgium.wikimedia.org with reason: OS update
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T364299)', diff saved to https://phabricator.wikimedia.org/P62408 and previous config saved to /var/cache/conftool/dbconfig/20240515-085247-marostegui.json
  • 08:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1192.eqiad.wmnet
  • 08:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T364299)', diff saved to https://phabricator.wikimedia.org/P62407 and previous config saved to /var/cache/conftool/dbconfig/20240515-085224-marostegui.json
  • 08:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P62406 and previous config saved to /var/cache/conftool/dbconfig/20240515-084950-ladsgroup.json
  • 08:48 btullis@deploy1002: Started deploy [analytics/refinery@88ed505]: Regular analytics weekly train [analytics/refinery@88ed505e]
  • 08:42 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:40 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:40 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:38 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:38 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P62405 and previous config saved to /var/cache/conftool/dbconfig/20240515-083717-marostegui.json
  • 08:35 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.5 refs T361399
  • 08:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P62404 and previous config saved to /var/cache/conftool/dbconfig/20240515-083443-ladsgroup.json
  • 08:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1192.eqiad.wmnet
  • 08:30 moritzm: installing openjdk-17/jetty9 security updates on Bookworm
  • 08:30 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:29 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 08:29 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1178.eqiad.wmnet
  • 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P62403 and previous config saved to /var/cache/conftool/dbconfig/20240515-082209-marostegui.json
  • 08:21 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 08:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T352010)', diff saved to https://phabricator.wikimedia.org/P62402 and previous config saved to /var/cache/conftool/dbconfig/20240515-081934-ladsgroup.json
  • 08:17 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:16 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1178.eqiad.wmnet
  • 08:15 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 08:13 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:13 kartik@deploy1002: Finished scap: Backport for Section Translation: Fix nds-nl language code (duration: 17m 14s)
  • 08:07 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T364299)', diff saved to https://phabricator.wikimedia.org/P62401 and previous config saved to /var/cache/conftool/dbconfig/20240515-080700-marostegui.json
  • 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1177.eqiad.wmnet
  • 08:03 moritzm: installing nodejs security updates on buster
  • 08:01 kartik@deploy1002: kartik: Continuing with sync
  • 07:59 kartik@deploy1002: kartik: Backport for Section Translation: Fix nds-nl language code synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:56 kartik@deploy1002: Started scap: Backport for Section Translation: Fix nds-nl language code
  • 07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1177.eqiad.wmnet
  • 07:49 kartik@deploy1002: Finished scap: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666) (duration: 18m 06s)
  • 07:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1172.eqiad.wmnet
  • 07:38 moritzm: installing curl security updates
  • 07:37 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 07:37 kartik@deploy1002: kartik: Continuing with sync
  • 07:36 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 07:34 kartik@deploy1002: kartik: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1172.eqiad.wmnet
  • 07:31 kartik@deploy1002: Started scap: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666)
  • 07:30 kartik@deploy1002: Sync cancelled.
  • 07:21 kartik@deploy1002: kartik: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:20 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-d1-codfw
  • 07:20 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-d1-codfw
  • 07:19 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-d1-codfw
  • 07:19 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-d1-codfw
  • 07:19 kartik@deploy1002: Started scap: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666)
  • 07:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.copy (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 07:04 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 07:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T352010)', diff saved to https://phabricator.wikimedia.org/P62399 and previous config saved to /var/cache/conftool/dbconfig/20240515-002923-ladsgroup.json
  • 00:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 00:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 00:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P62398 and previous config saved to /var/cache/conftool/dbconfig/20240515-002900-ladsgroup.json
  • 00:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P62397 and previous config saved to /var/cache/conftool/dbconfig/20240515-001352-ladsgroup.json

2024-05-14

  • 23:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P62396 and previous config saved to /var/cache/conftool/dbconfig/20240514-235844-ladsgroup.json
  • 23:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P62395 and previous config saved to /var/cache/conftool/dbconfig/20240514-234337-ladsgroup.json
  • 22:48 zabe: start running migrateGuSalt.php in screen session # T364435
  • 22:22 zabe: zabe@mwmaint1002:/tmp/upload$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="Yann" . # T364877
  • 22:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T364299)', diff saved to https://phabricator.wikimedia.org/P62394 and previous config saved to /var/cache/conftool/dbconfig/20240514-220640-marostegui.json
  • 22:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 22:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 22:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T364299)', diff saved to https://phabricator.wikimedia.org/P62393 and previous config saved to /var/cache/conftool/dbconfig/20240514-220617-marostegui.json
  • 21:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P62392 and previous config saved to /var/cache/conftool/dbconfig/20240514-215109-marostegui.json
  • 21:39 eileen: civicrm upgraded from c7b0dfbb to 9268acf3
  • 21:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P62391 and previous config saved to /var/cache/conftool/dbconfig/20240514-213601-marostegui.json
  • 21:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T364299)', diff saved to https://phabricator.wikimedia.org/P62390 and previous config saved to /var/cache/conftool/dbconfig/20240514-212052-marostegui.json
  • 21:00 cjming@deploy1002: Finished scap: Backport for Override VE overlays in night-mode (T363861) (duration: 18m 44s)
  • 20:49 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:48 cjming@deploy1002: cjming and jdlrobson: Continuing with sync
  • 20:44 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:44 cjming@deploy1002: cjming and jdlrobson: Backport for Override VE overlays in night-mode (T363861) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:44 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:42 cjming@deploy1002: Started scap: Backport for Override VE overlays in night-mode (T363861)
  • 20:41 cjming@deploy1002: Finished scap: Backport for cirrus: Shift 25% of public wikis writes in eqiad to replacement updater (T363475) (duration: 15m 02s)
  • 20:29 cjming@deploy1002: cjming and ebernhardson: Continuing with sync
  • 20:29 cjming@deploy1002: cjming and ebernhardson: Backport for cirrus: Shift 25% of public wikis writes in eqiad to replacement updater (T363475) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:28 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:26 cjming@deploy1002: Started scap: Backport for cirrus: Shift 25% of public wikis writes in eqiad to replacement updater (T363475)
  • 20:24 cjming@deploy1002: Finished scap: Backport for Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814) (duration: 18m 40s)
  • 20:14 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@ecf603d]: update discolytics to 0.18.0 (duration: 00m 27s)
  • 20:14 ebernhardson@deploy1002: Started deploy [airflow-dags/search@ecf603d]: update discolytics to 0.18.0
  • 20:11 cjming@deploy1002: jdlrobson and cjming: Continuing with sync
  • 20:08 cjming@deploy1002: jdlrobson and cjming: Backport for Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 20:07 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 20:06 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 20:06 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:06 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 20:05 cjming@deploy1002: Started scap: Backport for Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814)
  • 20:04 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 20:04 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 20:01 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 19:53 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 19:53 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 19:47 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:47 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:47 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:46 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:45 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 19:41 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 19:39 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:38 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:38 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1010 - vriley@cumin1002"
  • 19:37 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1010 - vriley@cumin1002"
  • 19:32 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 19:30 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1008.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:26 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 19:25 vriley@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['kafka-main1006']
  • 19:23 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main1006']
  • 19:19 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:18 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:18 cdanis: T364907 💔cdanis@apt1002.wikimedia.org ~ 🕞🍵 sudo -i reprepro --keepunreferencedfiles includedeb bullseye-wikimedia ~/otelcol-contrib_0.100.0_linux_amd64.deb
  • 19:18 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1008.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:17 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 19:16 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:16 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1008 - vriley@cumin1002"
  • 19:16 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1008 - vriley@cumin1002"
  • 19:13 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 18:18 sukhe: restart pybal on backup LVSes
  • 18:17 sukhe: [CORRECTION] above pybal restart was NOT run
  • 18:15 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@6270c72]: (no justification provided) (duration: 00m 34s)
  • 18:14 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@6270c72]: (no justification provided)
  • 18:10 sukhe: sudo cumin -b1 -s120 'A:lvs' 'systemctl restart pybal.service': clearing up alert for reverted pybal.conf CR 1031470
  • 17:47 ejegg: donorwiki upgraded from b005071a to fa7de70f
  • 17:33 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
  • 17:27 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
  • 17:25 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
  • 17:19 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
  • 17:18 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 17:12 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 17:11 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:11 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:09 ryankemper@cumin2002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons.
  • 17:02 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1007.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:00 ryankemper@cumin2002: START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons.
  • 16:51 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1006.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:50 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1007.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:49 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:49 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1007 - vriley@cumin1002"
  • 16:48 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1007 - vriley@cumin1002"
  • 16:46 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 16:44 pfischer@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw2286.codfw.wmnet with reason: T364863
  • 16:40 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw2286.codfw.wmnet with reason: T364863
  • 16:39 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1006.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:39 mutante: depooled mw2286.codfw.wmnet because of interface error / needed cable replacement T364863
  • 16:38 dzahn@cumin2002: conftool action : set/pooled=no; selector: name=mw2286.codfw.wmnet
  • 16:38 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:38 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1006 - vriley@cumin1002"
  • 16:37 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1006 - vriley@cumin1002"
  • 16:34 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 16:21 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Add notheme class to Echo (T363779), Convert function to arrow function to fix context (T364783) (duration: 22m 43s)
  • 16:14 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 16:14 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 16:14 pfischer@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:12 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 16:12 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 16:12 pfischer@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:11 pfischer@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:08 logmsgbot: lucaswerkmeister-wmde@deploy1002 jdlrobson and jforrester and lucaswerkmeister-wmde: Continuing with sync
  • 16:08 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.6.5 update to add modified wmf homer plugin - cmooney@cumin1002 - T364480
  • 16:06 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.6.5 update to add modified wmf homer plugin - cmooney@cumin1002 - T364480
  • 16:05 logmsgbot: lucaswerkmeister-wmde@deploy1002 jdlrobson and jforrester and lucaswerkmeister-wmde: Backport for Add notheme class to Echo (T363779), Convert function to arrow function to fix context (T364783) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:58 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Add notheme class to Echo (T363779), Convert function to arrow function to fix context (T364783)
  • 15:47 jayme@cumin1002: conftool action : set/weight=10; selector: name=kubestagemaster2005.codfw.wmnet
  • 15:47 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=kubestagemaster2005.codfw.wmnet
  • 15:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2195.codfw.wmnet
  • 15:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2195.codfw.wmnet
  • 15:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2181.codfw.wmnet
  • 15:26 pfischer@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 pfischer@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 pfischer@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 pfischer@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 pfischer@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 pfischer@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
  • 15:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P62387 and previous config saved to /var/cache/conftool/dbconfig/20240514-151838-ladsgroup.json
  • 15:18 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 15:18 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 15:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2181.codfw.wmnet
  • 15:16 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2167.codfw.wmnet
  • 15:13 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:13 moritzm: installing expat security updates
  • 15:13 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:12 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:12 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:11 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:11 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:05 brennen@deploy1002: Finished deploy [phabricator/deployment@7d858df]: test deploy phab2002 for T364850 (duration: 00m 50s)
  • 15:05 brennen@deploy1002: Started deploy [phabricator/deployment@7d858df]: test deploy phab2002 for T364850
  • 15:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.copy (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:04 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:04 brennen@deploy1002: Finished deploy [phabricator/deployment@7d858df]: test deploy phab2002 for T364850 (duration: 00m 33s)
  • 15:04 brennen@deploy1002: Started deploy [phabricator/deployment@7d858df]: test deploy phab2002 for T364850
  • 15:04 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge update
  • 15:03 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge update
  • 15:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2167.codfw.wmnet
  • 15:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2166.codfw.wmnet
  • 14:49 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2166.codfw.wmnet
  • 14:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2165.codfw.wmnet
  • 14:38 moritzm: installing dav1d security updates
  • 14:35 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2165.codfw.wmnet
  • 14:33 vgutierrez: repool cp4049
  • 14:31 vgutierrez: depool cp4049
  • 14:28 vgutierrez: repool cp4049
  • 14:24 vgutierrez: depool cp4049
  • 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2163.codfw.wmnet
  • 14:14 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.5 refs T361399
  • 14:12 vgutierrez: repool upload@ulsfo IPIP encapsulation NOT enabled - T357257
  • 14:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2163.codfw.wmnet
  • 14:07 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 14:06 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2162.codfw.wmnet
  • 13:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2162.codfw.wmnet
  • 13:57 Lucas_WMDE: UTC afternoon backport+config window done
  • 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2161.codfw.wmnet
  • 13:57 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Deploy disabled limited width on main page (T357706), Phase 5: Vector-2022.js should no longer load legacy Vector code (T301212) (duration: 16m 32s)
  • 13:46 vgutierrez: re-enable puppet on A:cp-upload - T357257
  • 13:44 logmsgbot: lucaswerkmeister-wmde@deploy1002 jdlrobson and ksarabia and lucaswerkmeister-wmde: Continuing with sync
  • 13:44 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2161.codfw.wmnet
  • 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2154.codfw.wmnet
  • 13:43 logmsgbot: lucaswerkmeister-wmde@deploy1002 jdlrobson and ksarabia and lucaswerkmeister-wmde: Backport for Deploy disabled limited width on main page (T357706), Phase 5: Vector-2022.js should no longer load legacy Vector code (T301212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:42 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 13:42 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 13:40 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Deploy disabled limited width on main page (T357706), Phase 5: Vector-2022.js should no longer load legacy Vector code (T301212)
  • 13:36 vgutierrez: re-enable puppet on A:cp-text - T357257
  • 13:32 vgutierrez: disable puppet on A:cp before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1030051 - T357257
  • 13:28 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2154.codfw.wmnet
  • 13:25 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 13:25 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 13:24 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host netmon2002.wikimedia.org
  • 13:24 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 13:24 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 13:22 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Use ConditionalUserOptions for "echo-subscriptions-email-dt-subscription" (T357221), Use ConditionalUserOptions for "discussiontools-autotopicsub" (T357221) (duration: 17m 59s)
  • 13:18 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:18 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2152.codfw.wmnet
  • 13:10 klausman@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:10 logmsgbot: lucaswerkmeister-wmde@deploy1002 matmarex and lucaswerkmeister-wmde: Continuing with sync
  • 13:08 ayounsi@cumin1002: START - Cookbook sre.hosts.dhcp for host netmon2002.wikimedia.org
  • 13:07 logmsgbot: lucaswerkmeister-wmde@deploy1002 matmarex and lucaswerkmeister-wmde: Backport for Use ConditionalUserOptions for "echo-subscriptions-email-dt-subscription" (T357221), Use ConditionalUserOptions for "discussiontools-autotopicsub" (T357221) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2152.codfw.wmnet
  • 13:04 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Use ConditionalUserOptions for "echo-subscriptions-email-dt-subscription" (T357221), Use ConditionalUserOptions for "discussiontools-autotopicsub" (T357221)
  • 13:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 12:00:00 on db2114.codfw.wmnet,db1125.eqiad.wmnet with reason: Testing
  • 12:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 12:00:00 on db2114.codfw.wmnet,db1125.eqiad.wmnet with reason: Testing
  • 12:59 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 12:58 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 12:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host serpens.wikimedia.org
  • 12:24 ladsgroup@deploy1002: Finished scap: Backport for Enable section-wide circuit breaking (T360930) (duration: 21m 12s)
  • 12:13 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P62384 and previous config saved to /var/cache/conftool/dbconfig/20240514-121326-root.json
  • 12:11 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 12:06 ladsgroup@deploy1002: ladsgroup: Backport for Enable section-wide circuit breaking (T360930) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:03 ladsgroup@deploy1002: Started scap: Backport for Enable section-wide circuit breaking (T360930)
  • 11:58 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P62383 and previous config saved to /var/cache/conftool/dbconfig/20240514-115820-root.json
  • 11:47 ladsgroup@deploy1002: Finished scap: Backport for etcd: Ignore parsercache clusters in externalLoads (T362786) (duration: 17m 22s)
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P62382 and previous config saved to /var/cache/conftool/dbconfig/20240514-114314-root.json
  • 11:35 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 11:33 ladsgroup@deploy1002: ladsgroup: Backport for etcd: Ignore parsercache clusters in externalLoads (T362786) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:30 ladsgroup@deploy1002: Started scap: Backport for etcd: Ignore parsercache clusters in externalLoads (T362786)
  • 11:28 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P62381 and previous config saved to /var/cache/conftool/dbconfig/20240514-112807-root.json
  • 11:18 ladsgroup@deploy1002: Finished scap: Backport for rdbms: Fix picking the database from the LB domain (T364827) (duration: 15m 47s)
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P62379 and previous config saved to /var/cache/conftool/dbconfig/20240514-111302-root.json
  • 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T364299)', diff saved to https://phabricator.wikimedia.org/P62378 and previous config saved to /var/cache/conftool/dbconfig/20240514-110704-marostegui.json
  • 11:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 11:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 11:05 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 11:05 ladsgroup@deploy1002: ladsgroup: Backport for rdbms: Fix picking the database from the LB domain (T364827) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:02 ladsgroup@deploy1002: Started scap: Backport for rdbms: Fix picking the database from the LB domain (T364827)
  • 10:17 jayme@cumin1002: conftool action : set/weight=10; selector: name=kubestagemaster2004.codfw.wmnet
  • 10:17 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=kubestagemaster2004.codfw.wmnet
  • 10:12 hashar@deploy1002: rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.43.0-wmf.5" - T361399
  • 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
  • 09:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on 6 hosts with reason: Checking RO status
  • 09:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:05:00 on 6 hosts with reason: Checking RO status
  • 09:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on 6 hosts with reason: Primary switchover es4 T364451
  • 09:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:05:00 on 6 hosts with reason: Primary switchover es4 T364451
  • 09:50 marostegui@deploy1002: Finished scap: Backport for db-production.php: Make es4 and es5 RO (T364447) (duration: 15m 28s)
  • 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
  • 09:50 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
  • 09:49 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
  • 09:49 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1004.eqiad.wmnet to plain
  • 09:48 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1004.eqiad.wmnet to plain
  • 09:48 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 09:47 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 09:47 jayme@cumin1002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 09:47 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 09:45 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster1005.eqiad.wmnet
  • 09:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1005.eqiad.wmnet with OS bullseye
  • 09:37 marostegui@deploy1002: marostegui: Continuing with sync
  • 09:37 marostegui@deploy1002: marostegui: Backport for db-production.php: Make es4 and es5 RO (T364447) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:35 marostegui@deploy1002: Started scap: Backport for db-production.php: Make es4 and es5 RO (T364447)
  • 09:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1005.eqiad.wmnet with reason: host reimage
  • 09:27 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1005.eqiad.wmnet with reason: host reimage
  • 09:24 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.5 refs T361399
  • 09:20 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster1004.eqiad.wmnet
  • 09:20 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1004.eqiad.wmnet with OS bullseye
  • 09:18 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster1003.eqiad.wmnet
  • 09:18 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1003.eqiad.wmnet with OS bullseye
  • 09:14 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1005.eqiad.wmnet with OS bullseye
  • 09:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1004.eqiad.wmnet with reason: host reimage
  • 09:04 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1003.eqiad.wmnet with reason: host reimage
  • 09:02 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1004.eqiad.wmnet with reason: host reimage
  • 09:02 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1003.eqiad.wmnet with reason: host reimage
  • 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on serpens.wikimedia.org with reason: OS update
  • 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on serpens.wikimedia.org with reason: OS update
  • 08:57 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1005.eqiad.wmnet - jayme@cumin1002"
  • 08:54 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1005.eqiad.wmnet - jayme@cumin1002"
  • 08:54 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster1005.eqiad.wmnet on all recursors
  • 08:54 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster1005.eqiad.wmnet on all recursors
  • 08:54 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:54 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1005.eqiad.wmnet - jayme@cumin1002"
  • 08:52 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1005.eqiad.wmnet - jayme@cumin1002"
  • 08:52 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1004.eqiad.wmnet with OS bullseye
  • 08:50 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1004.eqiad.wmnet - jayme@cumin1002"
  • 08:49 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1004.eqiad.wmnet - jayme@cumin1002"
  • 08:49 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1003.eqiad.wmnet with OS bullseye
  • 08:49 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster1004.eqiad.wmnet on all recursors
  • 08:49 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster1004.eqiad.wmnet on all recursors
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1004.eqiad.wmnet - jayme@cumin1002"
  • 08:48 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1004.eqiad.wmnet - jayme@cumin1002"
  • 08:48 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster1005.eqiad.wmnet
  • 08:47 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1003.eqiad.wmnet - jayme@cumin1002"
  • 08:46 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1003.eqiad.wmnet - jayme@cumin1002"
  • 08:45 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:45 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster1003.eqiad.wmnet on all recursors
  • 08:45 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster1003.eqiad.wmnet on all recursors
  • 08:45 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:45 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1003.eqiad.wmnet - jayme@cumin1002"
  • 08:44 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster1004.eqiad.wmnet
  • 08:44 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1003.eqiad.wmnet - jayme@cumin1002"
  • 08:43 dcausse@deploy1002: Finished scap: Backport for Fix the loss of ParserOutput pointer in ContentDOMTransformStages (T364597) (duration: 16m 17s)
  • 08:41 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:41 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster1003.eqiad.wmnet
  • 08:30 dcausse@deploy1002: dcausse and cscott: Continuing with sync
  • 08:29 dcausse@deploy1002: dcausse and cscott: Backport for Fix the loss of ParserOutput pointer in ContentDOMTransformStages (T364597) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:26 dcausse@deploy1002: Started scap: Backport for Fix the loss of ParserOutput pointer in ContentDOMTransformStages (T364597)
  • 08:22 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bdgreenlee out of all services on: 2208 hosts
  • 08:21 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Bdgreenlee out of all services on: 2208 hosts
  • 08:15 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubestagemaster2005.codfw.wmnet with OS bullseye
  • 07:57 kartik@deploy1002: Finished scap: Backport for CX: Add mw.cx.UserPermissionChecker (T349959) (duration: 17m 52s)
  • 07:55 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 07:54 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 07:54 moritzm: installing PHP 7.3 security updates
  • 07:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 215887
  • 07:53 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 215887
  • 07:46 moritzm: installing libgd2 security updates
  • 07:44 kartik@deploy1002: kartik: Continuing with sync
  • 07:42 kartik@deploy1002: kartik: Backport for CX: Add mw.cx.UserPermissionChecker (T349959) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:39 kartik@deploy1002: Started scap: Backport for CX: Add mw.cx.UserPermissionChecker (T349959)
  • 07:27 kartik@deploy1002: Finished scap: Backport for Set $wgSignatureValidation to 'disallow' on Polish Wikipedia (T364769) (duration: 18m 28s)
  • 07:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2185.codfw.wmnet with OS bookworm
  • 07:15 kartik@deploy1002: kartik and msz2001: Continuing with sync
  • 07:12 kartik@deploy1002: kartik and msz2001: Backport for Set $wgSignatureValidation to 'disallow' on Polish Wikipedia (T364769) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:09 kartik@deploy1002: Started scap: Backport for Set $wgSignatureValidation to 'disallow' on Polish Wikipedia (T364769)
  • 07:04 moritzm: installing glib2.0 security updates
  • 06:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2185.codfw.wmnet with reason: host reimage
  • 06:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2185.codfw.wmnet with reason: host reimage
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2185.codfw.wmnet with OS bookworm
  • 06:33 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db2185.codfw.wmnet with OS bookworm
  • 06:33 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2185.codfw.wmnet with OS bookworm
  • 05:31 kart_: Updated cxserver to 2024-04-23-221507-production (T363263, T333969, T360303, T360310)
  • 05:25 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:24 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:22 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:22 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:19 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:19 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:15 kart_: Updated MinT to 2024-03-28-061726-production (T333969)
  • 05:08 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 04:59 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 04:33 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 04:25 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 04:18 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 04:14 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 04:00 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.5 refs T361399 (duration: 57m 45s)
  • 03:03 mwpresync@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.5 refs T361399
  • 02:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 02:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 02:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240514-023316-ladsgroup.json
  • 02:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P62375 and previous config saved to /var/cache/conftool/dbconfig/20240514-021809-ladsgroup.json
  • 02:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P62374 and previous config saved to /var/cache/conftool/dbconfig/20240514-020301-ladsgroup.json
  • 01:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T352010)', diff saved to https://phabricator.wikimedia.org/P62373 and previous config saved to /var/cache/conftool/dbconfig/20240514-014753-ladsgroup.json
  • 01:18 ejegg: fundraising civicrm upgraded from c854dd3a to c7b0dfbb
  • 00:35 tstarling@deploy1002: Finished scap: Fix SecurePoll exception T209892 and CodeMirror 5 RTL T363752 (duration: 14m 56s)
  • 00:20 tstarling@deploy1002: Started scap: Fix SecurePoll exception T209892 and CodeMirror 5 RTL T363752
  • 00:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T364299)', diff saved to https://phabricator.wikimedia.org/P62372 and previous config saved to /var/cache/conftool/dbconfig/20240514-001956-marostegui.json
  • 00:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 00:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance

2024-05-13

  • 22:55 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=elastic110[5|7]\.eqiad\.wmnet
  • 22:43 ryankemper@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 22:30 zabe: zabe@mwmaint1002:~$ mwscript cleanupTitles.php itwikivoyage # T298315
  • 22:27 bking@cumin2002: conftool action : set/weight=10:pooled=no; selector: name=elastic110[5|7]\.eqiad\.wmnet
  • 21:47 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 21:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 21:39 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 21:39 eileen: civicrm upgraded from 447e1472 to c854dd3a
  • 21:32 ryankemper@cumin2002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
  • 21:32 ebernhardson@deploy1002: Finished scap: Backport for Unbreak link buttons (T364062) (duration: 22m 00s)
  • 21:22 ryankemper@cumin2002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
  • 21:20 ebernhardson@deploy1002: jdlrobson and ebernhardson: Continuing with sync
  • 21:12 ebernhardson@deploy1002: jdlrobson and ebernhardson: Backport for Unbreak link buttons (T364062) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:10 ebernhardson@deploy1002: Started scap: Backport for Unbreak link buttons (T364062)
  • 20:57 ebernhardson@deploy1002: Finished scap: Backport for IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884) (duration: 17m 22s)
  • 20:45 ebernhardson@deploy1002: ebernhardson and tchanders: Continuing with sync
  • 20:42 ebernhardson@deploy1002: ebernhardson and tchanders: Backport for IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:40 ebernhardson@deploy1002: Started scap: Backport for IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884)
  • 20:38 ebernhardson@deploy1002: Finished scap: Backport for Remove old CampaignEvents DB config (prod) (T348281) (duration: 21m 14s)
  • 20:29 eileen: civicrm upgraded from 4f55a7cf to 447e1472
  • 20:25 ebernhardson@deploy1002: ebernhardson and daimona: Continuing with sync
  • 20:19 ebernhardson@deploy1002: ebernhardson and daimona: Backport for Remove old CampaignEvents DB config (prod) (T348281) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:17 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 20:17 ebernhardson@deploy1002: Started scap: Backport for Remove old CampaignEvents DB config (prod) (T348281)
  • 19:57 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 19:47 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 19:27 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 19:26 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 19:26 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 18:49 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
  • 18:49 ryankemper@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 18:30 ryankemper@cumin2002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 18:24 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
  • 18:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
  • 18:20 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
  • 18:19 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
  • 18:08 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
  • 18:07 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
  • 18:04 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
  • 18:03 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
  • 17:40 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:40 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for new linknets on codfw spines - cmooney@cumin1002"
  • 17:39 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for new linknets on codfw spines - cmooney@cumin1002"
  • 17:38 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:38 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:27 ryankemper: T363973 [Kafka] Restarting `jumbo-eqiad` brokers, followed by mirror maker
  • 17:27 ryankemper@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
  • 17:05 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
  • 17:02 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
  • 16:50 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bullseye
  • 16:49 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
  • 16:47 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
  • 16:47 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2004.codfw.wmnet to plain
  • 16:46 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2004.codfw.wmnet to plain
  • 16:46 ejegg: fundraising civicrm upgraded from c0d2fa95 to 4f55a7cf
  • 16:46 jayme@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host kubestagemaster2005.codfw.wmnet
  • 16:46 jayme@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubestagemaster2005.codfw.wmnet with OS bullseye
  • 16:34 brouberol@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: JVM restart - brouberol@cumin2002 - T363975
  • 16:16 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 13m 47s)
  • 16:13 ejegg: restarted fundraising scheduled jobs
  • 16:11 ejegg: fundraising civicrm rolled back from 3fef5849 to c0d2fa95
  • 16:02 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 14m 23s)
  • 15:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T352010)', diff saved to https://phabricator.wikimedia.org/P62370 and previous config saved to /var/cache/conftool/dbconfig/20240513-155911-ladsgroup.json
  • 15:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62369 and previous config saved to /var/cache/conftool/dbconfig/20240513-155849-ladsgroup.json
  • 15:55 ejegg: fundraising civicrm upgraded from c0d2fa95 to 3fef5849
  • 15:54 ejegg: disabled fundraising scheduled jobs for CiviCRM deploy
  • 15:49 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
  • 15:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P62368 and previous config saved to /var/cache/conftool/dbconfig/20240513-154341-ladsgroup.json
  • 15:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P62367 and previous config saved to /var/cache/conftool/dbconfig/20240513-152833-ladsgroup.json
  • 15:27 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 15:25 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
  • 15:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 15:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 15:18 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: JVM restart - brouberol@cumin2002 - T363975
  • 15:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62366 and previous config saved to /var/cache/conftool/dbconfig/20240513-151325-ladsgroup.json
  • 14:55 Lucas_WMDE: UTC afternoon backport+config window don
  • 14:50 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Include mw-jobrunner port in host header check (duration: 16m 04s)
  • 14:49 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster2004.codfw.wmnet
  • 14:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2004.codfw.wmnet with OS bullseye
  • 14:42 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
  • 14:39 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
  • 14:37 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and hnowlan: Continuing with sync
  • 14:36 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and hnowlan: Backport for Include mw-jobrunner port in host header check synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:35 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2004.codfw.wmnet with reason: host reimage
  • 14:34 mutante: CI - switch over to other contint server finished - T334517
  • 14:34 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Include mw-jobrunner port in host header check
  • 14:32 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync
  • 14:32 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2004.codfw.wmnet with reason: host reimage
  • 14:32 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync
  • 14:25 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bullseye
  • 14:22 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster2005.codfw.wmnet - jayme@cumin1002"
  • 14:19 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster2005.codfw.wmnet - jayme@cumin1002"
  • 14:19 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster2005.codfw.wmnet on all recursors
  • 14:19 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster2005.codfw.wmnet on all recursors
  • 14:19 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:19 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster2005.codfw.wmnet - jayme@cumin1002"
  • 14:18 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster2005.codfw.wmnet - jayme@cumin1002"
  • 14:18 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2004.codfw.wmnet with OS bullseye
  • 14:17 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster2004.codfw.wmnet - jayme@cumin1002"
  • 14:16 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster2004.codfw.wmnet - jayme@cumin1002"
  • 14:16 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster2004.codfw.wmnet on all recursors
  • 14:15 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 14:15 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster2004.codfw.wmnet on all recursors
  • 14:15 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:15 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster2004.codfw.wmnet - jayme@cumin1002"
  • 14:15 mutante: CI - migration in progress - stopping jenkins and zuul (T334517)
  • 14:15 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster2004.codfw.wmnet - jayme@cumin1002"
  • 14:13 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster2005.codfw.wmnet
  • 14:12 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 14:12 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster2004.codfw.wmnet
  • 14:12 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Enable async upload-by-URL via jobqueue on testwiki (T295007) (duration: 25m 09s)
  • 14:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint1002.wikimedia.org with reason: T334517
  • 14:09 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on contint1002.wikimedia.org with reason: T334517
  • 14:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:09 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:05 brouberol@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: JVM restart - brouberol@cumin2002 - T363975
  • 14:00 logmsgbot: lucaswerkmeister-wmde@deploy1002 hnowlan and lucaswerkmeister-wmde: Continuing with sync
  • 13:59 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: JVM restart - brouberol@cumin2002 - T363975
  • 13:56 brouberol@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: JVM restart - brouberol@cumin2002 - T363975
  • 13:56 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: JVM restart - brouberol@cumin2002 - T363975
  • 13:49 logmsgbot: lucaswerkmeister-wmde@deploy1002 hnowlan and lucaswerkmeister-wmde: Backport for Enable async upload-by-URL via jobqueue on testwiki (T295007) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T360332)', diff saved to https://phabricator.wikimedia.org/P62363 and previous config saved to /var/cache/conftool/dbconfig/20240513-134852-arnaudb.json
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T360332)', diff saved to https://phabricator.wikimedia.org/P62362 and previous config saved to /var/cache/conftool/dbconfig/20240513-134721-arnaudb.json
  • 13:47 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Enable async upload-by-URL via jobqueue on testwiki (T295007)
  • 13:47 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:45 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for specials: Fix "include templates" query builder for Special:Export (T364554), ArticleTarget: Fix return of getVisualDiffGeneratorPromise (T364635) (duration: 16m 04s)
  • 13:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:37 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P62361 and previous config saved to /var/cache/conftool/dbconfig/20240513-133345-arnaudb.json
  • 13:33 logmsgbot: lucaswerkmeister-wmde@deploy1002 umherirrender and lucaswerkmeister-wmde and matmarex: Continuing with sync
  • 13:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62360 and previous config saved to /var/cache/conftool/dbconfig/20240513-133214-arnaudb.json
  • 13:32 logmsgbot: lucaswerkmeister-wmde@deploy1002 umherirrender and lucaswerkmeister-wmde and matmarex: Backport for specials: Fix "include templates" query builder for Special:Export (T364554), ArticleTarget: Fix return of getVisualDiffGeneratorPromise (T364635) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:29 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for specials: Fix "include templates" query builder for Special:Export (T364554), ArticleTarget: Fix return of getVisualDiffGeneratorPromise (T364635)
  • 13:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P62359 and previous config saved to /var/cache/conftool/dbconfig/20240513-131837-arnaudb.json
  • 13:17 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62358 and previous config saved to /var/cache/conftool/dbconfig/20240513-131706-arnaudb.json
  • 13:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:11 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:11 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:07 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:07 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:07 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:05 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:05 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T360332)', diff saved to https://phabricator.wikimedia.org/P62357 and previous config saved to /var/cache/conftool/dbconfig/20240513-130329-arnaudb.json
  • 13:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T360332)', diff saved to https://phabricator.wikimedia.org/P62356 and previous config saved to /var/cache/conftool/dbconfig/20240513-130158-arnaudb.json
  • 13:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 13:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 13:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T360332)', diff saved to https://phabricator.wikimedia.org/P62355 and previous config saved to /var/cache/conftool/dbconfig/20240513-130049-arnaudb.json
  • 13:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 13:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 12:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2165 (T360332)', diff saved to https://phabricator.wikimedia.org/P62354 and previous config saved to /var/cache/conftool/dbconfig/20240513-125940-arnaudb.json
  • 12:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 12:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 12:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T364299)', diff saved to https://phabricator.wikimedia.org/P62353 and previous config saved to /var/cache/conftool/dbconfig/20240513-124752-marostegui.json
  • 12:44 brouberol@cumin2002: END (PASS) - Cookbook sre.apifeatureusage.roll-restart-reboot-logstash (exit_code=0) rolling restart_daemons on A:apifeatureusage
  • 12:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P62351 and previous config saved to /var/cache/conftool/dbconfig/20240513-121737-marostegui.json
  • 12:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T364299)', diff saved to https://phabricator.wikimedia.org/P62350 and previous config saved to /var/cache/conftool/dbconfig/20240513-120229-marostegui.json
  • 11:58 hashar: Restarted CI Jenkins to update the Parameterized build plugin | T336782
  • 11:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T364299)', diff saved to https://phabricator.wikimedia.org/P62349 and previous config saved to /var/cache/conftool/dbconfig/20240513-113215-marostegui.json
  • 11:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 11:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 11:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T364299)', diff saved to https://phabricator.wikimedia.org/P62348 and previous config saved to /var/cache/conftool/dbconfig/20240513-113152-marostegui.json
  • 11:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P62347 and previous config saved to /var/cache/conftool/dbconfig/20240513-111644-marostegui.json
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 11:04 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 11:04 moritzm: installing tomcat9 security updates
  • 11:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P62346 and previous config saved to /var/cache/conftool/dbconfig/20240513-110137-marostegui.json
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T364299)', diff saved to https://phabricator.wikimedia.org/P62345 and previous config saved to /var/cache/conftool/dbconfig/20240513-104627-marostegui.json
  • 10:37 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 10:32 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 10:19 moritzm: installing expat security updates
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T364299)', diff saved to https://phabricator.wikimedia.org/P62343 and previous config saved to /var/cache/conftool/dbconfig/20240513-101748-marostegui.json
  • 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T364299)', diff saved to https://phabricator.wikimedia.org/P62342 and previous config saved to /var/cache/conftool/dbconfig/20240513-101724-marostegui.json
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P62341 and previous config saved to /var/cache/conftool/dbconfig/20240513-100216-marostegui.json
  • 09:47 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P62340 and previous config saved to /var/cache/conftool/dbconfig/20240513-094709-marostegui.json
  • 09:46 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
  • 09:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS bookworm
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T364299)', diff saved to https://phabricator.wikimedia.org/P62338 and previous config saved to /var/cache/conftool/dbconfig/20240513-093200-marostegui.json
  • 09:28 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 09:27 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 09:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
  • 09:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
  • 09:05 jynus: deploy new stat grants at m1:dbbackups T362509
  • 09:03 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS bookworm
  • 09:02 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2184.codfw.wmnet with OS bookworm
  • 09:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T364299)', diff saved to https://phabricator.wikimedia.org/P62337 and previous config saved to /var/cache/conftool/dbconfig/20240513-090035-marostegui.json
  • 09:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 09:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 09:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T364299)', diff saved to https://phabricator.wikimedia.org/P62336 and previous config saved to /var/cache/conftool/dbconfig/20240513-090011-marostegui.json
  • 09:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts snapshot1009.eqiad.wmnet
  • 09:00 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:00 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 08:58 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 08:56 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 08:53 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS bookworm
  • 08:51 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts snapshot1009.eqiad.wmnet
  • 08:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P62335 and previous config saved to /var/cache/conftool/dbconfig/20240513-084503-marostegui.json
  • 08:45 marostegui@deploy1002: Finished scap: Backport for db-production.php: Enable writes on es6 and es7 (T364446) (duration: 44m 00s)
  • 08:32 marostegui@deploy1002: marostegui: Continuing with sync
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P62334 and previous config saved to /var/cache/conftool/dbconfig/20240513-082956-marostegui.json
  • 08:24 moritzm: installing PHP 7.3 security updates
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T364299)', diff saved to https://phabricator.wikimedia.org/P62333 and previous config saved to /var/cache/conftool/dbconfig/20240513-081448-marostegui.json
  • 08:03 marostegui@deploy1002: marostegui: Backport for db-production.php: Enable writes on es6 and es7 (T364446) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:01 marostegui@deploy1002: Started scap: Backport for db-production.php: Enable writes on es6 and es7 (T364446)
  • 08:00 moritzm: installing python2.7 security updates
  • 07:58 ladsgroup@deploy1002: Finished scap: Backport for Fix static cache access (T364693) (duration: 16m 54s)
  • 07:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 17451
  • 07:53 moritzm: installing libgd2 security updates
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62332 and previous config saved to /var/cache/conftool/dbconfig/20240513-075256-root.json
  • 07:46 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 07:44 brouberol@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 07:44 ladsgroup@deploy1002: ladsgroup: Backport for Fix static cache access (T364693) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:41 ladsgroup@deploy1002: Started scap: Backport for Fix static cache access (T364693)
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T364299)', diff saved to https://phabricator.wikimedia.org/P62331 and previous config saved to /var/cache/conftool/dbconfig/20240513-074103-marostegui.json
  • 07:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T364299)', diff saved to https://phabricator.wikimedia.org/P62330 and previous config saved to /var/cache/conftool/dbconfig/20240513-074041-marostegui.json
  • 07:38 brouberol@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62329 and previous config saved to /var/cache/conftool/dbconfig/20240513-073750-root.json
  • 07:37 kartik@deploy1002: Finished scap: Backport for ContentTranslation: Update publishing setting for cswiki (T353049) (duration: 32m 03s)
  • 07:35 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 17451
  • 07:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62328 and previous config saved to /var/cache/conftool/dbconfig/20240513-073031-ladsgroup.json
  • 07:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:30 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 07:30 brouberol@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 07:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P62327 and previous config saved to /var/cache/conftool/dbconfig/20240513-072533-marostegui.json
  • 07:23 brouberol@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 07:23 kartik@deploy1002: kartik: Continuing with sync
  • 07:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62326 and previous config saved to /var/cache/conftool/dbconfig/20240513-072244-root.json
  • 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::eqiad1::instance_backups
  • 07:19 kartik@deploy1002: kartik: Backport for ContentTranslation: Update publishing setting for cswiki (T353049) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::eqiad1::instance_backups
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P62325 and previous config saved to /var/cache/conftool/dbconfig/20240513-071026-marostegui.json
  • 07:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudbackup1004.eqiad.wmnet
  • 07:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62324 and previous config saved to /var/cache/conftool/dbconfig/20240513-070738-root.json
  • 07:05 kartik@deploy1002: Started scap: Backport for ContentTranslation: Update publishing setting for cswiki (T353049)
  • 06:59 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudbackup1004.eqiad.wmnet
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T364299)', diff saved to https://phabricator.wikimedia.org/P62323 and previous config saved to /var/cache/conftool/dbconfig/20240513-065518-marostegui.json
  • 06:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62322 and previous config saved to /var/cache/conftool/dbconfig/20240513-065230-root.json
  • 06:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS bookworm
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62321 and previous config saved to /var/cache/conftool/dbconfig/20240513-063724-root.json
  • 06:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
  • 06:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
  • 06:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62320 and previous config saved to /var/cache/conftool/dbconfig/20240513-062219-root.json
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1183 (T364299)', diff saved to https://phabricator.wikimedia.org/P62319 and previous config saved to /var/cache/conftool/dbconfig/20240513-062129-marostegui.json
  • 06:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 06:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T364299)', diff saved to https://phabricator.wikimedia.org/P62318 and previous config saved to /var/cache/conftool/dbconfig/20240513-062117-marostegui.json
  • 06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: Reimage of the master
  • 06:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: Reimage of the master
  • 06:07 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS bookworm
  • 06:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: Reimage
  • 06:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: Reimage
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P62317 and previous config saved to /var/cache/conftool/dbconfig/20240513-060610-marostegui.json
  • 06:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2213.codfw.wmnet with reason: Schema change
  • 06:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2213.codfw.wmnet with reason: Schema change
  • 05:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2213.codfw.wmnet with reason: Schema change
  • 05:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2213.codfw.wmnet with reason: Schema change
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P62316 and previous config saved to /var/cache/conftool/dbconfig/20240513-055102-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2213 T364703', diff saved to https://phabricator.wikimedia.org/P62315 and previous config saved to /var/cache/conftool/dbconfig/20240513-054841-root.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2123 to s5 primary T364703', diff saved to https://phabricator.wikimedia.org/P62314 and previous config saved to /var/cache/conftool/dbconfig/20240513-054802-root.json
  • 05:47 marostegui: Starting s5 codfw failover from db2213 to db2123 - T364703
  • 05:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T364299)', diff saved to https://phabricator.wikimedia.org/P62313 and previous config saved to /var/cache/conftool/dbconfig/20240513-053553-marostegui.json
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'Remove vslow from db2123 T364703', diff saved to https://phabricator.wikimedia.org/P62312 and previous config saved to /var/cache/conftool/dbconfig/20240513-052424-marostegui.json
  • 05:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s5 T364703
  • 05:23 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2123 with weight 0 T364703', diff saved to https://phabricator.wikimedia.org/P62311 and previous config saved to /var/cache/conftool/dbconfig/20240513-052304-root.json
  • 05:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s5 T364703
  • 05:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T364299)', diff saved to https://phabricator.wikimedia.org/P62310 and previous config saved to /var/cache/conftool/dbconfig/20240513-050237-marostegui.json
  • 05:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 03:21 cwhite: restart apache2 on phab1004
  • 01:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T352010)', diff saved to https://phabricator.wikimedia.org/P62309 and previous config saved to /var/cache/conftool/dbconfig/20240513-014623-ladsgroup.json
  • 01:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P62308 and previous config saved to /var/cache/conftool/dbconfig/20240513-013113-ladsgroup.json
  • 01:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P62307 and previous config saved to /var/cache/conftool/dbconfig/20240513-011605-ladsgroup.json
  • 01:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T352010)', diff saved to https://phabricator.wikimedia.org/P62306 and previous config saved to /var/cache/conftool/dbconfig/20240513-010055-ladsgroup.json

2024-05-12

  • 19:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2209 (T352010)', diff saved to https://phabricator.wikimedia.org/P62305 and previous config saved to /var/cache/conftool/dbconfig/20240512-195220-ladsgroup.json
  • 19:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 19:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 19:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62304 and previous config saved to /var/cache/conftool/dbconfig/20240512-195156-ladsgroup.json
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P62303 and previous config saved to /var/cache/conftool/dbconfig/20240512-193645-ladsgroup.json
  • 19:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P62302 and previous config saved to /var/cache/conftool/dbconfig/20240512-192137-ladsgroup.json
  • 19:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62301 and previous config saved to /var/cache/conftool/dbconfig/20240512-190629-ladsgroup.json
  • 13:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62300 and previous config saved to /var/cache/conftool/dbconfig/20240512-134125-ladsgroup.json
  • 13:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 13:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 13:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P62299 and previous config saved to /var/cache/conftool/dbconfig/20240512-134101-ladsgroup.json
  • 13:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P62298 and previous config saved to /var/cache/conftool/dbconfig/20240512-132554-ladsgroup.json
  • 13:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P62297 and previous config saved to /var/cache/conftool/dbconfig/20240512-131046-ladsgroup.json
  • 12:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P62296 and previous config saved to /var/cache/conftool/dbconfig/20240512-125539-ladsgroup.json
  • 07:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P62295 and previous config saved to /var/cache/conftool/dbconfig/20240512-072559-ladsgroup.json
  • 07:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 07:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 07:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P62294 and previous config saved to /var/cache/conftool/dbconfig/20240512-072534-ladsgroup.json
  • 07:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P62293 and previous config saved to /var/cache/conftool/dbconfig/20240512-071026-ladsgroup.json
  • 06:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P62292 and previous config saved to /var/cache/conftool/dbconfig/20240512-065519-ladsgroup.json
  • 06:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P62291 and previous config saved to /var/cache/conftool/dbconfig/20240512-064011-ladsgroup.json
  • 00:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P62290 and previous config saved to /var/cache/conftool/dbconfig/20240512-000104-ladsgroup.json
  • 00:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P62289 and previous config saved to /var/cache/conftool/dbconfig/20240512-000040-ladsgroup.json

2024-05-11

  • 23:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P62288 and previous config saved to /var/cache/conftool/dbconfig/20240511-234532-ladsgroup.json
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P62287 and previous config saved to /var/cache/conftool/dbconfig/20240511-233023-ladsgroup.json
  • 23:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P62286 and previous config saved to /var/cache/conftool/dbconfig/20240511-231515-ladsgroup.json
  • 16:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240511-163653-ladsgroup.json
  • 16:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P62284 and previous config saved to /var/cache/conftool/dbconfig/20240511-163614-ladsgroup.json
  • 16:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P62283 and previous config saved to /var/cache/conftool/dbconfig/20240511-162106-ladsgroup.json
  • 16:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P62282 and previous config saved to /var/cache/conftool/dbconfig/20240511-160558-ladsgroup.json
  • 15:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P62281 and previous config saved to /var/cache/conftool/dbconfig/20240511-155050-ladsgroup.json
  • 13:20 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete betafeatures-geonotahack --nowarn` - T300371
  • 13:17 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete betafeatures-vector-compact-personal-bar --nowarn` - T300371
  • 13:14 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete betafeatures-vector-typography-update --nowarn` - T300371
  • 13:11 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete betafeatures-popup-disable` - T300371
  • 12:07 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete templatewizard-betafeature` - T300371
  • 09:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P62280 and previous config saved to /var/cache/conftool/dbconfig/20240511-090631-ladsgroup.json
  • 09:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 09:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 01:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 01:14 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 01:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T352010)', diff saved to https://phabricator.wikimedia.org/P62279 and previous config saved to /var/cache/conftool/dbconfig/20240511-011416-ladsgroup.json
  • 00:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P62278 and previous config saved to /var/cache/conftool/dbconfig/20240511-005908-ladsgroup.json
  • 00:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P62277 and previous config saved to /var/cache/conftool/dbconfig/20240511-004400-ladsgroup.json
  • 00:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T352010)', diff saved to https://phabricator.wikimedia.org/P62276 and previous config saved to /var/cache/conftool/dbconfig/20240511-002853-ladsgroup.json

2024-05-10

  • 21:19 ryankemper@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
  • 21:08 ryankemper@cumin2002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
  • 20:19 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:19 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 18:41 fab@deploy1002: Finished deploy [airflow-dags/research@75163c7]: (no justification provided) (duration: 00m 32s)
  • 18:41 fab@deploy1002: Started deploy [airflow-dags/research@75163c7]: (no justification provided)
  • 18:22 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:22 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for spine to spine links codfw - cmooney@cumin1002"
  • 17:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T352010)', diff saved to https://phabricator.wikimedia.org/P62275 and previous config saved to /var/cache/conftool/dbconfig/20240510-174044-ladsgroup.json
  • 17:40 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 17:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 17:13 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for spine to spine links codfw - cmooney@cumin1002"
  • 17:07 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:00 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 16:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19165
  • 16:58 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 19165
  • 16:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 26073
  • 16:58 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 26073
  • 16:53 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 15830
  • 16:52 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 15830
  • 16:48 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9269
  • 16:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 9269
  • 16:46 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 17451
  • 16:45 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 17451
  • 16:16 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:16 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns records for new codfw row c and d networks - cmooney@cumin1002"
  • 16:14 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns records for new codfw row c and d networks - cmooney@cumin1002"
  • 16:12 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:15 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 21574
  • 14:15 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 21574
  • 14:14 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 38565
  • 14:14 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 38565
  • 14:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 23473
  • 14:12 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 23473
  • 14:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 5769
  • 14:12 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 5769
  • 14:09 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 5418
  • 14:09 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 5418
  • 14:08 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 7337
  • 14:08 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 7337
  • 14:07 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 30640
  • 14:06 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 30640
  • 14:06 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
  • 13:59 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 64049
  • 12:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2002.wikimedia.org
  • 12:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2002.wikimedia.org
  • 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1002.wikimedia.org
  • 12:03 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 12:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1002.wikimedia.org
  • 11:36 moritzm: roll out debdeploy 0.0.99.14
  • 11:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:50 elukey: add amd-k8s-device-plugin_1.25.2.8 to bullseye-wikimedia
  • 10:32 moritzm: installing Linux 5.10.216 on Bullseye systems
  • 08:42 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 08:30 godog: restore SRE business hours oncall for EMEA - T350192
  • 07:55 moritzm: installing Linux 6.1.90 on Bookworm systems
  • 06:10 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade for T364481
  • 06:03 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade for T364481
  • 05:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 05:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 05:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P62272 and previous config saved to /var/cache/conftool/dbconfig/20240510-050102-ladsgroup.json
  • 04:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P62271 and previous config saved to /var/cache/conftool/dbconfig/20240510-044554-ladsgroup.json
  • 04:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P62270 and previous config saved to /var/cache/conftool/dbconfig/20240510-043046-ladsgroup.json
  • 04:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240510-041534-ladsgroup.json
  • 00:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P62269 and previous config saved to /var/cache/conftool/dbconfig/20240510-004703-ladsgroup.json
  • 00:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P62268 and previous config saved to /var/cache/conftool/dbconfig/20240510-004621-ladsgroup.json
  • 00:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P62267 and previous config saved to /var/cache/conftool/dbconfig/20240510-003113-ladsgroup.json
  • 00:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P62266 and previous config saved to /var/cache/conftool/dbconfig/20240510-001605-ladsgroup.json
  • 00:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P62265 and previous config saved to /var/cache/conftool/dbconfig/20240510-000058-ladsgroup.json

2024-05-09

  • 23:06 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 23:06 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 23:06 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 22:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2006']
  • 22:27 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2006']
  • 21:47 ryankemper: [wdqs] Re-enabled puppet on `wdqs2023`
  • 21:41 ryankemper@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
  • 21:18 ryankemper@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
  • 21:11 jhuneidi@deploy1002: Finished scap: Backport for Skin: Fix UrlUtils calls (T364539) (duration: 23m 42s)
  • 20:58 jhuneidi@deploy1002: jhuneidi and lucaswerkmeister: Continuing with sync
  • 20:49 jhuneidi@deploy1002: jhuneidi and lucaswerkmeister: Backport for Skin: Fix UrlUtils calls (T364539) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:47 jhuneidi@deploy1002: Started scap: Backport for Skin: Fix UrlUtils calls (T364539)
  • 20:18 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.4 refs T361398
  • 19:59 jhuneidi@deploy1002: Finished scap: Backport for Revert "Migrate to IReadableDatabase::newSelectQueryBuilder" (T312418 T364499) (duration: 17m 37s)
  • 19:46 jhuneidi@deploy1002: jhuneidi and zabe: Continuing with sync
  • 19:44 jhuneidi@deploy1002: jhuneidi and zabe: Backport for Revert "Migrate to IReadableDatabase::newSelectQueryBuilder" (T312418 T364499) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:42 jhuneidi@deploy1002: Started scap: Backport for Revert "Migrate to IReadableDatabase::newSelectQueryBuilder" (T312418 T364499)
  • 19:29 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol2001-dev.codfw.wmnet
  • 19:29 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:29 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2001-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 19:29 eileen: civicrm upgraded from 6256c944 to c0d2fa95
  • 19:28 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2001-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 19:26 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 19:19 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol2001-dev.codfw.wmnet
  • 19:09 denisse: Restarting `pyrra-filesystem-notify-thanos.path`, and `reset-failed thanos-rule-reload.service` units on titan1001
  • 19:08 denisse: Reset failed `pyrra-filesystem-notify-thanos.path`, and `reset-failed thanos-rule-reload.service` units on titan1001
  • 17:58 jforrester@deploy1002: Finished scap: Backport for Revert "Action APIs: Set most of our APIs to emit a cache header for 24 hours" (T364567) (duration: 17m 17s)
  • 17:45 jforrester@deploy1002: jforrester: Continuing with sync
  • 17:44 ejegg: SmashPig (standalone IPN listener) upgraded from 67db9d96 to 82392d54
  • 17:43 jforrester@deploy1002: jforrester: Backport for Revert "Action APIs: Set most of our APIs to emit a cache header for 24 hours" (T364567) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:41 jforrester@deploy1002: Started scap: Backport for Revert "Action APIs: Set most of our APIs to emit a cache header for 24 hours" (T364567)
  • 17:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P62263 and previous config saved to /var/cache/conftool/dbconfig/20240509-173728-ladsgroup.json
  • 17:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 17:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 17:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P62262 and previous config saved to /var/cache/conftool/dbconfig/20240509-173705-ladsgroup.json
  • 17:34 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet with OS bookworm
  • 17:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P62261 and previous config saved to /var/cache/conftool/dbconfig/20240509-172157-ladsgroup.json
  • 17:16 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2006-dev.codfw.wmnet with reason: host reimage
  • 17:13 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2006-dev.codfw.wmnet with reason: host reimage
  • 17:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P62260 and previous config saved to /var/cache/conftool/dbconfig/20240509-170649-ladsgroup.json
  • 16:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 16:55 sukhe: sudo cumin -b30 'A:cp' 'run-puppet-agent --enable "merging CR 1029614"'
  • 16:53 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcontrol2006-dev.codfw.wmnet with OS bookworm
  • 16:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P62259 and previous config saved to /var/cache/conftool/dbconfig/20240509-165141-ladsgroup.json
  • 16:49 sukhe: sudo cumin 'A:cp' 'disable-puppet "merging CR 1029614"'
  • 16:47 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS bullseye
  • 16:35 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudcontrol2006-dev.private.codfw.wikimedia.cloud on all recursors
  • 16:35 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache cloudcontrol2006-dev.private.codfw.wikimedia.cloud on all recursors
  • 16:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 16:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 16:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2007.codfw.wmnet with OS bullseye
  • 16:32 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:32 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add entries for new codfw cloudcontrol nodes - cmooney@cumin1002"
  • 16:31 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add entries for new codfw cloudcontrol nodes - cmooney@cumin1002"
  • 16:29 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:20 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 15:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2010']
  • 15:35 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2010']
  • 15:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-main2010']
  • 15:35 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2010']
  • 15:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:31 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Bootstrapping — T364422
  • 15:31 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Bootstrapping — T364422
  • 15:29 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1013.eqiad.wmnet with OS bullseye
  • 15:27 eevans@deploy1002: Finished deploy [cassandra/logstash-logback-encoder@42653e6] (aqs): (no justification provided) (duration: 00m 33s)
  • 15:27 eevans@deploy1002: Started deploy [cassandra/logstash-logback-encoder@42653e6] (aqs): (no justification provided)
  • 15:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62258 and previous config saved to /var/cache/conftool/dbconfig/20240509-152501-marostegui.json
  • 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:22 dancy@deploy1002: Installation of scap version "4.83.0" completed for 307 hosts
  • 15:22 dancy@deploy1002: Installing scap version "4.83.0" for 307 hosts
  • 15:21 dancy@deploy1002: Installing scap version "4.83.0" for 308 hosts
  • 15:20 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:20 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2010 to codfw - jhancock@cumin2002"
  • 15:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2010 to codfw - jhancock@cumin2002"
  • 15:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:15 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1010.eqiad.wmnet
  • 15:15 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2010 to codfw - jhancock@cumin2002"
  • 15:14 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2010 to codfw - jhancock@cumin2002"
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS bullseye
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS bullseye
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 15:11 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2008']
  • 15:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2009']
  • 15:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P62257 and previous config saved to /var/cache/conftool/dbconfig/20240509-150953-marostegui.json
  • 15:09 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2008']
  • 15:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2007']
  • 15:08 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2009']
  • 15:08 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host snapshot1010.eqiad.wmnet
  • 15:08 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2007']
  • 15:08 ladsgroup@deploy1002: Finished scap: Backport for Disable namespaceDupes again (T364546), Disable namespaceDupes again (T364546) (duration: 16m 02s)
  • 15:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:00 sukhe: sudo cumin 'A:cp' 'run-puppet-agent --enable "merging CR 1029570"'
  • 14:59 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:57 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1013.eqiad.wmnet with reason: host reimage
  • 14:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:55 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2006']
  • 14:55 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2006']
  • 14:55 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 14:55 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:54 ladsgroup@deploy1002: ladsgroup: Backport for Disable namespaceDupes again (T364546), Disable namespaceDupes again (T364546) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P62256 and previous config saved to /var/cache/conftool/dbconfig/20240509-145445-marostegui.json
  • 14:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:54 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:54 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1013.eqiad.wmnet with reason: host reimage
  • 14:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:52 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:52 ladsgroup@deploy1002: Started scap: Backport for Disable namespaceDupes again (T364546), Disable namespaceDupes again (T364546)
  • 14:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:44 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:43 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62255 and previous config saved to /var/cache/conftool/dbconfig/20240509-143938-marostegui.json
  • 14:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:29 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:29 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2006 to codfw - jhancock@cumin2002"
  • 14:28 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2006 to codfw - jhancock@cumin2002"
  • 14:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:18 eevans@cumin1002: START - Cookbook sre.hosts.reimage for host aqs1013.eqiad.wmnet with OS bullseye
  • 14:16 eevans@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host aqs1013.eqiad.wmnet with OS bullseye
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62254 and previous config saved to /var/cache/conftool/dbconfig/20240509-141526-root.json
  • 14:09 denisse: Restarting envoyproxy on titan* hosts as part of the CFSSL migration - T360414
  • 14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62253 and previous config saved to /var/cache/conftool/dbconfig/20240509-140858-marostegui.json
  • 14:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 14:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 14:06 TheresNoTime: ftr, did run `[samtar@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki quwiki --fix` for T355129, cancelled before complete due to outage
  • 14:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62252 and previous config saved to /var/cache/conftool/dbconfig/20240509-140020-root.json
  • 13:57 eevans@cumin1002: START - Cookbook sre.hosts.reimage for host aqs1013.eqiad.wmnet with OS bullseye
  • 13:47 samtar@deploy1002: Finished scap: Backport for quwiki: Set MetaNamespaceName to Wikipidiya (T355129) (duration: 19m 41s)
  • 13:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62250 and previous config saved to /var/cache/conftool/dbconfig/20240509-134514-root.json
  • 13:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 13:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T364299)', diff saved to https://phabricator.wikimedia.org/P62249 and previous config saved to /var/cache/conftool/dbconfig/20240509-134412-marostegui.json
  • 13:34 samtar@deploy1002: dreamrimmer and samtar: Continuing with sync
  • 13:30 samtar@deploy1002: dreamrimmer and samtar: Backport for quwiki: Set MetaNamespaceName to Wikipidiya (T355129) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62248 and previous config saved to /var/cache/conftool/dbconfig/20240509-133009-root.json
  • 13:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P62247 and previous config saved to /var/cache/conftool/dbconfig/20240509-132905-marostegui.json
  • 13:27 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2005-dev.codfw.wmnet with OS bookworm
  • 13:27 samtar@deploy1002: Started scap: Backport for quwiki: Set MetaNamespaceName to Wikipidiya (T355129)
  • 13:23 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2014.codfw.wmnet
  • 13:19 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2014.codfw.wmnet
  • 13:17 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2013.codfw.wmnet
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62246 and previous config saved to /var/cache/conftool/dbconfig/20240509-131501-root.json
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P62245 and previous config saved to /var/cache/conftool/dbconfig/20240509-131355-marostegui.json
  • 13:13 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2013.codfw.wmnet
  • 13:12 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2012.codfw.wmnet
  • 13:08 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2012.codfw.wmnet
  • 13:07 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2011.codfw.wmnet
  • 13:06 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2005-dev.codfw.wmnet with reason: host reimage
  • 13:04 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2011.codfw.wmnet
  • 13:03 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2010.codfw.wmnet
  • 13:03 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2005-dev.codfw.wmnet with reason: host reimage
  • 12:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62244 and previous config saved to /var/cache/conftool/dbconfig/20240509-125955-root.json
  • 12:59 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2010.codfw.wmnet
  • 12:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T364299)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-125843-marostegui.json
  • 12:58 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2009.codfw.wmnet
  • 12:52 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2009.codfw.wmnet
  • 12:50 elukey: depool/upgrade/repool ms-fe20[09-14] to upgrade envoy to TLS PKI certs
  • 12:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS bookworm
  • 12:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
  • 12:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
  • 12:21 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 12:20 ladsgroup@deploy1002: ladsgroup: Backport for Return array from LocalAuth::getCentralLists (T364538) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:18 ladsgroup@deploy1002: Started scap: Backport for Return array from LocalAuth::getCentralLists (T364538)
  • 12:16 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2008-dev.codfw.wmnet with reason: host reimage
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P62240 and previous config saved to /var/cache/conftool/dbconfig/20240509-121433-marostegui.json
  • 12:12 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
  • 12:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS bookworm
  • 12:10 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2008-dev.codfw.wmnet with reason: host reimage
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1192', diff saved to https://phabricator.wikimedia.org/P62239 and previous config saved to /var/cache/conftool/dbconfig/20240509-120955-root.json
  • 12:09 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P62237 and previous config saved to /var/cache/conftool/dbconfig/20240509-115925-marostegui.json
  • 11:51 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet2008-dev.codfw.wmnet with OS bookworm
  • 11:50 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet2007-dev.codfw.wmnet with OS bookworm
  • 11:50 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 11:50 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 11:49 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 11:49 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 11:48 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 11:47 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 11:46 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 11:45 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 11:45 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 11:45 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62236 and previous config saved to /var/cache/conftool/dbconfig/20240509-114417-marostegui.json
  • 11:44 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 11:43 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62235 and previous config saved to /var/cache/conftool/dbconfig/20240509-113443-root.json
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62234 and previous config saved to /var/cache/conftool/dbconfig/20240509-111936-root.json
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62233 and previous config saved to /var/cache/conftool/dbconfig/20240509-111100-marostegui.json
  • 11:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 11:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 11:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T364299)', diff saved to https://phabricator.wikimedia.org/P62232 and previous config saved to /var/cache/conftool/dbconfig/20240509-111037-marostegui.json
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62231 and previous config saved to /var/cache/conftool/dbconfig/20240509-110430-root.json
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P62230 and previous config saved to /var/cache/conftool/dbconfig/20240509-105527-marostegui.json
  • 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62229 and previous config saved to /var/cache/conftool/dbconfig/20240509-104922-root.json
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P62228 and previous config saved to /var/cache/conftool/dbconfig/20240509-104019-marostegui.json
  • 10:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62227 and previous config saved to /var/cache/conftool/dbconfig/20240509-103417-root.json
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T364299)', diff saved to https://phabricator.wikimedia.org/P62226 and previous config saved to /var/cache/conftool/dbconfig/20240509-102512-marostegui.json
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62225 and previous config saved to /var/cache/conftool/dbconfig/20240509-101911-root.json
  • 10:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS bookworm
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62224 and previous config saved to /var/cache/conftool/dbconfig/20240509-100405-root.json
  • 10:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-100006-ladsgroup.json
  • 10:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 09:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 09:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P62222 and previous config saved to /var/cache/conftool/dbconfig/20240509-095943-ladsgroup.json
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T364299)', diff saved to https://phabricator.wikimedia.org/P62221 and previous config saved to /var/cache/conftool/dbconfig/20240509-095313-marostegui.json
  • 09:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 09:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T364299)', diff saved to https://phabricator.wikimedia.org/P62220 and previous config saved to /var/cache/conftool/dbconfig/20240509-095249-marostegui.json
  • 09:52 jforrester@deploy1002: Finished scap: Backport for Disable ParserMigration on commonswiki (T364228) (duration: 16m 17s)
  • 09:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
  • 09:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-094431-ladsgroup.json
  • 09:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
  • 09:39 jforrester@deploy1002: lucaswerkmeister-wmde and jforrester: Continuing with sync
  • 09:38 jforrester@deploy1002: lucaswerkmeister-wmde and jforrester: Backport for Disable ParserMigration on commonswiki (T364228) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P62219 and previous config saved to /var/cache/conftool/dbconfig/20240509-093742-marostegui.json
  • 09:36 jforrester@deploy1002: Started scap: Backport for Disable ParserMigration on commonswiki (T364228)
  • 09:31 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: upgrade to 10.6
  • 09:31 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1171.eqiad.wmnet with reason: upgrade to 10.6
  • 09:31 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: upgrade to 10.6
  • 09:31 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: upgrade to 10.6
  • 09:29 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS bookworm
  • 09:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P62218 and previous config saved to /var/cache/conftool/dbconfig/20240509-092921-ladsgroup.json
  • 09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1167', diff saved to https://phabricator.wikimedia.org/P62217 and previous config saved to /var/cache/conftool/dbconfig/20240509-092757-root.json
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P62216 and previous config saved to /var/cache/conftool/dbconfig/20240509-092234-marostegui.json
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62215 and previous config saved to /var/cache/conftool/dbconfig/20240509-091445-root.json
  • 09:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P62214 and previous config saved to /var/cache/conftool/dbconfig/20240509-091413-ladsgroup.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T364299)', diff saved to https://phabricator.wikimedia.org/P62213 and previous config saved to /var/cache/conftool/dbconfig/20240509-090726-marostegui.json
  • 09:04 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 08:59 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62212 and previous config saved to /var/cache/conftool/dbconfig/20240509-085939-root.json
  • 08:54 btullis@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 08:53 jynus: deploy new grants for es6, es7 backups T363812
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62211 and previous config saved to /var/cache/conftool/dbconfig/20240509-084433-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T364299)', diff saved to https://phabricator.wikimedia.org/P62210 and previous config saved to /var/cache/conftool/dbconfig/20240509-083705-marostegui.json
  • 08:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T364299)', diff saved to https://phabricator.wikimedia.org/P62209 and previous config saved to /var/cache/conftool/dbconfig/20240509-083643-marostegui.json
  • 08:30 godog: set batphone oncall for May 9th only for EMEA, not Americas - T350192
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62208 and previous config saved to /var/cache/conftool/dbconfig/20240509-082927-root.json
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P62207 and previous config saved to /var/cache/conftool/dbconfig/20240509-082135-marostegui.json
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62206 and previous config saved to /var/cache/conftool/dbconfig/20240509-081422-root.json
  • 08:13 godog: set batphone oncall for May 9th - T350192
  • 08:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62205 and previous config saved to /var/cache/conftool/dbconfig/20240509-080936-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P62204 and previous config saved to /var/cache/conftool/dbconfig/20240509-080627-marostegui.json
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62203 and previous config saved to /var/cache/conftool/dbconfig/20240509-080549-root.json
  • 08:02 zabe@deploy1002: Finished scap: Backport for Fix error when marking a new page for translations (T364522) (duration: 19m 28s)
  • 07:59 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62202 and previous config saved to /var/cache/conftool/dbconfig/20240509-075914-root.json
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62201 and previous config saved to /var/cache/conftool/dbconfig/20240509-075429-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T364299)', diff saved to https://phabricator.wikimedia.org/P62200 and previous config saved to /var/cache/conftool/dbconfig/20240509-075118-marostegui.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62199 and previous config saved to /var/cache/conftool/dbconfig/20240509-075043-root.json
  • 07:50 zabe@deploy1002: zabe and abi: Continuing with sync
  • 07:45 zabe@deploy1002: zabe and abi: Backport for Fix error when marking a new page for translations (T364522) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62198 and previous config saved to /var/cache/conftool/dbconfig/20240509-074408-root.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Fully repool db1172', diff saved to https://phabricator.wikimedia.org/P62197 and previous config saved to /var/cache/conftool/dbconfig/20240509-074355-marostegui.json
  • 07:43 zabe@deploy1002: Started scap: Backport for Fix error when marking a new page for translations (T364522)
  • 07:42 zabe@deploy1002: Finished scap: Backport for Move wgGroupsAddToSelf and wgGroupsRemoveFromSelf to core-Permissions (duration: 17m 37s)
  • 07:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62196 and previous config saved to /var/cache/conftool/dbconfig/20240509-073922-root.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62195 and previous config saved to /var/cache/conftool/dbconfig/20240509-073537-root.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62194 and previous config saved to /var/cache/conftool/dbconfig/20240509-073311-root.json
  • 07:29 zabe@deploy1002: zabe: Continuing with sync
  • 07:28 zabe@deploy1002: zabe: Backport for Move wgGroupsAddToSelf and wgGroupsRemoveFromSelf to core-Permissions synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:24 zabe@deploy1002: Started scap: Backport for Move wgGroupsAddToSelf and wgGroupsRemoveFromSelf to core-Permissions
  • 07:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Repooling', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-072411-root.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62192 and previous config saved to /var/cache/conftool/dbconfig/20240509-072032-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62191 and previous config saved to /var/cache/conftool/dbconfig/20240509-071805-root.json
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T364299)', diff saved to https://phabricator.wikimedia.org/P62190 and previous config saved to /var/cache/conftool/dbconfig/20240509-071527-marostegui.json
  • 07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 07:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 07:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 07:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T364299)', diff saved to https://phabricator.wikimedia.org/P62189 and previous config saved to /var/cache/conftool/dbconfig/20240509-071449-marostegui.json
  • 07:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2020.codfw.wmnet with OS bookworm
  • 07:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62188 and previous config saved to /var/cache/conftool/dbconfig/20240509-070905-root.json
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62187 and previous config saved to /var/cache/conftool/dbconfig/20240509-070526-root.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62186 and previous config saved to /var/cache/conftool/dbconfig/20240509-070300-root.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P62185 and previous config saved to /var/cache/conftool/dbconfig/20240509-065941-marostegui.json
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 5%: Repooling', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-065355-root.json
  • 06:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2020.codfw.wmnet with reason: host reimage
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62183 and previous config saved to /var/cache/conftool/dbconfig/20240509-065020-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62182 and previous config saved to /var/cache/conftool/dbconfig/20240509-064754-root.json
  • 06:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2020.codfw.wmnet with reason: host reimage
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P62181 and previous config saved to /var/cache/conftool/dbconfig/20240509-064434-marostegui.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62180 and previous config saved to /var/cache/conftool/dbconfig/20240509-063845-root.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62179 and previous config saved to /var/cache/conftool/dbconfig/20240509-063832-root.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62178 and previous config saved to /var/cache/conftool/dbconfig/20240509-063514-root.json
  • 06:35 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1180.eqiad.wmnet onto db1231.eqiad.wmnet
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62177 and previous config saved to /var/cache/conftool/dbconfig/20240509-063248-root.json
  • 06:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T364299)', diff saved to https://phabricator.wikimedia.org/P62176 and previous config saved to /var/cache/conftool/dbconfig/20240509-062926-marostegui.json
  • 06:24 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2020.codfw.wmnet with OS bookworm
  • 06:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62175 and previous config saved to /var/cache/conftool/dbconfig/20240509-062327-root.json
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'Give some weight to es4 codfw master', diff saved to https://phabricator.wikimedia.org/P62174 and previous config saved to /var/cache/conftool/dbconfig/20240509-062027-marostegui.json
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2020 T364451', diff saved to https://phabricator.wikimedia.org/P62173 and previous config saved to /var/cache/conftool/dbconfig/20240509-061957-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2021 to es4 primary and set section read-write T364451', diff saved to https://phabricator.wikimedia.org/P62172 and previous config saved to /var/cache/conftool/dbconfig/20240509-061904-marostegui.json
  • 06:18 marostegui: Starting es4 codfw failover from es2020 to es2021 - T364451
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62171 and previous config saved to /var/cache/conftool/dbconfig/20240509-061742-root.json
  • 06:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T364451
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2021 with weight 0 T364451', diff saved to https://phabricator.wikimedia.org/P62170 and previous config saved to /var/cache/conftool/dbconfig/20240509-061500-root.json
  • 06:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T364451
  • 06:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS bookworm
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62169 and previous config saved to /var/cache/conftool/dbconfig/20240509-060821-root.json
  • 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T364299)', diff saved to https://phabricator.wikimedia.org/P62168 and previous config saved to /var/cache/conftool/dbconfig/20240509-055429-marostegui.json
  • 05:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 05:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62167 and previous config saved to /var/cache/conftool/dbconfig/20240509-055314-root.json
  • 05:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
  • 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
  • 05:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 05:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 05:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 10%: Repooling', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-053804-root.json
  • 05:37 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS bookworm
  • 05:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1172 T363792', diff saved to https://phabricator.wikimedia.org/P62166 and previous config saved to /var/cache/conftool/dbconfig/20240509-053442-marostegui.json
  • 05:32 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db1180.eqiad.wmnet onto db1231.eqiad.wmnet
  • 05:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231', diff saved to https://phabricator.wikimedia.org/P62165 and previous config saved to /var/cache/conftool/dbconfig/20240509-052912-root.json
  • 05:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 05:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 05:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62164 and previous config saved to /var/cache/conftool/dbconfig/20240509-052258-root.json
  • 05:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 05:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 05:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62163 and previous config saved to /var/cache/conftool/dbconfig/20240509-050752-root.json
  • 05:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 05:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 04:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 04:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T364067', diff saved to https://phabricator.wikimedia.org/P62162 and previous config saved to /var/cache/conftool/dbconfig/20240509-045216-marostegui.json
  • 04:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364067
  • 04:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364067
  • 04:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T361627)', diff saved to https://phabricator.wikimedia.org/P62161 and previous config saved to /var/cache/conftool/dbconfig/20240509-043908-marostegui.json
  • 04:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 04:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 04:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T361627)', diff saved to https://phabricator.wikimedia.org/P62160 and previous config saved to /var/cache/conftool/dbconfig/20240509-043845-marostegui.json
  • 04:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P62159 and previous config saved to /var/cache/conftool/dbconfig/20240509-042337-marostegui.json
  • 04:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P62158 and previous config saved to /var/cache/conftool/dbconfig/20240509-040830-marostegui.json
  • 03:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T361627)', diff saved to https://phabricator.wikimedia.org/P62157 and previous config saved to /var/cache/conftool/dbconfig/20240509-035320-marostegui.json
  • 03:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T361627)', diff saved to https://phabricator.wikimedia.org/P62156 and previous config saved to /var/cache/conftool/dbconfig/20240509-034128-marostegui.json
  • 03:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 03:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 03:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T361627)', diff saved to https://phabricator.wikimedia.org/P62155 and previous config saved to /var/cache/conftool/dbconfig/20240509-034105-marostegui.json
  • 03:32 eileen: civicrm upgraded from 3c8a3095 to 6256c944
  • 03:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-032552-marostegui.json
  • 03:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P62153 and previous config saved to /var/cache/conftool/dbconfig/20240509-031045-marostegui.json
  • 02:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T361627)', diff saved to https://phabricator.wikimedia.org/P62152 and previous config saved to /var/cache/conftool/dbconfig/20240509-025537-marostegui.json
  • 02:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P62151 and previous config saved to /var/cache/conftool/dbconfig/20240509-024531-ladsgroup.json
  • 02:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 02:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 02:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P62150 and previous config saved to /var/cache/conftool/dbconfig/20240509-024508-ladsgroup.json
  • 02:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T361627)', diff saved to https://phabricator.wikimedia.org/P62149 and previous config saved to /var/cache/conftool/dbconfig/20240509-024455-marostegui.json
  • 02:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 02:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 02:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T361627)', diff saved to https://phabricator.wikimedia.org/P62148 and previous config saved to /var/cache/conftool/dbconfig/20240509-024432-marostegui.json
  • 02:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P62147 and previous config saved to /var/cache/conftool/dbconfig/20240509-023000-ladsgroup.json
  • 02:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P62146 and previous config saved to /var/cache/conftool/dbconfig/20240509-022925-marostegui.json
  • 02:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P62145 and previous config saved to /var/cache/conftool/dbconfig/20240509-021452-ladsgroup.json
  • 02:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P62144 and previous config saved to /var/cache/conftool/dbconfig/20240509-021417-marostegui.json
  • 01:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P62143 and previous config saved to /var/cache/conftool/dbconfig/20240509-015942-ladsgroup.json
  • 01:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T361627)', diff saved to https://phabricator.wikimedia.org/P62142 and previous config saved to /var/cache/conftool/dbconfig/20240509-015909-marostegui.json
  • 01:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T361627)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-014836-marostegui.json
  • 01:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 01:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 01:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T361627)', diff saved to https://phabricator.wikimedia.org/P62140 and previous config saved to /var/cache/conftool/dbconfig/20240509-014814-marostegui.json
  • 01:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P62139 and previous config saved to /var/cache/conftool/dbconfig/20240509-013305-marostegui.json
  • 01:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P62138 and previous config saved to /var/cache/conftool/dbconfig/20240509-011758-marostegui.json
  • 01:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T361627)', diff saved to https://phabricator.wikimedia.org/P62137 and previous config saved to /var/cache/conftool/dbconfig/20240509-010250-marostegui.json
  • 00:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T361627)', diff saved to https://phabricator.wikimedia.org/P62136 and previous config saved to /var/cache/conftool/dbconfig/20240509-005146-marostegui.json
  • 00:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 00:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 00:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T361627)', diff saved to https://phabricator.wikimedia.org/P62135 and previous config saved to /var/cache/conftool/dbconfig/20240509-005122-marostegui.json
  • 00:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P62134 and previous config saved to /var/cache/conftool/dbconfig/20240509-003614-marostegui.json
  • 00:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P62133 and previous config saved to /var/cache/conftool/dbconfig/20240509-002105-marostegui.json
  • 00:14 eileen: civicrm upgraded from bf49ecdc to 3c8a3095
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T361627)', diff saved to https://phabricator.wikimedia.org/P62132 and previous config saved to /var/cache/conftool/dbconfig/20240509-000554-marostegui.json

2024-05-08

  • 23:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T361627)', diff saved to https://phabricator.wikimedia.org/P62131 and previous config saved to /var/cache/conftool/dbconfig/20240508-235350-marostegui.json
  • 23:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 23:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 23:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T361627)', diff saved to https://phabricator.wikimedia.org/P62130 and previous config saved to /var/cache/conftool/dbconfig/20240508-235327-marostegui.json
  • 23:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P62129 and previous config saved to /var/cache/conftool/dbconfig/20240508-233820-marostegui.json
  • 23:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240508-232308-marostegui.json
  • 23:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T361627)', diff saved to https://phabricator.wikimedia.org/P62127 and previous config saved to /var/cache/conftool/dbconfig/20240508-230800-marostegui.json
  • 22:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T361627)', diff saved to https://phabricator.wikimedia.org/P62126 and previous config saved to /var/cache/conftool/dbconfig/20240508-225652-marostegui.json
  • 22:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 22:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 22:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T361627)', diff saved to https://phabricator.wikimedia.org/P62125 and previous config saved to /var/cache/conftool/dbconfig/20240508-225628-marostegui.json
  • 22:53 mutante: contint1003 - systemctl start wmf_auto_restart_envoyproxy T364510 T358237
  • 22:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P62124 and previous config saved to /var/cache/conftool/dbconfig/20240508-224120-marostegui.json
  • 22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P62123 and previous config saved to /var/cache/conftool/dbconfig/20240508-222613-marostegui.json
  • 22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T361627)', diff saved to https://phabricator.wikimedia.org/P62122 and previous config saved to /var/cache/conftool/dbconfig/20240508-221105-marostegui.json
  • 21:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T361627)', diff saved to https://phabricator.wikimedia.org/P62121 and previous config saved to /var/cache/conftool/dbconfig/20240508-212242-marostegui.json
  • 21:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 21:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 21:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T361627)', diff saved to https://phabricator.wikimedia.org/P62119 and previous config saved to /var/cache/conftool/dbconfig/20240508-212219-marostegui.json
  • 21:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P62118 and previous config saved to /var/cache/conftool/dbconfig/20240508-210711-marostegui.json
  • 20:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P62117 and previous config saved to /var/cache/conftool/dbconfig/20240508-205203-marostegui.json
  • 20:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T361627)', diff saved to https://phabricator.wikimedia.org/P62116 and previous config saved to /var/cache/conftool/dbconfig/20240508-203655-marostegui.json
  • 20:25 ebernhardson@deploy1002: Finished scap: Backport for cirrus: Shift remaining public wikis in codfw to replacement updater (T363475) (duration: 16m 00s)
  • 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T361627)', diff saved to https://phabricator.wikimedia.org/P62115 and previous config saved to /var/cache/conftool/dbconfig/20240508-202516-marostegui.json
  • 20:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 20:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 20:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T361627)', diff saved to https://phabricator.wikimedia.org/P62114 and previous config saved to /var/cache/conftool/dbconfig/20240508-202446-marostegui.json
  • 20:12 ebernhardson@deploy1002: ebernhardson: Continuing with sync
  • 20:12 ebernhardson@deploy1002: ebernhardson: Backport for cirrus: Shift remaining public wikis in codfw to replacement updater (T363475) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:09 ebernhardson@deploy1002: Started scap: Backport for cirrus: Shift remaining public wikis in codfw to replacement updater (T363475)
  • 20:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P62113 and previous config saved to /var/cache/conftool/dbconfig/20240508-200935-marostegui.json
  • 19:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P62112 and previous config saved to /var/cache/conftool/dbconfig/20240508-195428-marostegui.json
  • 19:51 taavi@deploy1002: Finished scap: Backport for cawiki: Restore normal logo (T363057) (duration: 15m 29s)
  • 19:49 ebernhardson@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:48 ebernhardson@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T361627)', diff saved to https://phabricator.wikimedia.org/P62111 and previous config saved to /var/cache/conftool/dbconfig/20240508-193920-marostegui.json
  • 19:38 taavi@deploy1002: taavi: Continuing with sync
  • 19:38 taavi@deploy1002: taavi: Backport for cawiki: Restore normal logo (T363057) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P62110 and previous config saved to /var/cache/conftool/dbconfig/20240508-193624-ladsgroup.json
  • 19:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 19:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62109 and previous config saved to /var/cache/conftool/dbconfig/20240508-193601-ladsgroup.json
  • 19:36 taavi@deploy1002: Started scap: Backport for cawiki: Restore normal logo (T363057)
  • 19:33 ladsgroup@deploy1002: Finished scap: Backport for FlaggedRevsStats: Fix migration to query builder (duration: 16m 39s)
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T361627)', diff saved to https://phabricator.wikimedia.org/P62108 and previous config saved to /var/cache/conftool/dbconfig/20240508-192743-marostegui.json
  • 19:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 19:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T361627)', diff saved to https://phabricator.wikimedia.org/P62107 and previous config saved to /var/cache/conftool/dbconfig/20240508-192715-marostegui.json
  • 19:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P62106 and previous config saved to /var/cache/conftool/dbconfig/20240508-192054-ladsgroup.json
  • 19:20 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 19:20 ladsgroup@deploy1002: ladsgroup: Backport for FlaggedRevsStats: Fix migration to query builder synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:16 ladsgroup@deploy1002: Started scap: Backport for FlaggedRevsStats: Fix migration to query builder
  • 19:16 ladsgroup@deploy1002: Finished scap: Backport for Revert "logos: Add fawiki logo for 1,000,000 article" (duration: 16m 18s)
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P62105 and previous config saved to /var/cache/conftool/dbconfig/20240508-191207-marostegui.json
  • 19:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P62104 and previous config saved to /var/cache/conftool/dbconfig/20240508-190546-ladsgroup.json
  • 19:03 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 19:02 ladsgroup@deploy1002: ladsgroup: Backport for Revert "logos: Add fawiki logo for 1,000,000 article" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:59 ladsgroup@deploy1002: Started scap: Backport for Revert "logos: Add fawiki logo for 1,000,000 article"
  • 18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P62103 and previous config saved to /var/cache/conftool/dbconfig/20240508-185700-marostegui.json
  • 18:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62102 and previous config saved to /var/cache/conftool/dbconfig/20240508-185038-ladsgroup.json
  • 18:49 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.4 refs T361398
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T361627)', diff saved to https://phabricator.wikimedia.org/P62101 and previous config saved to /var/cache/conftool/dbconfig/20240508-184152-marostegui.json
  • 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T361627)', diff saved to https://phabricator.wikimedia.org/P62100 and previous config saved to /var/cache/conftool/dbconfig/20240508-183014-marostegui.json
  • 18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 18:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T361627)', diff saved to https://phabricator.wikimedia.org/P62099 and previous config saved to /var/cache/conftool/dbconfig/20240508-182951-marostegui.json
  • 18:24 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.4 refs T361398
  • 18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P62098 and previous config saved to /var/cache/conftool/dbconfig/20240508-181443-marostegui.json
  • 17:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P62097 and previous config saved to /var/cache/conftool/dbconfig/20240508-175936-marostegui.json
  • 17:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T361627)', diff saved to https://phabricator.wikimedia.org/P62096 and previous config saved to /var/cache/conftool/dbconfig/20240508-174428-marostegui.json
  • 17:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T361627)', diff saved to https://phabricator.wikimedia.org/P62095 and previous config saved to /var/cache/conftool/dbconfig/20240508-173353-marostegui.json
  • 17:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 17:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 17:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 17:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 17:04 sfaci@deploy1002: Finished deploy [airflow-dags/analytics@1f72038]: (no justification provided) (duration: 00m 29s)
  • 17:03 sfaci@deploy1002: Started deploy [airflow-dags/analytics@1f72038]: (no justification provided)
  • 16:45 sfaci@deploy1002: Finished deploy [analytics/refinery@1c45ef4] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@1c45ef4d] (duration: 02m 52s)
  • 16:45 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:45 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 16:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 16:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T361627)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240508-164322-marostegui.json
  • 16:43 sfaci@deploy1002: Started deploy [analytics/refinery@1c45ef4] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@1c45ef4d]
  • 16:42 sfaci@deploy1002: Finished deploy [analytics/refinery@1c45ef4] (thin): Regular analytics weekly train THIN [analytics/refinery@1c45ef4d] (duration: 03m 53s)
  • 16:38 sfaci@deploy1002: Started deploy [analytics/refinery@1c45ef4] (thin): Regular analytics weekly train THIN [analytics/refinery@1c45ef4d]
  • 16:38 sfaci@deploy1002: Finished deploy [analytics/refinery@1c45ef4]: Regular analytics weekly train [analytics/refinery@1c45ef4d] (duration: 16m 37s)
  • 16:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P62094 and previous config saved to /var/cache/conftool/dbconfig/20240508-162812-marostegui.json
  • 16:25 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 16:24 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 16:21 sfaci@deploy1002: Started deploy [analytics/refinery@1c45ef4]: Regular analytics weekly train [analytics/refinery@1c45ef4d]
  • 16:21 sfaci: Deploying refinery
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P62093 and previous config saved to /var/cache/conftool/dbconfig/20240508-161305-marostegui.json
  • 16:06 klausman@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 16:03 jelto@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 15:58 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 15:58 klausman@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 15:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T361627)', diff saved to https://phabricator.wikimedia.org/P62092 and previous config saved to /var/cache/conftool/dbconfig/20240508-155757-marostegui.json
  • 15:56 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 15:50 vgutierrez: tested fifo-log-demux 0.7.3 on cp4052, downgraded to 0.6.5
  • 15:38 moritzm: imported tomcat9 9.0.43-2~deb11u10+wmf12u1 to component/tomcat9 for bookworm-wikimedia (rebasing our forward port to the latest security update)
  • 15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T361627)', diff saved to https://phabricator.wikimedia.org/P62091 and previous config saved to /var/cache/conftool/dbconfig/20240508-153738-marostegui.json
  • 15:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 15:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 15:35 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 15:35 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 15:28 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 15:28 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 15:22 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 15:21 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 15:21 jelto: bump apt package gitlab-ce to 16.9.7-ce.0
  • 15:17 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 15:16 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 15:09 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 15:08 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 15:08 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 15:06 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 15:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 15:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 15:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T361627)', diff saved to https://phabricator.wikimedia.org/P62090 and previous config saved to /var/cache/conftool/dbconfig/20240508-150611-marostegui.json
  • 15:05 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 15:05 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 14:58 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:58 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P62089 and previous config saved to /var/cache/conftool/dbconfig/20240508-145100-marostegui.json
  • 14:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62088 and previous config saved to /var/cache/conftool/dbconfig/20240508-144501-ladsgroup.json
  • 14:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:44 moritzm: installing Java 8 security updates
  • 14:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:38 moritzm: installing Java 11 security updates
  • 14:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P62087 and previous config saved to /var/cache/conftool/dbconfig/20240508-143552-marostegui.json
  • 14:23 jiji@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:22 jiji@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T361627)', diff saved to https://phabricator.wikimedia.org/P62086 and previous config saved to /var/cache/conftool/dbconfig/20240508-142045-marostegui.json
  • 14:18 jiji@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:17 jiji@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:17 moritzm: installing libgd2 security updates
  • 14:15 jiji@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 14:15 jiji@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 14:14 jiji@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:13 jiji@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 14:09 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:08 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:59 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:57 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:57 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:55 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:54 zabe@deploy1002: Finished scap: Backport for Enable 'flood' user group at en.wikiquote (T351250), Remove wmgCollectionArticleNamespaces config for enWS (T361422) (duration: 19m 22s)
  • 13:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T361627)', diff saved to https://phabricator.wikimedia.org/P62084 and previous config saved to /var/cache/conftool/dbconfig/20240508-135314-marostegui.json
  • 13:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T361627)', diff saved to https://phabricator.wikimedia.org/P62083 and previous config saved to /var/cache/conftool/dbconfig/20240508-135250-marostegui.json
  • 13:52 moritzm: installing Java 17 security updates
  • 13:47 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 13:45 jiji@deploy1002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1236.eqiad.wmnet
  • 13:43 vgutierrez: update to tcp-mss-clamper 0.5 on ncredir6001
  • 13:41 zabe@deploy1002: zabe and dreamrimmer: Continuing with sync
  • 13:41 vgutierrez: uploaded tcp-mss-clamper 0.5 (bullseye|bookworm)-wikimedia (apt.wm.o)
  • 13:39 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:38 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1236.eqiad.wmnet
  • 13:37 zabe@deploy1002: zabe and dreamrimmer: Backport for Enable 'flood' user group at en.wikiquote (T351250), Remove wmgCollectionArticleNamespaces config for enWS (T361422) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1227.eqiad.wmnet
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P62082 and previous config saved to /var/cache/conftool/dbconfig/20240508-133742-marostegui.json
  • 13:35 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1014.eqiad.wmnet
  • 13:35 zabe@deploy1002: Started scap: Backport for Enable 'flood' user group at en.wikiquote (T351250), Remove wmgCollectionArticleNamespaces config for enWS (T361422)
  • 13:32 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1014.eqiad.wmnet
  • 13:31 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1013.eqiad.wmnet
  • 13:28 zabe@deploy1002: Finished scap: Backport for Add tm: as alias to template: on English Wikipedia (T363757), [ruwiki] Limit the use of the ContentTranslation tool (T362440) (duration: 21m 36s)
  • 13:28 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:27 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:27 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:27 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:27 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1013.eqiad.wmnet
  • 13:27 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1227.eqiad.wmnet
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1202.eqiad.wmnet
  • 13:23 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1012.eqiad.wmnet
  • 13:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P62081 and previous config saved to /var/cache/conftool/dbconfig/20240508-132235-marostegui.json
  • 13:21 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:17 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1012.eqiad.wmnet
  • 13:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1202.eqiad.wmnet
  • 13:15 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1011.eqiad.wmnet
  • 13:14 zabe@deploy1002: zabe and dreamrimmer: Continuing with sync
  • 13:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1191.eqiad.wmnet
  • 13:11 zabe@deploy1002: zabe and dreamrimmer: Backport for Add tm: as alias to template: on English Wikipedia (T363757), [ruwiki] Limit the use of the ContentTranslation tool (T362440) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:10 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1011.eqiad.wmnet
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T361627)', diff saved to https://phabricator.wikimedia.org/P62080 and previous config saved to /var/cache/conftool/dbconfig/20240508-130727-marostegui.json
  • 13:06 zabe@deploy1002: Started scap: Backport for Add tm: as alias to template: on English Wikipedia (T363757), [ruwiki] Limit the use of the ContentTranslation tool (T362440)
  • 13:05 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1010.eqiad.wmnet
  • 12:58 elukey: depool/deploy/repool every node in the range ms-fe10[10-14] to upgrade envoy to PKI TLS certs
  • 12:57 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 12:57 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1010.eqiad.wmnet
  • 12:56 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 12:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1191.eqiad.wmnet
  • 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1181.eqiad.wmnet
  • 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P62076 and previous config saved to /var/cache/conftool/dbconfig/20240508-122631-marostegui.json
  • 12:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1174.eqiad.wmnet
  • 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1170.eqiad.wmnet
  • 12:16 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2396.codfw.wmnet|mw2397.codfw.wmnet|mw2398.codfw.wmnet|mw2399.codfw.wmnet|mw2401.codfw.wmnet|mw2402.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P62075 and previous config saved to /var/cache/conftool/dbconfig/20240508-121123-marostegui.json
  • 12:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1170.eqiad.wmnet
  • 11:57 moritzm: installing tomcat security updates
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T361627)', diff saved to https://phabricator.wikimedia.org/P62074 and previous config saved to /var/cache/conftool/dbconfig/20240508-115616-marostegui.json
  • 11:37 hnowlan: running homer commit for new codfw appservers
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T361627)', diff saved to https://phabricator.wikimedia.org/P62073 and previous config saved to /var/cache/conftool/dbconfig/20240508-113048-marostegui.json
  • 11:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T361627)', diff saved to https://phabricator.wikimedia.org/P62072 and previous config saved to /var/cache/conftool/dbconfig/20240508-113025-marostegui.json
  • 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62071 and previous config saved to /var/cache/conftool/dbconfig/20240508-112439-root.json
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62070 and previous config saved to /var/cache/conftool/dbconfig/20240508-112054-root.json
  • 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1015.eqiad.wmnet
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P62069 and previous config saved to /var/cache/conftool/dbconfig/20240508-111518-marostegui.json
  • 11:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1015.eqiad.wmnet
  • 11:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2397.codfw.wmnet with OS bullseye
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62068 and previous config saved to /var/cache/conftool/dbconfig/20240508-110933-root.json
  • 11:08 volans@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for sretest1003.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 11:06 volans@cumin1002: START - Cookbook sre.puppet.renew-cert for sretest1003.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1011.eqiad.wmnet
  • 11:06 volans@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for sretest1002.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62067 and previous config saved to /var/cache/conftool/dbconfig/20240508-110545-root.json
  • 11:05 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2399.codfw.wmnet with OS bullseye
  • 11:03 volans@cumin1002: START - Cookbook sre.puppet.renew-cert for sretest1002.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 11:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2402.codfw.wmnet with OS bullseye
  • 11:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2398.codfw.wmnet with OS bullseye
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P62066 and previous config saved to /var/cache/conftool/dbconfig/20240508-110010-marostegui.json
  • 10:59 volans@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for sretest1001.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 10:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2401.codfw.wmnet with OS bullseye
  • 10:57 volans@cumin1002: START - Cookbook sre.puppet.renew-cert for sretest1001.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 10:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2396.codfw.wmnet with OS bullseye
  • 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62065 and previous config saved to /var/cache/conftool/dbconfig/20240508-105428-root.json
  • 10:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1011.eqiad.wmnet
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62064 and previous config saved to /var/cache/conftool/dbconfig/20240508-105039-root.json
  • 10:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2397.codfw.wmnet with reason: host reimage
  • 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2220.codfw.wmnet
  • 10:48 ladsgroup@deploy1002: Finished scap: Backport for pager: Use SelectQueryBuilder::rawTables in IndexPager (T364428) (duration: 15m 42s)
  • 10:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2399.codfw.wmnet with reason: host reimage
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T361627)', diff saved to https://phabricator.wikimedia.org/P62063 and previous config saved to /var/cache/conftool/dbconfig/20240508-104503-marostegui.json
  • 10:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2402.codfw.wmnet with reason: host reimage
  • 10:41 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2398.codfw.wmnet with reason: host reimage
  • 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62062 and previous config saved to /var/cache/conftool/dbconfig/20240508-103922-root.json
  • 10:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2220.codfw.wmnet
  • 10:38 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2401.codfw.wmnet with reason: host reimage
  • 10:36 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2396.codfw.wmnet with reason: host reimage
  • 10:36 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 10:35 ladsgroup@deploy1002: ladsgroup: Backport for pager: Use SelectQueryBuilder::rawTables in IndexPager (T364428) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62061 and previous config saved to /var/cache/conftool/dbconfig/20240508-103531-root.json
  • 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2218.codfw.wmnet
  • 10:34 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62060 and previous config saved to /var/cache/conftool/dbconfig/20240508-103410-root.json
  • 10:33 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2398.codfw.wmnet with reason: host reimage
  • 10:33 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2397.codfw.wmnet with reason: host reimage
  • 10:33 ladsgroup@deploy1002: Started scap: Backport for pager: Use SelectQueryBuilder::rawTables in IndexPager (T364428)
  • 10:33 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2399.codfw.wmnet with reason: host reimage
  • 10:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2402.codfw.wmnet with reason: host reimage
  • 10:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2401.codfw.wmnet with reason: host reimage
  • 10:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2396.codfw.wmnet with reason: host reimage
  • 10:24 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62059 and previous config saved to /var/cache/conftool/dbconfig/20240508-102416-root.json
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62058 and previous config saved to /var/cache/conftool/dbconfig/20240508-102023-root.json
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T361627)', diff saved to https://phabricator.wikimedia.org/P62057 and previous config saved to /var/cache/conftool/dbconfig/20240508-101946-marostegui.json
  • 10:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 10:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T361627)', diff saved to https://phabricator.wikimedia.org/P62056 and previous config saved to /var/cache/conftool/dbconfig/20240508-101923-marostegui.json
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62055 and previous config saved to /var/cache/conftool/dbconfig/20240508-101905-root.json
  • 10:19 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1011.eqiad.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2399.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2398.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2402.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2401.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2397.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2396.codfw.wmnet with OS bullseye
  • 10:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2218.codfw.wmnet
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62054 and previous config saved to /var/cache/conftool/dbconfig/20240508-100910-root.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62053 and previous config saved to /var/cache/conftool/dbconfig/20240508-100517-root.json
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P62052 and previous config saved to /var/cache/conftool/dbconfig/20240508-100416-marostegui.json
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62051 and previous config saved to /var/cache/conftool/dbconfig/20240508-100359-root.json
  • 09:58 hnowlan: depooling 6 6 codfw api appservers in advance of reimaging to k8s workers
  • 09:56 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1011.eqiad.wmnet with reason: host reimage
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62050 and previous config saved to /var/cache/conftool/dbconfig/20240508-095405-root.json
  • 09:53 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1011.eqiad.wmnet with reason: host reimage
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62049 and previous config saved to /var/cache/conftool/dbconfig/20240508-095011-root.json
  • 09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1022.eqiad.wmnet with OS bookworm
  • 09:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P62048 and previous config saved to /var/cache/conftool/dbconfig/20240508-094905-marostegui.json
  • 09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS bookworm
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62047 and previous config saved to /var/cache/conftool/dbconfig/20240508-094853-root.json
  • 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2208.codfw.wmnet
  • 09:41 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host snapshot1011.eqiad.wmnet with OS bullseye
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T361627)', diff saved to https://phabricator.wikimedia.org/P62046 and previous config saved to /var/cache/conftool/dbconfig/20240508-093350-marostegui.json
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62045 and previous config saved to /var/cache/conftool/dbconfig/20240508-093347-root.json
  • 09:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62044 and previous config saved to /var/cache/conftool/dbconfig/20240508-092944-root.json
  • 09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
  • 09:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1022.eqiad.wmnet with reason: host reimage
  • 09:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
  • 09:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1022.eqiad.wmnet with reason: host reimage
  • 09:18 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62043 and previous config saved to /var/cache/conftool/dbconfig/20240508-091841-root.json
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62042 and previous config saved to /var/cache/conftool/dbconfig/20240508-091434-root.json
  • 09:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS bookworm
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1177 T363792', diff saved to https://phabricator.wikimedia.org/P62041 and previous config saved to /var/cache/conftool/dbconfig/20240508-090925-root.json
  • 09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T361627)', diff saved to https://phabricator.wikimedia.org/P62040 and previous config saved to /var/cache/conftool/dbconfig/20240508-090817-marostegui.json
  • 09:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 09:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T361627)', diff saved to https://phabricator.wikimedia.org/P62039 and previous config saved to /var/cache/conftool/dbconfig/20240508-090754-marostegui.json
  • 09:07 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bookworm
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1022 T364289', diff saved to https://phabricator.wikimedia.org/P62038 and previous config saved to /var/cache/conftool/dbconfig/20240508-090621-root.json
  • 09:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62037 and previous config saved to /var/cache/conftool/dbconfig/20240508-090334-root.json
  • 08:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62036 and previous config saved to /var/cache/conftool/dbconfig/20240508-085929-root.json
  • 08:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2023.codfw.wmnet with OS bookworm
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62035 and previous config saved to /var/cache/conftool/dbconfig/20240508-085246-marostegui.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62034 and previous config saved to /var/cache/conftool/dbconfig/20240508-084422-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62033 and previous config saved to /var/cache/conftool/dbconfig/20240508-083739-marostegui.json
  • 08:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2208.codfw.wmnet
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2182.codfw.wmnet
  • 08:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2023.codfw.wmnet with reason: host reimage
  • 08:32 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2023.codfw.wmnet with reason: host reimage
  • 08:31 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 08:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 08:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62032 and previous config saved to /var/cache/conftool/dbconfig/20240508-082917-root.json
  • 08:24 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 08:23 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 08:22 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T361627)', diff saved to https://phabricator.wikimedia.org/P62031 and previous config saved to /var/cache/conftool/dbconfig/20240508-082231-marostegui.json
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62030 and previous config saved to /var/cache/conftool/dbconfig/20240508-082202-root.json
  • 08:21 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 08:21 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2182.codfw.wmnet
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2168.codfw.wmnet
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62029 and previous config saved to /var/cache/conftool/dbconfig/20240508-081633-root.json
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62028 and previous config saved to /var/cache/conftool/dbconfig/20240508-081412-root.json
  • 08:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2023.codfw.wmnet with OS bookworm
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Give some weight to es5 master', diff saved to https://phabricator.wikimedia.org/P62027 and previous config saved to /var/cache/conftool/dbconfig/20240508-080848-marostegui.json
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2023 T364443', diff saved to https://phabricator.wikimedia.org/P62026 and previous config saved to /var/cache/conftool/dbconfig/20240508-080812-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62025 and previous config saved to /var/cache/conftool/dbconfig/20240508-080656-root.json
  • 08:06 marostegui: Starting es5 codfw failover from es2023 to es2024 - T364443
  • 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2024 with weight 0 T364443', diff saved to https://phabricator.wikimedia.org/P62024 and previous config saved to /var/cache/conftool/dbconfig/20240508-080312-root.json
  • 08:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443
  • 08:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62023 and previous config saved to /var/cache/conftool/dbconfig/20240508-080128-root.json
  • 07:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62022 and previous config saved to /var/cache/conftool/dbconfig/20240508-075906-root.json
  • 07:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2168.codfw.wmnet
  • 07:57 Emperor: depool/restart/repool ms-fe1012
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T361627)', diff saved to https://phabricator.wikimedia.org/P62021 and previous config saved to /var/cache/conftool/dbconfig/20240508-075635-marostegui.json
  • 07:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 07:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T361627)', diff saved to https://phabricator.wikimedia.org/P62020 and previous config saved to /var/cache/conftool/dbconfig/20240508-075610-marostegui.json
  • 07:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2150.codfw.wmnet
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62019 and previous config saved to /var/cache/conftool/dbconfig/20240508-075150-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62018 and previous config saved to /var/cache/conftool/dbconfig/20240508-074620-root.json
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P62017 and previous config saved to /var/cache/conftool/dbconfig/20240508-074102-marostegui.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62016 and previous config saved to /var/cache/conftool/dbconfig/20240508-073644-root.json
  • 07:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2150.codfw.wmnet
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62015 and previous config saved to /var/cache/conftool/dbconfig/20240508-073109-root.json
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P62014 and previous config saved to /var/cache/conftool/dbconfig/20240508-072554-marostegui.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62012 and previous config saved to /var/cache/conftool/dbconfig/20240508-072138-root.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62011 and previous config saved to /var/cache/conftool/dbconfig/20240508-071604-root.json
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T361627)', diff saved to https://phabricator.wikimedia.org/P62010 and previous config saved to /var/cache/conftool/dbconfig/20240508-071047-marostegui.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62009 and previous config saved to /var/cache/conftool/dbconfig/20240508-070632-root.json
  • 07:02 moritzm: uninstalling git-fat on buster hosts T364373
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62008 and previous config saved to /var/cache/conftool/dbconfig/20240508-070058-root.json
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62007 and previous config saved to /var/cache/conftool/dbconfig/20240508-065127-root.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62006 and previous config saved to /var/cache/conftool/dbconfig/20240508-064552-root.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T361627)', diff saved to https://phabricator.wikimedia.org/P62005 and previous config saved to /var/cache/conftool/dbconfig/20240508-064523-marostegui.json
  • 06:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 06:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443
  • 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443
  • 06:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2022.codfw.wmnet with OS bookworm
  • 06:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T361627)', diff saved to https://phabricator.wikimedia.org/P62004 and previous config saved to /var/cache/conftool/dbconfig/20240508-062012-marostegui.json
  • 06:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2022.codfw.wmnet with reason: host reimage
  • 06:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2022.codfw.wmnet with reason: host reimage
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P62003 and previous config saved to /var/cache/conftool/dbconfig/20240508-060501-marostegui.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Give more weight to es2021', diff saved to https://phabricator.wikimedia.org/P62002 and previous config saved to /var/cache/conftool/dbconfig/20240508-060312-marostegui.json
  • 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2025.codfw.wmnet with OS bookworm
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'Give more weight to es2021', diff saved to https://phabricator.wikimedia.org/P62001 and previous config saved to /var/cache/conftool/dbconfig/20240508-055023-marostegui.json
  • 05:50 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2022.codfw.wmnet with OS bookworm
  • 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P62000 and previous config saved to /var/cache/conftool/dbconfig/20240508-054953-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Give more weight to es2021', diff saved to https://phabricator.wikimedia.org/P61999 and previous config saved to /var/cache/conftool/dbconfig/20240508-054825-marostegui.json
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'Give more weight to es2021', diff saved to https://phabricator.wikimedia.org/P61998 and previous config saved to /var/cache/conftool/dbconfig/20240508-054742-marostegui.json
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2022', diff saved to https://phabricator.wikimedia.org/P61997 and previous config saved to /var/cache/conftool/dbconfig/20240508-054705-root.json
  • 05:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61996 and previous config saved to /var/cache/conftool/dbconfig/20240508-053445-marostegui.json
  • 05:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2025.codfw.wmnet with reason: host reimage
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS bookworm
  • 05:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2025.codfw.wmnet with reason: host reimage
  • 05:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
  • 05:05 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2025.codfw.wmnet with OS bookworm
  • 05:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61995 and previous config saved to /var/cache/conftool/dbconfig/20240508-050419-marostegui.json
  • 05:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
  • 05:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2025', diff saved to https://phabricator.wikimedia.org/P61994 and previous config saved to /var/cache/conftool/dbconfig/20240508-050408-root.json
  • 05:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS bookworm
  • 04:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2203.codfw.wmnet with reason: Maintenance
  • 04:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2203.codfw.wmnet with reason: Maintenance
  • 02:16 eileen: civicrm upgraded from 867c3a0d to bf49ecdc

2024-05-07

  • 23:21 eileen: civicrm upgraded from aee07c4e to 867c3a0d
  • 22:50 eileen: civicrm upgraded from 80ae4543 to aee07c4e
  • 21:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T352010)', diff saved to https://phabricator.wikimedia.org/P61992 and previous config saved to /var/cache/conftool/dbconfig/20240507-215122-ladsgroup.json
  • 21:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P61991 and previous config saved to /var/cache/conftool/dbconfig/20240507-213614-ladsgroup.json
  • 21:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P61990 and previous config saved to /var/cache/conftool/dbconfig/20240507-213227-ladsgroup.json
  • 21:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P61989 and previous config saved to /var/cache/conftool/dbconfig/20240507-212103-ladsgroup.json
  • 21:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61988 and previous config saved to /var/cache/conftool/dbconfig/20240507-211717-ladsgroup.json
  • 21:15 zabe@deploy1002: Finished scap: Backport for Use OpenSSL for PBKDF2 password hashing (T320929) (duration: 17m 14s)
  • 21:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T352010)', diff saved to https://phabricator.wikimedia.org/P61987 and previous config saved to /var/cache/conftool/dbconfig/20240507-210556-ladsgroup.json
  • 21:03 zabe@deploy1002: zabe and ki: Continuing with sync
  • 21:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61986 and previous config saved to /var/cache/conftool/dbconfig/20240507-210209-ladsgroup.json
  • 21:01 zabe@deploy1002: zabe and ki: Backport for Use OpenSSL for PBKDF2 password hashing (T320929) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:58 zabe@deploy1002: Started scap: Backport for Use OpenSSL for PBKDF2 password hashing (T320929)
  • 20:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P61985 and previous config saved to /var/cache/conftool/dbconfig/20240507-204701-ladsgroup.json
  • 20:40 zabe@deploy1002: Finished scap: Backport for Avoid empty insert in SqlScoreStorage::storeScores (T364218) (duration: 16m 01s)
  • 20:27 zabe@deploy1002: zabe: Continuing with sync
  • 20:26 zabe@deploy1002: zabe: Backport for Avoid empty insert in SqlScoreStorage::storeScores (T364218) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:24 zabe@deploy1002: Started scap: Backport for Avoid empty insert in SqlScoreStorage::storeScores (T364218)
  • 20:19 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.4 refs T361398 (duration: 15m 03s)
  • 20:17 denisse: Deleting the kibana and kibana-combined certificates from the private repository - T360414
  • 20:09 denisse: Restarting envoyproxy and opensearch-dashboards services on the Logstash hosts that serve OpenSearch dashboards to migrate to CFSSL certificates - T360414
  • 20:06 denisse: Enabling Puppet on the Logstash hosts that serve OpenSearch dashboards to migrate to CFSSL certificates - T360414
  • 20:04 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.4 refs T361398
  • 19:59 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.4 refs T361398
  • 19:57 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 12 hosts with reason: Downtiming the Logstash hosts serving OpenSearch Dashboards as part of the cergen to CFSSL migration - T360414
  • 19:57 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on 12 hosts with reason: Downtiming the Logstash hosts serving OpenSearch Dashboards as part of the cergen to CFSSL migration - T360414
  • 19:46 denisse: disabling Puppet on the Logstash hosts that serve OpenSearch dashboards to test the CFSSL certificates - T360414
  • 19:34 jhuneidi@deploy1002: Finished scap: Backport for Partial cherry-pick of I9d8409fdbd757e (T361398 T362566) (duration: 15m 39s)
  • 19:21 jhuneidi@deploy1002: ladsgroup and jhuneidi: Continuing with sync
  • 19:21 jhuneidi@deploy1002: ladsgroup and jhuneidi: Backport for Partial cherry-pick of I9d8409fdbd757e (T361398 T362566) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:18 jhuneidi@deploy1002: Started scap: Backport for Partial cherry-pick of I9d8409fdbd757e (T361398 T362566)
  • 18:40 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Decommissioning — T364422
  • 18:40 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Decommissioning — T364422
  • 17:33 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
  • 17:32 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/apertium: apply
  • 17:21 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
  • 17:20 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/apertium: apply
  • 17:14 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/apertium: apply
  • 17:13 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/apertium: apply
  • 16:48 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
  • 16:39 elukey@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
  • 16:34 zabe@deploy1002: Finished scap: T363825 (duration: 07m 42s)
  • 16:26 zabe@deploy1002: Started scap: T363825
  • 16:08 zabe@deploy1002: sync-world aborted: (no justification provided) (duration: 00m 00s)
  • 16:08 zabe@deploy1002: Started scap: (no justification provided)
  • 16:05 ladsgroup@deploy1002: Finished scap: Backport for Stop writing to old columns of pagelinks in most wikis (T352010 T299947) (duration: 32m 29s)
  • 15:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P61983 and previous config saved to /var/cache/conftool/dbconfig/20240507-155822-ladsgroup.json
  • 15:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 15:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 15:52 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 15:38 ladsgroup@deploy1002: ladsgroup: Backport for Stop writing to old columns of pagelinks in most wikis (T352010 T299947) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:34 ejegg: switched Adyen IPN format to JSON in merchant console
  • 15:32 ladsgroup@deploy1002: Started scap: Backport for Stop writing to old columns of pagelinks in most wikis (T352010 T299947)
  • 15:31 ejegg: SmashPig (standalone IPN listener) upgraded from 71b9be53 to 67db9d96
  • 15:29 hnowlan: depooling 5 eqiad api appservers in advance of reimaging to k8s workers
  • 15:19 moritzm: imported nodejs 20.5.1-deb-1nodesource1 to thirdparty/node20 T362681
  • 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2122.codfw.wmnet
  • 15:13 godog: remove accidentally set site!=magru silence, add site=magru silence instead - T364016
  • 15:12 elukey: repool ms-fe1009's envoy with PKI TLS cert
  • 15:12 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1009.eqiad.wmnet
  • 14:55 elukey: depool ms-fe1009's nginx (swift proxy) to safely apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1026927
  • 14:54 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1009.eqiad.wmnet
  • 14:53 sukhe: A:cp and A:magru: running haproxy-restart
  • 14:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2122.codfw.wmnet
  • 14:53 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2305.codfw.wmnet|mw2325.codfw.wmnet|mw2338.codfw.wmnet|mw2359.codfw.wmnet|mw2390.codfw.wmnet|mw2407.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 14:52 moritzm: installing mariadb-10.5 security updates (as packaged in Debian, not the wmf-mariadb packages)
  • 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2121.codfw.wmnet
  • 14:50 godog: silence site=magru alerts during prometheus7001 - T364016
  • 14:44 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2121.codfw.wmnet
  • 14:41 hnowlan: running homer 'cr*codfw*' commit to configure BGP for new k8s codfw workers
  • 14:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2338.codfw.wmnet with OS bullseye
  • 14:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2325.codfw.wmnet with OS bullseye
  • 14:31 filippo@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host prometheus7001.magru.wmnet
  • 14:31 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus7001.magru.wmnet with OS bullseye
  • 14:30 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2305.codfw.wmnet with OS bullseye
  • 14:28 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2359.codfw.wmnet with OS bullseye
  • 14:23 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2407.codfw.wmnet with OS bullseye
  • 14:22 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:20 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2390.codfw.wmnet with OS bullseye
  • 14:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2338.codfw.wmnet with reason: host reimage
  • 14:16 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus7001.magru.wmnet with reason: host reimage
  • 14:13 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2325.codfw.wmnet with reason: host reimage
  • 14:13 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus7001.magru.wmnet with reason: host reimage
  • 14:12 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@b543b85]: (no justification provided) (duration: 00m 24s)
  • 14:11 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@b543b85]: (no justification provided)
  • 14:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2305.codfw.wmnet with reason: host reimage
  • 14:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2359.codfw.wmnet with reason: host reimage
  • 14:04 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2407.codfw.wmnet with reason: host reimage
  • 14:03 btullis@deploy1002: Finished deploy [airflow-dags/analytics@6be7efd]: (no justification provided) (duration: 00m 27s)
  • 14:03 btullis@deploy1002: Started deploy [airflow-dags/analytics@6be7efd]: (no justification provided)
  • 14:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2390.codfw.wmnet with reason: host reimage
  • 13:57 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2338.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2305.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2325.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2359.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2407.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2390.codfw.wmnet with reason: host reimage
  • 13:53 filippo@cumin1002: START - Cookbook sre.hosts.reimage for host prometheus7001.magru.wmnet with OS bullseye
  • 13:51 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 13:50 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 13:50 filippo@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus7001.magru.wmnet on all recursors
  • 13:50 filippo@cumin1002: START - Cookbook sre.dns.wipe-cache prometheus7001.magru.wmnet on all recursors
  • 13:50 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:50 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 13:49 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 13:48 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@ad4934c]: (no justification provided) (duration: 00m 32s)
  • 13:47 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@ad4934c]: (no justification provided)
  • 13:44 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 13:44 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 13:41 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus7001.magru.wmnet
  • 13:40 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2359.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2390.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2407.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2338.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2305.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2325.codfw.wmnet with OS bullseye
  • 13:40 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 13:36 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 13:31 filippo@cumin1002: START - Cookbook sre.hosts.decommission for hosts prometheus7001.magru.wmnet
  • 13:31 filippo@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus7001.magru.wmnet
  • 13:31 filippo@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 13:29 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 13:29 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 13:25 filippo@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus7001.magru.wmnet
  • 13:25 filippo@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus7001.magru.wmnet with OS bullseye
  • 13:21 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:19 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:17 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:14 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:05 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 13:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 12:48 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 12:46 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 12:10 filippo@cumin1002: START - Cookbook sre.hosts.reimage for host prometheus7001.magru.wmnet with OS bullseye
  • 12:09 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 12:08 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 12:08 filippo@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus7001.magru.wmnet on all recursors
  • 12:08 filippo@cumin1002: START - Cookbook sre.dns.wipe-cache prometheus7001.magru.wmnet on all recursors
  • 12:08 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:08 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 12:07 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 12:05 btullis@deploy1002: Finished deploy [airflow-dags/analytics@e5ba870]: (no justification provided) (duration: 00m 32s)
  • 12:05 btullis@deploy1002: Started deploy [airflow-dags/analytics@e5ba870]: (no justification provided)
  • 12:03 moritzm: installing ruby3.1 security updates
  • 12:02 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 12:02 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 11:22 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 11:19 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 11:19 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 11:17 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 11:16 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 11:15 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 11:05 hnowlan: depooling 6 codfw appservers in advance of reimaging
  • 10:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1223.eqiad.wmnet
  • 10:28 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus7001.magru.wmnet
  • 10:28 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:28 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot-master (exit_code=0) rolling restart_daemons on A:maps-master
  • 10:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1223.eqiad.wmnet
  • 10:25 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot-master rolling restart_daemons on A:maps-master
  • 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1198.eqiad.wmnet
  • 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 10:16 jnuche@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.4 refs T361398 (duration: 19m 22s)
  • 10:15 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 10:09 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 10:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1198.eqiad.wmnet
  • 10:01 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
  • 09:58 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
  • 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1189.eqiad.wmnet
  • 09:57 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 09:56 jnuche@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.4 refs T361398
  • 09:55 jnuche@deploy1002: sync-world aborted: testwikis wikis to 1.43.0-wmf.4 refs T361398 (duration: 43m 38s)
  • 09:54 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 09:49 filippo@cumin1002: START - Cookbook sre.hosts.decommission for hosts prometheus7001.magru.wmnet
  • 09:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1189.eqiad.wmnet
  • 09:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1166.eqiad.wmnet
  • 09:39 brouberol@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:38 brouberol@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:38 brouberol@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 09:37 brouberol@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 09:37 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:36 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:36 brouberol@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:36 brouberol@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2165 (T352010)', diff saved to https://phabricator.wikimedia.org/P61981 and previous config saved to /var/cache/conftool/dbconfig/20240507-093302-ladsgroup.json
  • 09:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 09:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 09:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1166.eqiad.wmnet
  • 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1157.eqiad.wmnet
  • 09:21 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1157.eqiad.wmnet
  • 09:11 jnuche@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.4 refs T361398
  • 09:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install7001.wikimedia.org
  • 09:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install7001.wikimedia.org with OS bullseye
  • 09:03 jayme@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:02 jayme@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 09:02 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 09:01 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 09:01 jayme@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:01 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:00 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:00 jayme@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:53 taavi@deploy1002: Finished scap: Backport for wikitech: Also disable password changes when logged-in (duration: 15m 50s)
  • 08:41 taavi@deploy1002: taavi: Continuing with sync
  • 08:40 taavi@deploy1002: taavi: Backport for wikitech: Also disable password changes when logged-in synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:37 taavi@deploy1002: Started scap: Backport for wikitech: Also disable password changes when logged-in
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 08:35 moritzm: installing glibc security updates on buster
  • 08:34 zabe@deploy1002: Finished scap: Backport for Use OpenSSL for PBKDF2 password hashing on testwiki (T320929) (duration: 17m 22s)
  • 08:34 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 08:22 zabe@deploy1002: zabe: Continuing with sync
  • 08:19 zabe@deploy1002: zabe: Backport for Use OpenSSL for PBKDF2 password hashing on testwiki (T320929) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:17 zabe@deploy1002: Started scap: Backport for Use OpenSSL for PBKDF2 password hashing on testwiki (T320929)
  • 08:15 zabe@deploy1002: Finished scap: Backport for Stop setting wgPasswordDefault (duration: 15m 24s)
  • 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install7001.wikimedia.org with OS bullseye
  • 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 08:07 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install7001.wikimedia.org on all recursors
  • 08:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install7001.wikimedia.org on all recursors
  • 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 08:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 08:03 zabe@deploy1002: zabe: Continuing with sync
  • 08:02 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:02 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install7001.wikimedia.org
  • 08:02 zabe@deploy1002: zabe: Backport for Stop setting wgPasswordDefault synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install7001.wikimedia.org
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install7001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:00 zabe@deploy1002: Started scap: Backport for Stop setting wgPasswordDefault
  • 07:50 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install7001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:45 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install7001.wikimedia.org
  • 07:39 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install7001.wikimedia.org with OS bullseye
  • 07:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install7001.wikimedia.org with OS bullseye
  • 04:04 mwpresync@deploy1002: Pruned MediaWiki: 1.43.0-wmf.1, 1.43.0-wmf.2 (duration: 04m 50s)
  • 00:47 denisse: Reverting debug changes to their previous state - T364354
  • 00:42 denisse: Writing output to `/tmp/benthos_output.txt` shows that the grok processor's output is being parsed correctly - T364354
  • 00:17 denisse: Adding a logger processor to the `parse_ncredir_log_format` on `ncredir2001` to examine the JSON structure - T364354

2024-05-06

  • 22:22 dzahn@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 22:20 dzahn@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 22:20 dzahn@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 22:18 dzahn@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 22:14 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 22:13 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 21:29 dancy@deploy1002: Installation of scap version "4.82.0" completed for 320 hosts
  • 21:28 dancy@deploy1002: Installing scap version "4.82.0" for 320 hosts
  • 20:47 jdrewniak@deploy1002: Finished scap: Backport for [Vector 2022] Deploy larger font-size and appearance menu to pilot wikis (T362147) (duration: 15m 10s)
  • 20:34 jdrewniak@deploy1002: jdrewniak: Continuing with sync
  • 20:34 jdrewniak@deploy1002: jdrewniak: Backport for [Vector 2022] Deploy larger font-size and appearance menu to pilot wikis (T362147) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:32 jdrewniak@deploy1002: Started scap: Backport for [Vector 2022] Deploy larger font-size and appearance menu to pilot wikis (T362147)
  • 20:27 jdrewniak@deploy1002: Finished scap: Backport for Revert "Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata"" (T352087) (duration: 21m 33s)
  • 20:14 jdrewniak@deploy1002: esanders and jdrewniak: Continuing with sync
  • 20:09 jdrewniak@deploy1002: esanders and jdrewniak: Backport for Revert "Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata"" (T352087) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:05 jdrewniak@deploy1002: Started scap: Backport for Revert "Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata"" (T352087)
  • 19:46 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s7 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 19:25 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s5 userOptions.php --delete rcenhancedfilters-seen-highlight-button-counter # T364269
  • 19:23 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript userOptions.php --wiki=enwiki --delete wlenhancedfilters-seen-tour # T364269
  • 19:21 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s6 userOptions.php --delete rcenhancedfilters-seen-highlight-button-counter # T364269
  • 19:17 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s6 userOptions.php --delete rcenhancedfilters-tried-highlight # T364269
  • 19:17 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s5 userOptions.php --delete rcenhancedfilters-tried-highlight # T364269
  • 19:15 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s3 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:20 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s6 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:20 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s5 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:20 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s4 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:20 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s2 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:18 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript userOptions.php --wiki=loginwiki --delete wlenhancedfilters-seen-tour # T364269
  • 18:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T361627)', diff saved to https://phabricator.wikimedia.org/P61979 and previous config saved to /var/cache/conftool/dbconfig/20240506-181706-marostegui.json
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P61978 and previous config saved to /var/cache/conftool/dbconfig/20240506-180158-marostegui.json
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P61977 and previous config saved to /var/cache/conftool/dbconfig/20240506-174651-marostegui.json
  • 17:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T361627)', diff saved to https://phabricator.wikimedia.org/P61976 and previous config saved to /var/cache/conftool/dbconfig/20240506-173143-marostegui.json
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2216 (T361627)', diff saved to https://phabricator.wikimedia.org/P61975 and previous config saved to /var/cache/conftool/dbconfig/20240506-172126-marostegui.json
  • 17:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 17:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T361627)', diff saved to https://phabricator.wikimedia.org/P61974 and previous config saved to /var/cache/conftool/dbconfig/20240506-172103-marostegui.json
  • 17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P61973 and previous config saved to /var/cache/conftool/dbconfig/20240506-170556-marostegui.json
  • 16:52 sukhe: sudo cumin 'A:ncredir' 'run-puppet-agent --enable "merging CR 1028514"'
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P61972 and previous config saved to /var/cache/conftool/dbconfig/20240506-165048-marostegui.json
  • 16:45 sukhe: disable puppet on A:ncredir to merge CR 1028514
  • 16:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2382.codfw.wmnet
  • 16:38 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for mw2382.codfw.wmnet
  • 16:37 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=mw2382.codfw.wmnet
  • 16:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T361627)', diff saved to https://phabricator.wikimedia.org/P61971 and previous config saved to /var/cache/conftool/dbconfig/20240506-163540-marostegui.json
  • 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2212 (T361627)', diff saved to https://phabricator.wikimedia.org/P61970 and previous config saved to /var/cache/conftool/dbconfig/20240506-162528-marostegui.json
  • 16:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 16:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T361627)', diff saved to https://phabricator.wikimedia.org/P61969 and previous config saved to /var/cache/conftool/dbconfig/20240506-161624-marostegui.json
  • 16:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P61968 and previous config saved to /var/cache/conftool/dbconfig/20240506-160116-marostegui.json
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61967 and previous config saved to /var/cache/conftool/dbconfig/20240506-155420-root.json
  • 15:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P61966 and previous config saved to /var/cache/conftool/dbconfig/20240506-154608-marostegui.json
  • 15:39 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61965 and previous config saved to /var/cache/conftool/dbconfig/20240506-153914-root.json
  • 15:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T361627)', diff saved to https://phabricator.wikimedia.org/P61964 and previous config saved to /var/cache/conftool/dbconfig/20240506-153101-marostegui.json
  • 15:25 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
  • 15:24 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
  • 15:24 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61963 and previous config saved to /var/cache/conftool/dbconfig/20240506-152408-root.json
  • 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T361627)', diff saved to https://phabricator.wikimedia.org/P61962 and previous config saved to /var/cache/conftool/dbconfig/20240506-152040-marostegui.json
  • 15:20 urbanecm@deploy1002: Finished scap: Backport for userOptions.php: Actually batch deletion (T364311) (duration: 16m 51s)
  • 15:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 15:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T361627)', diff saved to https://phabricator.wikimedia.org/P61961 and previous config saved to /var/cache/conftool/dbconfig/20240506-152016-marostegui.json
  • 15:17 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s{3-8} userOptions.php --delete rcenhancedfilters-seen-tour # T364269
  • 15:16 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s2 userOptions.php --delete rcenhancedfilters-seen-tour # T364269
  • 15:16 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript userOptions.php --wiki=enwiki --delete rcenhancedfilters-seen-tour # T364269
  • 15:12 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
  • 15:11 brouberol@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster cloudelastic: restart to pick up new JDK - brouberol@cumin2002 - T363975
  • 15:09 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/mathoid: apply
  • 15:09 urbanecm: mwmaint1002: mwscript userOptions.php --wiki=loginwiki --delete rcenhancedfilters-seen-tour # T364269
  • 15:09 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61960 and previous config saved to /var/cache/conftool/dbconfig/20240506-150902-root.json
  • 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P61959 and previous config saved to /var/cache/conftool/dbconfig/20240506-150508-marostegui.json
  • 15:03 urbanecm@deploy1002: Started scap: Backport for userOptions.php: Actually batch deletion (T364311)
  • 15:03 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
  • 15:01 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/mathoid: apply
  • 14:55 moritzm: installing less security updates
  • 14:54 filippo@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host prometheus7001.magru.wmnet
  • 14:54 filippo@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host prometheus7001.magru.wmnet with OS bullseye
  • 14:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61958 and previous config saved to /var/cache/conftool/dbconfig/20240506-145356-root.json
  • 14:51 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster cloudelastic: restart to pick up new JDK - brouberol@cumin2002 - T363975
  • 14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P61957 and previous config saved to /var/cache/conftool/dbconfig/20240506-145001-marostegui.json
  • 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2209.codfw.wmnet
  • 14:45 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:45 urbanecm@deploy1002: Finished scap: Backport for Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata" (duration: 18m 11s)
  • 14:45 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61956 and previous config saved to /var/cache/conftool/dbconfig/20240506-143850-root.json
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T361627)', diff saved to https://phabricator.wikimedia.org/P61955 and previous config saved to /var/cache/conftool/dbconfig/20240506-143453-marostegui.json
  • 14:32 urbanecm@deploy1002: urbanecm: Continuing with sync
  • 14:32 urbanecm@deploy1002: urbanecm: Backport for Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:28 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:28 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:27 urbanecm@deploy1002: Started scap: Backport for Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata"
  • 14:25 urbanecm@deploy1002: Sync cancelled.
  • 14:23 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61954 and previous config saved to /var/cache/conftool/dbconfig/20240506-142344-root.json
  • 14:23 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:23 filippo@cumin1002: START - Cookbook sre.hosts.reimage for host prometheus7001.magru.wmnet with OS bullseye
  • 14:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T361627)', diff saved to https://phabricator.wikimedia.org/P61953 and previous config saved to /var/cache/conftool/dbconfig/20240506-142316-marostegui.json
  • 14:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 14:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 14:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61952 and previous config saved to /var/cache/conftool/dbconfig/20240506-142253-marostegui.json
  • 14:21 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 14:20 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 14:20 filippo@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus7001.magru.wmnet on all recursors
  • 14:20 filippo@cumin1002: START - Cookbook sre.dns.wipe-cache prometheus7001.magru.wmnet on all recursors
  • 14:20 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:20 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 14:19 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 14:17 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 14:17 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 14:16 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1004.eqiad.wmnet with OS bookworm
  • 14:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2021.codfw.wmnet with OS bookworm
  • 14:11 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus7001.magru.wmnet
  • 14:11 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:11 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 14:11 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 14:08 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P61951 and previous config saved to /var/cache/conftool/dbconfig/20240506-140745-marostegui.json
  • 14:04 filippo@cumin1002: START - Cookbook sre.hosts.decommission for hosts prometheus7001.magru.wmnet
  • 13:54 filippo@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus7001.magru.wmnet
  • 13:54 filippo@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus7001.magru.wmnet with OS bullseye
  • 13:53 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1004.eqiad.wmnet with reason: host reimage
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P61950 and previous config saved to /var/cache/conftool/dbconfig/20240506-135238-marostegui.json
  • 13:51 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1004.eqiad.wmnet with reason: host reimage
  • 13:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2021.codfw.wmnet with reason: host reimage
  • 13:49 urbanecm@deploy1002: esanders and urbanecm: Backport for Release DT visual enhancements to all except Wikipedia/Commons/Wikidata (T352087) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2021.codfw.wmnet with reason: host reimage
  • 13:45 urbanecm@deploy1002: Started scap: Backport for Release DT visual enhancements to all except Wikipedia/Commons/Wikidata (T352087)
  • 13:44 urbanecm@deploy1002: Finished scap: Backport for eswiki, commonswiki wikidatawiki: lift IP cap for edit-a-thon (T364039) (duration: 17m 14s)
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61949 and previous config saved to /var/cache/conftool/dbconfig/20240506-133728-marostegui.json
  • 13:35 urbanecm: Run `mwscript userOptions.php --wiki=testwiki --delete` for "rcenhancedfilters-seen-tour", "wlenhancedfilters-seen-tour", "rcenhancedfilters-tried-highlight", "rcenhancedfilters-seen-highlight-button-counter" (T364269)
  • 13:33 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS bookworm
  • 13:27 elukey@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
  • 13:27 urbanecm@deploy1002: Started scap: Backport for eswiki, commonswiki wikidatawiki: lift IP cap for edit-a-thon (T364039)
  • 13:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61948 and previous config saved to /var/cache/conftool/dbconfig/20240506-132635-marostegui.json
  • 13:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T361627)', diff saved to https://phabricator.wikimedia.org/P61947 and previous config saved to /var/cache/conftool/dbconfig/20240506-132612-marostegui.json
  • 13:25 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2021.codfw.wmnet with OS bookworm
  • 13:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2021', diff saved to https://phabricator.wikimedia.org/P61946 and previous config saved to /var/cache/conftool/dbconfig/20240506-132424-root.json
  • 13:16 urbanecm@deploy1002: Finished scap: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269) (duration: 24m 01s)
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P61945 and previous config saved to /var/cache/conftool/dbconfig/20240506-131104-marostegui.json
  • 13:09 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 13:07 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61944 and previous config saved to /var/cache/conftool/dbconfig/20240506-130712-root.json
  • 13:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 13:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 13:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 13:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 13:02 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 13:00 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:59 sukhe: running authdns-update for removing depooling magru geoip/*
  • 12:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2209.codfw.wmnet
  • 12:57 urbanecm@deploy1002: urbanecm: Continuing with sync
  • 12:57 urbanecm@deploy1002: urbanecm: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 12:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P61943 and previous config saved to /var/cache/conftool/dbconfig/20240506-125556-marostegui.json
  • 12:54 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:53 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2205.codfw.wmnet
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61942 and previous config saved to /var/cache/conftool/dbconfig/20240506-125206-root.json
  • 12:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:52 urbanecm@deploy1002: Started scap: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269)
  • 12:51 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:51 urbanecm@deploy1002: Sync cancelled.
  • 12:45 urbanecm@deploy1002: urbanecm: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:27 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=iglwiki growthexperiments # T364130
  • 12:27 elukey@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
  • 12:26 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 12:26 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 12:25 urbanecm@deploy1002: Started scap: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269)
  • 12:21 urbanecm: [urbanecm@deploy1002 ~]$ sudo /usr/local/sbin/fix-staging-perms # fixing permissions
  • 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61937 and previous config saved to /var/cache/conftool/dbconfig/20240506-122154-root.json
  • 12:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P61936 and previous config saved to /var/cache/conftool/dbconfig/20240506-121515-marostegui.json
  • 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61935 and previous config saved to /var/cache/conftool/dbconfig/20240506-120648-root.json
  • 12:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P61934 and previous config saved to /var/cache/conftool/dbconfig/20240506-120007-marostegui.json
  • 11:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61933 and previous config saved to /var/cache/conftool/dbconfig/20240506-115142-root.json
  • 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61932 and previous config saved to /var/cache/conftool/dbconfig/20240506-114459-marostegui.json
  • 11:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61931 and previous config saved to /var/cache/conftool/dbconfig/20240506-113636-root.json
  • 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61930 and previous config saved to /var/cache/conftool/dbconfig/20240506-113511-marostegui.json
  • 11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 11:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T361627)', diff saved to https://phabricator.wikimedia.org/P61929 and previous config saved to /var/cache/conftool/dbconfig/20240506-113448-marostegui.json
  • 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2002.wikimedia.org
  • 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2194.codfw.wmnet
  • 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test2002.wikimedia.org
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P61928 and previous config saved to /var/cache/conftool/dbconfig/20240506-111940-marostegui.json
  • 11:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2024.codfw.wmnet with OS bookworm
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P61927 and previous config saved to /var/cache/conftool/dbconfig/20240506-110433-marostegui.json
  • 11:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2194.codfw.wmnet
  • 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T361627)', diff saved to https://phabricator.wikimedia.org/P61926 and previous config saved to /var/cache/conftool/dbconfig/20240506-104925-marostegui.json
  • 10:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2024.codfw.wmnet with reason: host reimage
  • 10:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2024.codfw.wmnet with reason: host reimage
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T361627)', diff saved to https://phabricator.wikimedia.org/P61925 and previous config saved to /var/cache/conftool/dbconfig/20240506-103848-marostegui.json
  • 10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T361627)', diff saved to https://phabricator.wikimedia.org/P61924 and previous config saved to /var/cache/conftool/dbconfig/20240506-103825-marostegui.json
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61923 and previous config saved to /var/cache/conftool/dbconfig/20240506-103814-root.json
  • 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2190.codfw.wmnet
  • 10:31 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1178.eqiad.wmnet with OS bookworm
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P61922 and previous config saved to /var/cache/conftool/dbconfig/20240506-102317-marostegui.json
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61921 and previous config saved to /var/cache/conftool/dbconfig/20240506-102307-root.json
  • 10:21 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2024.codfw.wmnet with OS bookworm
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'Give some weight to es2023', diff saved to https://phabricator.wikimedia.org/P61920 and previous config saved to /var/cache/conftool/dbconfig/20240506-101934-marostegui.json
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024', diff saved to https://phabricator.wikimedia.org/P61919 and previous config saved to /var/cache/conftool/dbconfig/20240506-101911-root.json
  • 10:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2190.codfw.wmnet
  • 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2177.codfw.wmnet
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P61918 and previous config saved to /var/cache/conftool/dbconfig/20240506-100809-marostegui.json
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61917 and previous config saved to /var/cache/conftool/dbconfig/20240506-100801-root.json
  • 10:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2177.codfw.wmnet
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T361627)', diff saved to https://phabricator.wikimedia.org/P61916 and previous config saved to /var/cache/conftool/dbconfig/20240506-095302-marostegui.json
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61915 and previous config saved to /var/cache/conftool/dbconfig/20240506-095255-root.json
  • 09:43 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:43 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T361627)', diff saved to https://phabricator.wikimedia.org/P61914 and previous config saved to /var/cache/conftool/dbconfig/20240506-094158-marostegui.json
  • 09:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 09:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T361627)', diff saved to https://phabricator.wikimedia.org/P61913 and previous config saved to /var/cache/conftool/dbconfig/20240506-094135-marostegui.json
  • 09:40 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:40 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61912 and previous config saved to /var/cache/conftool/dbconfig/20240506-093749-root.json
  • 09:30 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61911 and previous config saved to /var/cache/conftool/dbconfig/20240506-093047-root.json
  • 09:29 moritzm: uploaded openjdk-8 8u412-ga-1~deb10u1 to buster-wikimedia (forward port of latest Java 8 security updates)
  • 09:28 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 09:27 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 09:26 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 09:26 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 09:26 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P61910 and previous config saved to /var/cache/conftool/dbconfig/20240506-092627-marostegui.json
  • 09:26 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 09:26 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 09:26 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 09:25 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 09:25 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 09:25 jayme@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 09:24 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61909 and previous config saved to /var/cache/conftool/dbconfig/20240506-092244-root.json
  • 09:22 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 09:21 jayme@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 09:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2149.codfw.wmnet
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61908 and previous config saved to /var/cache/conftool/dbconfig/20240506-091541-root.json
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P61907 and previous config saved to /var/cache/conftool/dbconfig/20240506-091120-marostegui.json
  • 09:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS bookworm
  • 09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1178', diff saved to https://phabricator.wikimedia.org/P61906 and previous config saved to /var/cache/conftool/dbconfig/20240506-090759-root.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61905 and previous config saved to /var/cache/conftool/dbconfig/20240506-090736-root.json
  • 09:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61904 and previous config saved to /var/cache/conftool/dbconfig/20240506-090035-root.json
  • 08:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2149.codfw.wmnet
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T361627)', diff saved to https://phabricator.wikimedia.org/P61903 and previous config saved to /var/cache/conftool/dbconfig/20240506-085612-marostegui.json
  • 08:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1025.eqiad.wmnet with OS bookworm
  • 08:45 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61902 and previous config saved to /var/cache/conftool/dbconfig/20240506-084530-root.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T361627)', diff saved to https://phabricator.wikimedia.org/P61901 and previous config saved to /var/cache/conftool/dbconfig/20240506-084422-marostegui.json
  • 08:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 08:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61900 and previous config saved to /var/cache/conftool/dbconfig/20240506-083657-root.json
  • 08:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 08:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T361627)', diff saved to https://phabricator.wikimedia.org/P61899 and previous config saved to /var/cache/conftool/dbconfig/20240506-083507-marostegui.json
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2127.codfw.wmnet
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61898 and previous config saved to /var/cache/conftool/dbconfig/20240506-083024-root.json
  • 08:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61897 and previous config saved to /var/cache/conftool/dbconfig/20240506-082426-root.json
  • 08:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1025.eqiad.wmnet with reason: host reimage
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61896 and previous config saved to /var/cache/conftool/dbconfig/20240506-082151-root.json
  • 08:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1025.eqiad.wmnet with reason: host reimage
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P61895 and previous config saved to /var/cache/conftool/dbconfig/20240506-082000-marostegui.json
  • 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61894 and previous config saved to /var/cache/conftool/dbconfig/20240506-081518-root.json
  • 08:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2127.codfw.wmnet
  • 08:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61893 and previous config saved to /var/cache/conftool/dbconfig/20240506-080920-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61892 and previous config saved to /var/cache/conftool/dbconfig/20240506-080645-root.json
  • 08:05 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1025.eqiad.wmnet with OS bookworm
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P61891 and previous config saved to /var/cache/conftool/dbconfig/20240506-080452-marostegui.json
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1025 T364289', diff saved to https://phabricator.wikimedia.org/P61890 and previous config saved to /var/cache/conftool/dbconfig/20240506-080423-root.json
  • 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61889 and previous config saved to /var/cache/conftool/dbconfig/20240506-080012-root.json
  • 07:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1020.eqiad.wmnet with OS bookworm
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61888 and previous config saved to /var/cache/conftool/dbconfig/20240506-075414-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61887 and previous config saved to /var/cache/conftool/dbconfig/20240506-075139-root.json
  • 07:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T361627)', diff saved to https://phabricator.wikimedia.org/P61886 and previous config saved to /var/cache/conftool/dbconfig/20240506-074945-marostegui.json
  • 07:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61885 and previous config saved to /var/cache/conftool/dbconfig/20240506-073909-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T361627)', diff saved to https://phabricator.wikimedia.org/P61884 and previous config saved to /var/cache/conftool/dbconfig/20240506-073826-marostegui.json
  • 07:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 07:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T361627)', diff saved to https://phabricator.wikimedia.org/P61883 and previous config saved to /var/cache/conftool/dbconfig/20240506-073803-marostegui.json
  • 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) webproxy on magru recursors
  • 07:37 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache webproxy on magru recursors
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61882 and previous config saved to /var/cache/conftool/dbconfig/20240506-073633-root.json
  • 07:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1020.eqiad.wmnet with reason: host reimage
  • 07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1020.eqiad.wmnet with reason: host reimage
  • 07:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61881 and previous config saved to /var/cache/conftool/dbconfig/20240506-072403-root.json
  • 07:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P61880 and previous config saved to /var/cache/conftool/dbconfig/20240506-072255-marostegui.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61879 and previous config saved to /var/cache/conftool/dbconfig/20240506-072127-root.json
  • 07:13 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1020.eqiad.wmnet with OS bookworm
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1020', diff saved to https://phabricator.wikimedia.org/P61878 and previous config saved to /var/cache/conftool/dbconfig/20240506-071051-root.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61877 and previous config saved to /var/cache/conftool/dbconfig/20240506-070857-root.json
  • 07:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P61876 and previous config saved to /var/cache/conftool/dbconfig/20240506-070748-marostegui.json
  • 07:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS bookworm
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61875 and previous config saved to /var/cache/conftool/dbconfig/20240506-070621-root.json
  • 06:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS bookworm
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61874 and previous config saved to /var/cache/conftool/dbconfig/20240506-065351-root.json
  • 06:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T361627)', diff saved to https://phabricator.wikimedia.org/P61873 and previous config saved to /var/cache/conftool/dbconfig/20240506-065239-marostegui.json
  • 06:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
  • 06:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T361627)', diff saved to https://phabricator.wikimedia.org/P61872 and previous config saved to /var/cache/conftool/dbconfig/20240506-064121-marostegui.json
  • 06:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
  • 06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS bookworm
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1193', diff saved to https://phabricator.wikimedia.org/P61871 and previous config saved to /var/cache/conftool/dbconfig/20240506-062814-root.json
  • 06:17 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 06:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS bookworm
  • 06:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2165 T363977', diff saved to https://phabricator.wikimedia.org/P61870 and previous config saved to /var/cache/conftool/dbconfig/20240506-061416-root.json
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2161 to s8 primary T363977', diff saved to https://phabricator.wikimedia.org/P61869 and previous config saved to /var/cache/conftool/dbconfig/20240506-061311-marostegui.json
  • 06:12 marostegui: Starting s8 codfw failover from db2165 to db2161 - T363977
  • 06:07 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 05:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T363977
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2161 with weight 0 T363977', diff saved to https://phabricator.wikimedia.org/P61868 and previous config saved to /var/cache/conftool/dbconfig/20240506-055013-root.json
  • 05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T363977
  • 05:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 05:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2165.codfw.wmnet with reason: Maintenance

2024-05-05

  • 11:09 brennen@deploy1002: Finished deploy [phabricator/deployment@dd53761]: test deploy phab1004 for T364271 (duration: 00m 32s)
  • 11:08 brennen@deploy1002: Started deploy [phabricator/deployment@dd53761]: test deploy phab1004 for T364271
  • 11:08 brennen@deploy1002: Finished deploy [phabricator/deployment@dd53761]: test deploy phab2002 for T364271 (duration: 00m 32s)
  • 11:07 brennen@deploy1002: Started deploy [phabricator/deployment@dd53761]: test deploy phab2002 for T364271
  • 11:04 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab.wmfusercontent.org with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab.wmfusercontent.org with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on phabricator.wikimedia.org with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phabricator.wikimedia.org with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: brennen is deploying things
  • 08:42 taavi: taavi@gerrit1003 ~ $ sudo systemctl restart apache2

2024-05-04

  • 13:41 jayme: doubled the number of eventgate-main replicas in eqiad to 16
  • 07:39 taavi@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 07:33 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 03:07 denisse: Restarting `status curator_actions_cluster_wide.service` to log with DEBUGG level on logstash2026 - T364190
  • 03:06 denisse: Enable log level DEBUG for curator on logstash2026 - T364190
  • 01:33 bblack@cumin1002: conftool action : set/weight=100; selector: name=dns7.*
  • 01:24 bblack: lvs7001 - restart pybal
  • 01:23 bblack: lvs7003 - restart pybal

2024-05-03

  • 21:38 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920
  • 21:38 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920
  • 21:27 ryankemper: T362920 [wdqs] Depooled `wdqs2023` in preparation to switch it to a graph split host
  • 19:02 sukhe: cleaning up stale confd template files for magru related reimaging
  • 18:44 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:43 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:38 brett@cumin2002: conftool action : set/pooled=no; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:38 brett@cumin2002: conftool action : set/pooled=no; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/weight=1; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:28 brett@cumin2002: conftool action : set/weight=1; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 17:45 dcausse: repooling wdqs1012
  • 17:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 17:14 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir7002.magru.wmnet
  • 17:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir7002.magru.wmnet with OS bookworm
  • 17:13 denisse: Run `sudo mdadm --add /dev/md1 /dev/sdg` on `centrallog1002` - T363660
  • 17:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 17:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 17:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61862 and previous config saved to /var/cache/conftool/dbconfig/20240503-170054-marostegui.json
  • 16:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir7002.magru.wmnet with reason: host reimage
  • 16:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P61860 and previous config saved to /var/cache/conftool/dbconfig/20240503-164546-marostegui.json
  • 16:44 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir7002.magru.wmnet with reason: host reimage
  • 16:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P61859 and previous config saved to /var/cache/conftool/dbconfig/20240503-163039-marostegui.json
  • 16:18 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7002.magru.wmnet with OS bookworm
  • 16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61858 and previous config saved to /var/cache/conftool/dbconfig/20240503-161531-marostegui.json
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61857 and previous config saved to /var/cache/conftool/dbconfig/20240503-155432-marostegui.json
  • 15:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 15:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61856 and previous config saved to /var/cache/conftool/dbconfig/20240503-155409-marostegui.json
  • 15:42 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:41 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir7002.magru.wmnet on all recursors
  • 15:40 brett@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir7002.magru.wmnet on all recursors
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:39 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P61855 and previous config saved to /var/cache/conftool/dbconfig/20240503-153901-marostegui.json
  • 15:34 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 15:34 brett@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir7002.magru.wmnet
  • 15:26 dcausse: depooled wdqs1012 (lagged)
  • 15:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P61854 and previous config saved to /var/cache/conftool/dbconfig/20240503-152354-marostegui.json
  • 15:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61853 and previous config saved to /var/cache/conftool/dbconfig/20240503-150846-marostegui.json
  • 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add install7001 - jmm@cumin2002"
  • 14:44 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@5d3a06d] (releasing): update plugins to address vulnerabilities (duration: 00m 39s)
  • 14:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61852 and previous config saved to /var/cache/conftool/dbconfig/20240503-144419-marostegui.json
  • 14:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 14:44 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@5d3a06d] (releasing): update plugins to address vulnerabilities
  • 14:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61851 and previous config saved to /var/cache/conftool/dbconfig/20240503-144356-marostegui.json
  • 14:39 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@5d3a06d] (releasing): test plugin update in secondary host (duration: 00m 22s)
  • 14:39 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@5d3a06d] (releasing): test plugin update in secondary host
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P61850 and previous config saved to /var/cache/conftool/dbconfig/20240503-142848-marostegui.json
  • 14:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add install7001 - jmm@cumin2002"
  • 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install7001.wikimedia.org
  • 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install7001.wikimedia.org with OS bookworm
  • 14:16 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:15 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 14:14 sukhe: sudo homer asw*magru* commit "add durum and doh hosts in magru"
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P61849 and previous config saved to /var/cache/conftool/dbconfig/20240503-141341-marostegui.json
  • 14:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 14:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 14:07 herron: alert1001:~# systemctl restart prometheus-alertmanager.service
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61848 and previous config saved to /var/cache/conftool/dbconfig/20240503-135834-marostegui.json
  • 13:43 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install7001.wikimedia.org with OS bookworm
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61847 and previous config saved to /var/cache/conftool/dbconfig/20240503-133601-marostegui.json
  • 13:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61846 and previous config saved to /var/cache/conftool/dbconfig/20240503-133538-marostegui.json
  • 13:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:29 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install7001.wikimedia.org on all recursors
  • 13:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install7001.wikimedia.org on all recursors
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:26 elukey: restart karma on alert1001 to verify if probe down alerts shown are stale
  • 13:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:23 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:22 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P61845 and previous config saved to /var/cache/conftool/dbconfig/20240503-132030-marostegui.json
  • 13:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P61844 and previous config saved to /var/cache/conftool/dbconfig/20240503-130523-marostegui.json
  • 13:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:03 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 12:51 cmooney@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 12:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61843 and previous config saved to /var/cache/conftool/dbconfig/20240503-125015-marostegui.json
  • 12:47 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61841 and previous config saved to /var/cache/conftool/dbconfig/20240503-122659-root.json
  • 12:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61840 and previous config saved to /var/cache/conftool/dbconfig/20240503-122510-marostegui.json
  • 12:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 12:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61839 and previous config saved to /var/cache/conftool/dbconfig/20240503-122446-marostegui.json
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61838 and previous config saved to /var/cache/conftool/dbconfig/20240503-121153-root.json
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P61837 and previous config saved to /var/cache/conftool/dbconfig/20240503-120938-marostegui.json
  • 12:06 topranks: removing entries for lsw1-a1-codfw switch and private1-a1-codfw vlan from puppet T364097
  • 12:02 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh7002.wikimedia.org
  • 12:02 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh7002.wikimedia.org with OS bookworm
  • 12:01 moritzm: uploaded wmf-sre-laptop 0.5.10 to apt.wikimedia.org
  • 11:57 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:57 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove lsw1-a1-codfw phyiscal link dns - cmooney@cumin1002"
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61835 and previous config saved to /var/cache/conftool/dbconfig/20240503-115647-root.json
  • 11:55 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove lsw1-a1-codfw phyiscal link dns - cmooney@cumin1002"
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P61834 and previous config saved to /var/cache/conftool/dbconfig/20240503-115431-marostegui.json
  • 11:53 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 11:45 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum7002.magru.wmnet
  • 11:45 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum7002.magru.wmnet with OS bookworm
  • 11:44 topranks: Removing connections from ssw1-a1-codfw and ssw1-a8-codfw to lsw1-a1-codfw T364097
  • 11:41 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh7002.wikimedia.org with reason: host reimage
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61833 and previous config saved to /var/cache/conftool/dbconfig/20240503-114141-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61832 and previous config saved to /var/cache/conftool/dbconfig/20240503-113924-marostegui.json
  • 11:38 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh7002.wikimedia.org with reason: host reimage
  • 11:27 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum7002.magru.wmnet with reason: host reimage
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61831 and previous config saved to /var/cache/conftool/dbconfig/20240503-112635-root.json
  • 11:23 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum7002.magru.wmnet with reason: host reimage
  • 11:19 taavi@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM durum7001.magru.wmnet
  • 11:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS bookworm
  • 11:16 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 11:16 taavi@cumin1002: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=93)
  • 11:15 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61830 and previous config saved to /var/cache/conftool/dbconfig/20240503-111415-marostegui.json
  • 11:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61829 and previous config saved to /var/cache/conftool/dbconfig/20240503-111337-marostegui.json
  • 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM durum7001.magru.wmnet
  • 11:11 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host doh7002.wikimedia.org with OS bookworm
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61828 and previous config saved to /var/cache/conftool/dbconfig/20240503-111129-root.json
  • 11:11 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:10 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7002.wikimedia.org on all recursors
  • 11:09 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7002.wikimedia.org on all recursors
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM doh7001.wikimedia.org
  • 11:08 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:06 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 11:06 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7002.wikimedia.org
  • 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM doh7001.wikimedia.org
  • 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir7001.magru.wmnet
  • 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ncredir7001.magru.wmnet
  • 10:58 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host durum7002.magru.wmnet with OS bookworm
  • 10:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240503-105824-marostegui.json
  • 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM netflow7001.magru.wmnet
  • 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 10:54 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:53 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum7002.magru.wmnet on all recursors
  • 10:53 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache durum7002.magru.wmnet on all recursors
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 10:52 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM netflow7001.magru.wmnet
  • 10:50 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 10:50 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host durum7002.magru.wmnet
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P61827 and previous config saved to /var/cache/conftool/dbconfig/20240503-104317-marostegui.json
  • 10:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS bookworm
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1203', diff saved to https://phabricator.wikimedia.org/P61826 and previous config saved to /var/cache/conftool/dbconfig/20240503-103814-root.json
  • 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add bast7001 - jmm@cumin2002 - T364016"
  • 10:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add bast7001 - jmm@cumin2002 - T364016"
  • 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61825 and previous config saved to /var/cache/conftool/dbconfig/20240503-102809-marostegui.json
  • 10:27 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lsw1-a1-codfw,lsw1-a1-codfw IPv6,lsw1-a1-codfw.mgmt with reason: device being decommed and renamed, downtiming as a precaution first
  • 10:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on lsw1-a1-codfw,lsw1-a1-codfw IPv6,lsw1-a1-codfw.mgmt with reason: device being decommed and renamed, downtiming as a precaution first
  • 10:15 moritzm: installing Java 17 security updates on idp-test
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61823 and previous config saved to /var/cache/conftool/dbconfig/20240503-100335-marostegui.json
  • 10:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61822 and previous config saved to /var/cache/conftool/dbconfig/20240503-100313-marostegui.json
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P61821 and previous config saved to /var/cache/conftool/dbconfig/20240503-094805-marostegui.json
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P61820 and previous config saved to /var/cache/conftool/dbconfig/20240503-093257-marostegui.json
  • 09:26 pfischer@deploy1002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61818 and previous config saved to /var/cache/conftool/dbconfig/20240503-091750-marostegui.json
  • 09:11 pfischer@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61817 and previous config saved to /var/cache/conftool/dbconfig/20240503-085234-marostegui.json
  • 08:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host bast7001.wikimedia.org
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast7001.wikimedia.org with OS bookworm
  • 08:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61816 and previous config saved to /var/cache/conftool/dbconfig/20240503-085211-marostegui.json
  • 08:48 XioNoX: restart turnilo
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P61815 and previous config saved to /var/cache/conftool/dbconfig/20240503-083703-marostegui.json
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast7001.wikimedia.org with reason: host reimage
  • 08:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast7001.wikimedia.org with reason: host reimage
  • 08:30 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P61814 and previous config saved to /var/cache/conftool/dbconfig/20240503-082156-marostegui.json
  • 08:20 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 08:11 moritzm: installing emacs security updates
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61813 and previous config saved to /var/cache/conftool/dbconfig/20240503-080649-marostegui.json
  • 08:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast7001.wikimedia.org with OS bookworm
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast7001.wikimedia.org - jmm@cumin2002"
  • 08:00 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) bast7001.wikimedia.org on all recursors
  • 07:59 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache bast7001.wikimedia.org on all recursors
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:53 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:53 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host bast7001.wikimedia.org
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61812 and previous config saved to /var/cache/conftool/dbconfig/20240503-074135-marostegui.json
  • 07:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 07:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61811 and previous config saved to /var/cache/conftool/dbconfig/20240503-074112-marostegui.json
  • 07:33 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7004.magru.wmnet to cluster magru02 and group B4
  • 07:32 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'Arnadh2011' 'User435211' # T363654
  • 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7004.magru.wmnet to cluster magru02 and group B4
  • 07:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P61810 and previous config saved to /var/cache/conftool/dbconfig/20240503-072604-marostegui.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61809 and previous config saved to /var/cache/conftool/dbconfig/20240503-071853-root.json
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P61808 and previous config saved to /var/cache/conftool/dbconfig/20240503-071057-marostegui.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61807 and previous config saved to /var/cache/conftool/dbconfig/20240503-070347-root.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61806 and previous config saved to /var/cache/conftool/dbconfig/20240503-065547-marostegui.json
  • 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61805 and previous config saved to /var/cache/conftool/dbconfig/20240503-064842-root.json
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61804 and previous config saved to /var/cache/conftool/dbconfig/20240503-063336-root.json
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61803 and previous config saved to /var/cache/conftool/dbconfig/20240503-063048-marostegui.json
  • 06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61802 and previous config saved to /var/cache/conftool/dbconfig/20240503-063025-marostegui.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61801 and previous config saved to /var/cache/conftool/dbconfig/20240503-061830-root.json
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P61800 and previous config saved to /var/cache/conftool/dbconfig/20240503-061517-marostegui.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61799 and previous config saved to /var/cache/conftool/dbconfig/20240503-060324-root.json
  • 06:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P61798 and previous config saved to /var/cache/conftool/dbconfig/20240503-060010-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61797 and previous config saved to /var/cache/conftool/dbconfig/20240503-054818-root.json
  • 05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS bookworm
  • 05:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61796 and previous config saved to /var/cache/conftool/dbconfig/20240503-054502-marostegui.json
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61795 and previous config saved to /var/cache/conftool/dbconfig/20240503-052430-marostegui.json
  • 05:24 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 05:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
  • 05:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 05:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS bookworm
  • 05:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1214', diff saved to https://phabricator.wikimedia.org/P61794 and previous config saved to /var/cache/conftool/dbconfig/20240503-050947-root.json
  • 04:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 04:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 01:04 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 01:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61793 and previous config saved to /var/cache/conftool/dbconfig/20240503-010330-marostegui.json
  • 00:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P61792 and previous config saved to /var/cache/conftool/dbconfig/20240503-004821-marostegui.json
  • 00:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P61791 and previous config saved to /var/cache/conftool/dbconfig/20240503-003313-marostegui.json
  • 00:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61790 and previous config saved to /var/cache/conftool/dbconfig/20240503-001805-marostegui.json
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61789 and previous config saved to /var/cache/conftool/dbconfig/20240503-000614-marostegui.json
  • 00:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 00:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61788 and previous config saved to /var/cache/conftool/dbconfig/20240503-000602-marostegui.json

2024-05-02

  • 23:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P61787 and previous config saved to /var/cache/conftool/dbconfig/20240502-235053-marostegui.json
  • 23:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P61786 and previous config saved to /var/cache/conftool/dbconfig/20240502-233545-marostegui.json
  • 23:33 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61785 and previous config saved to /var/cache/conftool/dbconfig/20240502-232037-marostegui.json
  • 22:44 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 22:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61784 and previous config saved to /var/cache/conftool/dbconfig/20240502-224227-marostegui.json
  • 22:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 22:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 22:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61783 and previous config saved to /var/cache/conftool/dbconfig/20240502-224204-marostegui.json
  • 22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P61782 and previous config saved to /var/cache/conftool/dbconfig/20240502-222656-marostegui.json
  • 22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P61781 and previous config saved to /var/cache/conftool/dbconfig/20240502-221149-marostegui.json
  • 21:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61780 and previous config saved to /var/cache/conftool/dbconfig/20240502-215641-marostegui.json
  • 21:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir7001.magru.wmnet with OS bookworm
  • 21:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61779 and previous config saved to /var/cache/conftool/dbconfig/20240502-214435-marostegui.json
  • 21:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 21:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61778 and previous config saved to /var/cache/conftool/dbconfig/20240502-213631-marostegui.json
  • 21:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir7001.magru.wmnet with reason: host reimage
  • 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P61777 and previous config saved to /var/cache/conftool/dbconfig/20240502-212123-marostegui.json
  • 21:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir7001.magru.wmnet with reason: host reimage
  • 21:12 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 21:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P61776 and previous config saved to /var/cache/conftool/dbconfig/20240502-210613-marostegui.json
  • 20:53 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 20:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61775 and previous config saved to /var/cache/conftool/dbconfig/20240502-205105-marostegui.json
  • 20:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61774 and previous config saved to /var/cache/conftool/dbconfig/20240502-204208-marostegui.json
  • 20:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 20:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 20:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61773 and previous config saved to /var/cache/conftool/dbconfig/20240502-204146-marostegui.json
  • 20:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7001.magru.wmnet with OS bookworm
  • 20:32 jdrewniak@deploy1002: Sync cancelled.
  • 20:30 jdrewniak@deploy1002: jdrewniak: Backport for Revert "Deploy Vector appearance menu and increased font-size to plwiki" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P61772 and previous config saved to /var/cache/conftool/dbconfig/20240502-202638-marostegui.json
  • 20:25 jdrewniak@deploy1002: Started scap: Backport for Revert "Deploy Vector appearance menu and increased font-size to plwiki"
  • 20:21 jdrewniak@deploy1002: Sync cancelled.
  • 20:14 jdrewniak@deploy1002: bwang and jdrewniak: Backport for Update wgVectorClientPrefs to wgVectorAppearance (T362808), Deploy Vector appearance menu and increased font-size to plwiki (T362147) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P61771 and previous config saved to /var/cache/conftool/dbconfig/20240502-201131-marostegui.json
  • 20:09 jdrewniak@deploy1002: Started scap: Backport for Update wgVectorClientPrefs to wgVectorAppearance (T362808), Deploy Vector appearance menu and increased font-size to plwiki (T362147)
  • 20:04 cdanis@deploy1002: Finished scap: Backport for probenet: add magru measurement endpoint (T362902) (duration: 18m 19s)
  • 19:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61770 and previous config saved to /var/cache/conftool/dbconfig/20240502-195623-marostegui.json
  • 19:50 cdanis@deploy1002: cdanis: Continuing with sync
  • 19:50 cdanis@deploy1002: cdanis: Backport for probenet: add magru measurement endpoint (T362902) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:49 brett@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ncredir7001.magru.wmnet
  • 19:49 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ncredir7001.magru.wmnet with OS bookworm
  • 19:45 cdanis@deploy1002: Started scap: Backport for probenet: add magru measurement endpoint (T362902)
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61769 and previous config saved to /var/cache/conftool/dbconfig/20240502-194513-marostegui.json
  • 19:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61768 and previous config saved to /var/cache/conftool/dbconfig/20240502-194450-marostegui.json
  • 19:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P61767 and previous config saved to /var/cache/conftool/dbconfig/20240502-194127-ladsgroup.json
  • 19:36 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh7001.wikimedia.org
  • 19:36 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh7001.wikimedia.org with OS bookworm
  • 19:33 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@4edc35c]: (no justification provided) (duration: 00m 38s)
  • 19:32 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@4edc35c]: (no justification provided)
  • 19:31 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 19:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P61766 and previous config saved to /var/cache/conftool/dbconfig/20240502-192942-marostegui.json
  • 19:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 75%: Maint over', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20240502-192621-ladsgroup.json
  • 19:21 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P61765 and previous config saved to /var/cache/conftool/dbconfig/20240502-191434-marostegui.json
  • 19:11 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh7001.wikimedia.org with reason: host reimage
  • 19:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P61764 and previous config saved to /var/cache/conftool/dbconfig/20240502-191115-ladsgroup.json
  • 19:08 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh7001.wikimedia.org with reason: host reimage
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61763 and previous config saved to /var/cache/conftool/dbconfig/20240502-185926-marostegui.json
  • 18:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P61762 and previous config saved to /var/cache/conftool/dbconfig/20240502-185609-ladsgroup.json
  • 18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61761 and previous config saved to /var/cache/conftool/dbconfig/20240502-184710-marostegui.json
  • 18:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61760 and previous config saved to /var/cache/conftool/dbconfig/20240502-184658-marostegui.json
  • 18:41 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host doh7001.wikimedia.org with OS bookworm
  • 18:40 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:35 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P61759 and previous config saved to /var/cache/conftool/dbconfig/20240502-183151-marostegui.json
  • 18:24 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 18:23 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:22 sukhe: sudo cumin -b1 -s900 "A:dnsbox" "systemctl restart ntp.service"
  • 18:22 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7001.magru.wmnet with OS bookworm
  • 18:19 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:18 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir7001.magru.wmnet on all recursors
  • 18:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir7001.magru.wmnet on all recursors
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P61758 and previous config saved to /var/cache/conftool/dbconfig/20240502-181643-marostegui.json
  • 18:11 sukhe: magru: setting weights on cp servers and pooling
  • 18:10 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 18:10 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7001.wikimedia.org
  • 18:09 sukhe@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh7001.wikimedia.org
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 18:09 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:08 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:05 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61756 and previous config saved to /var/cache/conftool/dbconfig/20240502-180136-marostegui.json
  • 17:58 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:55 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 17:55 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 17:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:53 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:53 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:52 sukhe@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 17:50 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61755 and previous config saved to /var/cache/conftool/dbconfig/20240502-174920-marostegui.json
  • 17:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 17:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61754 and previous config saved to /var/cache/conftool/dbconfig/20240502-174856-marostegui.json
  • 17:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P61753 and previous config saved to /var/cache/conftool/dbconfig/20240502-173349-marostegui.json
  • 17:24 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 17:24 brett@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir7001.magru.wmnet
  • 17:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P61752 and previous config saved to /var/cache/conftool/dbconfig/20240502-171840-marostegui.json
  • 17:15 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:15 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:05 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:05 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7001.wikimedia.org
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61751 and previous config saved to /var/cache/conftool/dbconfig/20240502-170332-marostegui.json
  • 16:53 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:52 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61750 and previous config saved to /var/cache/conftool/dbconfig/20240502-165211-marostegui.json
  • 16:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61749 and previous config saved to /var/cache/conftool/dbconfig/20240502-165129-marostegui.json
  • 16:40 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum7001.magru.wmnet
  • 16:40 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum7001.magru.wmnet with OS bookworm
  • 16:39 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:38 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P61748 and previous config saved to /var/cache/conftool/dbconfig/20240502-163622-marostegui.json
  • 16:21 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@7513bfa]: (no justification provided) (duration: 00m 44s)
  • 16:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P61747 and previous config saved to /var/cache/conftool/dbconfig/20240502-162114-marostegui.json
  • 16:20 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@7513bfa]: (no justification provided)
  • 16:16 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum7001.magru.wmnet with reason: host reimage
  • 16:15 sukhe: running authdns-update once again to confirm state of dns700[12]
  • 16:14 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:14 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: force update dns7x - sukhe@cumin1002"
  • 16:13 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum7001.magru.wmnet with reason: host reimage
  • 16:12 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: force update dns7x - sukhe@cumin1002"
  • 16:12 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:12 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:11 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61746 and previous config saved to /var/cache/conftool/dbconfig/20240502-160606-marostegui.json
  • 16:05 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:03 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 15:56 sukhe: running authdns-update
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61744 and previous config saved to /var/cache/conftool/dbconfig/20240502-155359-marostegui.json
  • 15:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 15:53 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 15:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61743 and previous config saved to /var/cache/conftool/dbconfig/20240502-155336-marostegui.json
  • 15:51 moritzm: installing postgresql-15 security updates
  • 15:51 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns7002.wikimedia.org,service=(authdns-update|recdns|ntp)
  • 15:51 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org,service=(authdns-update|recdns|ntp)
  • 15:44 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host durum7001.magru.wmnet with OS bookworm
  • 15:43 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:43 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum7001.magru.wmnet on all recursors
  • 15:42 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache durum7001.magru.wmnet on all recursors
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:41 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:39 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 15:39 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host durum7001.magru.wmnet
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P61741 and previous config saved to /var/cache/conftool/dbconfig/20240502-153828-marostegui.json
  • 15:34 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore1*: Move to PKI Truststore - elukey@cumin1002
  • 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow7001.magru.wmnet
  • 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow7001.magru.wmnet with OS bookworm
  • 15:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P61740 and previous config saved to /var/cache/conftool/dbconfig/20240502-152319-marostegui.json
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti/magru02 - jmm@cumin2002"
  • 15:15 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore1*: Move to PKI Truststore - elukey@cumin1002
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61739 and previous config saved to /var/cache/conftool/dbconfig/20240502-151407-root.json
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61738 and previous config saved to /var/cache/conftool/dbconfig/20240502-151403-root.json
  • 15:13 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 15:12 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore200[5,6]*: Move to PKI Truststore - elukey@cumin1002
  • 15:12 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:11 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:10 hnowlan: Move mw-on-k8s traffic percentage from 80% to 85%
  • 15:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61737 and previous config saved to /var/cache/conftool/dbconfig/20240502-150812-marostegui.json
  • 15:03 elukey@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=inference,name=codfw
  • 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
  • 15:00 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore200[5,6]*: Move to PKI Truststore - elukey@cumin1002
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61736 and previous config saved to /var/cache/conftool/dbconfig/20240502-145901-root.json
  • 14:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61735 and previous config saved to /var/cache/conftool/dbconfig/20240502-145856-root.json
  • 14:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti/magru02 - jmm@cumin2002"
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61734 and previous config saved to /var/cache/conftool/dbconfig/20240502-145632-marostegui.json
  • 14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 14:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61733 and previous config saved to /var/cache/conftool/dbconfig/20240502-145609-marostegui.json
  • 14:56 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore2004*: Move to PKI Truststore - elukey@cumin1002
  • 14:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 14:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
  • 14:50 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore2004*: Move to PKI Truststore - elukey@cumin1002
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61732 and previous config saved to /var/cache/conftool/dbconfig/20240502-144356-root.json
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61731 and previous config saved to /var/cache/conftool/dbconfig/20240502-144350-root.json
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61730 and previous config saved to /var/cache/conftool/dbconfig/20240502-144300-root.json
  • 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow7001.magru.wmnet with reason: host reimage
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P61729 and previous config saved to /var/cache/conftool/dbconfig/20240502-144101-marostegui.json
  • 14:38 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow7001.magru.wmnet with reason: host reimage
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61728 and previous config saved to /var/cache/conftool/dbconfig/20240502-142850-root.json
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61727 and previous config saved to /var/cache/conftool/dbconfig/20240502-142844-root.json
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61726 and previous config saved to /var/cache/conftool/dbconfig/20240502-142754-root.json
  • 14:26 hnowlan@deploy1002: Finished scap: (no justification provided) (duration: 03m 16s)
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P61725 and previous config saved to /var/cache/conftool/dbconfig/20240502-142554-marostegui.json
  • 14:25 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:25 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:23 hnowlan@deploy1002: Started scap: (no justification provided)
  • 14:22 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.3 refs T361397
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61724 and previous config saved to /var/cache/conftool/dbconfig/20240502-141344-root.json
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61723 and previous config saved to /var/cache/conftool/dbconfig/20240502-141339-root.json
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61722 and previous config saved to /var/cache/conftool/dbconfig/20240502-141248-root.json
  • 14:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow7001.magru.wmnet with OS bookworm
  • 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61721 and previous config saved to /var/cache/conftool/dbconfig/20240502-141046-marostegui.json
  • 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:07 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow7001.magru.wmnet on all recursors
  • 14:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow7001.magru.wmnet on all recursors
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:04 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1371.eqiad.wmnet|mw1399.eqiad.wmnet|mw1405.eqiad.wmnet|mw1409.eqiad.wmnet|mw1435.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 14:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61720 and previous config saved to /var/cache/conftool/dbconfig/20240502-135947-marostegui.json
  • 13:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61719 and previous config saved to /var/cache/conftool/dbconfig/20240502-135839-root.json
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61718 and previous config saved to /var/cache/conftool/dbconfig/20240502-135833-root.json
  • 13:58 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:58 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61717 and previous config saved to /var/cache/conftool/dbconfig/20240502-135743-root.json
  • 13:57 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:57 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 13:56 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:56 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow7001.magru.wmnet
  • 13:54 hnowlan: running homer 'cr*eqiad*' commit for new kubernetes workers
  • 13:53 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:53 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:52 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:50 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:50 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 13:43 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:43 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61716 and previous config saved to /var/cache/conftool/dbconfig/20240502-134333-root.json
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61715 and previous config saved to /var/cache/conftool/dbconfig/20240502-134328-root.json
  • 13:42 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61714 and previous config saved to /var/cache/conftool/dbconfig/20240502-134237-root.json
  • 13:42 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:41 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:40 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1175 db1189', diff saved to https://phabricator.wikimedia.org/P61713 and previous config saved to /var/cache/conftool/dbconfig/20240502-134050-root.json
  • 13:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 13:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61712 and previous config saved to /var/cache/conftool/dbconfig/20240502-133420-marostegui.json
  • 13:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
  • 13:32 sukhe: running authdns-update to revert magru text geomap
  • 13:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61711 and previous config saved to /var/cache/conftool/dbconfig/20240502-132731-root.json
  • 13:24 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:24 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61710 and previous config saved to /var/cache/conftool/dbconfig/20240502-131912-marostegui.json
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61709 and previous config saved to /var/cache/conftool/dbconfig/20240502-131225-root.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS bookworm
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61708 and previous config saved to /var/cache/conftool/dbconfig/20240502-130404-marostegui.json
  • 13:02 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:57 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 12:49 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61707 and previous config saved to /var/cache/conftool/dbconfig/20240502-124857-marostegui.json
  • 12:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
  • 12:26 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm
  • 12:25 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2161.codfw.wmnet with OS bookworm
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61704 and previous config saved to /var/cache/conftool/dbconfig/20240502-122409-marostegui.json
  • 12:22 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 12:20 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2161', diff saved to https://phabricator.wikimedia.org/P61703 and previous config saved to /var/cache/conftool/dbconfig/20240502-121759-root.json
  • 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1230.eqiad.wmnet
  • 12:15 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61702 and previous config saved to /var/cache/conftool/dbconfig/20240502-120901-marostegui.json
  • 12:02 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 12:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1399.eqiad.wmnet with OS bullseye
  • 11:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 11:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1435.eqiad.wmnet with OS bullseye
  • 11:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
  • 11:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1230.eqiad.wmnet
  • 11:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1405.eqiad.wmnet with OS bullseye
  • 11:55 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 11:54 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1213.eqiad.wmnet
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61701 and previous config saved to /var/cache/conftool/dbconfig/20240502-115353-marostegui.json
  • 11:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1409.eqiad.wmnet with OS bullseye
  • 11:53 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 11:51 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1371.eqiad.wmnet with OS bullseye
  • 11:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61700 and previous config saved to /var/cache/conftool/dbconfig/20240502-114448-marostegui.json
  • 11:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 11:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61699 and previous config saved to /var/cache/conftool/dbconfig/20240502-114425-marostegui.json
  • 11:43 elukey: depool LiftWing's codfw services from traffic to move all MW API calls to mw-api-int-ro
  • 11:43 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage
  • 11:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1213.eqiad.wmnet
  • 11:42 elukey@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=inference,name=codfw
  • 11:41 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:41 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:41 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:40 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1435.eqiad.wmnet with reason: host reimage
  • 11:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1405.eqiad.wmnet with reason: host reimage
  • 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1210.eqiad.wmnet
  • 11:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1409.eqiad.wmnet with reason: host reimage
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1405.eqiad.wmnet with reason: host reimage
  • 11:34 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage
  • 11:34 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1435.eqiad.wmnet with reason: host reimage
  • 11:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1371.eqiad.wmnet with reason: host reimage
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1409.eqiad.wmnet with reason: host reimage
  • 11:29 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1371.eqiad.wmnet with reason: host reimage
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P61698 and previous config saved to /var/cache/conftool/dbconfig/20240502-112918-marostegui.json
  • 11:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1210.eqiad.wmnet
  • 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1185.eqiad.wmnet
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1405.eqiad.wmnet with OS bullseye
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1399.eqiad.wmnet with OS bullseye
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1435.eqiad.wmnet with OS bullseye
  • 11:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1409.eqiad.wmnet with OS bullseye
  • 11:15 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1371.eqiad.wmnet with OS bullseye
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P61697 and previous config saved to /var/cache/conftool/dbconfig/20240502-111410-marostegui.json
  • 11:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1185.eqiad.wmnet
  • 11:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet. on all recursors
  • 11:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet. on all recursors
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:05 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1183.eqiad.wmnet
  • 10:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61696 and previous config saved to /var/cache/conftool/dbconfig/20240502-105903-marostegui.json
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61695 and previous config saved to /var/cache/conftool/dbconfig/20240502-105530-root.json
  • 10:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1183.eqiad.wmnet
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61694 and previous config saved to /var/cache/conftool/dbconfig/20240502-104658-marostegui.json
  • 10:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 10:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61693 and previous config saved to /var/cache/conftool/dbconfig/20240502-104024-root.json
  • 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti01/magru - jmm@cumin2002"
  • 10:37 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti01/magru - jmm@cumin2002"
  • 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2213.codfw.wmnet
  • 10:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 10:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 10:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61692 and previous config saved to /var/cache/conftool/dbconfig/20240502-103601-marostegui.json
  • 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61691 and previous config saved to /var/cache/conftool/dbconfig/20240502-102518-root.json
  • 10:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2213.codfw.wmnet
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61690 and previous config saved to /var/cache/conftool/dbconfig/20240502-102053-marostegui.json
  • 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
  • 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2211.codfw.wmnet
  • 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61689 and previous config saved to /var/cache/conftool/dbconfig/20240502-101012-root.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61688 and previous config saved to /var/cache/conftool/dbconfig/20240502-100546-marostegui.json
  • 10:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2211.codfw.wmnet
  • 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2192.codfw.wmnet
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61687 and previous config saved to /var/cache/conftool/dbconfig/20240502-095506-root.json
  • 09:54 moritzm: installing util-linux security updates
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61686 and previous config saved to /var/cache/conftool/dbconfig/20240502-095038-marostegui.json
  • 09:50 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2192.codfw.wmnet
  • 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2178.codfw.wmnet
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61685 and previous config saved to /var/cache/conftool/dbconfig/20240502-094000-root.json
  • 09:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2382.codfw.wmnet with reason: Degraded RAID/storage controller issues
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61684 and previous config saved to /var/cache/conftool/dbconfig/20240502-093827-marostegui.json
  • 09:38 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2382.codfw.wmnet with reason: Degraded RAID/storage controller issues
  • 09:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61683 and previous config saved to /var/cache/conftool/dbconfig/20240502-093803-marostegui.json
  • 09:35 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1005.eqiad.wmnet with reason: host reimage
  • 09:32 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1005.eqiad.wmnet with reason: host reimage
  • 09:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2152.codfw.wmnet with OS bookworm
  • 09:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2178.codfw.wmnet
  • 09:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61682 and previous config saved to /var/cache/conftool/dbconfig/20240502-092454-root.json
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P61681 and previous config saved to /var/cache/conftool/dbconfig/20240502-092256-marostegui.json
  • 09:18 hnowlan: depooling 5 appservers in advance of migrating them to k8s workers
  • 09:18 stevemunene@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 09:13 stevemunene@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: sync on main
  • 09:13 stevemunene@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
  • 09:12 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:10 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
  • 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2171.codfw.wmnet
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P61680 and previous config saved to /var/cache/conftool/dbconfig/20240502-090748-marostegui.json
  • 09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
  • 09:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:02 stevemunene@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: sync on main
  • 08:59 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2171.codfw.wmnet
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61679 and previous config saved to /var/cache/conftool/dbconfig/20240502-085241-marostegui.json
  • 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2157.codfw.wmnet
  • 08:49 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2152.codfw.wmnet with OS bookworm
  • 08:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61677 and previous config saved to /var/cache/conftool/dbconfig/20240502-084041-marostegui.json
  • 08:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 08:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 08:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61676 and previous config saved to /var/cache/conftool/dbconfig/20240502-084018-marostegui.json
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P61675 and previous config saved to /var/cache/conftool/dbconfig/20240502-082510-marostegui.json
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P61674 and previous config saved to /var/cache/conftool/dbconfig/20240502-081002-marostegui.json
  • 08:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2157.codfw.wmnet
  • 08:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2123.codfw.wmnet
  • 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-public
  • 07:56 brouberol@cumin1002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61673 and previous config saved to /var/cache/conftool/dbconfig/20240502-075455-marostegui.json
  • 07:48 brouberol@cumin1002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
  • 07:47 moritzm: installing Java 8 security updates
  • 07:47 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-public
  • 07:44 volans@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: Update Netbox dependencies for netbox - volans@cumin1002
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61672 and previous config saved to /var/cache/conftool/dbconfig/20240502-074400-marostegui.json
  • 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61671 and previous config saved to /var/cache/conftool/dbconfig/20240502-074320-marostegui.json
  • 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-internal
  • 07:40 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2123.codfw.wmnet
  • 07:38 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-internal
  • 07:38 volans@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: Update Netbox dependencies for netbox - volans@cumin1002
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P61670 and previous config saved to /var/cache/conftool/dbconfig/20240502-072813-marostegui.json
  • 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-test
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P61669 and previous config saved to /var/cache/conftool/dbconfig/20240502-071305-marostegui.json
  • 07:13 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-test
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61668 and previous config saved to /var/cache/conftool/dbconfig/20240502-065758-marostegui.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61667 and previous config saved to /var/cache/conftool/dbconfig/20240502-064533-marostegui.json
  • 06:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61666 and previous config saved to /var/cache/conftool/dbconfig/20240502-064230-root.json
  • 06:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 06:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61665 and previous config saved to /var/cache/conftool/dbconfig/20240502-063343-marostegui.json
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61664 and previous config saved to /var/cache/conftool/dbconfig/20240502-062725-root.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P61663 and previous config saved to /var/cache/conftool/dbconfig/20240502-061836-marostegui.json
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61662 and previous config saved to /var/cache/conftool/dbconfig/20240502-061218-root.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P61661 and previous config saved to /var/cache/conftool/dbconfig/20240502-060328-marostegui.json
  • 05:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61660 and previous config saved to /var/cache/conftool/dbconfig/20240502-055712-root.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61659 and previous config saved to /var/cache/conftool/dbconfig/20240502-054821-marostegui.json
  • 05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61658 and previous config saved to /var/cache/conftool/dbconfig/20240502-054206-root.json
  • 05:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61657 and previous config saved to /var/cache/conftool/dbconfig/20240502-053717-marostegui.json
  • 05:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 05:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 05:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61656 and previous config saved to /var/cache/conftool/dbconfig/20240502-053654-marostegui.json
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS bookworm
  • 05:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61655 and previous config saved to /var/cache/conftool/dbconfig/20240502-052700-root.json
  • 05:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P61654 and previous config saved to /var/cache/conftool/dbconfig/20240502-052146-marostegui.json
  • 05:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS bookworm
  • 05:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61653 and previous config saved to /var/cache/conftool/dbconfig/20240502-051155-root.json
  • 05:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
  • 05:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P61652 and previous config saved to /var/cache/conftool/dbconfig/20240502-050639-marostegui.json
  • 05:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
  • 04:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
  • 04:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS bookworm
  • 04:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61651 and previous config saved to /var/cache/conftool/dbconfig/20240502-045131-marostegui.json
  • 04:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1181 T363892', diff saved to https://phabricator.wikimedia.org/P61650 and previous config saved to /var/cache/conftool/dbconfig/20240502-045017-root.json
  • 04:48 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write T363892', diff saved to https://phabricator.wikimedia.org/P61649 and previous config saved to /var/cache/conftool/dbconfig/20240502-044848-marostegui.json
  • 04:48 marostegui@cumin1002: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T363892', diff saved to https://phabricator.wikimedia.org/P61648 and previous config saved to /var/cache/conftool/dbconfig/20240502-044819-marostegui.json
  • 04:48 marostegui: Starting s7 eqiad failover from db1181 to db1236 - T363892
  • 04:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61647 and previous config saved to /var/cache/conftool/dbconfig/20240502-044020-marostegui.json
  • 04:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 04:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 04:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS bookworm
  • 04:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: Reimage
  • 04:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: Reimage
  • 04:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2162', diff saved to https://phabricator.wikimedia.org/P61646 and previous config saved to /var/cache/conftool/dbconfig/20240502-043403-root.json
  • 04:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 T363892
  • 04:30 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1236 with weight 0 T363892', diff saved to https://phabricator.wikimedia.org/P61645 and previous config saved to /var/cache/conftool/dbconfig/20240502-043019-marostegui.json
  • 04:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s7 T363892
  • 04:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 04:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 04:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 04:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance

2024-05-01

  • 23:57 eileen: civicrm upgraded from 3ac4043c to 80ae4543
  • 21:37 eileen: config revision changed from 36b287b6 to b772c8bc
  • 20:22 jdrewniak@deploy1002: Finished scap: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147) (duration: 19m 29s)
  • 20:10 jdrewniak@deploy1002: jdlrobson and jdrewniak: Continuing with sync
  • 20:08 jdrewniak@deploy1002: jdlrobson and jdrewniak: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:03 jdrewniak@deploy1002: Started scap: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147)
  • 19:40 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
  • 19:40 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 19:39 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 19:12 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
  • 19:09 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
  • 18:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61644 and previous config saved to /var/cache/conftool/dbconfig/20240501-185521-marostegui.json
  • 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P61643 and previous config saved to /var/cache/conftool/dbconfig/20240501-184013-marostegui.json
  • 18:36 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
  • 18:36 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS bookworm
  • 18:36 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:28 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1005.eqiad.wmnet with OS bookworm
  • 18:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P61642 and previous config saved to /var/cache/conftool/dbconfig/20240501-182505-marostegui.json
  • 18:16 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
  • 18:15 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7001.wikimedia.org with OS bookworm
  • 18:15 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 18:14 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 18:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61641 and previous config saved to /var/cache/conftool/dbconfig/20240501-180958-marostegui.json
  • 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61640 and previous config saved to /var/cache/conftool/dbconfig/20240501-180645-marostegui.json
  • 18:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61639 and previous config saved to /var/cache/conftool/dbconfig/20240501-180622-marostegui.json
  • 18:03 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1004.eqiad.wmnet with OS bookworm
  • 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P61638 and previous config saved to /var/cache/conftool/dbconfig/20240501-175114-marostegui.json
  • 17:49 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 17:46 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 17:38 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1004.eqiad.wmnet with reason: host reimage
  • 17:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P61637 and previous config saved to /var/cache/conftool/dbconfig/20240501-173607-marostegui.json
  • 17:35 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1004.eqiad.wmnet with reason: host reimage
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61636 and previous config saved to /var/cache/conftool/dbconfig/20240501-172059-marostegui.json
  • 17:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61635 and previous config saved to /var/cache/conftool/dbconfig/20240501-171527-marostegui.json
  • 17:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 17:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 17:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61634 and previous config saved to /var/cache/conftool/dbconfig/20240501-171504-marostegui.json
  • 17:14 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1004.eqiad.wmnet with OS bookworm
  • 17:12 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7001.wikimedia.org with OS bookworm
  • 17:02 sukhe: sudo cumin -b1 -s10 "A:dnsbox" "run-puppet-agent --enable 'merging CR 1026166'"
  • 16:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P61633 and previous config saved to /var/cache/conftool/dbconfig/20240501-165957-marostegui.json
  • 16:59 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:59 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org
  • 16:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P61632 and previous config saved to /var/cache/conftool/dbconfig/20240501-164450-marostegui.json
  • 16:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org
  • 16:43 sukhe: sudo cumin "A:dnsbox" "disable-puppet 'merging CR 1026166'"
  • 16:34 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1003.eqiad.wmnet with reason: host reimage
  • 16:31 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1003.eqiad.wmnet with reason: host reimage
  • 16:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61630 and previous config saved to /var/cache/conftool/dbconfig/20240501-162942-marostegui.json
  • 16:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61629 and previous config saved to /var/cache/conftool/dbconfig/20240501-162629-marostegui.json
  • 16:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 16:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 16:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61628 and previous config saved to /var/cache/conftool/dbconfig/20240501-162607-marostegui.json
  • 16:11 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P61627 and previous config saved to /var/cache/conftool/dbconfig/20240501-161059-marostegui.json
  • 16:10 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:01 milimetric@deploy1002: Finished deploy [airflow-dags/analytics@09b4f5f]: Testing different settings for mediawiki_history_shapshot_config (duration: 00m 28s)
  • 16:00 milimetric@deploy1002: Started deploy [airflow-dags/analytics@09b4f5f]: Testing different settings for mediawiki_history_shapshot_config
  • 15:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P61626 and previous config saved to /var/cache/conftool/dbconfig/20240501-155552-marostegui.json
  • 15:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61625 and previous config saved to /var/cache/conftool/dbconfig/20240501-154042-marostegui.json
  • 15:39 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1003.eqiad.wmnet with OS bookworm
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61624 and previous config saved to /var/cache/conftool/dbconfig/20240501-153829-marostegui.json
  • 15:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 15:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61623 and previous config saved to /var/cache/conftool/dbconfig/20240501-153806-marostegui.json
  • 15:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P61622 and previous config saved to /var/cache/conftool/dbconfig/20240501-152259-marostegui.json
  • 15:22 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.3 refs T361397
  • 15:15 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7001.wikimedia.org with OS bookworm
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P61621 and previous config saved to /var/cache/conftool/dbconfig/20240501-150751-marostegui.json
  • 14:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61620 and previous config saved to /var/cache/conftool/dbconfig/20240501-145243-marostegui.json
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61619 and previous config saved to /var/cache/conftool/dbconfig/20240501-145131-marostegui.json
  • 14:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61618 and previous config saved to /var/cache/conftool/dbconfig/20240501-145108-marostegui.json
  • 14:43 dancy@deploy1002: Installation of scap version "4.81.0" completed for 325 hosts
  • 14:42 dancy@deploy1002: Installing scap version "4.81.0" for 325 hosts
  • 14:36 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1002.eqiad.wmnet with OS bookworm
  • 14:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P61617 and previous config saved to /var/cache/conftool/dbconfig/20240501-143601-marostegui.json
  • 14:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61616 and previous config saved to /var/cache/conftool/dbconfig/20240501-142233-root.json
  • 14:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P61615 and previous config saved to /var/cache/conftool/dbconfig/20240501-142053-marostegui.json
  • 14:12 bking@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:11 bking@deploy1002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1002.eqiad.wmnet with reason: host reimage
  • 14:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1002.eqiad.wmnet with reason: host reimage
  • 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61614 and previous config saved to /var/cache/conftool/dbconfig/20240501-140728-root.json
  • 14:05 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 14:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61613 and previous config saved to /var/cache/conftool/dbconfig/20240501-140545-marostegui.json
  • 14:03 bking@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61612 and previous config saved to /var/cache/conftool/dbconfig/20240501-140333-marostegui.json
  • 14:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 14:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 14:03 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 14:03 bking@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61611 and previous config saved to /var/cache/conftool/dbconfig/20240501-135915-marostegui.json
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61610 and previous config saved to /var/cache/conftool/dbconfig/20240501-135222-root.json
  • 13:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1002.eqiad.wmnet with OS bookworm
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P61609 and previous config saved to /var/cache/conftool/dbconfig/20240501-134407-marostegui.json
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61608 and previous config saved to /var/cache/conftool/dbconfig/20240501-133717-root.json
  • 13:33 Amir1: promoting HNowlan (WMF) to admin in testwiki
  • 13:29 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7001.wikimedia.org with OS bookworm
  • 13:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P61607 and previous config saved to /var/cache/conftool/dbconfig/20240501-132900-marostegui.json
  • 13:25 sukhe: running authdns-update for CR 1026119: depool magru text*
  • 13:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61606 and previous config saved to /var/cache/conftool/dbconfig/20240501-132211-root.json
  • 13:15 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1001.eqiad.wmnet with OS bookworm
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61605 and previous config saved to /var/cache/conftool/dbconfig/20240501-131351-marostegui.json
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61604 and previous config saved to /var/cache/conftool/dbconfig/20240501-130822-marostegui.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61603 and previous config saved to /var/cache/conftool/dbconfig/20240501-130747-marostegui.json
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61602 and previous config saved to /var/cache/conftool/dbconfig/20240501-130704-root.json
  • 12:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS bookworm
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P61601 and previous config saved to /var/cache/conftool/dbconfig/20240501-125239-marostegui.json
  • 12:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61600 and previous config saved to /var/cache/conftool/dbconfig/20240501-125158-root.json
  • 12:48 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1001.eqiad.wmnet with reason: host reimage
  • 12:45 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1001.eqiad.wmnet with reason: host reimage
  • 12:24 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1001.eqiad.wmnet with OS bookworm
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61598 and previous config saved to /var/cache/conftool/dbconfig/20240501-122224-marostegui.json
  • 12:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61597 and previous config saved to /var/cache/conftool/dbconfig/20240501-122012-marostegui.json
  • 12:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS bookworm
  • 12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 12:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2154', diff saved to https://phabricator.wikimedia.org/P61596 and previous config saved to /var/cache/conftool/dbconfig/20240501-121347-root.json
  • 12:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61595 and previous config saved to /var/cache/conftool/dbconfig/20240501-120833-root.json
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61594 and previous config saved to /var/cache/conftool/dbconfig/20240501-115915-marostegui.json
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61593 and previous config saved to /var/cache/conftool/dbconfig/20240501-115327-root.json
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P61592 and previous config saved to /var/cache/conftool/dbconfig/20240501-114408-marostegui.json
  • 11:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61591 and previous config saved to /var/cache/conftool/dbconfig/20240501-113821-root.json
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P61590 and previous config saved to /var/cache/conftool/dbconfig/20240501-112900-marostegui.json
  • 11:24 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7003.magru.wmnet with OS bullseye
  • 11:24 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61589 and previous config saved to /var/cache/conftool/dbconfig/20240501-112315-root.json
  • 11:22 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 11:17 sukhe@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs7002.magru.wmnet
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61588 and previous config saved to /var/cache/conftool/dbconfig/20240501-111353-marostegui.json
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61587 and previous config saved to /var/cache/conftool/dbconfig/20240501-110834-marostegui.json
  • 11:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61586 and previous config saved to /var/cache/conftool/dbconfig/20240501-110822-marostegui.json
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61585 and previous config saved to /var/cache/conftool/dbconfig/20240501-110809-root.json
  • 11:07 sukhe@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs7001.magru.wmnet
  • 11:05 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs7002.magru.wmnet
  • 10:58 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7003.magru.wmnet with reason: host reimage
  • 10:55 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs7001.magru.wmnet
  • 10:55 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7003.magru.wmnet with reason: host reimage
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P61584 and previous config saved to /var/cache/conftool/dbconfig/20240501-105315-marostegui.json
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61583 and previous config saved to /var/cache/conftool/dbconfig/20240501-105304-root.json
  • 10:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS bookworm
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P61582 and previous config saved to /var/cache/conftool/dbconfig/20240501-103801-marostegui.json
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61581 and previous config saved to /var/cache/conftool/dbconfig/20240501-103758-root.json
  • 10:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61580 and previous config saved to /var/cache/conftool/dbconfig/20240501-103338-arnaudb.json
  • 10:30 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7003.magru.wmnet with OS bullseye
  • 10:30 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs7003.magru.wmnet with OS bullseye
  • 10:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 10:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 10:28 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs7003.magru.wmnet']
  • 10:27 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs7003.magru.wmnet']
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61579 and previous config saved to /var/cache/conftool/dbconfig/20240501-102253-marostegui.json
  • 10:22 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7003.magru.wmnet with OS bullseye
  • 10:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
  • 10:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61578 and previous config saved to /var/cache/conftool/dbconfig/20240501-101832-arnaudb.json
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61577 and previous config saved to /var/cache/conftool/dbconfig/20240501-101728-marostegui.json
  • 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 10:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61576 and previous config saved to /var/cache/conftool/dbconfig/20240501-101151-root.json
  • 10:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 10:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61575 and previous config saved to /var/cache/conftool/dbconfig/20240501-100650-marostegui.json
  • 10:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 50%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61574 and previous config saved to /var/cache/conftool/dbconfig/20240501-100326-arnaudb.json
  • 10:00 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS bookworm
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2163', diff saved to https://phabricator.wikimedia.org/P61573 and previous config saved to /var/cache/conftool/dbconfig/20240501-095845-root.json
  • 09:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61572 and previous config saved to /var/cache/conftool/dbconfig/20240501-095646-root.json
  • 09:52 topranks: restarting routinator service on rpki1001
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P61571 and previous config saved to /var/cache/conftool/dbconfig/20240501-095142-marostegui.json
  • 09:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61570 and previous config saved to /var/cache/conftool/dbconfig/20240501-094821-arnaudb.json
  • 09:42 marostegui@deploy1002: Finished scap: Backport for etcd.php: Add es7 (T355285 T355424) (duration: 14m 53s)
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61569 and previous config saved to /var/cache/conftool/dbconfig/20240501-094140-root.json
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P61568 and previous config saved to /var/cache/conftool/dbconfig/20240501-093635-marostegui.json
  • 09:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 15%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61567 and previous config saved to /var/cache/conftool/dbconfig/20240501-093315-arnaudb.json
  • 09:30 marostegui@deploy1002: marostegui: Continuing with sync
  • 09:30 marostegui@deploy1002: marostegui: Backport for etcd.php: Add es7 (T355285 T355424) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:27 marostegui@deploy1002: Started scap: Backport for etcd.php: Add es7 (T355285 T355424)
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61566 and previous config saved to /var/cache/conftool/dbconfig/20240501-092634-root.json
  • 09:22 topranks: withdrawing public prefix announcement to AS7195 to test backup in magru (T362421)
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61565 and previous config saved to /var/cache/conftool/dbconfig/20240501-092125-marostegui.json
  • 09:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 10%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61564 and previous config saved to /var/cache/conftool/dbconfig/20240501-091809-arnaudb.json
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61563 and previous config saved to /var/cache/conftool/dbconfig/20240501-091513-marostegui.json
  • 09:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61562 and previous config saved to /var/cache/conftool/dbconfig/20240501-091451-marostegui.json
  • 09:13 marostegui@cumin1002: dbctl commit (dc=all): 'Push es7 codfw config T355424', diff saved to https://phabricator.wikimedia.org/P61561 and previous config saved to /var/cache/conftool/dbconfig/20240501-091352-marostegui.json
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61560 and previous config saved to /var/cache/conftool/dbconfig/20240501-091128-root.json
  • 09:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 5%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61559 and previous config saved to /var/cache/conftool/dbconfig/20240501-090303-arnaudb.json
  • 08:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P61558 and previous config saved to /var/cache/conftool/dbconfig/20240501-085943-marostegui.json
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61557 and previous config saved to /var/cache/conftool/dbconfig/20240501-085622-root.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61556 and previous config saved to /var/cache/conftool/dbconfig/20240501-085223-root.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P61555 and previous config saved to /var/cache/conftool/dbconfig/20240501-084436-marostegui.json
  • 08:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS bookworm
  • 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61554 and previous config saved to /var/cache/conftool/dbconfig/20240501-084116-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61553 and previous config saved to /var/cache/conftool/dbconfig/20240501-083717-root.json
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61552 and previous config saved to /var/cache/conftool/dbconfig/20240501-083641-root.json
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Push es7 eqiad config T355285', diff saved to https://phabricator.wikimedia.org/P61551 and previous config saved to /var/cache/conftool/dbconfig/20240501-083120-marostegui.json
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61550 and previous config saved to /var/cache/conftool/dbconfig/20240501-082928-marostegui.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61549 and previous config saved to /var/cache/conftool/dbconfig/20240501-082357-marostegui.json
  • 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61548 and previous config saved to /var/cache/conftool/dbconfig/20240501-082334-marostegui.json
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61547 and previous config saved to /var/cache/conftool/dbconfig/20240501-082211-root.json
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61546 and previous config saved to /var/cache/conftool/dbconfig/20240501-082135-root.json
  • 08:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
  • 08:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P61545 and previous config saved to /var/cache/conftool/dbconfig/20240501-080827-marostegui.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61544 and previous config saved to /var/cache/conftool/dbconfig/20240501-080706-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61543 and previous config saved to /var/cache/conftool/dbconfig/20240501-080630-root.json
  • 08:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 08:05 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61542 and previous config saved to /var/cache/conftool/dbconfig/20240501-080354-root.json
  • 07:59 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS bookworm
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2164', diff saved to https://phabricator.wikimedia.org/P61541 and previous config saved to /var/cache/conftool/dbconfig/20240501-075614-root.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P61540 and previous config saved to /var/cache/conftool/dbconfig/20240501-075320-marostegui.json
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61539 and previous config saved to /var/cache/conftool/dbconfig/20240501-075200-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61538 and previous config saved to /var/cache/conftool/dbconfig/20240501-075124-root.json
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61537 and previous config saved to /var/cache/conftool/dbconfig/20240501-074848-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61536 and previous config saved to /var/cache/conftool/dbconfig/20240501-073812-marostegui.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61535 and previous config saved to /var/cache/conftool/dbconfig/20240501-073655-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61534 and previous config saved to /var/cache/conftool/dbconfig/20240501-073615-root.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61533 and previous config saved to /var/cache/conftool/dbconfig/20240501-073342-root.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61532 and previous config saved to /var/cache/conftool/dbconfig/20240501-073201-marostegui.json
  • 07:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61531 and previous config saved to /var/cache/conftool/dbconfig/20240501-073123-marostegui.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61530 and previous config saved to /var/cache/conftool/dbconfig/20240501-072149-root.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61529 and previous config saved to /var/cache/conftool/dbconfig/20240501-072110-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61528 and previous config saved to /var/cache/conftool/dbconfig/20240501-071836-root.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P61527 and previous config saved to /var/cache/conftool/dbconfig/20240501-071615-marostegui.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61526 and previous config saved to /var/cache/conftool/dbconfig/20240501-070603-root.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61525 and previous config saved to /var/cache/conftool/dbconfig/20240501-070330-root.json
  • 07:02 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1186.eqiad.wmnet onto db1234.eqiad.wmnet
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P61524 and previous config saved to /var/cache/conftool/dbconfig/20240501-070108-marostegui.json
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61523 and previous config saved to /var/cache/conftool/dbconfig/20240501-065845-root.json
  • 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61522 and previous config saved to /var/cache/conftool/dbconfig/20240501-064824-root.json
  • 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61521 and previous config saved to /var/cache/conftool/dbconfig/20240501-064600-marostegui.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61520 and previous config saved to /var/cache/conftool/dbconfig/20240501-064339-root.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61519 and previous config saved to /var/cache/conftool/dbconfig/20240501-063942-marostegui.json
  • 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61518 and previous config saved to /var/cache/conftool/dbconfig/20240501-063919-marostegui.json
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS bookworm
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61517 and previous config saved to /var/cache/conftool/dbconfig/20240501-063318-root.json
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61516 and previous config saved to /var/cache/conftool/dbconfig/20240501-062833-root.json
  • 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P61515 and previous config saved to /var/cache/conftool/dbconfig/20240501-062407-marostegui.json
  • 06:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
  • 06:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61514 and previous config saved to /var/cache/conftool/dbconfig/20240501-061327-root.json
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P61513 and previous config saved to /var/cache/conftool/dbconfig/20240501-060900-marostegui.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61512 and previous config saved to /var/cache/conftool/dbconfig/20240501-055822-root.json
  • 05:58 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS bookworm
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2166', diff saved to https://phabricator.wikimedia.org/P61511 and previous config saved to /var/cache/conftool/dbconfig/20240501-055657-root.json
  • 05:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61510 and previous config saved to /var/cache/conftool/dbconfig/20240501-055353-marostegui.json
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61509 and previous config saved to /var/cache/conftool/dbconfig/20240501-054720-marostegui.json
  • 05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61508 and previous config saved to /var/cache/conftool/dbconfig/20240501-054657-marostegui.json
  • 05:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61507 and previous config saved to /var/cache/conftool/dbconfig/20240501-054316-root.json
  • 05:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es[1035,1039-1040].eqiad.wmnet with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on es[1035,1039-1040].eqiad.wmnet with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 6 hosts with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 6 hosts with reason: Setting up T355285 T355424
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P61506 and previous config saved to /var/cache/conftool/dbconfig/20240501-053149-marostegui.json
  • 05:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS bookworm
  • 05:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS bookworm
  • 05:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61505 and previous config saved to /var/cache/conftool/dbconfig/20240501-052810-root.json
  • 05:23 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db1186.eqiad.wmnet onto db1234.eqiad.wmnet
  • 05:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1186 to clone db1234 T363890', diff saved to https://phabricator.wikimedia.org/P61504 and previous config saved to /var/cache/conftool/dbconfig/20240501-051848-marostegui.json
  • 05:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P61503 and previous config saved to /var/cache/conftool/dbconfig/20240501-051642-marostegui.json
  • 05:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
  • 05:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
  • 05:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
  • 05:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 05:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 05:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
  • 05:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61502 and previous config saved to /var/cache/conftool/dbconfig/20240501-050135-marostegui.json
  • 04:57 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS bookworm
  • 04:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1236', diff saved to https://phabricator.wikimedia.org/P61501 and previous config saved to /var/cache/conftool/dbconfig/20240501-045624-marostegui.json
  • 04:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61500 and previous config saved to /var/cache/conftool/dbconfig/20240501-045517-marostegui.json
  • 04:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 04:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 04:54 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS bookworm
  • 04:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 02:31 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7002.magru.wmnet with OS bullseye
  • 02:31 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 02:29 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 02:07 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7002.magru.wmnet with reason: host reimage
  • 02:04 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7002.magru.wmnet with reason: host reimage
  • 01:37 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7002.magru.wmnet with OS bullseye
  • 01:26 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7001.magru.wmnet with OS bullseye
  • 01:26 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 01:25 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 01:02 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7001.magru.wmnet with reason: host reimage
  • 00:58 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7001.magru.wmnet with reason: host reimage
  • 00:33 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7001.magru.wmnet with OS bullseye
  • 00:23 xcollazo@deploy1002: Finished deploy [airflow-dags/analytics@b10376a]: (no justification provided) (duration: 00m 31s)
  • 00:22 xcollazo@deploy1002: Started deploy [airflow-dags/analytics@b10376a]: (no justification provided)
  • 00:05 eileen: civicrm upgraded from 393e1deb to 3ac4043


Other archives

2000s

2010s

2020s