Jump to content

Server Admin Log

From Wikitech
(Redirected from Server admin log)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

2024-12-24

  • 14:38 Ammar: T382741 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=bnwiki --logwiki=metawiki 'Esteban16' 'Renamed user f26394dcb19bd7bdad78f0d752896653'
  • 10:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on db2123.codfw.wmnet with reason: Broken T382743 T382743
  • 10:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on db2123.codfw.wmnet with reason: Broken T382743 T382743
  • 10:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depool db2123 T382743', diff saved to https://phabricator.wikimedia.org/P71746 and previous config saved to /var/cache/conftool/dbconfig/20241224-105203-ladsgroup.json
  • 10:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Promote db2213 to s5 primary and set section read-write T382743', diff saved to https://phabricator.wikimedia.org/P71745 and previous config saved to /var/cache/conftool/dbconfig/20241224-103304-ladsgroup.json
  • 10:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Set db2213 with weight 0 T382743', diff saved to https://phabricator.wikimedia.org/P71744 and previous config saved to /var/cache/conftool/dbconfig/20241224-102200-ladsgroup.json
  • 10:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s5 T382743
  • 10:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s5 T382743
  • 10:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Set s5 codfw as read-only for maintenance - T382743', diff saved to https://phabricator.wikimedia.org/P71743 and previous config saved to /var/cache/conftool/dbconfig/20241224-102102-ladsgroup.json
  • 09:39 akosiaris@cumin1002: conftool action : set/weight=10; selector: dc=eqiad,cluster=kubernetes,service=kubemaster,name=wikikube-ctrl1004.eqiad.wmnet
  • 09:39 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=kubernetes,service=kubemaster,name=wikikube-ctrl1004.eqiad.wmnet
  • 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.44.0-wmf.5 (duration: 01m 21s)

2024-12-23

  • 23:54 wfan: payments-wiki upgraded from 65775042 to 8294c9ec
  • 22:18 zabe@deploy2002: Finished scap sync-world: T382717 (duration: 15m 07s)
  • 22:06 zabe@deploy2002: zabe: Continuing with sync
  • 22:05 zabe@deploy2002: zabe: T382717 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:03 zabe@deploy2002: Started scap sync-world: T382717
  • 22:03 zabe@deploy2002: Sync cancelled.
  • 22:03 zabe@deploy2002: zabe: T382717 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:55 zabe@deploy2002: Started scap sync-world: T382717
  • 21:54 zabe@deploy2002: scap failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.44.0-wmf.8 --multiversion-image-name docker-registry.discovery.wmnet/restricted/mediawiki-multiversion --multiversion-debug-image-name docker-registry.discovery.wmnet/restricted/media
  • 21:53 zabe@deploy2002: Started scap sync-world: T382717
  • 21:24 zabe@deploy2002: Started scap sync-world: Backport for Fix Azeri alias lang code (T382717 T381048)
  • 21:04 Emperor: depool/restart/repoo ms-fe1013
  • 20:59 mvernon@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=swift,name=codfw
  • 20:37 Emperor: cumin run on swift nodes
  • 20:16 Emperor: weighted ms-be2075 to zero T382705 T382707
  • 19:33 Emperor: restart swift-container-reconciler on ms-be1075
  • 17:22 Emperor: swift delete wikipedia-commons-local-public.88 8/88/Model_4000-First_of_Odakyu_Electric_Railway_2.JPG T382694
  • 16:20 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1031-1033].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 16:20 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1033.eqiad.wmnet with OS bookworm
  • 16:01 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1033.eqiad.wmnet with reason: host reimage
  • 15:56 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1033.eqiad.wmnet with reason: host reimage
  • 15:39 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1033.eqiad.wmnet with OS bookworm
  • 15:37 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1032.eqiad.wmnet with OS bookworm
  • 15:14 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1032.eqiad.wmnet with reason: host reimage
  • 15:11 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1032.eqiad.wmnet with reason: host reimage
  • 14:58 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1028-1030].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 14:58 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1030.eqiad.wmnet with OS bookworm
  • 14:52 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1032.eqiad.wmnet with OS bookworm
  • 14:51 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 14:50 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1031.eqiad.wmnet with OS bookworm
  • 14:39 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1030.eqiad.wmnet with reason: host reimage
  • 14:35 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1030.eqiad.wmnet with reason: host reimage
  • 14:32 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1031.eqiad.wmnet with reason: host reimage
  • 14:27 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1031.eqiad.wmnet with reason: host reimage
  • 14:19 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1030.eqiad.wmnet with OS bookworm
  • 14:17 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1029.eqiad.wmnet with OS bookworm
  • 14:11 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1031.eqiad.wmnet with OS bookworm
  • 14:10 mvernon@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=swift,name=codfw
  • 14:10 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1031-1033].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 14:10 Emperor: depool codfw swift T382705
  • 13:58 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1029.eqiad.wmnet with reason: host reimage
  • 13:55 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1029.eqiad.wmnet with reason: host reimage
  • 13:38 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1029.eqiad.wmnet with OS bookworm
  • 13:36 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1028.eqiad.wmnet with OS bookworm
  • 13:34 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1012-1014].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 13:34 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1014.eqiad.wmnet with OS bookworm
  • 13:18 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1028.eqiad.wmnet with reason: host reimage
  • 13:15 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1014.eqiad.wmnet with reason: host reimage
  • 13:14 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1028.eqiad.wmnet with reason: host reimage
  • 13:11 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1014.eqiad.wmnet with reason: host reimage
  • 12:54 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1014.eqiad.wmnet with OS bookworm
  • 12:53 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1013.eqiad.wmnet with OS bookworm
  • 12:34 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1013.eqiad.wmnet with reason: host reimage
  • 12:31 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1028.eqiad.wmnet with OS bookworm
  • 12:31 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1013.eqiad.wmnet with reason: host reimage
  • 12:29 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1028-1030].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 12:27 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1025-1027].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 12:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1027.eqiad.wmnet with OS bookworm
  • 12:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1027.eqiad.wmnet with reason: host reimage
  • 12:04 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1027.eqiad.wmnet with reason: host reimage
  • 12:03 moritzm: 5558
  • 12:02 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1004.eqiad.wmnet with reason: host reimage
  • 11:59 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1004.eqiad.wmnet with reason: host reimage
  • 11:47 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1027.eqiad.wmnet with OS bookworm
  • 11:46 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1026.eqiad.wmnet with OS bookworm
  • 11:40 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 11:39 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 11:37 akosiaris: roll restart of all swift fes in codfw. This seems to have fixed some higher than usual cache_upload error rates. Monitoring.
  • 11:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1026.eqiad.wmnet with reason: host reimage
  • 11:25 akosiaris@cumin1002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
  • 11:24 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1026.eqiad.wmnet with reason: host reimage
  • 11:23 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 11:23 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 11:22 akosiaris@cumin1002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw
  • 11:14 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 11:07 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 11:07 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1026.eqiad.wmnet with OS bookworm
  • 11:05 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1025.eqiad.wmnet with OS bookworm
  • 11:02 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1013.eqiad.wmnet with OS bookworm
  • 11:00 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1012.eqiad.wmnet with OS bookworm
  • 10:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1025.eqiad.wmnet with reason: host reimage
  • 10:41 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1012.eqiad.wmnet with reason: host reimage
  • 10:41 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1025.eqiad.wmnet with reason: host reimage
  • 10:38 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1012.eqiad.wmnet with reason: host reimage
  • 10:22 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1025.eqiad.wmnet with OS bookworm
  • 10:21 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1012.eqiad.wmnet with OS bookworm
  • 10:20 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1025-1027].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 10:20 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1012-1014].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 10:04 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
  • 10:03 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
  • 09:50 Emperor: repool ms-fe2010
  • 09:45 Emperor: depool ms-fe2010 to attempt swap clearance
  • 09:44 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on P{ms-fe2009*} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
  • 09:38 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on P{ms-fe2009*} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
  • 09:38 moritzm: installing gtk+3.0 security updates
  • 08:39 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 08:38 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 08:15 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1004.eqiad.wmnet with OS bookworm
  • 08:02 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wikikube-worker1290 to wikikube-ctrl1004
  • 08:02 moritzm: installing libxstream-java security updates
  • 08:02 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl1004
  • 08:02 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl1004
  • 08:02 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:02 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wikikube-worker1290 to wikikube-ctrl1004 - akosiaris@cumin1002"
  • 07:57 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wikikube-worker1290 to wikikube-ctrl1004 - akosiaris@cumin1002"
  • 07:53 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 07:53 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from wikikube-worker1290 to wikikube-ctrl1004
  • 07:46 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1290.eqiad.wmnet
  • 07:46 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1290.eqiad.wmnet
  • 07:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on dbproxy1029.eqiad.wmnet with reason: maintenance
  • 07:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on dbproxy1029.eqiad.wmnet with reason: maintenance
  • 07:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on dbproxy1028.eqiad.wmnet with reason: maintenance
  • 07:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on dbproxy1028.eqiad.wmnet with reason: maintenance

2024-12-22

  • 11:43 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
  • 11:40 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw
  • 10:52 Emperor: restart swift-object on ms-be2082
  • 10:49 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
  • 10:46 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw

2024-12-21

  • 00:00 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 00:00 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002"

2024-12-20

  • 22:12 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002"
  • 21:52 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd2004-dev.codfw.wmnet with reason: host reimage
  • 21:48 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd2004-dev.codfw.wmnet with reason: host reimage
  • 21:29 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 21:24 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 21:13 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 18:43 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71741 and previous config saved to /var/cache/conftool/dbconfig/20241220-184307-root.json
  • 18:35 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1023-1024].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 18:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1024.eqiad.wmnet with OS bookworm
  • 18:28 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71739 and previous config saved to /var/cache/conftool/dbconfig/20241220-182801-root.json
  • 18:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1024.eqiad.wmnet with reason: host reimage
  • 18:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71738 and previous config saved to /var/cache/conftool/dbconfig/20241220-181256-root.json
  • 18:12 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1024.eqiad.wmnet with reason: host reimage
  • 18:02 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@7fecc64]: Pickup hotfix for T377852. (duration: 02m 03s)
  • 18:00 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@7fecc64]: Pickup hotfix for T377852.
  • 17:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71737 and previous config saved to /var/cache/conftool/dbconfig/20241220-175751-root.json
  • 17:53 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1024.eqiad.wmnet with OS bookworm
  • 17:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1023.eqiad.wmnet with OS bookworm
  • 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71736 and previous config saved to /var/cache/conftool/dbconfig/20241220-174246-root.json
  • 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2168', diff saved to https://phabricator.wikimedia.org/P71735 and previous config saved to /var/cache/conftool/dbconfig/20241220-173922-marostegui.json
  • 17:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2168.codfw.wmnet with reason: maintenance
  • 17:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2168.codfw.wmnet with reason: maintenance
  • 17:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1023.eqiad.wmnet with reason: host reimage
  • 17:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1023.eqiad.wmnet with reason: host reimage
  • 17:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1023.eqiad.wmnet with OS bookworm
  • 17:06 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1023-1024].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 16:48 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@8c5744d]: Deploying latest analytics Airflow instance DAGs. T377852. (duration: 00m 58s)
  • 16:47 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@8c5744d]: Deploying latest analytics Airflow instance DAGs. T377852.
  • 14:38 btullis@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 14:06 btullis@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 12:10 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1001-1004].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 12:10 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1004.eqiad.wmnet with OS bookworm
  • 12:08 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1008-1011].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 12:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1011.eqiad.wmnet with OS bookworm
  • 11:52 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1004.eqiad.wmnet with reason: host reimage
  • 11:48 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1011.eqiad.wmnet with reason: host reimage
  • 11:47 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1004.eqiad.wmnet with reason: host reimage
  • 11:45 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1011.eqiad.wmnet with reason: host reimage
  • 11:31 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1004.eqiad.wmnet with OS bookworm
  • 11:29 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1003.eqiad.wmnet with OS bookworm
  • 11:28 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1011.eqiad.wmnet with OS bookworm
  • 11:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1010.eqiad.wmnet with OS bookworm
  • 11:10 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1003.eqiad.wmnet with reason: host reimage
  • 11:08 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1003.eqiad.wmnet with reason: host reimage
  • 11:07 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1010.eqiad.wmnet with reason: host reimage
  • 11:04 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1010.eqiad.wmnet with reason: host reimage
  • 10:51 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1003.eqiad.wmnet with OS bookworm
  • 10:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1002.eqiad.wmnet with OS bookworm
  • 10:46 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1010.eqiad.wmnet with OS bookworm
  • 10:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1009.eqiad.wmnet with OS bookworm
  • 10:30 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1002.eqiad.wmnet with reason: host reimage
  • 10:27 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database idwikivoyage (T381079)
  • 10:27 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1002.eqiad.wmnet with reason: host reimage
  • 10:27 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database idwikivoyage (T381079)
  • 10:25 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1009.eqiad.wmnet with reason: host reimage
  • 10:22 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1009.eqiad.wmnet with reason: host reimage
  • 10:11 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1002.eqiad.wmnet with OS bookworm
  • 10:09 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1001.eqiad.wmnet with OS bookworm
  • 10:05 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1009.eqiad.wmnet with OS bookworm
  • 10:04 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1008.eqiad.wmnet with OS bookworm
  • 09:48 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1001.eqiad.wmnet with reason: host reimage
  • 09:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1008.eqiad.wmnet with reason: host reimage
  • 09:43 moritzm: imported imposm3 0.11.1-1+deb12u1 to apt.wikimedia.org T381565
  • 09:43 moritzm: imported imposm3 0.11.1-1+deb12u1 to apt.wikimedia.org
  • 09:42 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1001.eqiad.wmnet with reason: host reimage
  • 09:41 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1008.eqiad.wmnet with reason: host reimage
  • 09:23 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1008.eqiad.wmnet with OS bookworm
  • 09:23 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1008-1011].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 09:23 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1001.eqiad.wmnet with OS bookworm
  • 09:22 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1001-1004].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 09:21 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[1001-1004].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 09:20 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[1008-1011].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 09:06 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1008-1011].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 09:06 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1001-1004].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 05:37 tchin@deploy2002: Finished deploy [airflow-dags/analytics@bcf6276]: (no justification provided) (duration: 04m 11s)

2024-12-19

  • 23:51 eileen: config revision changed from 404bbbd5 to c447228e
  • away: UTC late deploys done
  • 21:53 tgr@deploy2002: Finished scap sync-world: Backport for Make AuthManagerAutoConfig configuration key more distinctive (T369180), SUL3: Disable more auth providers in the local leg of SUL3 login (T369180), [noop] Update private/readme.php (T369180), Enable $wgWMEStatsBeaconUri (T355837) (duration: 21m 34s)
  • 21:48 tgr@deploy2002: krinkle, tgr: Continuing with sync
  • 21:37 tgr@deploy2002: krinkle, tgr: Backport for Make AuthManagerAutoConfig configuration key more distinctive (T369180), SUL3: Disable more auth providers in the local leg of SUL3 login (T369180), [noop] Update private/readme.php (T369180), Enable $wgWMEStatsBeaconUri (T355837) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:31 tgr@deploy2002: Started scap sync-world: Backport for Make AuthManagerAutoConfig configuration key more distinctive (T369180), SUL3: Disable more auth providers in the local leg of SUL3 login (T369180), [noop] Update private/readme.php (T369180), Enable $wgWMEStatsBeaconUri (T355837)
  • away: deploying PrivateSettings change 95517e85
  • 21:14 urbanecm@deploy2002: Finished scap sync-world: Backport for Set Flow to read-only on phase 1 wikis (T378833), Reader Survey: Undeploy (T378660) (duration: 11m 14s)
  • 21:09 urbanecm@deploy2002: urbanecm, kemayo, dani: Continuing with sync
  • 21:07 urbanecm@deploy2002: urbanecm, kemayo, dani: Backport for Set Flow to read-only on phase 1 wikis (T378833), Reader Survey: Undeploy (T378660) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:03 urbanecm@deploy2002: Started scap sync-world: Backport for Set Flow to read-only on phase 1 wikis (T378833), Reader Survey: Undeploy (T378660)
  • 20:38 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1011.eqiad.wmnet with OS bookworm
  • 19:51 jforrester@deploy2002: Finished deploy [integration/docroot@4701376]: I1ea9f3 for T233089 (duration: 00m 10s)
  • 19:51 jforrester@deploy2002: Started deploy [integration/docroot@4701376]: I1ea9f3 for T233089
  • 19:50 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:50 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: reset dns names for cloudcontrol1011 to newly-assigned ones - cmooney@cumin1002"
  • 19:50 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: reset dns names for cloudcontrol1011 to newly-assigned ones - cmooney@cumin1002"
  • 19:41 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 19:39 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:38 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: reset dns names for cloudcontrol1011 to newly-assigned ones - cmooney@cumin1002"
  • 19:38 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: reset dns names for cloudcontrol1011 to newly-assigned ones - cmooney@cumin1002"
  • 19:35 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 19:32 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2004-dev.codfw.wmnet with OS bullseye
  • 19:18 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcontrol1011.eqiad.wmnet with OS bookworm
  • 19:18 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcontrol1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:12 dancy@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.8 refs T375667
  • 18:52 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudcontrol1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:51 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:51 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for cloudcontrol1011 - jclark@cumin1002"
  • 18:51 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for cloudcontrol1011 - jclark@cumin1002"
  • 18:49 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcontrol1011
  • 18:49 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcontrol1011
  • 18:48 swfrench@deploy2002: Finished scap sync-world: Backport for maintenance: fix typo in job status logging (T382517), maintenance: fix typo in job status logging (T382517) (duration: 17m 11s)
  • 18:47 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 18:47 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 18:42 swfrench@deploy2002: swfrench: Continuing with sync
  • 18:41 swfrench@deploy2002: swfrench: Backport for maintenance: fix typo in job status logging (T382517), maintenance: fix typo in job status logging (T382517) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:32 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 18:32 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 18:32 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 18:31 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 18:31 swfrench@deploy2002: Started scap sync-world: Backport for maintenance: fix typo in job status logging (T382517), maintenance: fix typo in job status logging (T382517)
  • 18:31 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 18:31 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 18:07 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 18:06 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 18:06 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1022].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 18:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1022.eqiad.wmnet with OS bookworm
  • 17:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1022.eqiad.wmnet with reason: host reimage
  • 17:42 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1022.eqiad.wmnet with reason: host reimage
  • 17:39 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 17:38 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 17:23 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1022.eqiad.wmnet with OS bookworm
  • 17:21 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1022].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 17:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 17:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 17:10 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd2004-dev.codfw.wmnet with OS bookworm
  • 17:00 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[1022].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 16:58 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1022].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 16:54 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 16:53 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 16:53 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:35 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on ripe-atlas-eqsin,ripe-atlas-eqsin IPv6 with reason: Atlas device offline, scheduling reboot
  • 16:35 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on ripe-atlas-eqsin,ripe-atlas-eqsin IPv6 with reason: Atlas device offline, scheduling reboot
  • 16:28 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on ripe-atlas-eqiad,ripe-atlas-eqiad IPv6 with reason: Atlas device offline, scheduling reboot
  • 16:28 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on ripe-atlas-eqiad,ripe-atlas-eqiad IPv6 with reason: Atlas device offline, scheduling reboot
  • 16:15 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1296-1300].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 16:15 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1300.eqiad.wmnet with OS bookworm
  • 16:11 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[1022].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 16:00 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1022].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 15:55 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1300.eqiad.wmnet with reason: host reimage
  • 15:53 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1300.eqiad.wmnet with reason: host reimage
  • 15:50 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1017-1020].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 15:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1020.eqiad.wmnet with OS bookworm
  • 15:49 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2055-2056].codfw.wmnet
  • 15:49 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2055-2056].codfw.wmnet
  • 15:48 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@6ed5237]: SEAL conda env hotfix (duration: 01m 28s)
  • 15:47 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@6ed5237]: SEAL conda env hotfix
  • 15:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2056.codfw.wmnet with OS bookworm
  • 15:34 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Make the typage campaign not specific to 2023 (T380405) (duration: 12m 33s)
  • 15:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1020.eqiad.wmnet with reason: host reimage
  • 15:31 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1300.eqiad.wmnet with OS bookworm
  • 15:30 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1299.eqiad.wmnet with OS bookworm
  • 15:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1020.eqiad.wmnet with reason: host reimage
  • 15:26 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1301-1304].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 15:26 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1304.eqiad.wmnet with OS bookworm
  • 15:21 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Make the typage campaign not specific to 2023 (T380405)
  • 15:19 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Make the typage campaign not specific to 2023 (T380405)
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2056.codfw.wmnet with reason: host reimage
  • 15:17 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Disable Surfacing Add Link tasks on all wikis (T382037) (duration: 17m 00s)
  • 15:14 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2056.codfw.wmnet with reason: host reimage
  • 15:11 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 15:10 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1299.eqiad.wmnet with reason: host reimage
  • 15:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1020.eqiad.wmnet with OS bookworm
  • 15:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1019.eqiad.wmnet with OS bookworm
  • 15:08 urbanecm@deploy2002: urbanecm: Backport for [Growth] Disable Surfacing Add Link tasks on all wikis (T382037) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:05 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1299.eqiad.wmnet with reason: host reimage
  • 15:05 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1304.eqiad.wmnet with reason: host reimage
  • 15:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2055.codfw.wmnet with OS bookworm
  • 15:01 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1304.eqiad.wmnet with reason: host reimage
  • 15:00 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Disable Surfacing Add Link tasks on all wikis (T382037)
  • 14:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2056
  • 14:57 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2056
  • 14:57 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2056.codfw.wmnet with OS bookworm
  • 14:56 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2056.codfw.wmnet with OS bookworm
  • 14:56 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1302.eqiad.wmnet with OS bookworm
  • 14:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1297.eqiad.wmnet with OS bookworm
  • 14:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1019.eqiad.wmnet with reason: host reimage
  • 14:44 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1299.eqiad.wmnet with OS bookworm
  • 14:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1019.eqiad.wmnet with reason: host reimage
  • 14:42 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1298.eqiad.wmnet with OS bookworm
  • 14:40 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1304.eqiad.wmnet with OS bookworm
  • 14:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2055.codfw.wmnet with reason: host reimage
  • 14:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1303.eqiad.wmnet with OS bookworm
  • 14:36 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1302.eqiad.wmnet with reason: host reimage
  • 14:34 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2055.codfw.wmnet with reason: host reimage
  • 14:33 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1297.eqiad.wmnet with reason: host reimage
  • 14:30 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1302.eqiad.wmnet with reason: host reimage
  • 14:29 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1297.eqiad.wmnet with reason: host reimage
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1019.eqiad.wmnet with OS bookworm
  • 14:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1018.eqiad.wmnet with OS bookworm
  • 14:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1298.eqiad.wmnet with reason: host reimage
  • 14:20 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1303.eqiad.wmnet with reason: host reimage
  • 14:17 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1298.eqiad.wmnet with reason: host reimage
  • 14:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2055
  • 14:17 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2055
  • 14:17 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2055.codfw.wmnet with OS bookworm
  • 14:16 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1303.eqiad.wmnet with reason: host reimage
  • 14:16 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2055.codfw.wmnet with OS bookworm
  • 14:10 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1302.eqiad.wmnet with OS bookworm
  • 14:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1018.eqiad.wmnet with reason: host reimage
  • 14:05 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1297.eqiad.wmnet with OS bookworm
  • 14:02 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1018.eqiad.wmnet with reason: host reimage
  • 13:57 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1298.eqiad.wmnet with OS bookworm
  • 13:56 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1303.eqiad.wmnet with OS bookworm
  • 13:45 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1018.eqiad.wmnet with OS bookworm
  • 13:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1017.eqiad.wmnet with OS bookworm
  • 13:39 moritzm: installing libsepol security updates
  • 13:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1017.eqiad.wmnet with reason: host reimage
  • 13:19 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1017.eqiad.wmnet with reason: host reimage
  • 13:16 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tigwiki (T381378)
  • 13:02 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1017.eqiad.wmnet with OS bookworm
  • 13:01 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1017-1020].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 12:50 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database tigwiki (T381378)
  • 11:55 moritzm: installing gtk+2.0 security updates
  • 11:50 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1069.eqiad.wmnet
  • 11:48 moritzm: installing distro-info-data updates on bullseye
  • 11:44 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1069.eqiad.wmnet
  • 11:44 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1068.eqiad.wmnet
  • 11:43 moritzm: installing gsl security updates
  • 11:36 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1068.eqiad.wmnet
  • 11:36 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1067.eqiad.wmnet
  • 11:36 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1297.eqiad.wmnet with OS bookworm
  • 11:29 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1067.eqiad.wmnet
  • 11:29 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1066.eqiad.wmnet
  • 11:28 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1302.eqiad.wmnet with OS bookworm
  • 11:21 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1066.eqiad.wmnet
  • 11:21 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1065.eqiad.wmnet
  • 11:13 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1065.eqiad.wmnet
  • 11:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2055
  • 11:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2055
  • 11:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2056
  • 11:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2056
  • 11:02 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2055.codfw.wmnet with OS bookworm
  • 11:02 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2056.codfw.wmnet with OS bookworm
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2055-2056].codfw.wmnet
  • 11:00 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2055-2056].codfw.wmnet
  • 10:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2057-2058].codfw.wmnet
  • 10:59 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2057-2058].codfw.wmnet
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2058.codfw.wmnet with OS bookworm
  • 10:55 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1067.eqiad.wmnet
  • 10:54 btullis@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1067.eqiad.wmnet
  • 10:52 moritzm: installing e2fsprogs security updates
  • 10:51 btullis@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1067.eqiad.wmnet
  • 10:51 btullis@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1067.eqiad.wmnet
  • 10:44 moritzm: restarting slapd on r/w servers to pick up openssl security updates
  • 10:42 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1043.eqiad.wmnet with OS bookworm
  • 10:42 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 10:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2057.codfw.wmnet with OS bookworm
  • 10:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2058.codfw.wmnet with reason: host reimage
  • 10:37 jmm@cumin2002: END (PASS) - Cookbook sre.elasticsearch.restart-nginx (exit_code=0) rolling restart_daemons on A:relforge
  • 10:36 jmm@cumin2002: START - Cookbook sre.elasticsearch.restart-nginx rolling restart_daemons on A:relforge
  • 10:36 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 10:35 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2058.codfw.wmnet with reason: host reimage
  • 10:22 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1067.eqiad.wmnet
  • 10:21 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2057.codfw.wmnet with reason: host reimage
  • 10:20 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1302.eqiad.wmnet with OS bookworm
  • 10:19 btullis@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1067.eqiad.wmnet
  • 10:18 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
  • 10:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2058
  • 10:18 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2058
  • 10:18 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2058.codfw.wmnet with OS bookworm
  • 10:18 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1301.eqiad.wmnet with OS bookworm
  • 10:18 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1067.eqiad.wmnet with OS bullseye
  • 10:18 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 10:18 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2057.codfw.wmnet with reason: host reimage
  • 10:16 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2058.codfw.wmnet with OS bookworm
  • 10:16 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1297.eqiad.wmnet with OS bookworm
  • 10:15 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 10:14 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
  • 10:14 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1296.eqiad.wmnet with OS bookworm
  • 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
  • 10:08 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
  • 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-all
  • 10:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2058
  • 10:01 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2058
  • 10:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2057
  • 10:01 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2057
  • 10:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1067.eqiad.wmnet with reason: host reimage
  • 10:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2058.codfw.wmnet with OS bookworm
  • 10:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2057.codfw.wmnet with OS bookworm
  • 09:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2057-2058].codfw.wmnet
  • 09:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2057-2058].codfw.wmnet
  • 09:58 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1301.eqiad.wmnet with reason: host reimage
  • 09:57 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2060].codfw.wmnet
  • 09:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2060].codfw.wmnet
  • 09:55 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 09:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2059.codfw.wmnet with OS bookworm
  • 09:54 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1296.eqiad.wmnet with reason: host reimage
  • 09:53 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 09:51 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1067.eqiad.wmnet with reason: host reimage
  • 09:50 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1301.eqiad.wmnet with reason: host reimage
  • 09:50 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1296.eqiad.wmnet with reason: host reimage
  • 09:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2060.codfw.wmnet with OS bookworm
  • 09:50 elukey@cumin1002: START - Cookbook sre.hosts.provision for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 09:49 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-all
  • 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wcqs-public
  • 09:46 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wcqs-public
  • 09:39 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1067.eqiad.wmnet with OS bullseye
  • 09:39 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1067.eqiad.wmnet with OS bullseye
  • 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
  • 09:35 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw
  • 09:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2059.codfw.wmnet with reason: host reimage
  • 09:34 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1067.eqiad.wmnet with OS bullseye
  • 09:33 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1067.eqiad.wmnet with OS bullseye
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2060.codfw.wmnet with reason: host reimage
  • 09:30 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1301.eqiad.wmnet with OS bookworm
  • 09:30 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1296.eqiad.wmnet with OS bookworm
  • 09:28 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1301-1304].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 09:28 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1296-1300].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 09:28 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1069.eqiad.wmnet
  • 09:26 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2059.codfw.wmnet with reason: host reimage
  • 09:26 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2060.codfw.wmnet with reason: host reimage
  • 09:26 btullis@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1069.eqiad.wmnet
  • 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe
  • 09:23 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1069.eqiad.wmnet with OS bullseye
  • 09:23 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 09:20 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe
  • 09:17 kartik@deploy2002: Finished scap sync-world: Backport for Event logging: update schemaId (T364460) (duration: 25m 20s)
  • 09:12 moritzm: upgrading mwdebug* to PHP 1:7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u2+icu67u4 T382077
  • 09:09 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling restart_daemons on A:ldap-replicas-eqiad
  • 09:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2059
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2059
  • 09:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2060
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2060
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2059.codfw.wmnet with OS bookworm
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2060.codfw.wmnet with OS bookworm
  • 09:08 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling restart_daemons on A:ldap-replicas-eqiad
  • 09:07 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2060].codfw.wmnet
  • 09:07 kartik@deploy2002: kartik, wangombe: Continuing with sync
  • 09:06 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2060].codfw.wmnet
  • 09:03 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2061-2062].codfw.wmnet
  • 09:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2061-2062].codfw.wmnet
  • 08:57 kartik@deploy2002: kartik, wangombe: Backport for Event logging: update schemaId (T364460) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling restart_daemons on A:ldap-replicas-codfw
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2062.codfw.wmnet with OS bookworm
  • 08:55 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling restart_daemons on A:ldap-replicas-codfw
  • 08:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2061.codfw.wmnet with OS bookworm
  • 08:51 kartik@deploy2002: Started scap sync-world: Backport for Event logging: update schemaId (T364460)
  • 08:37 kartik@deploy2002: Finished scap sync-world: Backport for Event logging: pass empty object to translation property (T364460) (duration: 21m 52s)
  • 08:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2062.codfw.wmnet with reason: host reimage
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2061.codfw.wmnet with reason: host reimage
  • 08:31 kartik@deploy2002: wangombe, kartik: Continuing with sync
  • 08:28 kartik@deploy2002: wangombe, kartik: Backport for Event logging: pass empty object to translation property (T364460) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:27 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2062.codfw.wmnet with reason: host reimage
  • 08:27 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2061.codfw.wmnet with reason: host reimage
  • 08:15 kartik@deploy2002: Started scap sync-world: Backport for Event logging: pass empty object to translation property (T364460)
  • 08:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2062
  • 08:09 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2062
  • 08:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2061
  • 08:09 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2061
  • 08:08 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2062.codfw.wmnet with OS bookworm
  • 08:08 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2061.codfw.wmnet with OS bookworm
  • 08:07 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2061-2062].codfw.wmnet
  • 08:03 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2061-2062].codfw.wmnet
  • 08:01 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet
  • 08:01 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet
  • 08:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2190.codfw.wmnet with OS bookworm
  • 07:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2190.codfw.wmnet with reason: host reimage
  • 07:37 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2190.codfw.wmnet with reason: host reimage
  • 07:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2190
  • 07:18 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2190
  • 07:18 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2190.codfw.wmnet with OS bookworm
  • 02:28 krinkle@deploy2002: Finished deploy [statsv/statsv@2ee86ea]: Add dogstatsd support (duration: 00m 18s)
  • 02:28 krinkle@deploy2002: Started deploy [statsv/statsv@2ee86ea]: Add dogstatsd support
  • 01:48 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 01:05 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm

2024-12-18

  • 23:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 22:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 21:31 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@a43cacf]: bump image suggestions, section topics, and SEAL (duration: 01m 43s)
  • 21:30 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@a43cacf]: bump image suggestions, section topics, and SEAL
  • 20:44 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 20:44 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 20:36 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:36 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:29 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: sync
  • 20:28 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: sync
  • 20:28 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: sync
  • 20:27 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: sync
  • 20:27 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: sync
  • 20:27 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: sync
  • 20:26 ottomata: restarting eventgate-analytics-external to clear schema cache - T382113 | https://phabricator.wikimedia.org/T382113#10414005
  • 19:28 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.8 refs T375667
  • 18:55 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 18:40 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1069.eqiad.wmnet with reason: host reimage
  • 18:37 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1069.eqiad.wmnet with reason: host reimage
  • 18:25 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1069.eqiad.wmnet with OS bullseye
  • 18:25 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1069.eqiad.wmnet with OS bullseye
  • 18:23 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1068.eqiad.wmnet
  • 18:21 btullis@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1068.eqiad.wmnet
  • 18:20 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1069.eqiad.wmnet with OS bullseye
  • 18:18 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1068.eqiad.wmnet with OS bullseye
  • 18:18 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 18:18 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore1*.eqiad.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 18:16 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 18:16 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1067.eqiad.wmnet with OS bullseye
  • 18:15 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1067.eqiad.wmnet with OS bullseye
  • 18:13 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1069.eqiad.wmnet with OS bullseye
  • 18:09 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1067.eqiad.wmnet with OS bullseye
  • 18:05 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1067.eqiad.wmnet with OS bullseye
  • 18:01 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1068.eqiad.wmnet with reason: host reimage
  • 18:01 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1069.eqiad.wmnet with OS bullseye
  • 18:00 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore1*.eqiad.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 17:59 btullis@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-worker1069
  • 17:58 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1068.eqiad.wmnet with reason: host reimage
  • 17:58 btullis@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host an-worker1069
  • 17:57 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:57 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-commissioning an-presto1005 as an-worker1069 - btullis@cumin1002"
  • 17:57 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-commissioning an-presto1005 as an-worker1069 - btullis@cumin1002"
  • 17:57 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1067.eqiad.wmnet with OS bullseye
  • 17:55 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1067.eqiad.wmnet with OS bullseye
  • 17:51 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 17:50 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore2*.codfw.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 17:49 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2063-2064].codfw.wmnet
  • 17:49 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2063-2064].codfw.wmnet
  • 17:46 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1068.eqiad.wmnet with OS bullseye
  • 17:41 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1067.eqiad.wmnet with OS bullseye
  • 17:34 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1066.eqiad.wmnet
  • 17:32 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore2*.codfw.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 17:32 btullis@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1066.eqiad.wmnet
  • 17:31 btullis@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-worker1068
  • 17:30 btullis@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host an-worker1068
  • 17:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2064.codfw.wmnet with OS bookworm
  • 17:27 Emperor: depool, restart, repool ms-fe2009
  • 17:25 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:25 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-commissioning an-presto1004 as an-worker1068 - btullis@cumin1002"
  • 17:25 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-commissioning an-presto1004 as an-worker1068 - btullis@cumin1002"
  • 17:24 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1066.eqiad.wmnet with OS bullseye
  • 17:24 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 17:24 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2063.codfw.wmnet with OS bookworm
  • 17:21 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 17:19 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 17:12 btullis@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-worker1067
  • 17:12 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1291-1295].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 17:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1295.eqiad.wmnet with OS bookworm
  • 17:11 btullis@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host an-worker1067
  • 17:10 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:10 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-commissioning an-presto1003 as an-worker1067 - btullis@cumin1002"
  • 17:10 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-commissioning an-presto1003 as an-worker1067 - btullis@cumin1002"
  • 17:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2064.codfw.wmnet with reason: host reimage
  • 17:08 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1280-1284].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 17:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1284.eqiad.wmnet with OS bookworm
  • 17:06 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1066.eqiad.wmnet with reason: host reimage
  • 17:06 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 17:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2063.codfw.wmnet with reason: host reimage
  • 17:02 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1066.eqiad.wmnet with reason: host reimage
  • 17:01 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2064.codfw.wmnet with reason: host reimage
  • 17:00 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2063.codfw.wmnet with reason: host reimage
  • 16:58 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1065.eqiad.wmnet
  • 16:57 btullis@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1065.eqiad.wmnet
  • 16:52 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1295.eqiad.wmnet with reason: host reimage
  • 16:50 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1066.eqiad.wmnet with OS bullseye
  • 16:50 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1065.eqiad.wmnet with OS bullseye
  • 16:49 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 16:49 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1295.eqiad.wmnet with reason: host reimage
  • 16:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1284.eqiad.wmnet with reason: host reimage
  • 16:45 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1284.eqiad.wmnet with reason: host reimage
  • 16:42 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 16:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2063
  • 16:42 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2063
  • 16:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2064
  • 16:42 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2064
  • 16:41 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2064.codfw.wmnet with OS bookworm
  • 16:41 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2063.codfw.wmnet with OS bookworm
  • 16:40 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2063-2064].codfw.wmnet
  • 16:39 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2063-2064].codfw.wmnet
  • 16:37 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2065,2067].codfw.wmnet
  • 16:37 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2065,2067].codfw.wmnet
  • 16:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2065.codfw.wmnet with OS bookworm
  • 16:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2067.codfw.wmnet with OS bookworm
  • 16:28 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1295.eqiad.wmnet with OS bookworm
  • 16:27 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1065.eqiad.wmnet with reason: host reimage
  • 16:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1294.eqiad.wmnet with OS bookworm
  • 16:25 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1284.eqiad.wmnet with OS bookworm
  • 16:25 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1065.eqiad.wmnet with reason: host reimage
  • 16:24 btullis@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-worker1066
  • 16:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1283.eqiad.wmnet with OS bookworm
  • 16:22 btullis@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host an-worker1066
  • 16:22 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:22 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-provisioning an-presto1002 and an-worker1066 - btullis@cumin1002"
  • 16:22 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-provisioning an-presto1002 and an-worker1066 - btullis@cumin1002"
  • 16:17 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 16:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2065.codfw.wmnet with reason: host reimage
  • 16:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2067.codfw.wmnet with reason: host reimage
  • 16:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1294.eqiad.wmnet with reason: host reimage
  • 16:06 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2067.codfw.wmnet with reason: host reimage
  • 16:06 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2065.codfw.wmnet with reason: host reimage
  • 16:04 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1283.eqiad.wmnet with reason: host reimage
  • 16:04 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1294.eqiad.wmnet with reason: host reimage
  • 16:00 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1283.eqiad.wmnet with reason: host reimage
  • 16:00 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1065.eqiad.wmnet with OS bullseye
  • 15:58 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:57 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:57 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:55 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:55 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:54 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:54 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1065.eqiad.wmnet with OS bullseye
  • 15:54 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 15:54 cgoubert@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:53 cgoubert@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 15:53 cgoubert@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 15:53 cgoubert@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 15:52 cgoubert@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:52 cgoubert@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 15:51 cgoubert@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:51 cgoubert@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 15:50 cgoubert@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:49 cgoubert@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 15:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2067
  • 15:47 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2067
  • 15:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2065
  • 15:47 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2065
  • 15:46 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2067.codfw.wmnet with OS bookworm
  • 15:46 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2065.codfw.wmnet with OS bookworm
  • 15:45 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2065,2067].codfw.wmnet
  • 15:44 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2065,2067].codfw.wmnet
  • 15:44 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:43 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1294.eqiad.wmnet with OS bookworm
  • 15:43 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:43 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:42 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1293.eqiad.wmnet with OS bookworm
  • 15:41 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:40 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1283.eqiad.wmnet with OS bookworm
  • 15:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1282.eqiad.wmnet with OS bookworm
  • 15:38 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2188-2189,2191].codfw.wmnet
  • 15:38 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2188-2189,2191].codfw.wmnet
  • 15:38 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:36 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:34 jelto: homer 'cr*codfw*' commit 'T377877'
  • 15:32 jelto: homer 'lsw1-c3-codfw*' commit 'T377877'
  • 15:31 jelto: homer 'lsw1-c1-codfw*' commit 'T377877'
  • 15:30 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2190.codfw.wmnet with OS bookworm
  • 15:30 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:30 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephosd2004-dev
  • 15:30 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephosd2004-dev
  • 15:29 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:29 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:29 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:10 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 15:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1256.eqiad.wmnet with OS bookworm
  • 15:06 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2004-dev.codfw.wmnet with OS bullseye
  • 15:06 kartik@deploy2002: Started scap sync-world: Backport for CX3 Build 0.2.0+20241218 (T380702)
  • 15:05 kartik@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription by default (T372386) (duration: 23m 26s)
  • 14:58 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1293.eqiad.wmnet with OS bookworm
  • 14:57 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:56 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1292.eqiad.wmnet with OS bookworm
  • 14:55 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1282.eqiad.wmnet with OS bookworm
  • 14:55 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 14:54 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1281.eqiad.wmnet with OS bookworm
  • 14:52 kartik@deploy2002: kartik, abi: Continuing with sync
  • 14:51 moritzm: installing gstreamer1.0 security updates
  • 14:47 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1256.eqiad.wmnet with reason: host reimage
  • 14:46 kartik@deploy2002: kartik, abi: Backport for Translate: Enable message group subscription by default (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:44 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1256.eqiad.wmnet with reason: host reimage
  • 14:41 kartik@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription by default (T372386)
  • 14:37 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1292.eqiad.wmnet with reason: host reimage
  • 14:35 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1281.eqiad.wmnet with reason: host reimage
  • 14:34 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1292.eqiad.wmnet with reason: host reimage
  • 14:31 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1281.eqiad.wmnet with reason: host reimage
  • 14:23 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1256.eqiad.wmnet with OS bookworm
  • 14:18 kartik@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription by default (T372386)
  • 14:13 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1292.eqiad.wmnet with OS bookworm
  • 14:11 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1291.eqiad.wmnet with OS bookworm
  • 14:11 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1281.eqiad.wmnet with OS bookworm
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2190
  • 14:10 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2190
  • 14:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2190.codfw.wmnet with OS bookworm
  • 14:09 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2190.codfw.wmnet with OS bookworm
  • 14:09 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1280.eqiad.wmnet with OS bookworm
  • 13:52 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1291.eqiad.wmnet with reason: host reimage
  • 13:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1280.eqiad.wmnet with reason: host reimage
  • 13:48 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1291.eqiad.wmnet with reason: host reimage
  • 13:46 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1280.eqiad.wmnet with reason: host reimage
  • 13:44 moritzm: installing jinja2 security updates
  • 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2191.codfw.wmnet with OS bookworm
  • 13:37 moritzm: installing waitress security updates
  • 13:28 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1291.eqiad.wmnet with OS bookworm
  • 13:26 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1291-1295].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 13:26 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1280.eqiad.wmnet with OS bookworm
  • 13:24 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1280-1284].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 13:21 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1007,1021,1080,1287].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 13:21 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1287.eqiad.wmnet with OS bookworm
  • 13:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2191.codfw.wmnet with reason: host reimage
  • 13:17 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to test mw-videoscaler integration one last time
  • 13:16 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2191.codfw.wmnet with reason: host reimage
  • 13:12 moritzm: installing curl security updates
  • 13:07 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 13:07 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 13:02 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1287.eqiad.wmnet with reason: host reimage
  • 13:01 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: sync
  • 13:01 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: sync
  • 13:01 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 13:01 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 13:01 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 13:00 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 12:58 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1287.eqiad.wmnet with reason: host reimage
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2191
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2191
  • 12:57 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2191
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2191.codfw.wmnet 172.32.192.10.in-addr.arpa 2.7.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:57 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2191.codfw.wmnet 172.32.192.10.in-addr.arpa 2.7.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2191 - jelto@cumin1002"
  • 12:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2191 - jelto@cumin1002"
  • 12:57 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 12:57 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 12:53 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:49 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2191
  • 12:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2190
  • 12:49 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2190
  • 12:49 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2190
  • 12:49 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2190.codfw.wmnet 171.32.192.10.in-addr.arpa 1.7.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:49 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2190.codfw.wmnet 171.32.192.10.in-addr.arpa 1.7.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:48 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2190 - jelto@cumin1002"
  • 12:48 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2190 - jelto@cumin1002"
  • 12:45 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:45 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2190
  • 12:45 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2191.codfw.wmnet with OS bookworm
  • 12:45 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2190.codfw.wmnet with OS bookworm
  • 12:41 hashar: Restarted Gerrit at 12:37:08 UTC
  • 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2189.codfw.wmnet with OS bookworm
  • 12:38 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1287.eqiad.wmnet with OS bookworm
  • 12:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2188.codfw.wmnet with OS bookworm
  • 12:36 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1080.eqiad.wmnet with OS bookworm
  • 12:31 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 12:31 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 12:25 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 12:23 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 12:22 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:22 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:22 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:21 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2189.codfw.wmnet with reason: host reimage
  • 12:20 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:18 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2188.codfw.wmnet with reason: host reimage
  • 12:16 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2189.codfw.wmnet with reason: host reimage
  • 12:14 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1080.eqiad.wmnet with reason: host reimage
  • 12:10 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2188.codfw.wmnet with reason: host reimage
  • 12:09 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1080.eqiad.wmnet with reason: host reimage
  • 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2189
  • 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2189
  • 11:56 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2189
  • 11:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2189.codfw.wmnet 170.32.192.10.in-addr.arpa 0.7.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:56 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2189.codfw.wmnet 170.32.192.10.in-addr.arpa 0.7.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:56 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2189 - jelto@cumin1002"
  • 11:56 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2189 - jelto@cumin1002"
  • 11:52 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:52 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2189
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2188
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2188
  • 11:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2188
  • 11:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2188.codfw.wmnet 169.32.192.10.in-addr.arpa 9.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:50 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2188.codfw.wmnet 169.32.192.10.in-addr.arpa 9.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2188 - jelto@cumin1002"
  • 11:50 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2188 - jelto@cumin1002"
  • 11:49 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1080.eqiad.wmnet with OS bookworm
  • 11:47 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1021.eqiad.wmnet with OS bookworm
  • 11:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:46 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2188
  • 11:46 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2188.codfw.wmnet with OS bookworm
  • 11:45 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2189.codfw.wmnet with OS bookworm
  • 11:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2188.codfw.wmnet wikikube-worker2189.codfw.wmnet wikikube-worker2190.codfw.wmnet wikikube-worker2191.codfw.wmnet on all recursors
  • 11:41 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2188.codfw.wmnet wikikube-worker2189.codfw.wmnet wikikube-worker2190.codfw.wmnet wikikube-worker2191.codfw.wmnet on all recursors
  • 11:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2039 to wikikube-worker2191
  • 11:40 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2191
  • 11:40 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2191
  • 11:40 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:40 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2039 to wikikube-worker2191 - jelto@cumin1002"
  • 11:39 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2039 to wikikube-worker2191 - jelto@cumin1002"
  • 11:35 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:35 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2039 to wikikube-worker2191
  • 11:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2038 to wikikube-worker2190
  • 11:33 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2190
  • 11:32 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2190
  • 11:32 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:32 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2038 to wikikube-worker2190 - jelto@cumin1002"
  • 11:31 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2038 to wikikube-worker2190 - jelto@cumin1002"
  • 11:28 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1021.eqiad.wmnet with reason: host reimage
  • 11:27 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:27 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2038 to wikikube-worker2190
  • 11:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2037 to wikikube-worker2189
  • 11:25 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2189
  • 11:25 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1021.eqiad.wmnet with reason: host reimage
  • 11:24 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2189
  • 11:24 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:24 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2037 to wikikube-worker2189 - jelto@cumin1002"
  • 11:23 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2037 to wikikube-worker2189 - jelto@cumin1002"
  • 11:20 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:19 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2037 to wikikube-worker2189
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2036 to wikikube-worker2188
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2188
  • 11:15 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2188
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2036 to wikikube-worker2188 - jelto@cumin1002"
  • 11:14 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2036 to wikikube-worker2188 - jelto@cumin1002"
  • 11:10 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:10 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2036 to wikikube-worker2188
  • 11:07 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1021.eqiad.wmnet with OS bookworm
  • 11:03 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1007.eqiad.wmnet with OS bookworm
  • 10:58 hnowlan@deploy2002: Finished scap sync-world: Rebuild and deploy to test mw-videoscaler integration (duration: 36m 40s)
  • 10:44 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1007.eqiad.wmnet with reason: host reimage
  • 10:41 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1007.eqiad.wmnet with reason: host reimage
  • 10:24 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1007.eqiad.wmnet with OS bookworm
  • 10:22 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to test mw-videoscaler integration
  • 10:22 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1007,1021,1080,1287].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 10:13 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker1290.eqiad.wmnet
  • 10:12 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1290.eqiad.wmnet
  • 10:11 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1275.eqiad.wmnet
  • 10:11 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1275.eqiad.wmnet
  • 10:11 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1050.eqiad.wmnet
  • 10:11 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1050.eqiad.wmnet
  • 10:10 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) check for host wikikube-worker1290.eqiad.wmnet
  • 10:10 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host wikikube-worker1290.eqiad.wmnet
  • 10:09 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) check for host wikikube-worker1275.eqiad.wmnet
  • 10:09 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host wikikube-worker1275.eqiad.wmnet
  • 10:09 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) check for host wikikube-worker1050.eqiad.wmnet
  • 10:09 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host wikikube-worker1050.eqiad.wmnet
  • 10:04 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from kubernetes2036 to wikikube-worker2047
  • 10:04 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2036 to wikikube-worker2047
  • 09:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2036-2039].codfw.wmnet
  • 09:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2036-2039].codfw.wmnet
  • 09:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2071.codfw.wmnet
  • 09:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2071.codfw.wmnet
  • 09:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2069.codfw.wmnet
  • 09:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2069.codfw.wmnet
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2068.codfw.wmnet
  • 09:44 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2068.codfw.wmnet
  • 09:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2068.codfw.wmnet with OS bookworm
  • 09:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2069.codfw.wmnet with OS bookworm
  • 09:28 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 09:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2068.codfw.wmnet with reason: host reimage
  • 09:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2069.codfw.wmnet with reason: host reimage
  • 09:18 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 09:17 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2068.codfw.wmnet with reason: host reimage
  • 09:17 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2069.codfw.wmnet with reason: host reimage
  • 09:15 Emperor: restart wedged swift stats jobs on ms-fe2009
  • 09:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2068
  • 09:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2068
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2069
  • 08:59 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2069
  • 08:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2068.codfw.wmnet with OS bookworm
  • 08:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2069.codfw.wmnet with OS bookworm
  • 08:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2068-2069].codfw.wmnet
  • 08:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2068-2069].codfw.wmnet
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2070.codfw.wmnet
  • 08:55 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2070.codfw.wmnet
  • 08:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2071.codfw.wmnet with OS bookworm
  • 08:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2070.codfw.wmnet with OS bookworm
  • 08:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2071.codfw.wmnet with reason: host reimage
  • 08:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2070.codfw.wmnet with reason: host reimage
  • 08:30 oblivian@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 08:29 oblivian@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 08:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2071.codfw.wmnet with reason: host reimage
  • 08:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2070.codfw.wmnet with reason: host reimage
  • 08:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2071
  • 08:11 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2071
  • 08:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2070
  • 08:11 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2070
  • 08:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2070.codfw.wmnet with OS bookworm
  • 08:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2071.codfw.wmnet with OS bookworm
  • 08:04 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2071].codfw.wmnet
  • 08:03 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2071].codfw.wmnet
  • 05:59 eileen: civicrm upgraded from 8163be0d to 47cddc15
  • 05:11 eileen: civicrm upgraded from 0d7f2866 to 8163be0d
  • 05:01 tstarling@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 05:00 tstarling@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 05:00 tstarling@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 04:59 tstarling@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 04:13 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 04:12 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:32 tstarling@deploy2002: Finished scap sync-world: Backport for Revert "Use PHP type declarations" (T382385) (duration: 14m 35s)
  • 01:27 tstarling@deploy2002: tstarling: Continuing with sync
  • 01:26 tstarling@deploy2002: tstarling: Backport for Revert "Use PHP type declarations" (T382385) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 01:18 tstarling@deploy2002: Started scap sync-world: Backport for Revert "Use PHP type declarations" (T382385)
  • 00:06 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:04 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply

2024-12-17

  • 22:53 dancy@deploy2002: Finished scap sync-world: Testing scap 4.134.0 (duration: 03m 18s)
  • 22:49 dancy@deploy2002: Started scap sync-world: Testing scap 4.134.0
  • 22:49 dancy@deploy2002: Installation of scap version "4.134.0" completed for 2 hosts
  • 22:47 dancy@deploy2002: Installing scap version "4.134.0" for 2 host(s)
  • 22:13 cstone: payments-wiki upgraded from 674dd6cd to 65775042
  • 22:13 ebernhardson@deploy2002: Finished scap sync-world: Backport for Revert^3 "cirrus: Enable mlr-2024 for select wikis" (duration: 14m 23s)
  • 22:07 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 22:05 ebernhardson@deploy2002: ebernhardson: Backport for Revert^3 "cirrus: Enable mlr-2024 for select wikis" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:58 ebernhardson@deploy2002: Started scap sync-world: Backport for Revert^3 "cirrus: Enable mlr-2024 for select wikis"
  • 21:41 urbanecm@deploy2002: Finished scap sync-world: Backport for Reader Survey: Partially undeploy (T378660), Enable AutoModerator on azwiki (T382286), uzwiki: Update tagline (T370165) (duration: 12m 14s)
  • 21:36 urbanecm@deploy2002: urbanecm, dani, jsn: Continuing with sync
  • 21:35 urbanecm@deploy2002: urbanecm, dani, jsn: Backport for Reader Survey: Partially undeploy (T378660), Enable AutoModerator on azwiki (T382286), uzwiki: Update tagline (T370165) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:29 urbanecm@deploy2002: Started scap sync-world: Backport for Reader Survey: Partially undeploy (T378660), Enable AutoModerator on azwiki (T382286), uzwiki: Update tagline (T370165)
  • 21:27 urbanecm@deploy2002: Sync cancelled.
  • 21:10 urbanecm@deploy2002: urbanecm, ebernhardson, dani, jsn: Backport for Reader Survey: Partially undeploy (T378660), Enable AutoModerator on azwiki (T382286), Revert "cirrus: Enable mlr-2024 for select wikis" (T377128) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:04 urbanecm@deploy2002: Started scap sync-world: Backport for Reader Survey: Partially undeploy (T378660), Enable AutoModerator on azwiki (T382286), Revert "cirrus: Enable mlr-2024 for select wikis" (T377128)
  • 20:50 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 20:50 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 20:23 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 20:20 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 20:17 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 19:29 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.8 refs T375667
  • 19:09 swfrench@deploy2002: Finished scap sync-world: Deployment to populate remaining migration release files - T377040 (duration: 11m 35s)
  • 18:57 swfrench@deploy2002: Started scap sync-world: Deployment to populate remaining migration release files - T377040
  • 18:42 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 18:42 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 18:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 18:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 18:40 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 18:40 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 18:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 18:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 18:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 18:36 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 18:25 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1100530, 1100531, 1100532, 1100533 (duration: 22m 25s)
  • 18:20 rzl@deploy2002: rzl: Continuing with sync
  • 18:20 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1100530, 1100531, 1100532, 1100533 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:11 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 18:04 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1100530, 1100531, 1100532, 1100533
  • 17:46 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 17:30 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:30 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:30 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:30 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd2004-dev']
  • 17:01 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd2004-dev']
  • 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudcephosd2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 16:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudcephosd2004-dev to codfw - jhancock@cumin2002"
  • 16:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudcephosd2004-dev to codfw - jhancock@cumin2002"
  • 16:22 swfrench-wmf: deployed shellbox 2024-12-17-061932 for T292322
  • 16:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 16:20 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 16:20 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:19 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 16:19 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:19 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:18 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 16:18 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 16:18 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:17 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:17 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 16:17 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 16:16 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 16:12 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2072-2073].codfw.wmnet
  • 16:12 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2072-2073].codfw.wmnet
  • 16:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2072.codfw.wmnet with OS bookworm
  • 16:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2073.codfw.wmnet with OS bookworm
  • 16:09 brennen@deploy2002: Finished deploy [phabricator/deployment@53251a4]: deploy phab1004 for T382346 (duration: 00m 52s)
  • 16:08 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Update
  • 16:08 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Update
  • 16:08 brennen@deploy2002: Started deploy [phabricator/deployment@53251a4]: deploy phab1004 for T382346
  • 16:03 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 16:03 brennen@deploy2002: Finished deploy [phabricator/deployment@53251a4]: deploy phab2002 for T382346 (duration: 00m 27s)
  • 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@53251a4]: deploy phab2002 for T382346
  • 16:02 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:02 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Update
  • 16:02 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Update
  • 16:02 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:01 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 16:01 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:00 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:00 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 15:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 15:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 15:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 15:58 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 15:57 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 15:53 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 15:53 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 15:52 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 15:52 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 15:51 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 15:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2072.codfw.wmnet with reason: host reimage
  • 15:48 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 15:48 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 15:48 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 15:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2073.codfw.wmnet with reason: host reimage
  • 15:47 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 15:47 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 15:47 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 15:46 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2072.codfw.wmnet with reason: host reimage
  • 15:45 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 15:44 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2073.codfw.wmnet with reason: host reimage
  • 15:30 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 15:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2073
  • 15:26 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2073
  • 15:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2072
  • 15:26 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2072
  • 15:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2073.codfw.wmnet with OS bookworm
  • 15:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2072.codfw.wmnet with OS bookworm
  • 15:23 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2072-2073].codfw.wmnet
  • 15:22 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2072-2073].codfw.wmnet
  • 15:21 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2077].codfw.wmnet
  • 15:21 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2077].codfw.wmnet
  • 15:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2076.codfw.wmnet with OS bookworm
  • 15:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2077.codfw.wmnet with OS bookworm
  • 14:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2076.codfw.wmnet with reason: host reimage
  • 14:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2077.codfw.wmnet with reason: host reimage
  • 14:52 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2076.codfw.wmnet with reason: host reimage
  • 14:52 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2077.codfw.wmnet with reason: host reimage
  • 14:48 dcausse: closing the UTC afternoon backport window
  • 14:47 urbanecm: Run extensions/Flow/maintenance/FlowMoveBoardsToSubpages.php for arwiki cawiki frwiki mediawikiwiki orwiki wawiki wawiktionary wikidatawiki zhwiki (T378829)
  • 14:46 dcausse@deploy2002: Finished scap sync-world: Backport for beta: enable updating link-suggestions from read-mode (T378536) (duration: 18m 26s)
  • 14:39 dcausse@deploy2002: migr, dcausse: Continuing with sync
  • 14:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2076
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2076
  • 14:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2077
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2077
  • 14:34 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2077.codfw.wmnet with OS bookworm
  • 14:34 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2076.codfw.wmnet with OS bookworm
  • 14:34 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:34 dcausse@deploy2002: migr, dcausse: Backport for beta: enable updating link-suggestions from read-mode (T378536) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2077].codfw.wmnet
  • 14:34 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:33 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2077].codfw.wmnet
  • 14:32 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2078-2079].codfw.wmnet
  • 14:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2078-2079].codfw.wmnet
  • 14:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2078.codfw.wmnet with OS bookworm
  • 14:27 dcausse@deploy2002: Started scap sync-world: Backport for beta: enable updating link-suggestions from read-mode (T378536)
  • 14:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2079.codfw.wmnet with OS bookworm
  • 14:27 dcausse: T375641: reindexing all EntitySchema pages on testwikidatawiki
  • 14:25 dcausse@deploy2002: Finished scap sync-world: Backport for rdf-streaming-updater: add wdqs udpater streams in event stream config (T374919), cirrussearch: increase shard count for cebwiki_content (T379002) (duration: 20m 07s)
  • 14:20 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 14:17 dcausse@deploy2002: dcausse: Continuing with sync
  • 14:15 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 14:14 dcausse@deploy2002: dcausse: Backport for rdf-streaming-updater: add wdqs udpater streams in event stream config (T374919), cirrussearch: increase shard count for cebwiki_content (T379002) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2078.codfw.wmnet with reason: host reimage
  • 14:08 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 14:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2079.codfw.wmnet with reason: host reimage
  • 14:05 dcausse@deploy2002: Started scap sync-world: Backport for rdf-streaming-updater: add wdqs udpater streams in event stream config (T374919), cirrussearch: increase shard count for cebwiki_content (T379002)
  • 14:04 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2078.codfw.wmnet with reason: host reimage
  • 14:04 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2079.codfw.wmnet with reason: host reimage
  • 14:02 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:02 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2079
  • 13:46 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2079
  • 13:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2078
  • 13:46 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2078
  • 13:46 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2079.codfw.wmnet with OS bookworm
  • 13:46 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2078.codfw.wmnet with OS bookworm
  • 13:45 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2078-2079].codfw.wmnet
  • 13:41 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2078-2079].codfw.wmnet
  • 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2080.codfw.wmnet
  • 13:39 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2080.codfw.wmnet
  • 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2081.codfw.wmnet
  • 13:39 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2081.codfw.wmnet
  • 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2081.codfw.wmnet with OS bookworm
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2080.codfw.wmnet with OS bookworm
  • 13:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2081.codfw.wmnet with reason: host reimage
  • 13:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2080.codfw.wmnet with reason: host reimage
  • 13:13 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2081.codfw.wmnet with reason: host reimage
  • 13:12 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2080.codfw.wmnet with reason: host reimage
  • 12:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2081
  • 12:55 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2081
  • 12:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2080
  • 12:54 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2080
  • 12:54 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2081.codfw.wmnet with OS bookworm
  • 12:54 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2080.codfw.wmnet with OS bookworm
  • 12:53 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2080.codfw.wmnet
  • 12:53 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2080.codfw.wmnet
  • 12:53 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2081.codfw.wmnet
  • 12:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2081.codfw.wmnet
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2083.codfw.wmnet
  • 12:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2083.codfw.wmnet
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2082.codfw.wmnet
  • 12:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2082.codfw.wmnet
  • 12:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2082.codfw.wmnet with OS bookworm
  • 12:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2083.codfw.wmnet with OS bookworm
  • 12:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2082.codfw.wmnet with reason: host reimage
  • 12:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2083.codfw.wmnet with reason: host reimage
  • 12:24 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2082.codfw.wmnet with reason: host reimage
  • 12:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2083.codfw.wmnet with reason: host reimage
  • 12:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2083
  • 12:05 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2083
  • 12:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2082
  • 12:05 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2082
  • 12:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2083.codfw.wmnet with OS bookworm
  • 12:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2082.codfw.wmnet with OS bookworm
  • 12:03 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2083.codfw.wmnet
  • 12:03 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2083.codfw.wmnet
  • 12:03 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2082.codfw.wmnet
  • 12:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2082.codfw.wmnet
  • 11:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2087.codfw.wmnet
  • 11:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2087.codfw.wmnet
  • 11:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2084.codfw.wmnet
  • 11:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2084.codfw.wmnet
  • 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-eqiad
  • 11:47 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-eqiad
  • 11:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2087.codfw.wmnet with OS bookworm
  • 11:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2084.codfw.wmnet with OS bookworm
  • 11:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2087.codfw.wmnet with reason: host reimage
  • 11:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2084.codfw.wmnet with reason: host reimage
  • 11:22 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2087.codfw.wmnet with reason: host reimage
  • 11:20 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2084.codfw.wmnet with reason: host reimage
  • 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-codfw
  • 11:12 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-codfw
  • 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-codfw
  • 11:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2087
  • 11:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2087
  • 11:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2084
  • 11:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2084
  • 10:33 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gitlab1004.wikimedia.org with reason: Test downtime to troubleshoot failed cookbook
  • 10:33 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:15:00 on gitlab1004.wikimedia.org with reason: Test downtime to troubleshoot failed cookbook
  • 10:14 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on wikikube-worker2084.codfw.wmnet with reason: Test downtime to troubleshoot failed cookbook
  • 10:14 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:15:00 on wikikube-worker2084.codfw.wmnet with reason: Test downtime to troubleshoot failed cookbook
  • 09:44 hashar@deploy2002: Finished scap sync-world: Kubernetes cluster was unreachable (timeout) - T375667 (duration: 03m 27s)
  • 09:41 hashar@deploy2002: Started scap sync-world: Kubernetes cluster was unreachable (timeout) - T375667
  • 09:34 hashar@deploy2002: Started scap sync-world: Overnight deployment timed out deploying to Kubernetes as usual - T375667
  • 09:27 dcausse: T378097: reindexing all lexemes
  • 09:18 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-codfw
  • 09:16 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 09:15 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 09:15 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 09:14 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 09:13 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 09:13 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2084.codfw.wmnet with OS bookworm
  • 09:08 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2084.codfw.wmnet with OS bookworm
  • 09:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2087.codfw.wmnet with OS bookworm
  • 09:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2084.codfw.wmnet with OS bookworm
  • 08:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2084.codfw.wmnet
  • 08:55 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2084.codfw.wmnet
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2087.codfw.wmnet
  • 08:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2087.codfw.wmnet
  • 08:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2093.codfw.wmnet
  • 08:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2093.codfw.wmnet
  • 08:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
  • 08:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
  • 08:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2092.codfw.wmnet with OS bookworm
  • 08:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2093.codfw.wmnet with OS bookworm
  • 08:37 kartik@deploy2002: Finished scap sync-world: Backport for Enable the Contribute menu in 5th group of wikis (T380928) (duration: 33m 05s)
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-all
  • 08:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2092.codfw.wmnet with reason: host reimage
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2093.codfw.wmnet with reason: host reimage
  • 08:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2092.codfw.wmnet with reason: host reimage
  • 08:20 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2093.codfw.wmnet with reason: host reimage
  • 08:19 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-all
  • 08:19 kartik@deploy2002: kartik: Continuing with sync
  • 08:18 kartik@deploy2002: kartik: Backport for Enable the Contribute menu in 5th group of wikis (T380928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:17 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wcqs-public
  • 08:16 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wcqs-public
  • 08:04 kartik@deploy2002: Started scap sync-world: Backport for Enable the Contribute menu in 5th group of wikis (T380928)
  • 08:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2093
  • 08:03 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2093
  • 08:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2092
  • 08:03 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2092
  • 08:03 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2093.codfw.wmnet with OS bookworm
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2092.codfw.wmnet with OS bookworm
  • 08:01 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
  • 08:01 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
  • 08:00 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2093.codfw.wmnet
  • 08:00 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2093.codfw.wmnet
  • 07:53 moritzm: installing expat security updates
  • 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.44.0-wmf.8 refs T375667
  • 01:33 swfrench-wmf: deployed shellbox-video to pick up config change for T292322
  • 01:30 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 01:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 01:27 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 01:27 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 01:23 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 01:23 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply

2024-12-16

  • 23:29 tstarling@deploy2002: Finished scap sync-world: Backport for Enable canShellboxGetTempUrl on testwiki (T292322) (duration: 12m 00s)
  • 23:24 tstarling@deploy2002: tstarling: Continuing with sync
  • 23:23 tstarling@deploy2002: tstarling: Backport for Enable canShellboxGetTempUrl on testwiki (T292322) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:17 tstarling@deploy2002: Started scap sync-world: Backport for Enable canShellboxGetTempUrl on testwiki (T292322)
  • 22:22 ryankemper@cumin2002: conftool action : set/pooled=yes:weight=10; selector: cluster=wdqs-internal-main,service=wdqs-main
  • 21:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 21:33 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 21:24 cjming: end of UTC late backport window
  • 21:23 cjming@deploy2002: Finished scap sync-world: Backport for Update VisualEditor config to drop exclusions based on Flow (T224851) (duration: 11m 56s)
  • 21:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
  • 21:17 cjming@deploy2002: cjming, pppery: Continuing with sync
  • 21:16 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
  • 21:15 cjming@deploy2002: cjming, pppery: Backport for Update VisualEditor config to drop exclusions based on Flow (T224851) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:11 cjming@deploy2002: Started scap sync-world: Backport for Update VisualEditor config to drop exclusions based on Flow (T224851)
  • 21:05 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 19:36 joal@deploy2002: Finished deploy [airflow-dags/analytics@afda9d9]: Airflow analytics backfill deploy [airflow-dags/analytics@afda9d9a] (duration: 02m 58s)
  • 19:33 joal@deploy2002: Started deploy [airflow-dags/analytics@afda9d9]: Airflow analytics backfill deploy [airflow-dags/analytics@afda9d9a]
  • 19:16 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase1028.eqiad.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 19:07 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase1028.eqiad.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 19:05 swfrench@deploy2002: Finished scap sync-world: T353817 - Additional deployment to clear remaining diffs (duration: 02m 51s)
  • 19:03 swfrench@deploy2002: Started scap sync-world: T353817 - Additional deployment to clear remaining diffs
  • 18:37 otto@deploy2002: otto: Continuing with sync
  • 18:35 otto@deploy2002: otto: T353817 - Apache rewrite mediawiki.org/beacon/event to /beacon/event/index.php synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:31 otto@deploy2002: Started scap sync-world: T353817 - Apache rewrite mediawiki.org/beacon/event to /beacon/event/index.php
  • 17:46 dancy@deploy2002: Installation of scap version "4.133.0" completed for 1 hosts
  • 17:45 dancy@deploy2002: Installing scap version "4.133.0" for 1 host(s)
  • 17:41 dancy@deploy2002: Installing scap version "4.133.0" for 213 host(s)
  • 17:22 swfrench@deploy2002: Finished scap sync-world: Deployment to pick up debug image changes - T381473 (duration: 06m 49s)
  • 17:21 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:21 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:15 swfrench@deploy2002: Started scap sync-world: Deployment to pick up debug image changes - T381473
  • 16:47 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 25s)
  • 16:44 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 09m 25s)
  • 16:37 moritzm: installing ipmitool bugfix updates from Bookworm point release
  • 16:05 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 16:05 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 16:04 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 16:04 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 16:03 hnowlan@deploy2002: Finished scap sync-world: Rebuild and deploy to pick up new php8.1 base (duration: 23m 06s)
  • 16:00 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 16:00 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2095.codfw.wmnet
  • 15:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2095.codfw.wmnet
  • 15:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2094.codfw.wmnet
  • 15:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2094.codfw.wmnet
  • 15:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2095.codfw.wmnet with OS bookworm
  • 15:51 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2094.codfw.wmnet with OS bookworm
  • 15:51 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:42 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 15:41 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to pick up new php8.1 base
  • 15:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2095.codfw.wmnet with reason: host reimage
  • 15:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2094.codfw.wmnet with reason: host reimage
  • 15:26 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2095.codfw.wmnet with reason: host reimage
  • 15:25 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2094.codfw.wmnet with reason: host reimage
  • 15:21 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 15:21 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 15:18 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:18 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:15 ladsgroup@deploy2002: Finished scap sync-world: Backport for Kick bundlesize out of package.json (T382192 T360590), fix(surfacing): Show highlights in lists as well (T381841), stats(surfacing): track link recommendation api recommendations (T378536) (duration: 11m 30s)
  • 15:10 ladsgroup@deploy2002: migr, ladsgroup: Continuing with sync
  • 15:09 ladsgroup@deploy2002: migr, ladsgroup: Backport for Kick bundlesize out of package.json (T382192 T360590), fix(surfacing): Show highlights in lists as well (T381841), stats(surfacing): track link recommendation api recommendations (T378536) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2095
  • 15:07 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2095
  • 15:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2094
  • 15:06 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2094
  • 15:06 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2095.codfw.wmnet with OS bookworm
  • 15:06 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2094.codfw.wmnet with OS bookworm
  • 15:04 ladsgroup@deploy2002: Started scap sync-world: Backport for Kick bundlesize out of package.json (T382192 T360590), fix(surfacing): Show highlights in lists as well (T381841), stats(surfacing): track link recommendation api recommendations (T378536)
  • 15:02 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 15:02 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2094.codfw.wmnet
  • 14:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2095.codfw.wmnet
  • 14:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2094.codfw.wmnet
  • 14:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2095.codfw.wmnet
  • 14:57 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2097.codfw.wmnet
  • 14:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2097.codfw.wmnet
  • 14:57 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2096.codfw.wmnet
  • 14:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2096.codfw.wmnet
  • 14:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2096.codfw.wmnet with OS bookworm
  • 14:54 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 14:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2097.codfw.wmnet with OS bookworm
  • 14:52 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:51 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:49 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 14:41 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2096.codfw.wmnet with reason: host reimage
  • 14:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2097.codfw.wmnet with reason: host reimage
  • 14:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2096.codfw.wmnet with reason: host reimage
  • 14:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2097.codfw.wmnet with reason: host reimage
  • 14:22 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 14:20 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2096
  • 14:13 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2096
  • 14:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2097
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2097
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2096.codfw.wmnet with OS bookworm
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2097.codfw.wmnet with OS bookworm
  • 14:12 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2096.codfw.wmnet
  • 14:11 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2096.codfw.wmnet
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2097.codfw.wmnet
  • 14:10 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2097.codfw.wmnet
  • 14:10 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Exclude autopromotion of temp IP viewer for users with specific global groups (T377929) (duration: 10m 05s)
  • 14:09 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2099.codfw.wmnet
  • 14:09 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2099.codfw.wmnet
  • 14:09 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2098.codfw.wmnet
  • 14:08 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2098.codfw.wmnet
  • 14:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2099.codfw.wmnet with OS bookworm
  • 14:03 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2098.codfw.wmnet with OS bookworm
  • 14:03 dreamyjazz@deploy2002: dreamyjazz: Backport for Exclude autopromotion of temp IP viewer for users with specific global groups (T377929) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:01 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 13:59 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:59 dreamyjazz@deploy2002: Started scap sync-world: Backport for Exclude autopromotion of temp IP viewer for users with specific global groups (T377929)
  • 13:58 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 13:56 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2099.codfw.wmnet with reason: host reimage
  • 13:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2098.codfw.wmnet with reason: host reimage
  • 13:39 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2099.codfw.wmnet with reason: host reimage
  • 13:39 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2098.codfw.wmnet with reason: host reimage
  • 13:26 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:25 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:25 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:25 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:24 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:24 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:22 hnowlan: imported packages for mercurius 1.0.3 via reprepro
  • 13:21 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2099
  • 13:21 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2098
  • 13:21 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2098
  • 13:21 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2099
  • 13:20 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2098.codfw.wmnet with OS bookworm
  • 13:20 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2099.codfw.wmnet with OS bookworm
  • 13:20 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2098.codfw.wmnet
  • 13:19 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2098.codfw.wmnet
  • 13:19 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2099.codfw.wmnet
  • 13:19 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2099.codfw.wmnet
  • 13:17 dcausse@deploy2002: Finished deploy [airflow-dags/search@c84bfa9]: search: add graph_name filtering (duration: 00m 30s)
  • 13:16 dcausse@deploy2002: Started deploy [airflow-dags/search@c84bfa9]: search: add graph_name filtering
  • 13:06 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2100.codfw.wmnet
  • 13:06 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2100.codfw.wmnet
  • 13:06 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2101.codfw.wmnet
  • 13:06 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2101.codfw.wmnet
  • 12:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2101.codfw.wmnet with OS bookworm
  • 12:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2100.codfw.wmnet with OS bookworm
  • 12:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2101.codfw.wmnet with reason: host reimage
  • 12:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2100.codfw.wmnet with reason: host reimage
  • 12:32 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2101.codfw.wmnet with reason: host reimage
  • 12:31 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2100.codfw.wmnet with reason: host reimage
  • 12:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2101
  • 12:14 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2101
  • 12:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2100
  • 12:14 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2100
  • 12:13 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2101.codfw.wmnet with OS bookworm
  • 12:13 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2100.codfw.wmnet with OS bookworm
  • 12:12 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2100.codfw.wmnet
  • 12:12 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2100.codfw.wmnet
  • 12:12 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2101.codfw.wmnet
  • 12:11 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2101.codfw.wmnet
  • 11:14 moritzm: installing NSS security updates
  • 10:57 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 10:57 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2102.codfw.wmnet
  • 10:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2102.codfw.wmnet
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2107.codfw.wmnet
  • 10:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2107.codfw.wmnet
  • 10:50 moritzm: installing postgresql-15 security updates
  • 09:38 hashar: UTC morning backport window has been completed
  • 09:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2102.codfw.wmnet with OS bookworm
  • 09:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2107.codfw.wmnet with OS bookworm
  • 09:22 hashar@deploy2002: Finished scap sync-world: Backport for [arwikisource] Enable the SandboxLink extension (T382218) (duration: 10m 23s)
  • 09:16 hashar@deploy2002: hashar, hubaishan: Continuing with sync
  • 09:16 hashar@deploy2002: hashar, hubaishan: Backport for [arwikisource] Enable the SandboxLink extension (T382218) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2102.codfw.wmnet with reason: host reimage
  • 09:12 hashar@deploy2002: Started scap sync-world: Backport for [arwikisource] Enable the SandboxLink extension (T382218)
  • 09:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2107.codfw.wmnet with reason: host reimage
  • 09:07 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2102.codfw.wmnet with reason: host reimage
  • 09:06 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2107.codfw.wmnet with reason: host reimage
  • 09:04 hashar@deploy2002: Finished scap sync-world: Backport for Update interwiki cache (T381379) (duration: 10m 36s)
  • 08:58 hashar@deploy2002: hashar: Continuing with sync
  • 08:57 hashar@deploy2002: hashar: Backport for Update interwiki cache (T381379) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:53 hashar@deploy2002: Started scap sync-world: Backport for Update interwiki cache (T381379)
  • 08:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2102
  • 08:48 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2102
  • 08:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2107
  • 08:48 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2107
  • 08:48 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2107.codfw.wmnet with OS bookworm
  • 08:48 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2102.codfw.wmnet with OS bookworm
  • 08:47 hashar@deploy2002: Finished scap sync-world: Backport for tigwiki: add logos (T381379), tigwiki: add SITENAME, timezone and projectnamespace (T381379) (duration: 11m 37s)
  • 08:45 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2102.codfw.wmnet
  • 08:45 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2102.codfw.wmnet
  • 08:44 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2107.codfw.wmnet
  • 08:44 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2107.codfw.wmnet
  • 08:41 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2109.codfw.wmnet
  • 08:41 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2109.codfw.wmnet
  • 08:40 hashar@deploy2002: anzx, hashar: Continuing with sync
  • 08:40 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2108.codfw.wmnet
  • 08:40 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2108.codfw.wmnet
  • 08:39 hashar@deploy2002: anzx, hashar: Backport for tigwiki: add logos (T381379), tigwiki: add SITENAME, timezone and projectnamespace (T381379) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:35 hashar@deploy2002: Started scap sync-world: Backport for tigwiki: add logos (T381379), tigwiki: add SITENAME, timezone and projectnamespace (T381379)
  • 08:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2108.codfw.wmnet with OS bookworm
  • 08:32 hashar@deploy2002: Finished scap sync-world: Backport for [enwikinews] & [hewikinews] & [plwikinews]: Upgrade license to CC BY 4.0 (T381421) (duration: 23m 51s)
  • 08:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2109.codfw.wmnet with OS bookworm
  • 08:28 dcausse: restarting blazegraph on wdqs2017 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 08:22 hashar@deploy2002: hashar, anwon: Continuing with sync
  • 08:21 hashar@deploy2002: hashar, anwon: Backport for [enwikinews] & [hewikinews] & [plwikinews]: Upgrade license to CC BY 4.0 (T381421) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2108.codfw.wmnet with reason: host reimage
  • 08:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2109.codfw.wmnet with reason: host reimage
  • 08:08 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2108.codfw.wmnet with reason: host reimage
  • 08:08 hashar@deploy2002: Started scap sync-world: Backport for [enwikinews] & [hewikinews] & [plwikinews]: Upgrade license to CC BY 4.0 (T381421)
  • 08:06 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2109.codfw.wmnet with reason: host reimage
  • 07:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2108
  • 07:48 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2108
  • 07:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2109
  • 07:48 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2109
  • 07:48 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2109.codfw.wmnet with OS bookworm
  • 07:48 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2108.codfw.wmnet with OS bookworm
  • 07:45 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2108.codfw.wmnet
  • 07:44 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2108.codfw.wmnet
  • 07:44 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2109.codfw.wmnet
  • 07:43 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2109.codfw.wmnet

2024-12-14

  • 12:46 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 12:41 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync

2024-12-13

  • 20:19 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1025.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:28 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 19:27 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1025.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 19:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 19:10 mstyles@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 19:09 mstyles@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 19:08 mstyles@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 19:08 mstyles@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 19:08 mstyles@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 19:07 mstyles@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 18:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 18:25 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 17:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2113.codfw.wmnet
  • 17:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2113.codfw.wmnet
  • 17:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2112.codfw.wmnet
  • 17:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2112.codfw.wmnet
  • 17:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2111.codfw.wmnet
  • 17:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2111.codfw.wmnet
  • 17:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2110.codfw.wmnet
  • 17:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2110.codfw.wmnet
  • 17:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2110.codfw.wmnet with OS bookworm
  • 17:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2111.codfw.wmnet with OS bookworm
  • 17:26 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:24 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2110.codfw.wmnet with reason: host reimage
  • 17:21 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2111.codfw.wmnet with reason: host reimage
  • 17:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 17:17 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2110.codfw.wmnet with reason: host reimage
  • 17:16 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2111.codfw.wmnet with reason: host reimage
  • 17:16 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:11 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:10 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:05 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:04 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:02 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:00 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2110
  • 16:59 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2110
  • 16:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2111
  • 16:59 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2111
  • 16:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2110.codfw.wmnet with OS bookworm
  • 16:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2111.codfw.wmnet with OS bookworm
  • 16:58 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2110.codfw.wmnet
  • 16:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2110.codfw.wmnet
  • 16:57 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2111.codfw.wmnet
  • 16:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2111.codfw.wmnet
  • 16:54 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:50 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:48 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:47 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:45 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:41 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:39 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:36 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:35 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:35 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:34 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:32 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:31 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:31 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:30 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:29 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:19 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2112.codfw.wmnet with OS bookworm
  • 16:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2113.codfw.wmnet with OS bookworm
  • 15:59 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 15:57 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 15:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2112.codfw.wmnet with reason: host reimage
  • 15:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2113.codfw.wmnet with reason: host reimage
  • 15:45 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2112.codfw.wmnet with reason: host reimage
  • 15:45 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2113.codfw.wmnet with reason: host reimage
  • 15:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2113
  • 15:28 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2113
  • 15:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2112
  • 15:28 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2112
  • 15:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2113.codfw.wmnet with OS bookworm
  • 15:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2112.codfw.wmnet with OS bookworm
  • 15:27 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2112.codfw.wmnet
  • 15:26 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2112.codfw.wmnet
  • 15:26 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2113.codfw.wmnet
  • 15:26 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2113.codfw.wmnet
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2114.codfw.wmnet
  • 15:18 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2114.codfw.wmnet
  • 15:17 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2115.codfw.wmnet
  • 15:17 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2115.codfw.wmnet
  • 15:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2115.codfw.wmnet with OS bookworm
  • 15:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2114.codfw.wmnet with OS bookworm
  • 14:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2115.codfw.wmnet with reason: host reimage
  • 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host build2002.codfw.wmnet with OS bookworm
  • 14:54 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[2007-2010].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 14:54 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2010.codfw.wmnet with OS bookworm
  • 14:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2114.codfw.wmnet with reason: host reimage
  • 14:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2115.codfw.wmnet with reason: host reimage
  • 14:48 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2114.codfw.wmnet with reason: host reimage
  • 14:35 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2010.codfw.wmnet with reason: host reimage
  • 14:31 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2010.codfw.wmnet with reason: host reimage
  • 14:31 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2115
  • 14:30 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2115
  • 14:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2114
  • 14:29 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2114
  • 14:29 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2115.codfw.wmnet with OS bookworm
  • 14:29 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2114.codfw.wmnet with OS bookworm
  • 14:29 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:28 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:27 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2114.codfw.wmnet
  • 14:26 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2114.codfw.wmnet
  • 14:26 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:26 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2115.codfw.wmnet
  • 14:25 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2115.codfw.wmnet
  • 14:25 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:21 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2116.codfw.wmnet
  • 14:21 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2116.codfw.wmnet
  • 14:21 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2117.codfw.wmnet
  • 14:21 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2117.codfw.wmnet
  • 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on build2002.codfw.wmnet with reason: host reimage
  • 14:15 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:14 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2010
  • 14:14 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2010
  • 14:13 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2010.codfw.wmnet with OS bookworm
  • 14:13 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:13 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on build2002.codfw.wmnet with reason: host reimage
  • 14:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2009.codfw.wmnet with OS bookworm
  • 14:03 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:58 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host build2002.codfw.wmnet with OS bookworm
  • 13:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM build2002.codfw.wmnet
  • 13:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2009.codfw.wmnet with reason: host reimage
  • 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM build2002.codfw.wmnet
  • 13:48 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:45 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2009.codfw.wmnet with reason: host reimage
  • 13:38 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 13:28 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 13:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2116.codfw.wmnet with OS bookworm
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2117.codfw.wmnet with OS bookworm
  • 13:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2116.codfw.wmnet with reason: host reimage
  • 13:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2117.codfw.wmnet with reason: host reimage
  • 13:01 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2116.codfw.wmnet with reason: host reimage
  • 13:00 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2117.codfw.wmnet with reason: host reimage
  • 12:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2116
  • 12:43 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2116
  • 12:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2117
  • 12:43 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2117
  • 12:43 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2117.codfw.wmnet with OS bookworm
  • 12:43 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2116.codfw.wmnet with OS bookworm
  • 12:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2116.codfw.wmnet
  • 12:33 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2116.codfw.wmnet
  • 12:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2117.codfw.wmnet
  • 12:30 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2009
  • 12:30 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2009
  • 12:30 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2009.codfw.wmnet with OS bookworm
  • 12:28 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2008.codfw.wmnet with OS bookworm
  • 12:27 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2117.codfw.wmnet
  • 12:24 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2119.codfw.wmnet
  • 12:24 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2119.codfw.wmnet
  • 12:24 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2118.codfw.wmnet
  • 12:24 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2118.codfw.wmnet
  • 12:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2118.codfw.wmnet with OS bookworm
  • 12:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2119.codfw.wmnet with OS bookworm
  • 12:09 moritzm: bump build2002 to 400G T379343
  • 12:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2008.codfw.wmnet with reason: host reimage
  • 12:06 aokoth@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Update
  • 12:05 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2008.codfw.wmnet with reason: host reimage
  • 12:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2118.codfw.wmnet with reason: host reimage
  • 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2119.codfw.wmnet with reason: host reimage
  • 11:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2118.codfw.wmnet with reason: host reimage
  • 11:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2119.codfw.wmnet with reason: host reimage
  • 11:48 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2008
  • 11:48 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2008
  • 11:47 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2008.codfw.wmnet with OS bookworm
  • 11:46 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2007.codfw.wmnet with OS bookworm
  • 11:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2118
  • 11:36 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2118
  • 11:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2119
  • 11:36 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2119
  • 11:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2118.codfw.wmnet with OS bookworm
  • 11:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2119.codfw.wmnet with OS bookworm
  • 11:32 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2118.codfw.wmnet
  • 11:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2118.codfw.wmnet
  • 11:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2121.codfw.wmnet
  • 11:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2121.codfw.wmnet
  • 11:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2120.codfw.wmnet
  • 11:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2120.codfw.wmnet
  • 11:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2119.codfw.wmnet
  • 11:30 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2119.codfw.wmnet
  • 11:26 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2007.codfw.wmnet with reason: host reimage
  • 11:23 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2007.codfw.wmnet with reason: host reimage
  • 11:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2007
  • 11:06 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2007
  • 11:05 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2007.codfw.wmnet with OS bookworm
  • 11:04 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[2007-2010].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 11:00 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[2001-2004].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 11:00 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2004.codfw.wmnet with OS bookworm
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2120.codfw.wmnet with OS bookworm
  • 10:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2121.codfw.wmnet with OS bookworm
  • 10:40 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2004.codfw.wmnet with reason: host reimage
  • 10:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2120.codfw.wmnet with reason: host reimage
  • 10:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2121.codfw.wmnet with reason: host reimage
  • 10:30 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2004.codfw.wmnet with reason: host reimage
  • 10:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2120.codfw.wmnet with reason: host reimage
  • 10:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2121.codfw.wmnet with reason: host reimage
  • 10:13 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2004
  • 10:13 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2004
  • 10:13 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2004.codfw.wmnet with OS bookworm
  • 10:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2120
  • 10:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2120
  • 10:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2121
  • 10:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2121
  • 10:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2120.codfw.wmnet with OS bookworm
  • 10:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2121.codfw.wmnet with OS bookworm
  • 10:11 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2003.codfw.wmnet with OS bookworm
  • 10:10 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2120.codfw.wmnet
  • 10:10 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2120.codfw.wmnet
  • 10:08 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2121.codfw.wmnet
  • 10:07 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2121.codfw.wmnet
  • 09:57 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2122.codfw.wmnet
  • 09:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2122.codfw.wmnet
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2123.codfw.wmnet
  • 09:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2123.codfw.wmnet
  • 09:50 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2003.codfw.wmnet with reason: host reimage
  • 09:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2123.codfw.wmnet with OS bookworm
  • 09:47 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2003.codfw.wmnet with reason: host reimage
  • 09:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2122.codfw.wmnet with OS bookworm
  • 09:42 Emperor: depool/restart swift/repool ms-fe1014
  • 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2018.codfw.wmnet
  • 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2018.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:36 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2018.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 09:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2123.codfw.wmnet with reason: host reimage
  • 09:29 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2003
  • 09:29 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2003
  • 09:29 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2003.codfw.wmnet with OS bookworm
  • 09:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2002.codfw.wmnet with OS bookworm
  • 09:27 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2018.codfw.wmnet
  • 09:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2122.codfw.wmnet with reason: host reimage
  • 09:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2123.codfw.wmnet with reason: host reimage
  • 09:22 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2122.codfw.wmnet with reason: host reimage
  • 09:09 xSavitar: T382078 Ran mwscript-k8s --comment="T382078" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=trwikiquote --logwiki=metawiki 'Roggenwolf' 'ChopinAficionado'
  • 09:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2002.codfw.wmnet with reason: host reimage
  • 09:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2123
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2123
  • 09:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2122
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2122
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2123.codfw.wmnet with OS bookworm
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2122.codfw.wmnet with OS bookworm
  • 09:05 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2002.codfw.wmnet with reason: host reimage
  • 09:02 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2122.codfw.wmnet
  • 09:02 xSavitar: T382078 Ran mwscript-k8s --comment="T382078" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Norberto Luis Amoroso Jacquet' 'Renamed user fe0fd27068061604303a2a5ab7390149'
  • 09:01 aokoth@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Update
  • 08:59 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2122.codfw.wmnet
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2123.codfw.wmnet
  • 08:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2123.codfw.wmnet
  • 08:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2126.codfw.wmnet
  • 08:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2126.codfw.wmnet
  • 08:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2125.codfw.wmnet
  • 08:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2125.codfw.wmnet
  • 08:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2126.codfw.wmnet with OS bookworm
  • 08:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2017.codfw.wmnet
  • 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2017.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2017.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2125.codfw.wmnet with OS bookworm
  • 08:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2002
  • 08:45 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2002
  • 08:45 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2002.codfw.wmnet with OS bookworm
  • 08:43 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2001.codfw.wmnet with OS bookworm
  • 08:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2006.codfw.wmnet
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2005.codfw.wmnet
  • 08:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2017.codfw.wmnet
  • 08:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2006.codfw.wmnet
  • 08:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2005.codfw.wmnet
  • 08:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2126.codfw.wmnet with reason: host reimage
  • 08:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2004.codfw.wmnet
  • 08:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2003.codfw.wmnet
  • 08:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2125.codfw.wmnet with reason: host reimage
  • 08:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 08:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2004.codfw.wmnet
  • 08:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2003.codfw.wmnet
  • 08:22 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host maps-test2002.codfw.wmnet
  • 08:21 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2126.codfw.wmnet with reason: host reimage
  • 08:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host maps-test2001.codfw.wmnet
  • 08:21 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2125.codfw.wmnet with reason: host reimage
  • 08:19 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2002.codfw.wmnet
  • 08:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet
  • 08:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2125
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2125
  • 08:02 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2001
  • 08:02 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2001
  • 08:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2126
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2126
  • 08:02 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bookworm
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2125.codfw.wmnet with OS bookworm
  • 08:01 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2126.codfw.wmnet with OS bookworm
  • 08:00 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[2001-2004].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 07:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2125.codfw.wmnet
  • 07:59 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 07:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2125.codfw.wmnet
  • 07:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2126.codfw.wmnet
  • 07:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2126.codfw.wmnet
  • 07:49 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 07:41 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 07:30 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync

2024-12-12

  • away: UTC late deploys done
  • 22:35 tgr@deploy2002: Finished scap sync-world: Backport for change metric types back to counters (T374050) (duration: 19m 10s)
  • 22:30 tgr@deploy2002: tgr, cwhite: Continuing with sync
  • 22:29 eileen: config revision changed from ca701cba to 404bbbd5
  • 22:20 tgr@deploy2002: tgr, cwhite: Backport for change metric types back to counters (T374050) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:16 tgr@deploy2002: Started scap sync-world: Backport for change metric types back to counters (T374050)
  • 22:14 tgr@deploy2002: Finished scap sync-world: Backport for EditCheck: move checks to a sidebar (T341308 T379443) (duration: 29m 12s)
  • 22:09 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker127[6-7].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 22:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1277.eqiad.wmnet with OS bookworm
  • 22:08 inflatador: bking@cumin2002 sudo cumin A:gitlab-runner 'systemctl restart ferm.service' T371994
  • 22:03 tgr@deploy2002: tgr, kemayo: Continuing with sync
  • 22:02 tgr@deploy2002: tgr, kemayo: Backport for EditCheck: move checks to a sidebar (T341308 T379443) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1277.eqiad.wmnet with reason: host reimage
  • 21:46 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1277.eqiad.wmnet with reason: host reimage
  • 21:45 tgr@deploy2002: Started scap sync-world: Backport for EditCheck: move checks to a sidebar (T341308 T379443)
  • 21:38 tgr@deploy2002: Finished scap sync-world: Backport for Reader Survey: Deploy on eswiki, dewiki and frwiki. (T378660) (duration: 12m 42s)
  • 21:35 inflatador: bking@gitlab-runner2004 restart ferm to troubleshoot missing iptables rules T371994
  • 21:32 tgr@deploy2002: dani, tgr: Continuing with sync
  • 21:32 inflatador: bking@gitlab-runner2004 restart docker to troubleshoot missing iptables rules T371994
  • 21:32 tgr@deploy2002: dani, tgr: Backport for Reader Survey: Deploy on eswiki, dewiki and frwiki. (T378660) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1277
  • 21:25 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1277
  • 21:25 tgr@deploy2002: Started scap sync-world: Backport for Reader Survey: Deploy on eswiki, dewiki and frwiki. (T378660)
  • 21:25 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1277.eqiad.wmnet with OS bookworm
  • 21:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1276.eqiad.wmnet with OS bookworm
  • 21:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1275.eqiad.wmnet with OS bookworm
  • 21:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1276.eqiad.wmnet with reason: host reimage
  • 21:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1275.eqiad.wmnet with reason: host reimage
  • 21:00 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1276.eqiad.wmnet with reason: host reimage
  • 20:56 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1275.eqiad.wmnet with reason: host reimage
  • 20:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1276
  • 20:40 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1276
  • 20:40 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1276.eqiad.wmnet with OS bookworm
  • 20:38 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker127[6-7].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 20:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1275
  • 20:37 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1275
  • 20:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1275.eqiad.wmnet with OS bookworm
  • 20:32 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[1270-1275].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 20:32 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1275.eqiad.wmnet with OS bookworm
  • 19:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1275
  • 19:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1275
  • 19:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1275.eqiad.wmnet with OS bookworm
  • 19:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1274.eqiad.wmnet with OS bookworm
  • 19:42 swfrench@deploy2002: Finished scap sync-world: Deployment to populate mw-api-int migration release files - T377040 (duration: 02m 13s)
  • 19:40 swfrench@deploy2002: Started scap sync-world: Deployment to populate mw-api-int migration release files - T377040
  • 19:30 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 19:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 19:27 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 19:27 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1274.eqiad.wmnet with reason: host reimage
  • 19:24 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1274.eqiad.wmnet with reason: host reimage
  • 19:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1274
  • 19:04 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1274
  • 19:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1274.eqiad.wmnet with OS bookworm
  • 19:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1273.eqiad.wmnet with OS bookworm
  • 18:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1273.eqiad.wmnet with reason: host reimage
  • 18:39 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1273.eqiad.wmnet with reason: host reimage
  • 18:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:22 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 18:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1273
  • 18:19 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1273
  • 18:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1273.eqiad.wmnet with OS bookworm
  • 18:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1272.eqiad.wmnet with OS bookworm
  • 18:08 James_F: Running `mwscript-k8s -f -- extensions/WikiLambda/maintenance/updateSecondaryTables.php --wiki=wikifunctionswiki --zType Z4 --report --verbose`
  • 17:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1272.eqiad.wmnet with reason: host reimage
  • 17:55 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1272.eqiad.wmnet with reason: host reimage
  • 17:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:53 ottomata: killing wikidatawiki xml dump process to try to unstick it - T382084
  • 17:41 aqu@deploy2002: Finished deploy [airflow-dags/analytics@c2d7e08]: Backfill pageview actor hourly 2024 12 (duration: 03m 03s)
  • 17:41 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:41 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:41 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:38 aqu@deploy2002: Started deploy [airflow-dags/analytics@c2d7e08]: Backfill pageview actor hourly 2024 12
  • 17:37 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:37 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1272
  • 17:35 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1272
  • 17:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1272.eqiad.wmnet with OS bookworm
  • 17:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1271.eqiad.wmnet with OS bookworm
  • 17:25 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:23 bvibber: charts-renderer deployment T382039 complete
  • 17:21 bvibber@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 17:20 bvibber@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 17:20 bvibber@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 17:19 bvibber@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 17:19 bvibber@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 17:18 bvibber@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 17:16 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 17:15 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1271.eqiad.wmnet with reason: host reimage
  • 17:13 bvibber: doing service deploy for chart-renderer (T382039)
  • 17:11 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1271.eqiad.wmnet with reason: host reimage
  • 16:53 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: sync
  • 16:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: sync
  • 16:53 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: sync
  • 16:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: sync
  • 16:52 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 16:52 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 16:52 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 16:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1271
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1271
  • 16:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1271.eqiad.wmnet with OS bookworm
  • 16:51 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1270.eqiad.wmnet with OS bookworm
  • 16:44 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:43 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 16:41 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:39 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:39 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 16:39 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:37 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 16:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1270.eqiad.wmnet with reason: host reimage
  • 16:28 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1270.eqiad.wmnet with reason: host reimage
  • 16:27 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:17 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1270
  • 16:08 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1270
  • 16:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1270.eqiad.wmnet with OS bookworm
  • 16:06 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1270-1275].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 16:06 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 15:56 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 15:43 bking@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: 0.3.150 (duration: 00m 13s)
  • 15:43 bking@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: 0.3.150
  • 15:34 ladsgroup@deploy2002: Finished scap sync-world: Backport for Activate tigwiki (T381377) (duration: 09m 25s)
  • 15:34 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 15:29 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 15:28 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 15:28 ladsgroup@deploy2002: ladsgroup: Backport for Activate tigwiki (T381377) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:24 ladsgroup@deploy2002: Started scap sync-world: Backport for Activate tigwiki (T381377)
  • 15:24 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 15:19 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 15:19 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 15:18 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 15:17 ladsgroup@deploy2002: Finished scap sync-world: Backport for Add tigwiki to pre-install (T381377) (duration: 09m 35s)
  • 15:16 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 15:16 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:16 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:12 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 15:11 ladsgroup@deploy2002: ladsgroup: Backport for Add tigwiki to pre-install (T381377) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:09 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 15:09 bking@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: 0.3.150 (duration: 00m 05s)
  • 15:09 bking@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: 0.3.150
  • 15:08 ladsgroup@deploy2002: Started scap sync-world: Backport for Add tigwiki to pre-install (T381377)
  • 15:03 eevans@cumin1002: END (ERROR) - Cookbook sre.cassandra.roll-restart (exit_code=97) for nodes matching A:aqs-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 14:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:57 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:57 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:55 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 14:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 14:54 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 14:54 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 14:54 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 14:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 14:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71708 and previous config saved to /var/cache/conftool/dbconfig/20241212-144846-root.json
  • 14:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71707 and previous config saved to /var/cache/conftool/dbconfig/20241212-143340-root.json
  • 14:30 Amir1: ladsgroup@mwmaint2002:~$ foreachwikiindblist all userOptions.php --delete VectorSkinVersion (T54777)
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71706 and previous config saved to /var/cache/conftool/dbconfig/20241212-141835-root.json
  • 14:15 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host db1208.eqiad.wmnet
  • 14:04 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71705 and previous config saved to /var/cache/conftool/dbconfig/20241212-140329-root.json
  • 14:03 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2127.codfw.wmnet
  • 14:03 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2127.codfw.wmnet
  • 14:03 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 14:01 btullis@cumin1002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 14:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2127.codfw.wmnet with OS bookworm
  • 13:55 btullis@cumin1002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 13:53 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:52 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:52 btullis@cumin1002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 13:48 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:48 elukey@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71704 and previous config saved to /var/cache/conftool/dbconfig/20241212-134824-root.json
  • 13:48 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:47 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 13:47 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1169.eqiad.wmnet with reason: maintenance
  • 13:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1169.eqiad.wmnet with reason: maintenance
  • 13:46 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 13:46 btullis@cumin1002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 13:41 moritzm: installing Python 3.11 security updates
  • 13:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2127.codfw.wmnet with reason: host reimage
  • 13:39 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:38 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2127.codfw.wmnet with reason: host reimage
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T381532)', diff saved to https://phabricator.wikimedia.org/P71703 and previous config saved to /var/cache/conftool/dbconfig/20241212-133633-marostegui.json
  • 13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 13:32 moritzm: rebalance Ganeti cluster in codfw/D following server refresh T376594
  • 13:29 elukey@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2016].codfw.wmnet with reason: maintenance
  • 13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on pc[2014,2016].codfw.wmnet with reason: maintenance
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2127
  • 13:18 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2127
  • 13:18 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2127.codfw.wmnet with OS bookworm
  • 13:16 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2127.codfw.wmnet
  • 13:15 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2127.codfw.wmnet
  • 13:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[1013,1017].eqiad.wmnet with reason: maintenance
  • 13:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on pc[1013,1017].eqiad.wmnet with reason: maintenance
  • 13:12 mszabo@deploy2002: Finished scap sync-world: Backport for Enable IRS in the Project namespace on ptwiki (T382061) (duration: 09m 41s)
  • 13:06 mszabo@deploy2002: mszabo: Continuing with sync
  • 13:05 mszabo@deploy2002: mszabo: Backport for Enable IRS in the Project namespace on ptwiki (T382061) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:02 mszabo@deploy2002: Started scap sync-world: Backport for Enable IRS in the Project namespace on ptwiki (T382061)
  • 12:36 btullis@cumin1002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: Restarting to pick up new JRE for T377938 - btullis@cumin1002 - T377938
  • 12:31 btullis@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: Restarting to pick up new JRE for T377938 - btullis@cumin1002 - T377938
  • 12:30 btullis@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: Restarting to pick up new JRE for T377938 - btullis@cumin1002 - T377938
  • 12:29 btullis@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: Restarting to pick up new JRE for T377938 - btullis@cumin1002 - T377938
  • 12:15 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 12:15 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 12:15 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 12:15 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 12:10 hnowlan@deploy2002: Finished scap sync-world: syncing changes to mediawiki chart vendor dependencies (duration: 09m 30s)
  • 12:07 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Bugfixes - oblivian@cumin1002 - T382062"
  • 12:06 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Bugfixes - oblivian@cumin1002 - T382062
  • 12:06 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Bugfixes - oblivian@cumin1002 - T382062
  • 12:06 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Bugfixes - oblivian@cumin1002 - T382062"
  • 12:03 hnowlan@deploy2002: Started scap sync-world: syncing changes to mediawiki chart vendor dependencies
  • 11:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1154.eqiad.wmnet with reason: maintenance
  • 11:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1154.eqiad.wmnet with reason: maintenance
  • 11:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:33 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:33 elukey@deploy2002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:33 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
  • 11:32 elukey@deploy2002: helmfile [codfw] START helmfile.d/admin 'sync'.
  • 11:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2187.codfw.wmnet with reason: maintenance
  • 11:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2187.codfw.wmnet with reason: maintenance
  • 11:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: maintenance
  • 11:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2186.codfw.wmnet with reason: maintenance
  • 11:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: maintenance
  • 11:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2186.codfw.wmnet with reason: maintenance
  • 11:19 aqu@deploy2002: Finished deploy [airflow-dags/analytics@0e18d4f]: Backfill webrequest actor label hourly 2024 12 (duration: 02m 52s)
  • 11:16 aqu@deploy2002: Started deploy [airflow-dags/analytics@0e18d4f]: Backfill webrequest actor label hourly 2024 12
  • 11:08 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 11:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 11:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 11:07 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 11:04 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 11:04 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 11:04 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 11:03 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 11:03 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 11:03 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 11:02 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 11:02 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 11:01 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 11:01 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 11:01 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 10:48 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 10:44 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: update Homer wmf-plugin - cmooney@cumin1002
  • 10:43 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 10:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 10:43 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 10:36 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 10:36 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 10:34 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 10:34 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 10:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:22 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:22 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:18 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:18 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 09:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1027.eqiad.wmnet with reason: maintenance
  • 09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbproxy1027.eqiad.wmnet with reason: maintenance
  • 08:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:54 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1026.eqiad.wmnet with reason: maintenance
  • 08:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbproxy1026.eqiad.wmnet with reason: maintenance
  • 08:27 kartik@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription for 7 wikis (T372386) (duration: 20m 05s)
  • 08:23 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:23 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:21 kartik@deploy2002: kartik, abi: Continuing with sync
  • 08:12 kartik@deploy2002: kartik, abi: Backport for Translate: Enable message group subscription for 7 wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1024.eqiad.wmnet with reason: maintenance
  • 08:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbproxy1024.eqiad.wmnet with reason: maintenance
  • 08:07 kartik@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription for 7 wikis (T372386)
  • 08:02 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:02 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: maintenance
  • 08:00 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:00 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: maintenance
  • 08:00 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 08:00 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 07:56 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 07:56 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 07:52 moritzm: installing upx-ucl security updates

2024-12-11

  • 23:24 tzatziki: removing 7 files for legal compliance
  • 23:02 tzatziki: removing 4 files for legal compliance
  • 22:52 tzatziki: removing three files for legal compliance
  • 21:53 eileen: civicrm upgraded from ddda6d67 to 0d7f2866
  • 21:33 TheresNoTime: done UTC late backport window
  • 21:32 samtar@deploy2002: Finished scap sync-world: Backport for Follow-up I9df39fdcc: Convert missed 'this' to 'el' (T381741) (duration: 10m 01s)
  • 21:26 samtar@deploy2002: novemlinguae, samtar: Continuing with sync
  • 21:25 samtar@deploy2002: novemlinguae, samtar: Backport for Follow-up I9df39fdcc: Convert missed 'this' to 'el' (T381741) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:22 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 21:22 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 21:22 samtar@deploy2002: Started scap sync-world: Backport for Follow-up I9df39fdcc: Convert missed 'this' to 'el' (T381741)
  • 21:14 samtar@deploy2002: Finished scap sync-world: Backport for Enable AutoModerator on bnwiki (T381000) (duration: 11m 01s)
  • 21:11 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 21:11 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 21:10 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:10 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:08 samtar@deploy2002: kgraessle, samtar: Continuing with sync
  • 21:08 samtar@deploy2002: kgraessle, samtar: Backport for Enable AutoModerator on bnwiki (T381000) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:03 samtar@deploy2002: Started scap sync-world: Backport for Enable AutoModerator on bnwiki (T381000)
  • 21:00 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:00 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:42 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on aqs1014.eqiad.wmnet with reason: Hardware replacement
  • 20:42 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on aqs1014.eqiad.wmnet with reason: Hardware replacement
  • 18:18 claime: homer 'lsw1-d6-codfw*' commit 'T379788'
  • 18:17 claime: homer 'lsw1-c1-codfw*' commit 'T379788'
  • 18:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wikikube-worker[2180-2183].codfw.wmnet
  • 18:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2180-2183].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 18:15 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2180-2183].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 18:10 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 18:06 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:06 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:05 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:05 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:05 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:05 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:04 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:04 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:00 cgoubert@cumin1002: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[2180-2183].codfw.wmnet
  • 17:59 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:59 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:58 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 17:55 claime: homer 'lsw1-a6-codfw' commit 'T379788'
  • 17:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wikikube-worker[2047,2066,2085-2086].codfw.wmnet
  • 17:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2047,2066,2085-2086].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 17:53 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2047,2066,2085-2086].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 17:48 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 17:38 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UI improvements, add uncomitted changes warning - oblivian@cumin1002"
  • 17:38 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UI improvements, add uncomitted changes warning - oblivian@cumin1002
  • 17:37 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UI improvements, add uncomitted changes warning - oblivian@cumin1002
  • 17:37 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UI improvements, add uncomitted changes warning - oblivian@cumin1002"
  • 17:32 cgoubert@cumin1002: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[2047,2066,2085-2086].codfw.wmnet
  • 17:31 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.restart (exit_code=97)
  • 17:31 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 17:19 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 17:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2047,2066,2085-2086,2180-2183].codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2047,2066,2085-2086,2180-2183].codfw.wmnet
  • 16:48 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:47 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 16:46 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 16:43 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:42 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 16:35 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:34 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 16:32 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:25 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 16:25 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 16:24 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 16:24 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:23 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 16:22 otto@deploy2002: Finished scap sync-world: Backport for mediawiki.org/beacon/event/index.php - use EventBus->send (T353817) (duration: 11m 36s)
  • 16:21 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 16:21 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 16:16 otto@deploy2002: otto: Continuing with sync
  • 16:16 otto@deploy2002: otto: Backport for mediawiki.org/beacon/event/index.php - use EventBus->send (T353817) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:10 otto@deploy2002: Started scap sync-world: Backport for mediawiki.org/beacon/event/index.php - use EventBus->send (T353817)
  • 15:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2135.codfw.wmnet with reason: maintenance
  • 15:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2135.codfw.wmnet with reason: maintenance
  • 15:39 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on archiva1002.wikimedia.org with reason: Adding new disk
  • 15:38 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on archiva1002.wikimedia.org with reason: Adding new disk
  • 15:38 klausman@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 15:38 klausman@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 15:36 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:36 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:36 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
  • 15:35 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
  • 15:23 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:23 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:22 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:21 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:20 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:19 klausman@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 15:19 klausman@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 15:15 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:13 elukey@deploy2002: Finished deploy [docker-pkg/deploy@9305554]: Update to 4.0.3 (duration: 00m 37s)
  • 15:13 elukey@deploy2002: Started deploy [docker-pkg/deploy@9305554]: Update to 4.0.3
  • 15:02 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2184-2187].codfw.wmnet
  • 15:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2184-2187].codfw.wmnet
  • 15:02 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2085.codfw.wmnet with OS bullseye
  • 15:00 jelto: homer 'lsw1-d3-codfw*' commit 'T377877'
  • 14:58 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 14:57 jelto: homer 'lsw1-c3-codfw*' commit 'T377877'
  • 14:57 klausman@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 14:56 klausman@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 14:56 jelto: homer 'lsw1-d5-codfw*' commit 'T377877'
  • 14:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2185.codfw.wmnet with OS bookworm
  • 14:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2187.codfw.wmnet with OS bookworm
  • 14:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2184.codfw.wmnet with OS bookworm
  • 14:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2186.codfw.wmnet with OS bookworm
  • 14:42 TheresNoTime: done UTC afternoon backport window
  • 14:41 samtar@deploy2002: Finished scap sync-world: Backport for Add Atieno's public key (duration: 08m 47s)
  • 14:39 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2085.codfw.wmnet with reason: host reimage
  • 14:36 samtar@deploy2002: arlolra, samtar: Continuing with sync
  • 14:36 samtar@deploy2002: arlolra, samtar: Backport for Add Atieno's public key synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2185.codfw.wmnet with reason: host reimage
  • 14:34 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2085.codfw.wmnet with reason: host reimage
  • 14:32 samtar@deploy2002: Started scap sync-world: Backport for Add Atieno's public key
  • 14:32 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2185.codfw.wmnet with reason: host reimage
  • 14:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2187.codfw.wmnet with reason: host reimage
  • 14:30 samtar@deploy2002: Finished scap sync-world: Backport for ve.ui.CodeMirror.v6: Use plugin callback to load the actual module (T374072), styles: Avoid misalignments when line numbering is disabled (T381714) (duration: 10m 42s)
  • 14:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2184.codfw.wmnet with reason: host reimage
  • 14:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2187.codfw.wmnet with reason: host reimage
  • 14:25 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2184.codfw.wmnet with reason: host reimage
  • 14:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2186.codfw.wmnet with reason: host reimage
  • 14:25 samtar@deploy2002: samtar, func: Continuing with sync
  • 14:23 samtar@deploy2002: samtar, func: Backport for ve.ui.CodeMirror.v6: Use plugin callback to load the actual module (T374072), styles: Avoid misalignments when line numbering is disabled (T381714) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:22 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2186.codfw.wmnet with reason: host reimage
  • 14:19 samtar@deploy2002: Started scap sync-world: Backport for ve.ui.CodeMirror.v6: Use plugin callback to load the actual module (T374072), styles: Avoid misalignments when line numbering is disabled (T381714)
  • 14:19 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2085.codfw.wmnet with OS bullseye
  • 14:18 samtar@deploy2002: Finished scap sync-world: Backport for Remove feature flag which controls wikibase item link location (T377809) (duration: 12m 32s)
  • 14:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2185
  • 14:15 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2185
  • 14:14 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on archiva1002.wikimedia.org with reason: Adding new disk
  • 14:14 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on archiva1002.wikimedia.org with reason: Adding new disk
  • 14:14 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2185
  • 14:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2185.codfw.wmnet 89.32.192.10.in-addr.arpa 9.8.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:14 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2185.codfw.wmnet 89.32.192.10.in-addr.arpa 9.8.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:14 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2185 - jelto@cumin1002"
  • 14:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 14:14 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2185 - jelto@cumin1002"
  • 14:12 samtar@deploy2002: samtar, joelyrookewmde: Continuing with sync
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2187
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2187
  • 14:11 btullis@cumin1002: END (PASS) - Cookbook sre.apifeatureusage.roll-restart-reboot-logstash (exit_code=0) rolling restart_daemons on A:apifeatureusage
  • 14:11 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2187
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2187.codfw.wmnet 87.48.192.10.in-addr.arpa 7.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:11 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:10 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2187.codfw.wmnet 87.48.192.10.in-addr.arpa 7.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:09 samtar@deploy2002: samtar, joelyrookewmde: Backport for Remove feature flag which controls wikibase item link location (T377809) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:09 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2185
  • 14:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2184
  • 14:08 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2184
  • 14:08 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:08 btullis@cumin1002: START - Cookbook sre.apifeatureusage.roll-restart-reboot-logstash rolling restart_daemons on A:apifeatureusage
  • 14:07 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2184
  • 14:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2184.codfw.wmnet 41.32.192.10.in-addr.arpa 1.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:07 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2184.codfw.wmnet 41.32.192.10.in-addr.arpa 1.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:06 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:06 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:05 samtar@deploy2002: Started scap sync-world: Backport for Remove feature flag which controls wikibase item link location (T377809)
  • 14:05 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2186
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2186
  • 14:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2186
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2186.codfw.wmnet 180.48.192.10.in-addr.arpa 0.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:04 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2186.codfw.wmnet 180.48.192.10.in-addr.arpa 0.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:04 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:04 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2186 - jelto@cumin1002"
  • 14:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2186 - jelto@cumin1002"
  • 14:00 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2187
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2186
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2184
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2187.codfw.wmnet with OS bookworm
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2186.codfw.wmnet with OS bookworm
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2185.codfw.wmnet with OS bookworm
  • 14:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2184.codfw.wmnet with OS bookworm
  • 13:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2184.codfw.wmnet wikikube-worker2185.codfw.wmnet wikikube-worker2186.codfw.wmnet wikikube-worker2187.codfw.wmnet on all recursors
  • 13:57 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2184.codfw.wmnet wikikube-worker2185.codfw.wmnet wikikube-worker2186.codfw.wmnet wikikube-worker2187.codfw.wmnet on all recursors
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2024 to wikikube-worker2187
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2187
  • 13:53 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2187
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2024 to wikikube-worker2187 - jelto@cumin1002"
  • 13:53 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2024 to wikikube-worker2187 - jelto@cumin1002"
  • 13:42 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:42 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2024 to wikikube-worker2187
  • 13:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2022 to wikikube-worker2186
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2186
  • 13:40 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2186
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2022 to wikikube-worker2186 - jelto@cumin1002"
  • 13:39 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2022 to wikikube-worker2186 - jelto@cumin1002"
  • 13:34 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:34 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2022 to wikikube-worker2186
  • 13:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2021 to wikikube-worker2185
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2185
  • 13:32 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2185
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2021 to wikikube-worker2185 - jelto@cumin1002"
  • 13:31 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2021 to wikikube-worker2185 - jelto@cumin1002"
  • 13:28 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:27 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2021 to wikikube-worker2185
  • 13:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2017 to wikikube-worker2184
  • 13:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:25 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2184
  • 13:25 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2184
  • 13:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:25 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2017 to wikikube-worker2184 - jelto@cumin1002"
  • 13:25 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2017 to wikikube-worker2184 - jelto@cumin1002"
  • 13:21 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:21 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2017 to wikikube-worker2184
  • 13:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2017,2021-2022,2024].codfw.wmnet
  • 13:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:13 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2017,2021-2022,2024].codfw.wmnet
  • 13:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:04 kart_: Updated cxserver to 2024-12-10-132417-production (T369815)
  • 13:04 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 13:01 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 13:00 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 13:00 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 12:59 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 12:57 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 12:54 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:54 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 12:54 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:47 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 12:15 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 12:14 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 12:13 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:12 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:12 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:11 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:11 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:11 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:11 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:11 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 12:08 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:04 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2180-2183].codfw.wmnet
  • 11:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2180-2183].codfw.wmnet
  • 11:56 jelto: homer 'lsw1-c1-codfw*' commit 'T377877'
  • 11:54 jelto: homer 'lsw1-d6-codfw*' commit 'T377877'
  • 11:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2181.codfw.wmnet with OS bookworm
  • 11:37 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 11:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2181.codfw.wmnet with reason: host reimage
  • 11:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2181.codfw.wmnet with reason: host reimage
  • 11:25 mszabo@deploy2002: Finished scap sync-world: Backport for Prep pilot wiki config for IRS (T374105) (duration: 11m 04s)
  • 11:20 mszabo@deploy2002: mszabo: Continuing with sync
  • 11:17 mszabo@deploy2002: mszabo: Backport for Prep pilot wiki config for IRS (T374105) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2182.codfw.wmnet with OS bookworm
  • 11:14 mszabo@deploy2002: Started scap sync-world: Backport for Prep pilot wiki config for IRS (T374105)
  • 11:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2181
  • 11:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2181
  • 11:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2181.codfw.wmnet with OS bookworm
  • 11:11 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2181.codfw.wmnet with OS bookworm
  • 11:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Revert^2 "Stats: Move StatsFactory flush into emitBufferedStats" (duration: 14m 22s)
  • 11:03 dreamyjazz@deploy2002: dreamyjazz, cwhite: Continuing with sync
  • 10:59 dreamyjazz@deploy2002: dreamyjazz, cwhite: Backport for Revert^2 "Stats: Move StatsFactory flush into emitBufferedStats" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:58 fabfur: merging https://gerrit.wikimedia.org/r/c/operations/dns/+/1100084 to direct Argentina, Chile, Uruguay to magru (T359054)
  • 10:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2182.codfw.wmnet with reason: host reimage
  • 10:54 dreamyjazz@deploy2002: Started scap sync-world: Backport for Revert^2 "Stats: Move StatsFactory flush into emitBufferedStats"
  • 10:51 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2182.codfw.wmnet with reason: host reimage
  • 10:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2183.codfw.wmnet with OS bookworm
  • 10:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2182
  • 10:33 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2182
  • 10:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2182.codfw.wmnet with OS bookworm
  • 10:32 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2182.codfw.wmnet with OS bookworm
  • 10:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2180.codfw.wmnet with OS bookworm
  • 10:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2183.codfw.wmnet with reason: host reimage
  • 10:14 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2183.codfw.wmnet with reason: host reimage
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71699 and previous config saved to /var/cache/conftool/dbconfig/20241211-101051-root.json
  • 10:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2180.codfw.wmnet with reason: host reimage
  • 10:02 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2180.codfw.wmnet with reason: host reimage
  • 09:59 aqu@deploy2002: Finished deploy [airflow-dags/analytics@416a3c0]: Backfill webrequest actor metrics rollup hourly 2024 12 (duration: 01m 02s)
  • 09:58 aqu@deploy2002: Started deploy [airflow-dags/analytics@416a3c0]: Backfill webrequest actor metrics rollup hourly 2024 12
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2183
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2183
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71698 and previous config saved to /var/cache/conftool/dbconfig/20241211-095546-root.json
  • 09:55 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2183
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2183.codfw.wmnet 29.48.192.10.in-addr.arpa 9.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:55 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2183.codfw.wmnet 29.48.192.10.in-addr.arpa 9.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2183 - jelto@cumin1002"
  • 09:55 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2183 - jelto@cumin1002"
  • 09:51 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2181
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2181
  • 09:51 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2181
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2181.codfw.wmnet 110.32.192.10.in-addr.arpa 0.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:51 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2181.codfw.wmnet 110.32.192.10.in-addr.arpa 0.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2181 - jelto@cumin1002"
  • 09:51 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2181 - jelto@cumin1002"
  • 09:48 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2183
  • 09:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2182
  • 09:47 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2182
  • 09:47 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:46 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2182
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2182.codfw.wmnet 28.48.192.10.in-addr.arpa 8.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:46 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2182.codfw.wmnet 28.48.192.10.in-addr.arpa 8.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:44 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2181
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2180
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2180
  • 09:44 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:44 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2180
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2180.codfw.wmnet 109.32.192.10.in-addr.arpa 9.0.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:44 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2180.codfw.wmnet 109.32.192.10.in-addr.arpa 9.0.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2180 - jelto@cumin1002"
  • 09:44 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2180 - jelto@cumin1002"
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71697 and previous config saved to /var/cache/conftool/dbconfig/20241211-094040-root.json
  • 09:40 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2183.codfw.wmnet with OS bookworm
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2182
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2182.codfw.wmnet with OS bookworm
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2181.codfw.wmnet with OS bookworm
  • 09:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2180
  • 09:39 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2180.codfw.wmnet with OS bookworm
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2180.codfw.wmnet wikikube-worker2181.codfw.wmnet wikikube-worker2182.codfw.wmnet wikikube-worker2183.codfw.wmnet on all recursors
  • 09:37 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2180.codfw.wmnet wikikube-worker2181.codfw.wmnet wikikube-worker2182.codfw.wmnet wikikube-worker2183.codfw.wmnet on all recursors
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2014 to wikikube-worker2183
  • 09:36 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2183
  • 09:36 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2183
  • 09:36 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:36 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2014 to wikikube-worker2183 - jelto@cumin1002"
  • 09:36 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2014 to wikikube-worker2183 - jelto@cumin1002"
  • 09:32 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:32 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-lab1002.eqiad.wmnet with OS bookworm
  • 09:32 elukey@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 09:32 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2014 to wikikube-worker2183
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2013 to wikikube-worker2182
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2182
  • 09:31 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2182
  • 09:30 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:30 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2013 to wikikube-worker2182 - jelto@cumin1002"
  • 09:30 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2013 to wikikube-worker2182 - jelto@cumin1002"
  • 09:26 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:26 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2013 to wikikube-worker2182
  • 09:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2012 to wikikube-worker2181
  • 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71696 and previous config saved to /var/cache/conftool/dbconfig/20241211-092535-root.json
  • 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2181
  • 09:25 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2181
  • 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2012 to wikikube-worker2181 - jelto@cumin1002"
  • 09:24 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2012 to wikikube-worker2181 - jelto@cumin1002"
  • 09:20 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:20 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2012 to wikikube-worker2181
  • 09:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2011 to wikikube-worker2180
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2180
  • 09:19 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2180
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2011 to wikikube-worker2180 - jelto@cumin1002"
  • 09:18 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2011 to wikikube-worker2180 - jelto@cumin1002"
  • 09:14 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:14 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2011 to wikikube-worker2180
  • 09:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71695 and previous config saved to /var/cache/conftool/dbconfig/20241211-091029-root.json
  • 09:08 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2011-2014].codfw.wmnet
  • 09:08 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2011-2014].codfw.wmnet
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2011-2014].codfw.wmnet
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2136 to upgrade MariaDB 10.11 T378940', diff saved to https://phabricator.wikimedia.org/P71694 and previous config saved to /var/cache/conftool/dbconfig/20241211-090538-marostegui.json
  • 09:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2136.codfw.wmnet with reason: maintenance
  • 09:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2136.codfw.wmnet with reason: maintenance
  • 09:04 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2011-2014].codfw.wmnet
  • 02:30 eileen: civicrm upgraded from 3ef855ca to ddda6d67
  • 01:36 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2027.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 00:50 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2027.codfw.wmnet w/ force delete existing files, repooling both afterwards

2024-12-10

  • 23:35 eileen: config revision changed from b3741848 to ca701cba add phone update job
  • 22:54 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 22:54 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 22:49 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1088.eqiad.wmnet with OS bookworm
  • 22:36 cjming: end of UTC late backport window
  • 22:22 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 22:19 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 22:15 cjming@deploy2002: Finished scap sync-world: Backport for Disable stats collection when WMF_MAINTENANCE_OFFLINE is set (T380609) (duration: 11m 24s)
  • 22:10 cjming@deploy2002: cwhite, cjming: Continuing with sync
  • 22:08 cjming@deploy2002: cwhite, cjming: Backport for Disable stats collection when WMF_MAINTENANCE_OFFLINE is set (T380609) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:04 cjming@deploy2002: Started scap sync-world: Backport for Disable stats collection when WMF_MAINTENANCE_OFFLINE is set (T380609)
  • 21:59 cjming@deploy2002: Finished scap sync-world: Backport for Beta Cluster: Enable MetricsPlatform extension on all wikis (T381849 T381853) (duration: 10m 50s)
  • 21:56 jhathaway@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS bookworm
  • 21:53 cjming@deploy2002: cjming, phuedx: Continuing with sync
  • 21:52 cjming@deploy2002: cjming, phuedx: Backport for Beta Cluster: Enable MetricsPlatform extension on all wikis (T381849 T381853) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:48 cjming@deploy2002: Started scap sync-world: Backport for Beta Cluster: Enable MetricsPlatform extension on all wikis (T381849 T381853)
  • 21:47 eileen: ivicrm upgraded from f9c89e50 to 3ef855ca
  • 21:47 cjming@deploy2002: Finished scap sync-world: Backport for Reader Survey: Increase coverage (T378660) (duration: 10m 02s)
  • 21:41 cjming@deploy2002: cjming, dani: Continuing with sync
  • 21:41 cjming@deploy2002: cjming, dani: Backport for Reader Survey: Increase coverage (T378660) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:37 cjming@deploy2002: Started scap sync-world: Backport for Reader Survey: Increase coverage (T378660)
  • 21:35 cjming@deploy2002: Finished scap sync-world: Backport for LanguageConverter: Ignore content inside <math> and <svg> elements (T381617) (duration: 11m 55s)
  • 21:30 cjming@deploy2002: bvibber, cjming: Continuing with sync
  • 21:27 cjming@deploy2002: bvibber, cjming: Backport for LanguageConverter: Ignore content inside <math> and <svg> elements (T381617) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:23 cjming@deploy2002: Started scap sync-world: Backport for LanguageConverter: Ignore content inside <math> and <svg> elements (T381617)
  • 21:22 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2026.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 20:55 mforns@deploy2002: Finished deploy [airflow-dags/analytics@2af4e1a]: Fix for the Commons Impact Metrics job (duration: 01m 38s)
  • 20:54 mforns@deploy2002: Started deploy [airflow-dags/analytics@2af4e1a]: Fix for the Commons Impact Metrics job
  • 20:47 mforns@deploy2002: Finished deploy [analytics/refinery@25c1946] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@25c1946c] (duration: 00m 27s)
  • 20:46 mforns@deploy2002: Started deploy [analytics/refinery@25c1946] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@25c1946c]
  • 20:46 mforns@deploy2002: Finished deploy [analytics/refinery@25c1946] (thin): Regular analytics weekly train THIN [analytics/refinery@25c1946c] (duration: 00m 31s)
  • 20:45 mforns@deploy2002: Started deploy [analytics/refinery@25c1946] (thin): Regular analytics weekly train THIN [analytics/refinery@25c1946c]
  • 20:45 mforns@deploy2002: Finished deploy [analytics/refinery@25c1946]: Regular analytics weekly train [analytics/refinery@25c1946c] (duration: 13m 12s)
  • 20:38 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2026.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 20:38 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2026.codfw.wmnet, repooling source-only afterwards
  • 20:37 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, xfer wdqs scholarly 2023(public)->2026(internal)) xfer scholarly_articles from wdqs2023.codfw.wmnet -> wdqs2026.codfw.wmnet, repooling source-only afterwards
  • 20:32 mforns@deploy2002: Started deploy [analytics/refinery@25c1946]: Regular analytics weekly train [analytics/refinery@25c1946c]
  • 20:28 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 20:28 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 20:04 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 20:04 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on ms-be1088.eqiad.wmnet with reason: T381919
  • 18:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71693 and previous config saved to /var/cache/conftool/dbconfig/20241210-183545-root.json
  • 18:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71692 and previous config saved to /var/cache/conftool/dbconfig/20241210-182040-root.json
  • 18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71691 and previous config saved to /var/cache/conftool/dbconfig/20241210-180534-root.json
  • 18:02 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 18:02 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 18:02 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 18:01 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 18:01 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 18:00 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:00 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 17:55 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:54 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71690 and previous config saved to /var/cache/conftool/dbconfig/20241210-175029-root.json
  • 17:47 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:47 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:42 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:42 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:42 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:41 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71688 and previous config saved to /var/cache/conftool/dbconfig/20241210-173524-root.json
  • 17:30 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-lab1002.eqiad.wmnet with reason: host reimage
  • 17:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2158.codfw.wmnet with reason: maintenance
  • 17:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2158.codfw.wmnet with reason: maintenance
  • 17:25 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-lab1002.eqiad.wmnet with reason: host reimage
  • 17:24 herron@cumin1002: dbctl commit (dc=all): 'depooling db2158 T381901', diff saved to https://phabricator.wikimedia.org/P71687 and previous config saved to /var/cache/conftool/dbconfig/20241210-172424-herron.json
  • 17:18 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:18 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:13 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1002.eqiad.wmnet with OS bookworm
  • 17:13 swfrench-wmf: deployed shellbox 2024-12-07-073046 for T381830
  • 17:12 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 17:12 klausman@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1002.eqiad.wmnet with OS bookworm
  • 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 17:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 17:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 17:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 17:08 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 17:08 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
  • 17:07 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
  • 17:06 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
  • 17:05 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
  • 17:05 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
  • 17:04 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
  • 17:03 ottomata: restarting eventgate-analytics to pick up stream config changes for T381322
  • 17:01 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 17:00 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 16:59 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:58 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:58 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:58 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 16:57 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 16:57 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:57 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 16:56 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 16:56 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 16:51 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 16:51 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 16:50 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:50 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 16:50 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:49 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:49 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 16:49 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 16:49 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:48 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 16:48 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 16:47 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 16:38 denisse@deploy2002: Finished deploy [librenms/librenms@f049593]: Upgrade LibreNMS to 24.10.0 - T381785 (duration: 00m 13s)
  • 16:38 denisse@deploy2002: Started deploy [librenms/librenms@f049593]: Upgrade LibreNMS to 24.10.0 - T381785
  • 16:26 klausman@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1002.eqiad.wmnet with OS bookworm
  • 16:25 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:20 klausman@cumin1002: START - Cookbook sre.hosts.provision for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:15 klausman@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-lab1002
  • 16:15 klausman@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ml-lab1002
  • 16:13 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 16:13 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 16:13 moritzm: installing postgresql-15 security updates
  • 16:12 klausman@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:12 klausman@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update DNS for newly-provisioned ml-lab1002 - klausman@cumin1002"
  • 16:12 klausman@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update DNS for newly-provisioned ml-lab1002 - klausman@cumin1002"
  • 16:09 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
  • 16:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on phab1004.eqiad.wmnet with reason: nftables
  • 16:09 elukey@deploy2002: helmfile [codfw] START helmfile.d/admin 'sync'.
  • 16:09 mutante: phabricator production host needs a maintenance reboot - expect short downtime
  • 16:09 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on phab1004.eqiad.wmnet with reason: nftables
  • 16:09 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 16:08 elukey@deploy2002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 16:08 klausman@cumin1002: START - Cookbook sre.dns.netbox
  • 16:07 moritzm: manually clean out ganeti1009 from puppetdb, decom cookbook got interrupted T381652
  • 16:06 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: ganeti1009.eqiad.wmnet
  • 16:06 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: ganeti1009.eqiad.wmnet
  • 16:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71685 and previous config saved to /var/cache/conftool/dbconfig/20241210-160322-root.json
  • 15:53 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:52 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:52 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71684 and previous config saved to /var/cache/conftool/dbconfig/20241210-154816-root.json
  • 15:48 moritzm: installing usb.ids updates from Bullseye point release
  • 15:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:35 moritzm: installing imagemagick security updates
  • 15:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71683 and previous config saved to /var/cache/conftool/dbconfig/20241210-153311-root.json
  • 15:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71682 and previous config saved to /var/cache/conftool/dbconfig/20241210-151805-root.json
  • 15:15 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1025.eqiad.wmnet with reason: T376150
  • 15:06 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:05 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Event Logging: Update streamName and schemaId (T364460) (duration: 25m 40s)
  • 15:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71681 and previous config saved to /var/cache/conftool/dbconfig/20241210-150300-root.json
  • 15:00 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, wangombe: Continuing with sync
  • 14:47 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ml-lab1002.eqiad.wmnet
  • 14:47 klausman@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:47 klausman@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ml-lab1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - klausman@cumin1002"
  • 14:46 klausman@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ml-lab1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - klausman@cumin1002"
  • 14:44 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, wangombe: Backport for Event Logging: Update streamName and schemaId (T364460) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:40 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Event Logging: Update streamName and schemaId (T364460)
  • 14:39 samtar@deploy2002: Finished scap sync-world: Backport for Revert "IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki" (duration: 10m 15s)
  • 14:32 samtar@deploy2002: samtar: Continuing with sync
  • 14:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:32 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1080,1082-1083].eqiad.wmnet
  • 14:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1080,1082-1083].eqiad.wmnet
  • 14:32 samtar@deploy2002: samtar: Backport for Revert "IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:31 klausman@cumin1002: START - Cookbook sre.dns.netbox
  • 14:29 TheresNoTime: revert 1101545 for T377121
  • 14:29 samtar@deploy2002: Started scap sync-world: Backport for Revert "IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki"
  • 14:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T381532)', diff saved to https://phabricator.wikimedia.org/P71678 and previous config saved to /var/cache/conftool/dbconfig/20241210-141820-marostegui.json
  • 14:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:17 jelto: homer 'lsw1-f3-eqiad*' commit 'T377876' , homer 'cr*eqiad*' commit 'T377876'
  • 14:15 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 13:01 klausman@cumin1002: START - Cookbook sre.hosts.decommission for hosts ml-lab1002.eqiad.wmnet
  • 13:00 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on ml-lab1002.eqiad.wmnet with reason: Moving to analytics network
  • 13:00 klausman@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on ml-lab1002.eqiad.wmnet with reason: Moving to analytics network
  • 12:55 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 12:53 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 12:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1083.eqiad.wmnet with OS bookworm
  • 12:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1080.eqiad.wmnet with OS bookworm
  • 12:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1082.eqiad.wmnet with OS bookworm
  • 12:07 samtar@deploy2002: Finished scap sync-world: Backport for IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki (T377121) (duration: 14m 06s)
  • 12:02 samtar@deploy2002: samtar: Continuing with sync
  • 12:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1083.eqiad.wmnet with reason: host reimage
  • 11:58 samtar@deploy2002: samtar: Backport for IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki (T377121) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1080.eqiad.wmnet with reason: host reimage
  • 11:53 samtar@deploy2002: Started scap sync-world: Backport for IS/IS-l: wgUseCodexSpecialBlock for beta, prod test.wiki (T377121)
  • 11:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1082.eqiad.wmnet with reason: host reimage
  • 11:51 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1080.eqiad.wmnet with reason: host reimage
  • 11:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1083.eqiad.wmnet with reason: host reimage
  • 11:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1082.eqiad.wmnet with reason: host reimage
  • 11:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1083.eqiad.wmnet with OS bookworm
  • 11:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1082.eqiad.wmnet with OS bookworm
  • 11:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 11:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1080.eqiad.wmnet with OS bookworm
  • 11:29 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1080.eqiad.wmnet wikikube-worker1081.eqiad.wmnet wikikube-worker1082.eqiad.wmnet wikikube-worker1083.eqiad.wmnet on all recursors
  • 11:29 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1080.eqiad.wmnet wikikube-worker1081.eqiad.wmnet wikikube-worker1082.eqiad.wmnet wikikube-worker1083.eqiad.wmnet on all recursors
  • 11:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1058 to wikikube-worker1083
  • 11:28 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1083
  • 11:27 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1083
  • 11:27 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:27 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1058 to wikikube-worker1083 - jelto@cumin1002"
  • 11:27 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1058 to wikikube-worker1083 - jelto@cumin1002"
  • 11:24 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:23 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1058 to wikikube-worker1083
  • 11:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1057 to wikikube-worker1082
  • 11:21 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1082
  • 11:20 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1082
  • 11:20 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:20 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1057 to wikikube-worker1082 - jelto@cumin1002"
  • 11:19 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1057 to wikikube-worker1082 - jelto@cumin1002"
  • 11:16 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:15 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1057 to wikikube-worker1082
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1056 to wikikube-worker1081
  • 11:14 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1081
  • 11:14 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1081
  • 11:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:14 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1056 to wikikube-worker1081 - jelto@cumin1002"
  • 11:14 claime: Done deploying no-op cfssl-issuer admin_ng change - 1101455
  • 11:14 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1056 to wikikube-worker1081 - jelto@cumin1002"
  • 11:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:12 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 11:12 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 11:12 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:11 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:11 cgoubert@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 11:11 cgoubert@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 11:10 cgoubert@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:10 cgoubert@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 11:10 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:10 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1056 to wikikube-worker1081
  • 11:09 cgoubert@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:09 cgoubert@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 11:09 cgoubert@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1055 to wikikube-worker1080
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1080
  • 11:08 cgoubert@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:08 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1080
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1055 to wikikube-worker1080 - jelto@cumin1002"
  • 11:08 cgoubert@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:07 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1055 to wikikube-worker1080 - jelto@cumin1002"
  • 11:06 cgoubert@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:05 claime: Deploying no-op cfssl-issuer admin_ng change - 1101455
  • 11:02 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:01 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1055 to wikikube-worker1080
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1055-1058].eqiad.wmnet
  • 10:53 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1055-1058].eqiad.wmnet
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71675 and previous config saved to /var/cache/conftool/dbconfig/20241210-102038-root.json
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71674 and previous config saved to /var/cache/conftool/dbconfig/20241210-101815-root.json
  • 10:17 aqu@deploy2002: Finished deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12 (duration: 20m 51s)
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71673 and previous config saved to /var/cache/conftool/dbconfig/20241210-100532-root.json
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71672 and previous config saved to /var/cache/conftool/dbconfig/20241210-100310-root.json
  • 10:00 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1076-1079].eqiad.wmnet
  • 10:00 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1076-1079].eqiad.wmnet
  • 09:57 jelto: homer 'lsw1-f3-eqiad*' commit 'T377876' , homer 'lsw1-e3-eqiad*' commit 'T377876'
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1078.eqiad.wmnet with OS bookworm
  • 09:56 aqu@deploy2002: Started deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12
  • 09:56 aqu@deploy2002: Finished deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12 (duration: 07m 37s)
  • 09:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1079.eqiad.wmnet with OS bookworm
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71670 and previous config saved to /var/cache/conftool/dbconfig/20241210-095027-root.json
  • 09:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1077.eqiad.wmnet with OS bookworm
  • 09:48 aqu@deploy2002: Started deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12
  • 09:48 aqu@deploy2002: Finished deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12 (duration: 07m 22s)
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71669 and previous config saved to /var/cache/conftool/dbconfig/20241210-094805-root.json
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1076.eqiad.wmnet with OS bookworm
  • 09:45 joal@deploy2002: Finished deploy [analytics/refinery@0ffc330] (hadoop-test): Analytics backfill train - TEST [analytics/refinery@0ffc3306] (duration: 00m 26s)
  • 09:44 joal@deploy2002: Started deploy [analytics/refinery@0ffc330] (hadoop-test): Analytics backfill train - TEST [analytics/refinery@0ffc3306]
  • 09:44 joal@deploy2002: Finished deploy [analytics/refinery@0ffc330] (thin): Analytics backfill train - THIN [analytics/refinery@0ffc3306] (duration: 00m 31s)
  • 09:44 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:44 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:44 joal@deploy2002: Started deploy [analytics/refinery@0ffc330] (thin): Analytics backfill train - THIN [analytics/refinery@0ffc3306]
  • 09:43 joal@deploy2002: Finished deploy [analytics/refinery@0ffc330]: Analytics backfill train [analytics/refinery@0ffc3306] (duration: 02m 04s)
  • 09:43 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:41 joal@deploy2002: Started deploy [analytics/refinery@0ffc330]: Analytics backfill train [analytics/refinery@0ffc3306]
  • 09:41 aqu@deploy2002: Started deploy [airflow-dags/analytics@7428c06]: Backfill webrequest actor metrics 2024 12
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1078.eqiad.wmnet with reason: host reimage
  • 09:36 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71668 and previous config saved to /var/cache/conftool/dbconfig/20241210-093522-root.json
  • 09:34 moritzm: rebalance Ganeti cluster in codfw/c following server refresh T376594
  • 09:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1079.eqiad.wmnet with reason: host reimage
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71667 and previous config saved to /var/cache/conftool/dbconfig/20241210-093259-root.json
  • 09:32 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 09:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1077.eqiad.wmnet with reason: host reimage
  • 09:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1076.eqiad.wmnet with reason: host reimage
  • 09:24 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1079.eqiad.wmnet with reason: host reimage
  • 09:24 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1078.eqiad.wmnet with reason: host reimage
  • 09:24 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1077.eqiad.wmnet with reason: host reimage
  • 09:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1076.eqiad.wmnet with reason: host reimage
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 100%: 5', diff saved to https://phabricator.wikimedia.org/P71666 and previous config saved to /var/cache/conftool/dbconfig/20241210-092243-root.json
  • 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71665 and previous config saved to /var/cache/conftool/dbconfig/20241210-092016-root.json
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71664 and previous config saved to /var/cache/conftool/dbconfig/20241210-091754-root.json
  • 09:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 75%: 5', diff saved to https://phabricator.wikimedia.org/P71663 and previous config saved to /var/cache/conftool/dbconfig/20241210-090738-root.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71662 and previous config saved to /var/cache/conftool/dbconfig/20241210-090732-root.json
  • 09:07 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:06 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1079.eqiad.wmnet with OS bookworm
  • 09:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1078.eqiad.wmnet with OS bookworm
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 5%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71661 and previous config saved to /var/cache/conftool/dbconfig/20241210-090511-root.json
  • 09:04 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1077.eqiad.wmnet with OS bookworm
  • 09:04 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1076.eqiad.wmnet with OS bookworm
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 5%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71660 and previous config saved to /var/cache/conftool/dbconfig/20241210-090248-root.json
  • 09:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1076.eqiad.wmnet wikikube-worker1077.eqiad.wmnet wikikube-worker1078.eqiad.wmnet wikikube-worker1079.eqiad.wmnet on all recursors
  • 09:01 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1076.eqiad.wmnet wikikube-worker1077.eqiad.wmnet wikikube-worker1078.eqiad.wmnet wikikube-worker1079.eqiad.wmnet on all recursors
  • 09:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1054 to wikikube-worker1079
  • 09:00 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1079
  • 08:59 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1079
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1054 to wikikube-worker1079 - jelto@cumin1002"
  • 08:59 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1054 to wikikube-worker1079 - jelto@cumin1002"
  • 08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet with reason: Alter table
  • 08:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet with reason: Alter table
  • 08:55 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:55 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1054 to wikikube-worker1079
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1053 to wikikube-worker1078
  • 08:54 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1078
  • 08:54 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1078
  • 08:54 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:54 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1053 to wikikube-worker1078 - jelto@cumin1002"
  • 08:53 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1053 to wikikube-worker1078 - jelto@cumin1002"
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 50%: 5', diff saved to https://phabricator.wikimedia.org/P71659 and previous config saved to /var/cache/conftool/dbconfig/20241210-085232-root.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71658 and previous config saved to /var/cache/conftool/dbconfig/20241210-085227-root.json
  • 08:50 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'es2045 (re)pooling @ 1%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71657 and previous config saved to /var/cache/conftool/dbconfig/20241210-085006-root.json
  • 08:50 elukey: manual run of docker-system-prune-all on build2001 to free some space
  • 08:49 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1053 to wikikube-worker1078
  • 08:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1052 to wikikube-worker1077
  • 08:49 marostegui@cumin1002: dbctl commit (dc=all): 'Change es2024 weight', diff saved to https://phabricator.wikimedia.org/P71656 and previous config saved to /var/cache/conftool/dbconfig/20241210-084932-marostegui.json
  • 08:49 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1077
  • 08:48 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1077
  • 08:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:48 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1052 to wikikube-worker1077 - jelto@cumin1002"
  • 08:48 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2045 to dbctl T381259', diff saved to https://phabricator.wikimedia.org/P71655 and previous config saved to /var/cache/conftool/dbconfig/20241210-084844-marostegui.json
  • 08:48 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1052 to wikikube-worker1077 - jelto@cumin1002"
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 1%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71654 and previous config saved to /var/cache/conftool/dbconfig/20241210-084743-root.json
  • 08:44 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:44 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1052 to wikikube-worker1077
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1051 to wikikube-worker1076
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1076
  • 08:41 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1076
  • 08:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:41 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1051 to wikikube-worker1076 - jelto@cumin1002"
  • 08:41 gmodena: UTC morning backport deploys done
  • 08:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1051 to wikikube-worker1076 - jelto@cumin1002"
  • 08:39 gmodena@deploy2002: Finished scap sync-world: Backport for EventStreamConfig: add content_history streams. (T381322) (duration: 17m 16s)
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 25%: 5', diff saved to https://phabricator.wikimedia.org/P71653 and previous config saved to /var/cache/conftool/dbconfig/20241210-083726-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71652 and previous config saved to /var/cache/conftool/dbconfig/20241210-083721-root.json
  • 08:37 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:36 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1051 to wikikube-worker1076
  • 08:34 gmodena@deploy2002: gmodena: Continuing with sync
  • 08:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1051-1054].eqiad.wmnet
  • 08:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1051-1054].eqiad.wmnet
  • 08:26 gmodena@deploy2002: gmodena: Backport for EventStreamConfig: add content_history streams. (T381322) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:22 gmodena@deploy2002: Started scap sync-world: Backport for EventStreamConfig: add content_history streams. (T381322)
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1159 (re)pooling @ 10%: 5', diff saved to https://phabricator.wikimedia.org/P71650 and previous config saved to /var/cache/conftool/dbconfig/20241210-082221-root.json
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71649 and previous config saved to /var/cache/conftool/dbconfig/20241210-082216-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Add db1159 to dbctl depooled T381550', diff saved to https://phabricator.wikimedia.org/P71648 and previous config saved to /var/cache/conftool/dbconfig/20241210-082020-marostegui.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71647 and previous config saved to /var/cache/conftool/dbconfig/20241210-080710-root.json
  • 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.44.0-wmf.4 (duration: 01m 25s)

2024-12-09

  • 22:29 ryankemper: [wdqs-internal graph split] Cleared away old categories units on 5 hosts (`wdqs20[18-20],wdqs202[6-7]`)
  • 22:28 cjming: end of UTC late backport window
  • 22:23 cjming@deploy2002: Finished scap sync-world: Backport for idwikivoyage: add timezone, sitename and project namespace (T381080) (duration: 10m 46s)
  • 22:17 cjming@deploy2002: cjming, anzx: Continuing with sync
  • 22:16 cjming@deploy2002: cjming, anzx: Backport for idwikivoyage: add timezone, sitename and project namespace (T381080) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:12 cjming@deploy2002: Started scap sync-world: Backport for idwikivoyage: add timezone, sitename and project namespace (T381080)
  • 22:10 cjming@deploy2002: Finished scap sync-world: Backport for jawiki: lift IP cap on 2024-12-17 and 2025-01-14 for Edit-a-ton (T381729) (duration: 10m 02s)
  • 22:05 cjming@deploy2002: cjming, anzx: Continuing with sync
  • 22:05 cjming@deploy2002: cjming, anzx: Backport for jawiki: lift IP cap on 2024-12-17 and 2025-01-14 for Edit-a-ton (T381729) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:00 cjming@deploy2002: Started scap sync-world: Backport for jawiki: lift IP cap on 2024-12-17 and 2025-01-14 for Edit-a-ton (T381729)
  • 21:57 cjming@deploy2002: Finished scap sync-world: Backport for Disable QuickSurveys for recommendations (T379241 T380778) (duration: 10m 15s)
  • 21:52 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:51 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:51 cjming@deploy2002: cjming, jdlrobson: Continuing with sync
  • 21:51 cjming@deploy2002: cjming, jdlrobson: Backport for Disable QuickSurveys for recommendations (T379241 T380778) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:47 cjming@deploy2002: Started scap sync-world: Backport for Disable QuickSurveys for recommendations (T379241 T380778)
  • 21:46 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:46 cjming@deploy2002: Finished scap sync-world: Backport for Expand support for dark mode for anonymous users (itwiki, enwikivoyage) (T379352) (duration: 11m 08s)
  • 21:44 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:40 cjming@deploy2002: jdlrobson, cjming: Continuing with sync
  • 21:39 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:39 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:38 cjming@deploy2002: jdlrobson, cjming: Backport for Expand support for dark mode for anonymous users (itwiki, enwikivoyage) (T379352) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:34 cjming@deploy2002: Started scap sync-world: Backport for Expand support for dark mode for anonymous users (itwiki, enwikivoyage) (T379352)
  • 21:34 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:33 cjming@deploy2002: Finished scap sync-world: Backport for cirrus: Enable mlr-2024 for select wikis (T377128) (duration: 10m 28s)
  • 21:32 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:29 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:28 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:27 cjming@deploy2002: cjming, ebernhardson: Continuing with sync
  • 21:27 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 21:27 cjming@deploy2002: cjming, ebernhardson: Backport for cirrus: Enable mlr-2024 for select wikis (T377128) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:22 cjming@deploy2002: Started scap sync-world: Backport for cirrus: Enable mlr-2024 for select wikis (T377128)
  • 21:21 cjming@deploy2002: Finished scap sync-world: Backport for Actually load IRS in production (T374105) (duration: 12m 29s)
  • 21:14 cjming@deploy2002: cjming, mszabo: Continuing with sync
  • 21:13 cjming@deploy2002: cjming, mszabo: Backport for Actually load IRS in production (T374105) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:09 cjming@deploy2002: Started scap sync-world: Backport for Actually load IRS in production (T374105)
  • 20:25 aqu@deploy2002: Finished deploy [airflow-dags/analytics@1d9b4b5]: Canary events generation: pooling (duration: 01m 46s)
  • 20:23 aqu@deploy2002: Started deploy [airflow-dags/analytics@1d9b4b5]: Canary events generation: pooling
  • 20:07 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching aqs1010.eqiad.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 19:58 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching aqs1010.eqiad.wmnet: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 18:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 18:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 18:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 17:52 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:51 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:47 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 17:44 cdanis: 💙cdanis@cumin1002.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:cp' 'enable-puppet "cdanis testing in production I464702d8fb T381771"'
  • 17:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 17:36 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 17:35 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 17:22 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1072,1074-1075].eqiad.wmnet
  • 17:22 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1072,1074-1075].eqiad.wmnet
  • 17:20 jelto: homer 'lsw1-e3-eqiad*' commit 'T377876'
  • 17:18 cdanis: T381771 💙cdanis@cp1107.eqiad.wmnet ~ 🕧☕ sudo run-puppet-agent --force
  • 17:16 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 17:15 cdanis: 💙cdanis@cumin1002.eqiad.wmnet ~ 🕛☕ sudo cumin 'A:cp' 'disable-puppet "cdanis testing in production I464702d8fb T381771"'
  • 17:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 16:59 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 16:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 16:47 hnowlan@deploy2002: Finished scap sync-world: Rebuild and deploy to pick up new php8.1 base (duration: 21m 09s)
  • 16:26 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to pick up new php8.1 base
  • 16:12 moritzm: rebalance Ganeti cluster in codfw/B following server refresh T376594
  • 16:06 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2089-2090].codfw.wmnet
  • 16:06 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2089-2090].codfw.wmnet
  • 16:05 hnowlan@deploy2002: Finished scap sync-world: Rebuild and deploy to pick up new php8.1 base (duration: 23m 00s)
  • 15:56 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 15:55 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 15:45 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2090.codfw.wmnet with OS bookworm
  • 15:44 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to pick up new php8.1 base
  • 15:43 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2089.codfw.wmnet with OS bookworm
  • 15:34 Emperor: depool/restart swift/repool ms-fe1012
  • 15:34 mszabo@deploy2002: Finished scap sync-world: Backport for dialog: Fix wrong title on Types of unacceptable behavior step (T381529), dialog: Fix spacing between buttons in the dialog footer (T381530), Prep IRS config for testwiki (duration: 13m 39s)
  • 15:33 Emperor: depool/restart swift/repool ms-fe1010
  • 15:28 mszabo@deploy2002: mszabo: Continuing with sync
  • 15:25 mszabo@deploy2002: mszabo: Backport for dialog: Fix wrong title on Types of unacceptable behavior step (T381529), dialog: Fix spacing between buttons in the dialog footer (T381530), Prep IRS config for testwiki synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2090.codfw.wmnet with reason: host reimage
  • 15:22 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2089.codfw.wmnet with reason: host reimage
  • 15:21 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1075.eqiad.wmnet with OS bookworm
  • 15:20 mszabo@deploy2002: Started scap sync-world: Backport for dialog: Fix wrong title on Types of unacceptable behavior step (T381529), dialog: Fix spacing between buttons in the dialog footer (T381530), Prep IRS config for testwiki
  • 15:20 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2090.codfw.wmnet with reason: host reimage
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1074.eqiad.wmnet with OS bookworm
  • 15:18 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2089.codfw.wmnet with reason: host reimage
  • 15:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1072.eqiad.wmnet with OS bookworm
  • 15:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1075.eqiad.wmnet with reason: host reimage
  • 15:01 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1074.eqiad.wmnet with reason: host reimage
  • 15:00 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2090
  • 15:00 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2090
  • 15:00 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2090.codfw.wmnet with OS bookworm
  • 15:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1025.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:59 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2090.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:59 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for idwikivoyage: add logo, wordmark (T381080) (duration: 11m 44s)
  • 14:59 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2089
  • 14:59 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2089
  • 14:58 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2089.codfw.wmnet with OS bookworm
  • 14:58 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2089.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1072.eqiad.wmnet with reason: host reimage
  • 14:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1075.eqiad.wmnet with reason: host reimage
  • 14:53 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1074.eqiad.wmnet with reason: host reimage
  • 14:53 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Continuing with sync
  • 14:53 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1072.eqiad.wmnet with reason: host reimage
  • 14:51 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Backport for idwikivoyage: add logo, wordmark (T381080) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:47 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for idwikivoyage: add logo, wordmark (T381080)
  • 14:46 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1025.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:44 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2089.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:44 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2090.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:39 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription for 6 wikis (T372386) (duration: 14m 34s)
  • 14:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1075.eqiad.wmnet with OS bookworm
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1074.eqiad.wmnet with OS bookworm
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1072.eqiad.wmnet with OS bookworm
  • 14:34 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wikikube-worker[2089-2090].codfw.wmnet with reason: reimage
  • 14:34 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wikikube-worker[2089-2090].codfw.wmnet with reason: reimage
  • 14:34 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2089-2090].codfw.wmnet
  • 14:33 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Continuing with sync
  • 14:33 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2089-2090].codfw.wmnet
  • 14:29 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Backport for Translate: Enable message group subscription for 6 wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1072.eqiad.wmnet wikikube-worker1073.eqiad.wmnet wikikube-worker1074.eqiad.wmnet wikikube-worker1075.eqiad.wmnet on all recursors
  • 14:25 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1072.eqiad.wmnet wikikube-worker1073.eqiad.wmnet wikikube-worker1074.eqiad.wmnet wikikube-worker1075.eqiad.wmnet on all recursors
  • 14:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1050 to wikikube-worker1075
  • 14:24 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1075
  • 14:24 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription for 6 wikis (T372386)
  • 14:24 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1075
  • 14:24 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:24 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1050 to wikikube-worker1075 - jelto@cumin1002"
  • 14:23 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add Metrics Platform stream configuration for translate_extension (T364460) (duration: 17m 12s)
  • 14:23 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1050 to wikikube-worker1075 - jelto@cumin1002"
  • 14:19 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:18 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1050 to wikikube-worker1075
  • 14:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1049 to wikikube-worker1074
  • 14:17 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1074
  • 14:16 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, wangombe: Continuing with sync
  • 14:16 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1074
  • 14:16 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:16 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1049 to wikikube-worker1074 - jelto@cumin1002"
  • 14:15 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1049 to wikikube-worker1074 - jelto@cumin1002"
  • 14:12 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1049 to wikikube-worker1074
  • 14:11 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, wangombe: Backport for Add Metrics Platform stream configuration for translate_extension (T364460) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1048 to wikikube-worker1073
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1073
  • 14:10 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1073
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1048 to wikikube-worker1073 - jelto@cumin1002"
  • 14:09 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1048 to wikikube-worker1073 - jelto@cumin1002"
  • 14:06 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:06 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add Metrics Platform stream configuration for translate_extension (T364460)
  • 14:05 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1048 to wikikube-worker1073
  • 14:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1047 to wikikube-worker1072
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1072
  • 14:03 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1072
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1047 to wikikube-worker1072 - jelto@cumin1002"
  • 14:02 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1047 to wikikube-worker1072 - jelto@cumin1002"
  • 14:00 Lucas_WMDE: 'Updated the Wikidata property suggester with data from 20241125’s JSON dump: mwscript-k8s --attach -- extensions/PropertySuggester/maintenance/UpdateTable.php --wiki wikidatawiki --file php://stdin < wbs_propertypairs.csv # T377986, T376604'
  • 13:58 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:57 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1047 to wikikube-worker1072
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1047-1050].eqiad.wmnet
  • 13:49 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1047-1050].eqiad.wmnet
  • 13:46 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2103-2106].codfw.wmnet
  • 13:46 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2103-2106].codfw.wmnet
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1068,1070-1071].eqiad.wmnet
  • 13:30 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1068,1070-1071].eqiad.wmnet
  • 13:16 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 12:57 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 12:26 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2106.codfw.wmnet with OS bookworm
  • 12:22 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2105.codfw.wmnet with OS bookworm
  • 12:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1068.eqiad.wmnet with OS bookworm
  • 12:12 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2104.codfw.wmnet with OS bookworm
  • 12:08 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2103.codfw.wmnet with OS bookworm
  • 12:07 moritzm: installing reportbug bugfix updates
  • 12:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2106.codfw.wmnet with reason: host reimage
  • 12:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 12:04 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 12:02 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2105.codfw.wmnet with reason: host reimage
  • 11:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1071.eqiad.wmnet with OS bookworm
  • 11:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1068.eqiad.wmnet with reason: host reimage
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1070.eqiad.wmnet with OS bookworm
  • 11:51 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2104.codfw.wmnet with reason: host reimage
  • 11:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1068.eqiad.wmnet with reason: host reimage
  • 11:48 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2103.codfw.wmnet with reason: host reimage
  • 11:46 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2106.codfw.wmnet with reason: host reimage
  • 11:45 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2105.codfw.wmnet with reason: host reimage
  • 11:45 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2104.codfw.wmnet with reason: host reimage
  • 11:45 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2103.codfw.wmnet with reason: host reimage
  • 11:42 root@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1210.eqiad.wmnet onto db1159.eqiad.wmnet
  • 11:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1071.eqiad.wmnet with reason: host reimage
  • 11:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1070.eqiad.wmnet with reason: host reimage
  • 11:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1068.eqiad.wmnet with OS bookworm
  • 11:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1071.eqiad.wmnet with reason: host reimage
  • 11:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1070.eqiad.wmnet with reason: host reimage
  • 11:27 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1068.eqiad.wmnet with OS bookworm
  • 11:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2106
  • 11:27 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2106
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2106.codfw.wmnet with OS bookworm
  • 11:26 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2105
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2105
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2105.codfw.wmnet with OS bookworm
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2104
  • 11:25 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2104
  • 11:25 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2104.codfw.wmnet with OS bookworm
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2103
  • 11:25 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2103
  • 11:25 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2103.codfw.wmnet with OS bookworm
  • 11:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2104.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2103.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2106.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2105.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1071.eqiad.wmnet with OS bookworm
  • 11:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1070.eqiad.wmnet with OS bookworm
  • 11:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 11:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1068.eqiad.wmnet with OS bookworm
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1068.eqiad.wmnet wikikube-worker1069.eqiad.wmnet wikikube-worker1070.eqiad.wmnet wikikube-worker1071.eqiad.wmnet on all recursors
  • 11:05 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1068.eqiad.wmnet wikikube-worker1069.eqiad.wmnet wikikube-worker1070.eqiad.wmnet wikikube-worker1071.eqiad.wmnet on all recursors
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1046 to wikikube-worker1071
  • 11:04 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1071
  • 11:03 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1071
  • 11:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:03 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1046 to wikikube-worker1071 - jelto@cumin1002"
  • 11:03 root@cumin1002: START - Cookbook sre.mysql.clone of db1210.eqiad.wmnet onto db1159.eqiad.wmnet
  • 11:03 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1046 to wikikube-worker1071 - jelto@cumin1002"
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2106.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2105.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2104.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2103.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: cloning
  • 11:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: cloning
  • 11:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: cloning
  • 11:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: cloning
  • 10:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1210 to clone db1159 T381550', diff saved to https://phabricator.wikimedia.org/P71640 and previous config saved to /var/cache/conftool/dbconfig/20241209-105941-marostegui.json
  • 10:59 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:59 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1046 to wikikube-worker1071
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1045 to wikikube-worker1070
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1070
  • 10:56 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1070
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1045 to wikikube-worker1070 - jelto@cumin1002"
  • 10:56 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1045 to wikikube-worker1070 - jelto@cumin1002"
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024 to clone es2045', diff saved to https://phabricator.wikimedia.org/P71639 and previous config saved to /var/cache/conftool/dbconfig/20241209-105508-marostegui.json
  • 10:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2024.codfw.wmnet with reason: cloning
  • 10:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2024.codfw.wmnet with reason: cloning
  • 10:52 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:52 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1045 to wikikube-worker1070
  • 10:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1044 to wikikube-worker1069
  • 10:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1069
  • 10:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1069
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1044 to wikikube-worker1069 - jelto@cumin1002"
  • 10:49 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1044 to wikikube-worker1069 - jelto@cumin1002"
  • 10:45 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:45 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1044 to wikikube-worker1069
  • 10:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1043 to wikikube-worker1068
  • 10:44 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1068
  • 10:42 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1068
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1043 to wikikube-worker1068 - jelto@cumin1002"
  • 10:42 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1043 to wikikube-worker1068 - jelto@cumin1002"
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2103-2106].codfw.wmnet
  • 10:39 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2103-2106].codfw.wmnet
  • 10:38 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:38 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wikikube-worker[2103-2106].codfw.wmnet with reason: reimage
  • 10:38 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1043 to wikikube-worker1068
  • 10:38 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wikikube-worker[2103-2106].codfw.wmnet with reason: reimage
  • 10:36 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1043-1046].eqiad.wmnet
  • 10:35 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2074-2075,2091,2124].codfw.wmnet
  • 10:35 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2074-2075,2091,2124].codfw.wmnet
  • 10:34 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1043-1046].eqiad.wmnet
  • 10:30 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2091.codfw.wmnet with OS bookworm
  • 10:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2075.codfw.wmnet with OS bookworm
  • 10:23 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2074.codfw.wmnet with OS bookworm
  • 10:20 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1064-1067].eqiad.wmnet
  • 10:20 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1064-1067].eqiad.wmnet
  • 10:10 moritzm: rebalance Ganeti cluster in codfw/A following server refresh T376594
  • 10:10 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2091.codfw.wmnet with reason: host reimage
  • 10:06 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 10:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2075.codfw.wmnet with reason: host reimage
  • 10:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1065.eqiad.wmnet with OS bookworm
  • 10:03 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2074.codfw.wmnet with reason: host reimage
  • 10:02 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2124.codfw.wmnet with OS bookworm
  • 10:00 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2091.codfw.wmnet with reason: host reimage
  • 09:59 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2075.codfw.wmnet with reason: host reimage
  • 09:59 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2074.codfw.wmnet with reason: host reimage
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1067.eqiad.wmnet with OS bookworm
  • 09:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1066.eqiad.wmnet with OS bookworm
  • 09:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1064.eqiad.wmnet with OS bookworm
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1065.eqiad.wmnet with reason: host reimage
  • 09:42 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2124.codfw.wmnet with reason: host reimage
  • 09:40 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2075.codfw.wmnet with OS bookworm
  • 09:40 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2074.codfw.wmnet with OS bookworm
  • 09:40 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2091.codfw.wmnet with OS bookworm
  • 09:39 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2075.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:39 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2074.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:39 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2091.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:38 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2124.codfw.wmnet with reason: host reimage
  • 09:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1067.eqiad.wmnet with reason: host reimage
  • 09:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1066.eqiad.wmnet with reason: host reimage
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1064.eqiad.wmnet with reason: host reimage
  • 09:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1067.eqiad.wmnet with reason: host reimage
  • 09:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1066.eqiad.wmnet with reason: host reimage
  • 09:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1065.eqiad.wmnet with reason: host reimage
  • 09:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1064.eqiad.wmnet with reason: host reimage
  • 09:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2124.codfw.wmnet with OS bookworm
  • 09:16 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2124.codfw.wmnet with OS bookworm
  • 09:16 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2124.codfw.wmnet with OS bookworm
  • 09:14 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2091.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:13 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2075.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:13 jayme@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2074.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1067.eqiad.wmnet with OS bookworm
  • 09:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1066.eqiad.wmnet with OS bookworm
  • 09:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1065.eqiad.wmnet with OS bookworm
  • 09:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1064.eqiad.wmnet with OS bookworm
  • 09:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1064.eqiad.wmnet wikikube-worker1065.eqiad.wmnet wikikube-worker1066.eqiad.wmnet wikikube-worker1067.eqiad.wmnet on all recursors
  • 09:07 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1064.eqiad.wmnet wikikube-worker1065.eqiad.wmnet wikikube-worker1066.eqiad.wmnet wikikube-worker1067.eqiad.wmnet on all recursors
  • 09:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1042 to wikikube-worker1067
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1067
  • 09:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1067
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1042 to wikikube-worker1067 - jelto@cumin1002"
  • 09:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1042 to wikikube-worker1067 - jelto@cumin1002"
  • 09:04 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2074-2075,2091,2124].codfw.wmnet
  • 09:02 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2074-2075,2091,2124].codfw.wmnet
  • 09:01 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wikikube-worker[2074-2075,2091,2124].codfw.wmnet with reason: reimage
  • 09:01 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:00 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wikikube-worker[2074-2075,2091,2124].codfw.wmnet with reason: reimage
  • 09:00 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1042 to wikikube-worker1067
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1041 to wikikube-worker1066
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1066
  • 08:58 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1066
  • 08:58 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:58 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1041 to wikikube-worker1066 - jelto@cumin1002"
  • 08:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1041 to wikikube-worker1066 - jelto@cumin1002"
  • 08:54 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:53 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1041 to wikikube-worker1066
  • 08:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1040 to wikikube-worker1065
  • 08:52 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1065
  • 08:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1065
  • 08:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1040 to wikikube-worker1065 - jelto@cumin1002"
  • 08:50 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1040 to wikikube-worker1065 - jelto@cumin1002"
  • 08:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:45 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1040 to wikikube-worker1065
  • 08:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1039 to wikikube-worker1064
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1064
  • 08:41 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:40 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1064
  • 08:40 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:40 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1039 to wikikube-worker1064 - jelto@cumin1002"
  • 08:39 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1039 to wikikube-worker1064 - jelto@cumin1002"
  • 08:36 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:35 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:35 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1039 to wikikube-worker1064
  • 08:35 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:34 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1039-1042].eqiad.wmnet
  • 08:29 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1039-1042].eqiad.wmnet
  • 07:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1056.eqiad.wmnet
  • 07:34 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1056.eqiad.wmnet
  • 07:18 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 06:29 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 06:28 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:54 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:53 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:41 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:40 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 05:39 tstarling@deploy2002: Finished deploy [restbase/deploy@8184836]: also deploy to restbase2036-9 T380726 T377896 (duration: 16m 06s)
  • 05:23 tstarling@deploy2002: Started deploy [restbase/deploy@8184836]: also deploy to restbase2036-9 T380726 T377896
  • 04:45 tstarling@deploy2002: Finished deploy [restbase/deploy@0531d4e]: try again after removing decom servers T380790 T380726 (duration: 14m 36s)
  • 04:31 tstarling@deploy2002: Started deploy [restbase/deploy@0531d4e]: try again after removing decom servers T380790 T380726
  • 04:29 tstarling@deploy2002: Finished deploy [restbase/deploy@27f4a8e]: try again, seems like restbase2026 at least was skipped T380726 (duration: 09m 00s)
  • 04:20 tstarling@deploy2002: Started deploy [restbase/deploy@27f4a8e]: try again, seems like restbase2026 at least was skipped T380726
  • 04:08 tstarling@deploy2002: Finished deploy [restbase/deploy@27f4a8e]: add 3 wikis T380726 (duration: 10m 46s)
  • 03:58 tstarling@deploy2002: Started deploy [restbase/deploy@27f4a8e]: add 3 wikis T380726
  • 03:55 tstarling@deploy2002: Finished deploy [restbase/deploy@6d0b97e]: no-op test deploy (duration: 11m 22s)
  • 03:44 tstarling@deploy2002: Started deploy [restbase/deploy@6d0b97e]: no-op test deploy
  • 03:30 tstarling@deploy2002: Finished scap sync-world: Backport for Prepare for migration of the Interwiki extension to core (T33951) (duration: 31m 17s)
  • 03:20 tstarling@deploy2002: tstarling: Continuing with sync
  • 03:10 tstarling@deploy2002: tstarling: Backport for Prepare for migration of the Interwiki extension to core (T33951) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 02:59 tstarling@deploy2002: Started scap sync-world: Backport for Prepare for migration of the Interwiki extension to core (T33951)

2024-12-08

  • 19:25 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 19:24 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply

2024-12-07

  • 00:33 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:32 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply

2024-12-06

  • 23:18 mutante: clouddumps1001/clouddumps1002: rm /srv/dumps/xmldatadumps/public/other/misc/phabricator_public.dump - an uncompressed old file from Sep 2023 - normal dumps are gzipped and current
  • 22:33 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 22:33 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 20:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 20:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 19:40 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1025.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 19:00 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1025.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 17:31 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 17:29 topranks: splitting codfw -> eqsin traffic over path via ulsfo as direct link is saturated
  • 17:08 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1058-1063].eqiad.wmnet
  • 17:08 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1058-1063].eqiad.wmnet
  • 17:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1059.eqiad.wmnet with OS bookworm
  • 16:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1059.eqiad.wmnet with reason: host reimage
  • 16:45 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1059.eqiad.wmnet with reason: host reimage
  • 16:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1061.eqiad.wmnet with OS bookworm
  • 16:29 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1059.eqiad.wmnet with OS bookworm
  • 16:29 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1059.eqiad.wmnet with OS bookworm
  • 16:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1063.eqiad.wmnet with OS bookworm
  • 16:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1062.eqiad.wmnet with OS bookworm
  • 16:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1060.eqiad.wmnet with OS bookworm
  • 16:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1058.eqiad.wmnet with OS bookworm
  • 16:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1061.eqiad.wmnet with reason: host reimage
  • 16:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 16:10 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 16:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1063.eqiad.wmnet with reason: host reimage
  • 16:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1062.eqiad.wmnet with reason: host reimage
  • 16:01 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1060.eqiad.wmnet with reason: host reimage
  • 15:59 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1063.eqiad.wmnet with reason: host reimage
  • 15:58 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1062.eqiad.wmnet with reason: host reimage
  • 15:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1058.eqiad.wmnet with reason: host reimage
  • 15:58 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1061.eqiad.wmnet with reason: host reimage
  • 15:57 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1060.eqiad.wmnet with reason: host reimage
  • 15:54 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1058.eqiad.wmnet with reason: host reimage
  • 15:43 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1063.eqiad.wmnet with OS bookworm
  • 15:43 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1062.eqiad.wmnet with OS bookworm
  • 15:42 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1061.eqiad.wmnet with OS bookworm
  • 15:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1060.eqiad.wmnet with OS bookworm
  • 15:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1059.eqiad.wmnet with OS bookworm
  • 15:39 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1058.eqiad.wmnet with OS bookworm
  • 15:36 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker1058.eqiad.wmnet
  • 15:36 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1058.eqiad.wmnet with OS bullseye
  • 15:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1058.eqiad.wmnet with OS bullseye
  • 15:36 kamila@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker1058.eqiad.wmnet
  • 15:34 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1058.eqiad.wmnet wikikube-worker1059.eqiad.wmnet wikikube-worker1060.eqiad.wmnet wikikube-worker1061.eqiad.wmnet wikikube-worker1062.eqiad.wmnet wikikube-worker1063.eqiad.wmnet on all recursors
  • 15:34 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1058.eqiad.wmnet wikikube-worker1059.eqiad.wmnet wikikube-worker1060.eqiad.wmnet wikikube-worker1061.eqiad.wmnet wikikube-worker1062.eqiad.wmnet wikikube-worker1063.eqiad.wmnet on all recursors
  • 15:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1434 to wikikube-worker1062
  • 15:33 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1062
  • 15:32 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1062
  • 15:32 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1435 to wikikube-worker1063
  • 15:30 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1063
  • 15:29 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:29 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1063
  • 15:29 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1433 to wikikube-worker1061
  • 15:28 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1061
  • 15:27 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:27 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1061
  • 15:26 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:26 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1433 to wikikube-worker1061 - kamila@cumin1002"
  • 15:26 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1433 to wikikube-worker1061 - kamila@cumin1002"
  • 15:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1432 to wikikube-worker1060
  • 15:23 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1060
  • 15:22 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:22 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1060
  • 15:22 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:22 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1432 to wikikube-worker1060 - kamila@cumin1002"
  • 15:22 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1432 to wikikube-worker1060 - kamila@cumin1002"
  • 15:20 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1435 to wikikube-worker1063
  • 15:20 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1434 to wikikube-worker1062
  • 15:20 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1433 to wikikube-worker1061
  • 15:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1431 to wikikube-worker1059
  • 15:19 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1059
  • 15:18 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:18 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1059
  • 15:18 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:18 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1431 to wikikube-worker1059 - kamila@cumin1002"
  • 15:18 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1432 to wikikube-worker1060
  • 15:18 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1431 to wikikube-worker1059 - kamila@cumin1002"
  • 15:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1430 to wikikube-worker1058
  • 15:14 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1058
  • 15:14 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:13 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1058
  • 15:13 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:13 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1430 to wikikube-worker1058 - kamila@cumin1002"
  • 15:13 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1430 to wikikube-worker1058 - kamila@cumin1002"
  • 15:10 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1431 to wikikube-worker1059
  • 15:08 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:08 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1430 to wikikube-worker1058
  • 15:05 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1430-1435].eqiad.wmnet
  • 15:02 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1430-1435].eqiad.wmnet
  • 14:50 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 14:49 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: ganeti1009.eqiad.wmnet
  • 14:27 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: ganeti1009.eqiad.wmnet
  • 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1020.eqiad.wmnet
  • 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1020.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 14:19 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1020.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 14:11 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1056.eqiad.wmnet with OS bookworm
  • 13:56 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1020.eqiad.wmnet
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1018.eqiad.wmnet
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1018.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1056.eqiad.wmnet with reason: host reimage
  • 13:47 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1018.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:46 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1056.eqiad.wmnet with reason: host reimage
  • 13:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1018.eqiad.wmnet
  • 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1017.eqiad.wmnet
  • 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1017.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1017.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:31 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:29 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 13:29 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1056.eqiad.wmnet with OS bookworm
  • 13:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1056.eqiad.wmnet wikikube-worker1057.eqiad.wmnet on all recursors
  • 13:25 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1056.eqiad.wmnet wikikube-worker1057.eqiad.wmnet on all recursors
  • 13:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1017.eqiad.wmnet
  • 13:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1038 to wikikube-worker1057
  • 13:19 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1057
  • 13:18 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1057
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:17 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1038 to wikikube-worker1057 - jelto@cumin1002"
  • 13:17 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1038 to wikikube-worker1057 - jelto@cumin1002"
  • 13:13 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:13 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1038 to wikikube-worker1057
  • 13:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1037 to wikikube-worker1056
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1016.eqiad.wmnet
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:11 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1056
  • 13:11 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:10 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1056
  • 13:10 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:10 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1037 to wikikube-worker1056 - jelto@cumin1002"
  • 13:07 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:01 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1037 to wikikube-worker1056 - jelto@cumin1002"
  • 12:58 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1016.eqiad.wmnet
  • 12:56 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:56 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1037 to wikikube-worker1056
  • 12:48 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1037-1038].eqiad.wmnet
  • 12:47 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1037-1038].eqiad.wmnet
  • 12:40 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti1009.eqiad.wmnet
  • 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1009.eqiad.wmnet
  • 12:15 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1054-1055].eqiad.wmnet
  • 12:15 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1054-1055].eqiad.wmnet
  • 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host build2002.codfw.wmnet with OS bookworm
  • 11:58 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 11:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1055.eqiad.wmnet with OS bookworm
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1054.eqiad.wmnet with OS bookworm
  • 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on build2002.codfw.wmnet with reason: host reimage
  • 11:48 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on build2002.codfw.wmnet with reason: host reimage
  • 11:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1055.eqiad.wmnet with reason: host reimage
  • 11:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1054.eqiad.wmnet with reason: host reimage
  • 11:32 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1055.eqiad.wmnet with reason: host reimage
  • 11:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1054.eqiad.wmnet with reason: host reimage
  • 11:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host build2002.codfw.wmnet with OS bookworm
  • 11:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1055.eqiad.wmnet with OS bookworm
  • 11:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1054.eqiad.wmnet with OS bookworm
  • 11:00 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1054.eqiad.wmnet wikikube-worker1055.eqiad.wmnet on all recursors
  • 11:00 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1054.eqiad.wmnet wikikube-worker1055.eqiad.wmnet on all recursors
  • 10:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1036 to wikikube-worker1055
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1055
  • 10:57 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1055
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1036 to wikikube-worker1055 - jelto@cumin1002"
  • 10:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1036 to wikikube-worker1055 - jelto@cumin1002"
  • 10:53 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2042.codfw.wmnet to cluster codfw and group D
  • 10:53 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:53 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1036 to wikikube-worker1055
  • 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2042.codfw.wmnet to cluster codfw and group D
  • 10:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1035 to wikikube-worker1054
  • 10:48 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1054
  • 10:47 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1054
  • 10:47 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:47 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1035 to wikikube-worker1054 - jelto@cumin1002"
  • 10:47 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1035 to wikikube-worker1054 - jelto@cumin1002"
  • 10:43 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:43 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1035 to wikikube-worker1054
  • 10:41 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1035-1036].eqiad.wmnet
  • 10:39 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1035-1036].eqiad.wmnet
  • 10:27 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1052-1053].eqiad.wmnet
  • 10:27 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1052-1053].eqiad.wmnet
  • 10:11 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 10:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1053.eqiad.wmnet with OS bookworm
  • 10:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1052.eqiad.wmnet with OS bookworm
  • 09:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1053.eqiad.wmnet with reason: host reimage
  • 09:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1052.eqiad.wmnet with reason: host reimage
  • 09:46 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1053.eqiad.wmnet with reason: host reimage
  • 09:45 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1052.eqiad.wmnet with reason: host reimage
  • 09:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1053.eqiad.wmnet with OS bookworm
  • 09:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1052.eqiad.wmnet with OS bookworm
  • 09:24 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1052.eqiad.wmnet wikikube-worker1053.eqiad.wmnet on all recursors
  • 09:24 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1052.eqiad.wmnet wikikube-worker1053.eqiad.wmnet on all recursors
  • 09:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1034 to wikikube-worker1053
  • 09:22 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1053
  • 09:21 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1053
  • 09:21 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:21 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1034 to wikikube-worker1053 - jelto@cumin1002"
  • 09:20 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1034 to wikikube-worker1053 - jelto@cumin1002"
  • 09:16 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:16 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1034 to wikikube-worker1053
  • 09:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1033 to wikikube-worker1052
  • 09:15 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1052
  • 09:14 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1052
  • 09:13 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:13 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1033 to wikikube-worker1052 - jelto@cumin1002"
  • 09:13 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1033 to wikikube-worker1052 - jelto@cumin1002"
  • 09:09 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1033 to wikikube-worker1052
  • 09:03 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1033-1034].eqiad.wmnet
  • 09:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1033-1034].eqiad.wmnet
  • 09:00 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:55 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:43 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:33 moritzm: uploaded ruby-sys-filesystem 1.4.3-1~wmf11u1 to component/puppet7 for Bullseye (needed by the mountpoints fact in facter 4) T381538
  • 08:33 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:30 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:30 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:26 hashar@deploy2002: Finished deploy [gerrit/gerrit@ac50ebe]: Reinstate the banner for the developer survey (duration: 00m 11s)
  • 08:26 hashar@deploy2002: Started deploy [gerrit/gerrit@ac50ebe]: Reinstate the banner for the developer survey
  • 08:18 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:18 elukey@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:17 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:17 elukey@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:17 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:16 elukey@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71633 and previous config saved to /var/cache/conftool/dbconfig/20241206-072120-root.json
  • 07:20 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:19 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:07 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:06 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71632 and previous config saved to /var/cache/conftool/dbconfig/20241206-070614-root.json
  • 07:05 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 07:04 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71630 and previous config saved to /var/cache/conftool/dbconfig/20241206-063603-root.json
  • 06:25 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P71629 and previous config saved to /var/cache/conftool/dbconfig/20241206-062527-root.json
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71628 and previous config saved to /var/cache/conftool/dbconfig/20241206-062058-root.json
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P71627 and previous config saved to /var/cache/conftool/dbconfig/20241206-061021-root.json
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 5%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71626 and previous config saved to /var/cache/conftool/dbconfig/20241206-060552-root.json
  • 05:55 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P71625 and previous config saved to /var/cache/conftool/dbconfig/20241206-055516-root.json
  • 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1154.eqiad.wmnet with reason: Alter table
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1154.eqiad.wmnet with reason: Alter table
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'es2044 (re)pooling @ 1%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71624 and previous config saved to /var/cache/conftool/dbconfig/20241206-055047-root.json
  • 05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2044 to dbctl depooled T381259', diff saved to https://phabricator.wikimedia.org/P71623 and previous config saved to /var/cache/conftool/dbconfig/20241206-054457-marostegui.json
  • 05:40 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P71622 and previous config saved to /var/cache/conftool/dbconfig/20241206-054010-root.json
  • 01:50 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:49 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:49 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:48 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:32 TimStarling: on mwmaint2002: deleting MediaWiki:Sitesupport-url pages per T379205
  • 01:16 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:15 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:19 urbanecm: mwmaint2002: foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/revalidateLinkRecommendations.php --all --verbose # T380455
  • 00:19 urbanecm: Delete previously-started mwscript-k8s instances of revalidateLinkRecommendations.php (T380455)

2024-12-05

  • 23:26 jhathaway: looking at puppet failures on an-workers
  • 23:23 urbanecm: Start revalidateLinkRecommendations.php for Add Link-enabled wikis via mwscript-k8s (T380455)
  • 22:53 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2020.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 22:00 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2020.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 21:24 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 21:23 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 21:21 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 21:21 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 21:20 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 21:19 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 20:27 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2019.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 19:34 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2019.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 19:28 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.6 refs T375665
  • 18:44 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 18:13 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1048-1049,1051].eqiad.wmnet
  • 18:13 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1048-1049,1051].eqiad.wmnet
  • 18:00 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 18:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:59 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 17:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1051.eqiad.wmnet with OS bookworm
  • 17:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1050.eqiad.wmnet with OS bookworm
  • 17:41 pt1979@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:40 pt1979@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1051.eqiad.wmnet with reason: host reimage
  • 17:36 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1051.eqiad.wmnet with reason: host reimage
  • 17:36 jhathaway: upgrading facter on bullseye puppet nodes
  • 17:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1050.eqiad.wmnet with reason: host reimage
  • 17:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1050.eqiad.wmnet with reason: host reimage
  • 17:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1125.eqiad.wmnet with reason: Test setup should not alert
  • 17:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1125.eqiad.wmnet with reason: Test setup should not alert
  • 17:21 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:19 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 17:19 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1051.eqiad.wmnet with OS bookworm
  • 17:19 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 17:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1050.eqiad.wmnet with OS bookworm
  • 17:11 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1051.eqiad.wmnet with OS bookworm
  • 17:11 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1050.eqiad.wmnet with OS bookworm
  • 17:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1049.eqiad.wmnet with OS bookworm
  • 17:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1048.eqiad.wmnet with OS bookworm
  • 17:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:03 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:03 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:03 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:01 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:01 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:00 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:00 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1049.eqiad.wmnet with reason: host reimage
  • 16:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1048.eqiad.wmnet with reason: host reimage
  • 16:46 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS bullseye
  • 16:45 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS bullseye
  • 16:43 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1049.eqiad.wmnet with reason: host reimage
  • 16:43 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1048.eqiad.wmnet with reason: host reimage
  • 16:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:31 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:31 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:30 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:29 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:29 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 16:29 jiji@deploy2002: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 16:29 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 16:28 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 16:28 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 16:28 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 16:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1051.eqiad.wmnet with OS bookworm
  • 16:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1050.eqiad.wmnet with OS bookworm
  • 16:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1049.eqiad.wmnet with OS bookworm
  • 16:25 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1048.eqiad.wmnet with OS bookworm
  • 16:23 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1048.eqiad.wmnet wikikube-worker1049.eqiad.wmnet wikikube-worker1050.eqiad.wmnet wikikube-worker1051.eqiad.wmnet on all recursors
  • 16:23 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1048.eqiad.wmnet wikikube-worker1049.eqiad.wmnet wikikube-worker1050.eqiad.wmnet wikikube-worker1051.eqiad.wmnet on all recursors
  • 16:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1032 to wikikube-worker1051
  • 16:22 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:22 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:22 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1051
  • 16:21 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1051
  • 16:21 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:21 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1032 to wikikube-worker1051 - jelto@cumin1002"
  • 16:21 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1032 to wikikube-worker1051 - jelto@cumin1002"
  • 16:20 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 16:20 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:20 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:20 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 16:20 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 16:19 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 16:19 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 16:18 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 16:18 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 16:18 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 16:17 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:17 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:17 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:17 jclark@cumin1002: START - Cookbook sre.hosts.provision for host cloudelastic1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:15 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:15 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:15 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for cloudelastic - jclark@cumin1002"
  • 16:14 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for cloudelastic - jclark@cumin1002"
  • 16:14 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:14 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:13 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1032 to wikikube-worker1051
  • 16:13 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:12 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:11 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:11 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 16:11 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:11 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 16:11 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:11 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:11 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 16:11 jiji@deploy2002: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 16:08 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main1010.eqiad.wmnet
  • 16:08 jiji@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main1010.eqiad.wmnet
  • 16:08 jiji@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:07 jiji@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:07 jiji@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:07 jiji@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:07 jiji@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:07 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 16:06 jiji@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 16:06 jiji@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 16:06 jiji@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 16:06 jiji@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:06 jiji@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 16:05 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 16:04 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 16:04 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:04 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 16:02 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1031 to wikikube-worker1050
  • 15:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1050
  • 15:56 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1050
  • 15:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:56 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1031 to wikikube-worker1050 - jelto@cumin1002"
  • 15:55 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1031 to wikikube-worker1050 - jelto@cumin1002"
  • 15:55 moritzm: installing nghttp2 security updates
  • 15:52 jiji@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
  • 15:52 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:51 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1031 to wikikube-worker1050
  • 15:41 jiji@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
  • 15:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1030 to wikikube-worker1049
  • 15:28 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1049
  • 15:27 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1049
  • 15:27 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:27 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1030 to wikikube-worker1049 - jelto@cumin1002"
  • 15:26 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1030 to wikikube-worker1049 - jelto@cumin1002"
  • 15:24 moritzm: installing postgresql security updates
  • 15:22 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:22 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1030 to wikikube-worker1049
  • 15:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1029 to wikikube-worker1048
  • 15:20 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1048
  • 15:18 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1048
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:18 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1029 to wikikube-worker1048 - jelto@cumin1002"
  • 15:18 dbrant@deploy2002: Finished scap sync-world: Backport for Enable Parsoid Fragment mode on Chart pilot wikis (T381436 T381312 T380758) (duration: 19m 51s)
  • 15:18 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1029 to wikikube-worker1048 - jelto@cumin1002"
  • 15:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:13 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:12 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1029 to wikikube-worker1048
  • 15:10 dbrant@deploy2002: dbrant, cscott: Continuing with sync
  • 15:10 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1029-1032].eqiad.wmnet
  • 15:10 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:10 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:10 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:10 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:09 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:09 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 15:08 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1029-1032].eqiad.wmnet
  • 15:07 dbrant@deploy2002: dbrant, cscott: Backport for Enable Parsoid Fragment mode on Chart pilot wikis (T381436 T381312 T380758) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 14:58 dbrant@deploy2002: Started scap sync-world: Backport for Enable Parsoid Fragment mode on Chart pilot wikis (T381436 T381312 T380758)
  • 14:55 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS bookworm
  • 14:55 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
  • 14:36 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1046-1047].eqiad.wmnet
  • 14:36 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1046-1047].eqiad.wmnet
  • 14:34 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
  • 14:29 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 14:28 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 14:28 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 14:27 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 14:20 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:20 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 14:20 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1047.eqiad.wmnet with OS bookworm
  • 14:18 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS bookworm
  • 14:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 14:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1046.eqiad.wmnet with OS bookworm
  • 14:04 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1047.eqiad.wmnet with reason: host reimage
  • 13:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1046.eqiad.wmnet with reason: host reimage
  • 13:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1047.eqiad.wmnet with reason: host reimage
  • 13:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1046.eqiad.wmnet with reason: host reimage
  • 13:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:42 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2178.codfw.wmnet
  • 13:42 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2178.codfw.wmnet
  • 13:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1047.eqiad.wmnet with OS bookworm
  • 13:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1046.eqiad.wmnet with OS bookworm
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1046.eqiad.wmnet wikikube-worker1047.eqiad.wmnet on all recursors
  • 13:32 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1046.eqiad.wmnet wikikube-worker1047.eqiad.wmnet on all recursors
  • 13:23 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2178.codfw.wmnet with OS bookworm
  • 13:08 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1005,1010].eqiad.wmnet with reason: Hardware refresh
  • 13:08 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1005,1010].eqiad.wmnet with reason: Hardware refresh
  • 13:03 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2178.codfw.wmnet with reason: host reimage
  • 12:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1028 to wikikube-worker1047
  • 12:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1047
  • 12:57 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1047
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1028 to wikikube-worker1047 - jelto@cumin1002"
  • 12:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1028 to wikikube-worker1047 - jelto@cumin1002"
  • 12:56 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2178.codfw.wmnet with reason: host reimage
  • 12:53 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:53 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1028 to wikikube-worker1047
  • 12:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1027 to wikikube-worker1046
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1046
  • 12:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1046
  • 12:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1027 to wikikube-worker1046 - jelto@cumin1002"
  • 12:50 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1027 to wikikube-worker1046 - jelto@cumin1002"
  • 12:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:45 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1027 to wikikube-worker1046
  • 12:42 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2176-2177,2179].codfw.wmnet
  • 12:42 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2176-2177,2179].codfw.wmnet
  • 12:39 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1027-1028].eqiad.wmnet
  • 12:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2178
  • 12:37 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2178
  • 12:36 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2178
  • 12:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2178.codfw.wmnet 185.48.192.10.in-addr.arpa 5.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:36 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2178.codfw.wmnet 185.48.192.10.in-addr.arpa 5.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:36 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2178 - jayme@cumin2002"
  • 12:36 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2178 - jayme@cumin2002"
  • 12:36 jgleeson: payments updated from 119448ca to ab7e70ec
  • 12:35 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1027-1028].eqiad.wmnet
  • 12:26 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 12:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 12:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 12:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T371742)', diff saved to https://phabricator.wikimedia.org/P71620 and previous config saved to /var/cache/conftool/dbconfig/20241205-121609-ladsgroup.json
  • 12:15 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1038-1043].eqiad.wmnet
  • 12:15 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1038-1043].eqiad.wmnet
  • 12:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2176.codfw.wmnet with OS bookworm
  • 12:02 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2177.codfw.wmnet with OS bookworm
  • 12:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P71617 and previous config saved to /var/cache/conftool/dbconfig/20241205-120102-ladsgroup.json
  • 11:58 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2179.codfw.wmnet with OS bookworm
  • 11:46 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2176.codfw.wmnet with reason: host reimage
  • 11:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P71616 and previous config saved to /var/cache/conftool/dbconfig/20241205-114555-ladsgroup.json
  • 11:43 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2176.codfw.wmnet with reason: host reimage
  • 11:42 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2177.codfw.wmnet with reason: host reimage
  • 11:38 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2177.codfw.wmnet with reason: host reimage
  • 11:38 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2179.codfw.wmnet with reason: host reimage
  • 11:34 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2179.codfw.wmnet with reason: host reimage
  • 11:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T371742)', diff saved to https://phabricator.wikimedia.org/P71615 and previous config saved to /var/cache/conftool/dbconfig/20241205-113048-ladsgroup.json
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2178
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2176
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2176
  • 11:25 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2176
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2176.codfw.wmnet 81.48.192.10.in-addr.arpa 1.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:25 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2176.codfw.wmnet 81.48.192.10.in-addr.arpa 1.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:25 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2176 - jayme@cumin2002"
  • 11:24 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2176 - jayme@cumin2002"
  • 11:20 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:19 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2176
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2177
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2177
  • 11:19 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2177
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2177.codfw.wmnet 83.48.192.10.in-addr.arpa 3.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2177.codfw.wmnet 83.48.192.10.in-addr.arpa 3.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2177 - jayme@cumin2002"
  • 11:19 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2177 - jayme@cumin2002"
  • 11:15 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2176.codfw.wmnet with OS bookworm
  • 11:15 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2176.codfw.wmnet with OS bookworm
  • 11:15 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2177
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2179
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2179
  • 11:15 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2179
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2179.codfw.wmnet 207.48.192.10.in-addr.arpa 7.0.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:15 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2179.codfw.wmnet 207.48.192.10.in-addr.arpa 7.0.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:15 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2179 - jayme@cumin2002"
  • 11:15 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2179 - jayme@cumin2002"
  • 11:14 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2176.codfw.wmnet with OS bookworm
  • 11:13 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2177.codfw.wmnet with OS bookworm
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2449 to wikikube-worker2177
  • 11:12 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2177
  • 11:12 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2178.codfw.wmnet with OS bookworm
  • 11:11 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2177
  • 11:11 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:11 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2449 to wikikube-worker2177 - jayme@cumin2002"
  • 11:11 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:11 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2179
  • 11:11 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2179.codfw.wmnet with OS bookworm
  • 11:10 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2179.codfw.wmnet with OS bookworm
  • 11:09 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2449 to wikikube-worker2177 - jayme@cumin2002"
  • 11:09 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2179.codfw.wmnet with OS bookworm
  • 11:05 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:05 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2449 to wikikube-worker2177
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2451 to wikikube-worker2179
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2179
  • 10:52 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2179
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2450 to wikikube-worker2178
  • 10:50 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2178
  • 10:50 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:50 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2178
  • 10:50 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:50 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2450 to wikikube-worker2178 - jayme@cumin2002"
  • 10:50 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2450 to wikikube-worker2178 - jayme@cumin2002"
  • 10:47 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2448 to wikikube-worker2176
  • 10:46 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2176
  • 10:46 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2176
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2448 to wikikube-worker2176 - jayme@cumin2002"
  • 10:45 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2448 to wikikube-worker2176 - jayme@cumin2002"
  • 10:43 dcausse: reindexed all wikidata entity schemas (T376252)
  • 10:42 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2451 to wikikube-worker2179
  • 10:42 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2449 to wikikube-worker2177
  • 10:42 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2450 to wikikube-worker2178
  • 10:42 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2449 to wikikube-worker2177
  • 10:42 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:41 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2448 to wikikube-worker2176
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2451.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2450.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2449.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2448.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:33 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1044-1045].eqiad.wmnet
  • 10:33 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1044-1045].eqiad.wmnet
  • 10:28 jelto: homer 'lsw1-f3-eqiad*' commit 'T377876'
  • 10:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1045.eqiad.wmnet with OS bookworm
  • 10:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1045.eqiad.wmnet with reason: host reimage
  • 10:04 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1045.eqiad.wmnet with reason: host reimage
  • 09:56 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2451.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T371742)', diff saved to https://phabricator.wikimedia.org/P71614 and previous config saved to /var/cache/conftool/dbconfig/20241205-095554-ladsgroup.json
  • 09:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 09:55 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2450.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 09:55 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2449.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:54 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2448.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:48 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1045.eqiad.wmnet with OS bookworm
  • 09:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1044.eqiad.wmnet with OS bookworm
  • 09:40 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2450-2451].codfw.wmnet
  • 09:39 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2450-2451].codfw.wmnet
  • 09:39 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[2448-2451].codfw.wmnet with reason: reimage
  • 09:38 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[2448-2451].codfw.wmnet with reason: reimage
  • 09:38 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2448-2449].codfw.wmnet
  • 09:37 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2448-2449].codfw.wmnet
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1044.eqiad.wmnet with reason: host reimage
  • 09:25 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1044.eqiad.wmnet with reason: host reimage
  • 09:20 jayme: destroyed unused expiring puppet certs - T381474
  • 09:15 fabfur: deploying haproxykafka also on magru and drmrs (T378578)
  • 09:09 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1044.eqiad.wmnet with OS bookworm
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1044.eqiad.wmnet wikikube-worker1045.eqiad.wmnet on all recursors
  • 09:06 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1044.eqiad.wmnet wikikube-worker1045.eqiad.wmnet on all recursors
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1026 to wikikube-worker1045
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1045
  • 09:03 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1045
  • 09:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:03 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1026 to wikikube-worker1045 - jelto@cumin1002"
  • 09:03 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1026 to wikikube-worker1045 - jelto@cumin1002"
  • 08:58 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:58 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1026 to wikikube-worker1045
  • 08:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1025 to wikikube-worker1044
  • 08:57 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1044
  • 08:55 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1044
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1025 to wikikube-worker1044 - jelto@cumin1002"
  • 08:54 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1025 to wikikube-worker1044 - jelto@cumin1002"
  • 08:49 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:49 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1025 to wikikube-worker1044
  • 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:46 moritzm: rebalance Ganeti eqiad/D following server refreshes
  • 08:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 08:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 08:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T371742)', diff saved to https://phabricator.wikimedia.org/P71611 and previous config saved to /var/cache/conftool/dbconfig/20241205-080745-ladsgroup.json
  • 07:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P71610 and previous config saved to /var/cache/conftool/dbconfig/20241205-075237-ladsgroup.json
  • 07:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P71609 and previous config saved to /var/cache/conftool/dbconfig/20241205-073730-ladsgroup.json
  • 07:36 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1025-1026].eqiad.wmnet
  • 07:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1025-1026].eqiad.wmnet
  • 07:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T371742)', diff saved to https://phabricator.wikimedia.org/P71608 and previous config saved to /var/cache/conftool/dbconfig/20241205-072223-ladsgroup.json
  • 07:16 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 06:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 100%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71607 and previous config saved to /var/cache/conftool/dbconfig/20241205-063132-root.json
  • 06:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 75%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71606 and previous config saved to /var/cache/conftool/dbconfig/20241205-061626-root.json
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71605 and previous config saved to /var/cache/conftool/dbconfig/20241205-060631-root.json
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71604 and previous config saved to /var/cache/conftool/dbconfig/20241205-060612-root.json
  • 06:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 50%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71603 and previous config saved to /var/cache/conftool/dbconfig/20241205-060121-root.json
  • 05:58 eileen: civicrm upgraded from 74c059a4 to f9c89e50
  • 05:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T371742)', diff saved to https://phabricator.wikimedia.org/P71602 and previous config saved to /var/cache/conftool/dbconfig/20241205-055442-ladsgroup.json
  • 05:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 05:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 05:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T371742)', diff saved to https://phabricator.wikimedia.org/P71601 and previous config saved to /var/cache/conftool/dbconfig/20241205-055420-ladsgroup.json
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71600 and previous config saved to /var/cache/conftool/dbconfig/20241205-055125-root.json
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71599 and previous config saved to /var/cache/conftool/dbconfig/20241205-055106-root.json
  • 05:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 25%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71598 and previous config saved to /var/cache/conftool/dbconfig/20241205-054615-root.json
  • 05:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2023.codfw.wmnet with reason: cloning
  • 05:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2023.codfw.wmnet with reason: cloning
  • 05:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2023 to clone es2044', diff saved to https://phabricator.wikimedia.org/P71597 and previous config saved to /var/cache/conftool/dbconfig/20241205-054200-marostegui.json
  • 05:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2025.codfw.wmnet with reason: cloning
  • 05:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2025.codfw.wmnet with reason: cloning
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2025 to es5 master T381259', diff saved to https://phabricator.wikimedia.org/P71596 and previous config saved to /var/cache/conftool/dbconfig/20241205-054114-marostegui.json
  • 05:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P71595 and previous config saved to /var/cache/conftool/dbconfig/20241205-053912-ladsgroup.json
  • 05:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71593 and previous config saved to /var/cache/conftool/dbconfig/20241205-053620-root.json
  • 05:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71592 and previous config saved to /var/cache/conftool/dbconfig/20241205-053601-root.json
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 10%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71591 and previous config saved to /var/cache/conftool/dbconfig/20241205-053109-root.json
  • 05:28 marostegui: Failover m3 from db1159 to db1213 - T381365
  • 05:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P71590 and previous config saved to /var/cache/conftool/dbconfig/20241205-052405-ladsgroup.json
  • 05:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71589 and previous config saved to /var/cache/conftool/dbconfig/20241205-052114-root.json
  • 05:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71588 and previous config saved to /var/cache/conftool/dbconfig/20241205-052056-root.json
  • 05:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2134,2160].codfw.wmnet,db[1159,1213,1217].eqiad.wmnet with reason: m3 master switchover T381365
  • 05:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2134,2160].codfw.wmnet,db[1159,1213,1217].eqiad.wmnet with reason: m3 master switchover T381365
  • 05:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2043 (re)pooling @ 1%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71587 and previous config saved to /var/cache/conftool/dbconfig/20241205-051604-root.json
  • 05:15 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2043 depooled T381259', diff saved to https://phabricator.wikimedia.org/P71586 and previous config saved to /var/cache/conftool/dbconfig/20241205-051545-marostegui.json
  • 05:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T371742)', diff saved to https://phabricator.wikimedia.org/P71585 and previous config saved to /var/cache/conftool/dbconfig/20241205-050858-ladsgroup.json
  • 05:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71584 and previous config saved to /var/cache/conftool/dbconfig/20241205-050609-root.json
  • 05:05 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71583 and previous config saved to /var/cache/conftool/dbconfig/20241205-050550-root.json
  • 03:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T371742)', diff saved to https://phabricator.wikimedia.org/P71578 and previous config saved to /var/cache/conftool/dbconfig/20241205-033803-ladsgroup.json
  • 03:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 03:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 03:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T371742)', diff saved to https://phabricator.wikimedia.org/P71577 and previous config saved to /var/cache/conftool/dbconfig/20241205-033751-ladsgroup.json
  • 03:34 eileen: tools upgraded from b230f718 to c7b53ecd
  • 03:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P71576 and previous config saved to /var/cache/conftool/dbconfig/20241205-032245-ladsgroup.json
  • 03:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P71575 and previous config saved to /var/cache/conftool/dbconfig/20241205-030737-ladsgroup.json
  • 02:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T371742)', diff saved to https://phabricator.wikimedia.org/P71574 and previous config saved to /var/cache/conftool/dbconfig/20241205-025230-ladsgroup.json
  • 02:45 eileen: civicrm upgraded from 6361a578 to 74c059a4
  • 01:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1209 (T371742)', diff saved to https://phabricator.wikimedia.org/P71573 and previous config saved to /var/cache/conftool/dbconfig/20241205-012108-ladsgroup.json
  • 01:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 01:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 01:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T371742)', diff saved to https://phabricator.wikimedia.org/P71572 and previous config saved to /var/cache/conftool/dbconfig/20241205-012046-ladsgroup.json
  • 01:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P71571 and previous config saved to /var/cache/conftool/dbconfig/20241205-010539-ladsgroup.json
  • 01:03 sukhe: re-enabling puppet on A:lvs [post-wdqs merge]
  • 00:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:57 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P71570 and previous config saved to /var/cache/conftool/dbconfig/20241205-005031-ladsgroup.json
  • 00:49 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:49 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T371742)', diff saved to https://phabricator.wikimedia.org/P71569 and previous config saved to /var/cache/conftool/dbconfig/20241205-003524-ladsgroup.json
  • 00:30 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 00:15 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1085.eqiad.wmnet with OS bullseye
  • 00:15 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 00:15 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 00:00 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 23:00:00 on 8 hosts with reason: T376150 non-prod hosts

2024-12-04

  • 23:59 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 23:00:00 on 8 hosts with reason: T376150 non-prod hosts
  • 23:57 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1085.eqiad.wmnet with reason: host reimage
  • 23:54 vriley@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1085.eqiad.wmnet with reason: host reimage
  • 23:47 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 23:43 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1085.eqiad.wmnet with OS bullseye
  • 23:42 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS bookworm
  • 23:42 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:40 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:39 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:35 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:35 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:32 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:32 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:26 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1085.eqiad.wmnet with OS bullseye
  • 23:21 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:20 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1043.eqiad.wmnet with OS bookworm
  • 23:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1042.eqiad.wmnet with OS bookworm
  • 23:10 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 22:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1043.eqiad.wmnet with reason: host reimage
  • 22:56 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T371742)', diff saved to https://phabricator.wikimedia.org/P71567 and previous config saved to /var/cache/conftool/dbconfig/20241204-225545-ladsgroup.json
  • 22:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 22:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 22:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T371742)', diff saved to https://phabricator.wikimedia.org/P71566 and previous config saved to /var/cache/conftool/dbconfig/20241204-225523-ladsgroup.json
  • 22:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1041.eqiad.wmnet with OS bookworm
  • 22:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1042.eqiad.wmnet with reason: host reimage
  • 22:51 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1040.eqiad.wmnet with OS bookworm
  • 22:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1043.eqiad.wmnet with reason: host reimage
  • 22:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1042.eqiad.wmnet with reason: host reimage
  • 22:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P71565 and previous config saved to /var/cache/conftool/dbconfig/20241204-224016-ladsgroup.json
  • 22:38 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
  • 22:37 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1045.eqiad.wmnet with OS bookworm
  • 22:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1041.eqiad.wmnet with reason: host reimage
  • 22:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1043.eqiad.wmnet with OS bookworm
  • 22:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1042.eqiad.wmnet with OS bookworm
  • 22:33 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
  • 22:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1040.eqiad.wmnet with reason: host reimage
  • 22:30 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1041.eqiad.wmnet with reason: host reimage
  • 22:29 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1040.eqiad.wmnet with reason: host reimage
  • 22:26 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 22:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P71564 and previous config saved to /var/cache/conftool/dbconfig/20241204-222509-ladsgroup.json
  • 22:18 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1085.eqiad.wmnet with OS bullseye
  • 22:16 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS bookworm
  • 22:13 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1041.eqiad.wmnet with OS bookworm
  • 22:13 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1040.eqiad.wmnet with OS bookworm
  • 22:12 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye
  • 22:12 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:12 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T371742)', diff saved to https://phabricator.wikimedia.org/P71563 and previous config saved to /var/cache/conftool/dbconfig/20241204-221001-ladsgroup.json
  • 21:59 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.6 refs T375665
  • 21:57 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 21:57 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1044.eqiad.wmnet with OS bookworm
  • 21:49 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
  • 21:46 cjming: end of UTC late backport window
  • 21:46 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es1045
  • 21:45 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
  • 21:45 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host es1045
  • 21:43 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:43 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 21:43 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 21:43 cjming@deploy2002: Finished scap sync-world: Backport for CSP for banner preview: allow remind me later SMS host (T380232) (duration: 17m 39s)
  • 21:40 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 21:37 cjming@deploy2002: cjming, gjg: Continuing with sync
  • 21:34 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye
  • 21:34 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:34 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:32 cjming@deploy2002: cjming, gjg: Backport for CSP for banner preview: allow remind me later SMS host (T380232) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:26 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs-internal-scholarly.discovery.wmnet on all recursors
  • 21:26 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache wdqs-internal-scholarly.discovery.wmnet on all recursors
  • 21:26 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs-internal-main.discovery.wmnet on all recursors
  • 21:26 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache wdqs-internal-main.discovery.wmnet on all recursors
  • 21:25 cjming@deploy2002: Started scap sync-world: Backport for CSP for banner preview: allow remind me later SMS host (T380232)
  • 21:25 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
  • 21:24 ryankemper: T379334 `ryankemper@dns1004:~$ sudo -i authdns-update` completed
  • 21:23 cjming@deploy2002: Finished scap sync-world: Backport for Enable Chart extension on several pilot wikis (T381436 T381312) (duration: 17m 29s)
  • 21:22 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
  • 21:21 ryankemper: T379334 Final step (step 9) of spinning up these new services; merged https://gerrit.wikimedia.org/r/c/operations/dns/+/1100165/, next up is the authdns update
  • 21:18 ryankemper@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly
  • 21:18 ryankemper@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-main
  • 21:17 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS bookworm
  • 21:14 cjming@deploy2002: cjming, bvibber: Continuing with sync
  • 21:13 cjming@deploy2002: cjming, bvibber: Backport for Enable Chart extension on several pilot wikis (T381436 T381312) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:09 ryankemper: T380555 Rolling out prod change => `ryankemper@cumin2002:~$ sudo cumin -b 8 'A:dnsbox' 'run-puppet-agent'`
  • 21:05 ryankemper: T380555 Moving `wdqs-internal-[main,scholarly]` services into prod by merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094074
  • 21:05 cjming@deploy2002: Started scap sync-world: Backport for Enable Chart extension on several pilot wikis (T381436 T381312)
  • 21:03 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: rebooting shortly
  • 21:03 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2013.codfw.wmnet with reason: rebooting shortly
  • 21:01 joal@deploy2002: Finished deploy [analytics/refinery@7ba91e1] (hadoop-test): Regular analytics weekly train TEST - HOTFIX 2 [analytics/refinery@7ba91e13] (duration: 00m 29s)
  • 21:00 joal@deploy2002: Started deploy [analytics/refinery@7ba91e1] (hadoop-test): Regular analytics weekly train TEST - HOTFIX 2 [analytics/refinery@7ba91e13]
  • 21:00 joal@deploy2002: Finished deploy [analytics/refinery@7ba91e1] (thin): Regular analytics weekly train THIN - HOTFIX 2 [analytics/refinery@7ba91e13] (duration: 00m 31s)
  • 20:59 joal@deploy2002: Started deploy [analytics/refinery@7ba91e1] (thin): Regular analytics weekly train THIN - HOTFIX 2 [analytics/refinery@7ba91e13]
  • 20:59 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 20:59 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 20:59 joal@deploy2002: Finished deploy [analytics/refinery@7ba91e1]: Regular analytics weekly train - HOTFIX 2 [analytics/refinery@7ba91e13] (duration: 01m 48s)
  • 20:57 joal@deploy2002: Started deploy [analytics/refinery@7ba91e1]: Regular analytics weekly train - HOTFIX 2 [analytics/refinery@7ba91e13]
  • 20:51 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1039.eqiad.wmnet with OS bookworm
  • 20:50 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:49 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1038.eqiad.wmnet with OS bookworm
  • 20:37 ryankemper: T380555 hosts happily pooled (except that `lvs2013` aka `A:lvs-low-traffic-codfw` cannot talk to `wdqs2026`) and `sudo ipvsadm -L -n` shows `10.2.1.93` and `10.2.1.94` as expected, codfw all done
  • 20:33 sukhe@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad and A:lvs
  • 20:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1039.eqiad.wmnet with reason: host reimage
  • 20:32 sukhe@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad and A:lvs
  • 20:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T371742)', diff saved to https://phabricator.wikimedia.org/P71562 and previous config saved to /var/cache/conftool/dbconfig/20241204-203043-ladsgroup.json
  • 20:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 20:30 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 20:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T371742)', diff saved to https://phabricator.wikimedia.org/P71561 and previous config saved to /var/cache/conftool/dbconfig/20241204-203021-ladsgroup.json
  • 20:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-codfw and A:lvs
  • 20:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1038.eqiad.wmnet with reason: host reimage
  • 20:28 ryankemper: T380555 `sudo cookbook sre.loadbalancer.restart-pybal --query 'A:lvs-low-traffic-codfw' --reason 'rolling out new wdqs-internal-[main,scholarly] services' restart_daemons`
  • 20:28 ryankemper@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-codfw and A:lvs
  • 20:28 ryankemper: T380555 ran `sudo -E cumin 'A:lvs-low-traffic-codfw' 'run-puppet-agent --force'`
  • 20:28 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1020.eqiad.wmnet
  • 20:28 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for lvs1020.eqiad.wmnet
  • 20:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1039.eqiad.wmnet with reason: host reimage
  • 20:25 sukhe@cumin1002: END (ERROR) - Cookbook sre.loadbalancer.restart-pybal (exit_code=97) rolling-restart of pybal on A:lvs-secondary-eqiad and A:lvs
  • 20:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1038.eqiad.wmnet with reason: host reimage
  • 20:24 ryankemper: T380555 hosts happily pooled and `sudo ipvsadm -L -n` shows `10.2.1.93` and `10.2.1.94` as expected), proceeding to `A:lvs-low-traffic-codfw`
  • 20:23 sukhe@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad and A:lvs
  • 20:23 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
  • 20:22 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
  • 20:22 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
  • 20:21 ryankemper@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-codfw and A:lvs
  • 20:21 ryankemper: T380555 `sudo cookbook sre.loadbalancer.restart-pybal --query 'A:lvs-secondary-codfw' --reason 'rolling out new wdqs-internal-[main,scholarly] services' restart_daemons`
  • 20:21 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
  • 20:20 ryankemper@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-codfw and A:lvs
  • 20:20 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
  • 20:20 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
  • 20:18 ryankemper: T380555 `sudo cookbook sre.loadbalancer.restart-pybal 'A:lvs-secondary-codfw' --reason 'rolling out new wdqs-internal-[main,scholarly] services'`
  • 20:17 ryankemper: T380555 `sudo -E cumin 'A:lvs-secondary-codfw' 'run-puppet-agent --force'`
  • 20:17 ryankemper: T380555 Beginning lvs rolling restarts. first up `A:lvs-secondary-codfw`
  • 20:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P71560 and previous config saved to /var/cache/conftool/dbconfig/20241204-201513-ladsgroup.json
  • 20:12 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 20:12 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 20:12 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 20:12 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 20:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1039.eqiad.wmnet with OS bookworm
  • 20:09 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 20:09 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 20:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1038.eqiad.wmnet with OS bookworm
  • 20:08 ryankemper: T380555 ran `ryankemper@cumin2002:~$ sudo -E cumin 'lvs*' 'disable-puppet T380555'`
  • 20:07 ryankemper: T380555 Disabling puppet on lvs hosts in preparation for merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094070 which will move `wdqs-internal-[main,scholarly]` from `service_setup` to `lvs_setup`
  • 20:04 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "T377876 - kamila@cumin1002"
  • 20:04 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "T377876 - kamila@cumin1002"
  • 20:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P71559 and previous config saved to /var/cache/conftool/dbconfig/20241204-200006-ladsgroup.json
  • 19:57 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1038.eqiad.wmnet wikikube-worker1039.eqiad.wmnet wikikube-worker1040.eqiad.wmnet wikikube-worker1041.eqiad.wmnet wikikube-worker1042.eqiad.wmnet wikikube-worker1043.eqiad.wmnet on all recursors
  • 19:57 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1038.eqiad.wmnet wikikube-worker1039.eqiad.wmnet wikikube-worker1040.eqiad.wmnet wikikube-worker1041.eqiad.wmnet wikikube-worker1042.eqiad.wmnet wikikube-worker1043.eqiad.wmnet on all recursors
  • 19:55 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1038.eqiad.wmnet on all recursors
  • 19:55 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1038.eqiad.wmnet on all recursors
  • 19:55 ryankemper: T380555 Running puppet on `wdqs2018`
  • 19:55 ryankemper: T380555 Proceeding to step 5 of new lvs service process. Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094069 to enable lvs::realserver functionality
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1496 to wikikube-worker1043
  • 19:53 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1043
  • 19:53 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 19:53 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1043
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1496 to wikikube-worker1043 - kamila@cumin1002"
  • 19:52 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1496 to wikikube-worker1043 - kamila@cumin1002"
  • 19:52 sukhe: sudo cumin "O:config_master" "run-puppet-agent"
  • 19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1495 to wikikube-worker1042
  • 19:49 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1042
  • 19:49 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:49 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1042
  • 19:49 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:49 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1495 to wikikube-worker1042 - kamila@cumin1002"
  • 19:49 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1495 to wikikube-worker1042 - kamila@cumin1002"
  • 19:45 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1496 to wikikube-worker1043
  • 19:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T371742)', diff saved to https://phabricator.wikimedia.org/P71558 and previous config saved to /var/cache/conftool/dbconfig/20241204-194459-ladsgroup.json
  • 19:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1494 to wikikube-worker1041
  • 19:43 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:43 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1041
  • 19:43 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1041
  • 19:43 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:43 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1494 to wikikube-worker1041 - kamila@cumin1002"
  • 19:42 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1494 to wikikube-worker1041 - kamila@cumin1002"
  • 19:40 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 19:40 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1495 to wikikube-worker1042
  • 19:40 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 19:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1493 to wikikube-worker1040
  • 19:39 joal@deploy2002: Finished deploy [airflow-dags/analytics@df2cac9]: Regular analytics weekly train [airflow-dags/analytics@df2cac98] (duration: 03m 55s)
  • 19:38 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1040
  • 19:38 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:38 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1040
  • 19:38 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:38 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1493 to wikikube-worker1040 - kamila@cumin1002"
  • 19:38 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1493 to wikikube-worker1040 - kamila@cumin1002"
  • 19:37 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1494 to wikikube-worker1041
  • 19:36 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 19:35 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 19:35 joal@deploy2002: Started deploy [airflow-dags/analytics@df2cac9]: Regular analytics weekly train [airflow-dags/analytics@df2cac98]
  • 19:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1492 to wikikube-worker1039
  • 19:34 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1039
  • 19:34 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1039
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1492 to wikikube-worker1039 - kamila@cumin1002"
  • 19:33 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1492 to wikikube-worker1039 - kamila@cumin1002"
  • 19:30 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1493 to wikikube-worker1040
  • 19:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1491 to wikikube-worker1038
  • 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1038
  • 19:29 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:29 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1038
  • 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1491 to wikikube-worker1038 - kamila@cumin1002"
  • 19:29 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1491 to wikikube-worker1038 - kamila@cumin1002"
  • 19:26 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1492 to wikikube-worker1039
  • 19:25 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:25 joal@deploy2002: Finished deploy [analytics/refinery@1f94312] (hadoop-test): Regular analytics weekly train TEST - HOTFIX [analytics/refinery@1f94312a] (duration: 00m 26s)
  • 19:25 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1491 to wikikube-worker1038
  • 19:24 joal@deploy2002: Started deploy [analytics/refinery@1f94312] (hadoop-test): Regular analytics weekly train TEST - HOTFIX [analytics/refinery@1f94312a]
  • 19:24 joal@deploy2002: Finished deploy [analytics/refinery@1f94312] (thin): Regular analytics weekly train THIN - HOTFIX [analytics/refinery@1f94312a] (duration: 00m 30s)
  • 19:23 joal@deploy2002: Started deploy [analytics/refinery@1f94312] (thin): Regular analytics weekly train THIN - HOTFIX [analytics/refinery@1f94312a]
  • 19:23 joal@deploy2002: Finished deploy [analytics/refinery@1f94312]: Regular analytics weekly train - HOTFIX [analytics/refinery@1f94312a] (duration: 03m 17s)
  • 19:22 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1491-1496].eqiad.wmnet
  • 19:20 joal@deploy2002: Started deploy [analytics/refinery@1f94312]: Regular analytics weekly train - HOTFIX [analytics/refinery@1f94312a]
  • 19:19 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1491-1496].eqiad.wmnet
  • 19:16 ryankemper: T380555 Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094069 to enable `lvs::realserver`
  • 19:09 ryankemper: T379333 Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1097542 to establish envoy on `A:wdqs-internal-main` and `A:wdqs-internal-scholarly`; running puppet on `wdqs2018` to test change
  • 19:03 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 19:02 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 19:02 ryankemper: T380555 Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1094061 to establish initial service definitions for `wdqs-internal-main` and `wdqs-internal-scholarly`
  • 18:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:57 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs-internal-main.svc.eqiad.wmnet on all recursors
  • 18:57 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache wdqs-internal-main.svc.eqiad.wmnet on all recursors
  • 18:55 ryankemper: T379334 Successfully ran `sudo authdns-update` on `dns1004`
  • 18:52 ryankemper: T379334 Creating A and PTR records for `wdqs-internal-main` and `wdqs-internal-scholarly` VIPs [merging https://gerrit.wikimedia.org/r/c/operations/dns/+/1100010/ & running authdns update after]
  • 18:48 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 18:47 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 18:47 ryankemper: T379330 `wdqs-internal-main` and `wdqs-internal-scholarly` pools created
  • 18:46 ryankemper@cumin2002: conftool action : set/pooled=yes:weight=10; selector: cluster=wdqs-internal-main,service=wdqs-main
  • 18:46 ryankemper@cumin2002: conftool action : set/pooled=yes:weight=10; selector: cluster=wdqs-internal-scholarly,service=wdqs-scholarly
  • 18:35 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
  • 18:35 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
  • 18:34 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
  • 18:33 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
  • 18:30 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
  • 18:30 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
  • 18:13 swfrench@deploy2002: Finished scap sync-world: Deployment to clear noop chart diff from 1081449 - T377040 (duration: 02m 07s)
  • 18:11 swfrench@deploy2002: Started scap sync-world: Deployment to clear noop chart diff from 1081449 - T377040
  • 18:04 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 18:04 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 18:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T371742)', diff saved to https://phabricator.wikimedia.org/P71556 and previous config saved to /var/cache/conftool/dbconfig/20241204-180114-ladsgroup.json
  • 18:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 18:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 18:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T371742)', diff saved to https://phabricator.wikimedia.org/P71555 and previous config saved to /var/cache/conftool/dbconfig/20241204-180052-ladsgroup.json
  • 17:55 joal@deploy2002: Finished deploy [analytics/refinery@6e3ee14] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6e3ee14b] (duration: 00m 31s)
  • 17:54 joal@deploy2002: Started deploy [analytics/refinery@6e3ee14] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6e3ee14b]
  • 17:54 joal@deploy2002: Finished deploy [analytics/refinery@6e3ee14] (thin): Regular analytics weekly train THIN [analytics/refinery@6e3ee14b] (duration: 00m 37s)
  • 17:54 joal@deploy2002: Started deploy [analytics/refinery@6e3ee14] (thin): Regular analytics weekly train THIN [analytics/refinery@6e3ee14b]
  • 17:52 joal@deploy2002: Finished deploy [analytics/refinery@6e3ee14]: Regular analytics weekly train [analytics/refinery@6e3ee14b] (duration: 02m 05s)
  • 17:50 bd808: Moved SAL fediverse posts to https://wikimedia.social/@sal. Many thanks to botsin.space for providing hosting for so long.
  • 17:50 joal@deploy2002: Started deploy [analytics/refinery@6e3ee14]: Regular analytics weekly train [analytics/refinery@6e3ee14b]
  • 17:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P71554 and previous config saved to /var/cache/conftool/dbconfig/20241204-174544-ladsgroup.json
  • 17:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P71553 and previous config saved to /var/cache/conftool/dbconfig/20241204-173037-ladsgroup.json
  • 17:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T371742)', diff saved to https://phabricator.wikimedia.org/P71551 and previous config saved to /var/cache/conftool/dbconfig/20241204-171530-ladsgroup.json
  • 17:10 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon1003.eqiad.wmnet
  • 17:10 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:10 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 17:08 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 17:04 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 17:00 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudcephmon1003.eqiad.wmnet
  • 16:59 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon1002.eqiad.wmnet
  • 16:59 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:59 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 16:59 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 16:56 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 16:52 jgleeson: smashpig-listeners updated from 79b463b4 to 17ac74f2
  • 16:51 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudcephmon1002.eqiad.wmnet
  • 16:46 isaranto@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 16:45 isaranto@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 16:38 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon1001.eqiad.wmnet
  • 16:38 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:38 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 16:38 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephmon1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 16:37 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 16:37 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 16:37 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 16:37 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 16:36 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 16:36 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:36 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 16:36 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 16:36 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 16:36 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 16:35 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2173-2175].codfw.wmnet
  • 16:35 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2173-2175].codfw.wmnet
  • 16:34 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:34 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 16:34 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 16:34 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 16:34 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 16:33 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 16:33 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:33 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:33 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:33 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 16:32 jiji@deploy2002: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 16:32 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2175.codfw.wmnet with OS bookworm
  • 16:27 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudcephmon1001.eqiad.wmnet
  • 16:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Schema change
  • 16:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Schema change
  • 16:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2174.codfw.wmnet with OS bookworm
  • 16:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P71550 and previous config saved to /var/cache/conftool/dbconfig/20241204-162127-root.json
  • 16:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main1009.eqiad.wmnet
  • 16:19 jiji@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main1009.eqiad.wmnet
  • 16:18 jiji@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:17 jiji@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:17 jiji@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:17 jiji@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:17 jiji@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:16 jiji@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 16:16 jiji@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 16:16 jiji@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 16:16 jiji@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:15 jiji@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 16:14 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 16:14 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 16:14 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:14 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 16:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2173.codfw.wmnet with OS bookworm
  • 16:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2175.codfw.wmnet with reason: host reimage
  • 16:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 16:12 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 16:09 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2175.codfw.wmnet with reason: host reimage
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P71549 and previous config saved to /var/cache/conftool/dbconfig/20241204-160622-root.json
  • 16:06 hnowlan@deploy2002: Finished scap sync-world: Rebuild and deploy to pick up new php8.1 base (duration: 42m 17s)
  • 16:01 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2174.codfw.wmnet with reason: host reimage
  • 15:55 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2174.codfw.wmnet with reason: host reimage
  • 15:54 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2173.codfw.wmnet with reason: host reimage
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P71548 and previous config saved to /var/cache/conftool/dbconfig/20241204-155116-root.json
  • 15:51 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2175
  • 15:51 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2175
  • 15:50 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2173.codfw.wmnet with reason: host reimage
  • 15:46 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2175
  • 15:46 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2175.codfw.wmnet 80.48.192.10.in-addr.arpa 0.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:46 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2175.codfw.wmnet 80.48.192.10.in-addr.arpa 0.8.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:46 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:45 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2175 - jayme@cumin2002"
  • 15:45 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2175 - jayme@cumin2002"
  • 15:45 vgutierrez: restarting purged on cp1115
  • 15:41 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1036-1037].eqiad.wmnet
  • 15:41 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1036-1037].eqiad.wmnet
  • 15:39 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:37 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2175
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2174
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2174
  • 15:36 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2174
  • 15:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P71546 and previous config saved to /var/cache/conftool/dbconfig/20241204-153611-root.json
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2174.codfw.wmnet 79.48.192.10.in-addr.arpa 9.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:36 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2174.codfw.wmnet 79.48.192.10.in-addr.arpa 9.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:36 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2174 - jayme@cumin2002"
  • 15:36 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2174 - jayme@cumin2002"
  • 15:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T371742)', diff saved to https://phabricator.wikimedia.org/P71545 and previous config saved to /var/cache/conftool/dbconfig/20241204-153234-ladsgroup.json
  • 15:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 15:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 15:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T371742)', diff saved to https://phabricator.wikimedia.org/P71544 and previous config saved to /var/cache/conftool/dbconfig/20241204-153212-ladsgroup.json
  • 15:31 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:31 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2174
  • 15:31 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2173
  • 15:31 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2173
  • 15:30 jiji@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
  • 15:30 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2173
  • 15:30 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2173.codfw.wmnet 78.48.192.10.in-addr.arpa 8.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:30 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2173.codfw.wmnet 78.48.192.10.in-addr.arpa 8.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:30 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:30 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2173 - jayme@cumin2002"
  • 15:30 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2173 - jayme@cumin2002"
  • 15:28 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2175.codfw.wmnet with OS bookworm
  • 15:27 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2174.codfw.wmnet with OS bookworm
  • 15:27 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 15:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1037.eqiad.wmnet with OS bookworm
  • 15:26 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:26 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2173.codfw.wmnet wikikube-worker2174.codfw.wmnet wikikube-worker2175.codfw.wmnet on all recursors
  • 15:26 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2173.codfw.wmnet wikikube-worker2174.codfw.wmnet wikikube-worker2175.codfw.wmnet on all recursors
  • 15:26 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2173
  • 15:26 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2173.codfw.wmnet with OS bookworm
  • 15:25 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2447 to wikikube-worker2175
  • 15:24 hnowlan@deploy2002: Started scap sync-world: Rebuild and deploy to pick up new php8.1 base
  • 15:24 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2175
  • 15:24 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2175
  • 15:24 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:22 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2446 to wikikube-worker2174
  • 15:22 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2174
  • 15:22 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:21 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2174
  • 15:21 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:21 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2446 to wikikube-worker2174 - jayme@cumin2002"
  • 15:21 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2446 to wikikube-worker2174 - jayme@cumin2002"
  • 15:21 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1045.eqiad.wmnet with OS bookworm
  • 15:21 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2445 to wikikube-worker2173
  • 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P71543 and previous config saved to /var/cache/conftool/dbconfig/20241204-152105-root.json
  • 15:20 jiji@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
  • 15:20 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2173
  • 15:18 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2173
  • 15:18 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:18 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2445 to wikikube-worker2173 - jayme@cumin2002"
  • 15:18 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2445 to wikikube-worker2173 - jayme@cumin2002"
  • 15:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P71542 and previous config saved to /var/cache/conftool/dbconfig/20241204-151705-ladsgroup.json
  • 15:16 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:10 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2447 to wikikube-worker2175
  • 15:10 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2446 to wikikube-worker2174
  • 15:10 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 15:10 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2445 to wikikube-worker2173
  • 15:06 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 15:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1037.eqiad.wmnet with reason: host reimage
  • 15:03 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1037.eqiad.wmnet with reason: host reimage
  • 15:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P71541 and previous config saved to /var/cache/conftool/dbconfig/20241204-150157-ladsgroup.json
  • 15:01 TheresNoTime: '[samtar@deploy2002 ~]$ mwscript-k8s --comment="T373634" -f -- namespaceDupes.php --wiki hsbwiktionary --fix' for T373634
  • 14:59 samtar@deploy2002: Finished scap sync-world: Backport for Add new namespaces to hsb wiktionary (T373634) (duration: 10m 16s)
  • 14:54 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2015.codfw.wmnet
  • 14:54 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2015.codfw.wmnet
  • 14:52 samtar@deploy2002: samtar, srishakatux: Continuing with sync
  • 14:51 samtar@deploy2002: samtar, srishakatux: Backport for Add new namespaces to hsb wiktionary (T373634) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2446.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2447.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2445.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 14:48 samtar@deploy2002: Started scap sync-world: Backport for Add new namespaces to hsb wiktionary (T373634)
  • 14:47 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1037.eqiad.wmnet with OS bookworm
  • 14:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T371742)', diff saved to https://phabricator.wikimedia.org/P71540 and previous config saved to /var/cache/conftool/dbconfig/20241204-144651-ladsgroup.json
  • 14:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1036.eqiad.wmnet with OS bookworm
  • 14:46 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 14:31 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:29 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription for 6 wikis (T372386) (duration: 18m 12s)
  • 14:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1036.eqiad.wmnet with reason: host reimage
  • 14:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1036.eqiad.wmnet with reason: host reimage
  • 14:22 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Continuing with sync
  • 14:19 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 14:18 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2015.codfw.wmnet with OS bookworm
  • 14:17 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Backport for Translate: Enable message group subscription for 6 wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:11 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription for 6 wikis (T372386)
  • 14:07 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1036.eqiad.wmnet with OS bookworm
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1036.eqiad.wmnet wikikube-worker1037.eqiad.wmnet on all recursors
  • 14:05 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1036.eqiad.wmnet wikikube-worker1037.eqiad.wmnet on all recursors
  • 14:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2446.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2447.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:01 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2445.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS bookworm
  • 14:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS bookworm
  • 14:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 14:00 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Ensure IP reveal buttons are not shown on Special:MassGlobalBlock (T124607) (duration: 13m 08s)
  • 13:58 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2015.codfw.wmnet with reason: host reimage
  • 13:58 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw[2445-2447].codfw.wmnet with reason: reimage
  • 13:57 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw[2445-2447].codfw.wmnet with reason: reimage
  • 13:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1024 to wikikube-worker1037
  • 13:55 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1037
  • 13:54 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2015.codfw.wmnet with reason: host reimage
  • 13:54 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1037
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1024 to wikikube-worker1037 - jelto@cumin1002"
  • 13:54 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1024 to wikikube-worker1037 - jelto@cumin1002"
  • 13:53 dreamyjazz@deploy2002: tchanders, dreamyjazz: Continuing with sync
  • 13:53 dreamyjazz@deploy2002: tchanders, dreamyjazz: Backport for Ensure IP reveal buttons are not shown on Special:MassGlobalBlock (T124607) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:51 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2445-2447].codfw.wmnet
  • 13:50 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:50 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2445-2447].codfw.wmnet
  • 13:50 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1024 to wikikube-worker1037
  • 13:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1023 to wikikube-worker1036
  • 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1036
  • 13:47 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1036
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1023 to wikikube-worker1036 - jelto@cumin1002"
  • 13:47 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1023 to wikikube-worker1036 - jelto@cumin1002"
  • 13:47 dreamyjazz@deploy2002: Started scap sync-world: Backport for Ensure IP reveal buttons are not shown on Special:MassGlobalBlock (T124607)
  • {{safesubst:SAL entry|1=13:42 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), [[gerrit:1100449|Revert "Stats: Move StatsFactory flush into emitBufferedSt}}
  • 13:41 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016,2171-2172].codfw.wmnet
  • 13:41 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016,2171-2172].codfw.wmnet
  • 13:39 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:39 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1023 to wikikube-worker1036
  • 13:35 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2015.codfw.wmnet with OS bookworm
  • 13:35 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2015.codfw.wmnet with OS bookworm
  • 13:33 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 13:33 dreamyjazz@deploy2002: dreamyjazz: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Revert "Stats: Move StatsFactory flush into emitBufferedStats" synced
  • 13:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1023-1024].eqiad.wmnet
  • 13:30 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1023-1024].eqiad.wmnet
  • {{safesubst:SAL entry|1=13:28 dreamyjazz@deploy2002: Started scap sync-world: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), [[gerrit:1100449|Revert "Stats: Move StatsFactory flush into emitBufferedSta}}
  • 13:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Alter table
  • 13:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on clouddb1020.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on clouddb1020.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on clouddb1016.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on clouddb1016.eqiad.wmnet with reason: Alter table
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1154.eqiad.wmnet with reason: Alter table
  • 13:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1154.eqiad.wmnet with reason: Alter table
  • 13:12 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:12 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:09 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T371742)', diff saved to https://phabricator.wikimedia.org/P71537 and previous config saved to /var/cache/conftool/dbconfig/20241204-130614-ladsgroup.json
  • 13:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 13:06 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 13:05 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:05 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:04 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:00 dreamyjazz@deploy2002: Started scap sync-world: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)
  • 12:59 dreamyjazz@deploy2002: Started scap sync-world: Backport for Stats: Move StatsFactory flush into emitBufferedStats (T380609), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169), Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)
  • 12:57 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:56 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:55 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:55 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:54 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1004,1009].eqiad.wmnet with reason: Hardware refresh
  • 12:54 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1004,1009].eqiad.wmnet with reason: Hardware refresh
  • 12:52 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
  • 12:47 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
  • 12:47 moritzm: uploaded mailman3 3.3.8-2~deb12u2+wmf1 T377045
  • 12:42 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 12:40 hnowlan: imported debs for mercurius_1.0.2
  • 12:38 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 12:33 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:32 moritzm: installing glib2.0 security updates
  • 12:22 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2172.codfw.wmnet with OS bookworm
  • 12:06 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Create a DB list for wikis with continuous MediaModeration scans (T355169) (duration: 13m 02s)
  • 12:02 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2172.codfw.wmnet with reason: host reimage
  • 12:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 12:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 11:59 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 11:59 dreamyjazz@deploy2002: dreamyjazz: Backport for Create a DB list for wikis with continuous MediaModeration scans (T355169) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:58 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2172.codfw.wmnet with reason: host reimage
  • 11:53 dreamyjazz@deploy2002: Started scap sync-world: Backport for Create a DB list for wikis with continuous MediaModeration scans (T355169)
  • 11:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2016.codfw.wmnet with OS bookworm
  • 11:49 vgutierrez: re-enabling outbound bandwidth limits enforced by haproxy on the upload cluster
  • 11:39 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2172.codfw.wmnet with OS bookworm
  • 11:38 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2172.codfw.wmnet with OS bookworm
  • 11:36 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2171.codfw.wmnet with OS bookworm
  • 11:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 11:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 11:32 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2016.codfw.wmnet with reason: host reimage
  • 11:26 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2016.codfw.wmnet with reason: host reimage
  • 11:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 11:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 11:14 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2171.codfw.wmnet with reason: host reimage
  • 11:13 vgutierrez: disabling outbound bandwidth limits enforced by haproxy on the upload cluster (we are getting haproxy crashes)
  • 11:11 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2171.codfw.wmnet with reason: host reimage
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2016
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2016
  • 11:07 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2016
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2016.codfw.wmnet 151.32.192.10.in-addr.arpa 1.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:07 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2016.codfw.wmnet 151.32.192.10.in-addr.arpa 1.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:07 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2016 - jayme@cumin2002"
  • 11:07 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2016 - jayme@cumin2002"
  • 11:03 vgutierrez: restarting haproxy on cp1107
  • 10:58 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:58 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2016
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2015
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2015
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2015.codfw.wmnet wikikube-worker2016.codfw.wmnet wikikube-worker2171.codfw.wmnet wikikube-worker2172.codfw.wmnet on all recursors
  • 10:57 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2015.codfw.wmnet wikikube-worker2016.codfw.wmnet wikikube-worker2171.codfw.wmnet wikikube-worker2172.codfw.wmnet on all recursors
  • 10:57 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2015
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2015.codfw.wmnet 149.32.192.10.in-addr.arpa 9.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:57 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2015.codfw.wmnet 149.32.192.10.in-addr.arpa 9.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:57 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2015 - jayme@cumin2002"
  • 10:57 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2015 - jayme@cumin2002"
  • 10:54 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2016.codfw.wmnet with OS bookworm
  • 10:53 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:53 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2015
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2171
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2171
  • 10:52 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2171
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2171.codfw.wmnet 152.32.192.10.in-addr.arpa 2.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:52 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2171.codfw.wmnet 152.32.192.10.in-addr.arpa 2.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:50 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:49 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2016.codfw.wmnet with OS bookworm
  • 10:49 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1005.eqiad.wmnet
  • 10:49 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:48 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2016.codfw.wmnet with OS bookworm
  • 10:47 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2442 to wikikube-worker2016
  • 10:47 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2016
  • 10:46 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 10:46 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2016
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:46 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2442 to wikikube-worker2016 - jayme@cumin2002"
  • 10:46 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2442 to wikikube-worker2016 - jayme@cumin2002"
  • 10:41 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2172
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2172
  • 10:41 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2172
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2172.codfw.wmnet 77.48.192.10.in-addr.arpa 7.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:41 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2172.codfw.wmnet 77.48.192.10.in-addr.arpa 7.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:41 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2172 - jayme@cumin2002"
  • 10:40 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2172 - jayme@cumin2002"
  • 10:39 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1005.eqiad.wmnet
  • 10:38 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2015.codfw.wmnet with OS bookworm
  • 10:38 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2440 to wikikube-worker2015
  • 10:37 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2442 to wikikube-worker2016
  • 10:37 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2015
  • 10:37 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2171
  • 10:36 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2171.codfw.wmnet with OS bookworm
  • 10:36 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:36 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2015
  • 10:36 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:36 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2172
  • 10:36 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2172.codfw.wmnet with OS bookworm
  • 10:35 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2443 to wikikube-worker2171
  • 10:35 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2171
  • 10:34 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:33 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1004.eqiad.wmnet
  • 10:33 moritzm: removing ganeti2018 from active Ganeti nodes T376594
  • 10:33 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:30 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 10:30 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2444 to wikikube-worker2172
  • 10:30 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2171
  • 10:30 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:30 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2172
  • 10:29 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2172
  • 10:29 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:29 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2444 to wikikube-worker2172 - jayme@cumin2002"
  • 10:28 vgutierrez: enabling outbound bandwidth limits enforced by haproxy on the upload cluster
  • 10:28 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:27 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2442 to wikikube-worker2016
  • 10:27 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2444 to wikikube-worker2172 - jayme@cumin2002"
  • 10:27 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2442 to wikikube-worker2016
  • 10:23 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2442 to wikikube-worker2016
  • 10:23 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2442 to wikikube-worker2016
  • 10:23 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:23 jayme@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:22 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2444 to wikikube-worker2172
  • 10:22 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2443 to wikikube-worker2171
  • 10:22 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2442 to wikikube-worker20160
  • 10:22 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:22 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2442 to wikikube-worker20160
  • 10:22 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1004.eqiad.wmnet
  • 10:21 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:21 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:20 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2440 to wikikube-worker2015
  • 10:19 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1003.eqiad.wmnet
  • 10:19 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:19 brouberol@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 10:19 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:19 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:19 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:19 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:13 brouberol@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 10:10 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 10:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 8 hosts with reason: Rebooting
  • 10:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 8 hosts with reason: Rebooting
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2018.codfw.wmnet
  • 10:04 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1003.eqiad.wmnet
  • 10:03 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1002.eqiad.wmnet
  • 10:03 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:03 brouberol@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 10:02 brouberol@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 09:58 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 09:56 godog: bump space for prometheus k8s-mlserve in eqiad
  • 09:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2444.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:50 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2443.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:46 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2440.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:39 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1002.eqiad.wmnet
  • 09:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2024.codfw.wmnet with reason: cloning
  • 09:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2024.codfw.wmnet with reason: cloning
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024 to clone es2045', diff saved to https://phabricator.wikimedia.org/P71535 and previous config saved to /var/cache/conftool/dbconfig/20241204-093541-marostegui.json
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2023 to es5 master T381259', diff saved to https://phabricator.wikimedia.org/P71534 and previous config saved to /var/cache/conftool/dbconfig/20241204-093519-marostegui.json
  • 09:35 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-presto1001.eqiad.wmnet
  • 09:35 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:35 brouberol@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 09:34 brouberol@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-presto1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 09:33 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2444.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:33 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2443.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:32 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2442.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:30 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 09:21 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-presto1001.eqiad.wmnet
  • 09:15 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2442.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:14 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2440.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:12 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[2440,2442-2444].codfw.wmnet with reason: T377877
  • 09:12 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 100%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71533 and previous config saved to /var/cache/conftool/dbconfig/20241204-091229-root.json
  • 09:12 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[2440,2442-2444].codfw.wmnet with reason: T377877
  • 09:07 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2440,2442-2444].codfw.wmnet
  • 09:05 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2440,2442-2444].codfw.wmnet
  • 08:57 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 75%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71532 and previous config saved to /var/cache/conftool/dbconfig/20241204-085724-root.json
  • 08:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2022.codfw.wmnet with reason: cloning
  • 08:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2022.codfw.wmnet with reason: cloning
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2022 to clone es2043', diff saved to https://phabricator.wikimedia.org/P71531 and previous config saved to /var/cache/conftool/dbconfig/20241204-085143-marostegui.json
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2020 to es4 master T381259', diff saved to https://phabricator.wikimedia.org/P71530 and previous config saved to /var/cache/conftool/dbconfig/20241204-085124-marostegui.json
  • 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71529 and previous config saved to /var/cache/conftool/dbconfig/20241204-084650-root.json
  • 08:42 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 50%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71528 and previous config saved to /var/cache/conftool/dbconfig/20241204-084219-root.json
  • 08:35 moritzm: rebalance Ganeti eqiad/C following server refreshes
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71527 and previous config saved to /var/cache/conftool/dbconfig/20241204-083145-root.json
  • 08:27 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 25%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71526 and previous config saved to /var/cache/conftool/dbconfig/20241204-082714-root.json
  • 08:25 kharlan@deploy2002: Finished scap sync-world: Backport for dialog: Don't duplicate the footer in the behaviour list template (T381189) (duration: 12m 08s)
  • 08:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2018.codfw.wmnet
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2018.codfw.wmnet
  • 08:18 kharlan@deploy2002: kharlan: Continuing with sync
  • 08:18 kharlan@deploy2002: kharlan: Backport for dialog: Don't duplicate the footer in the behaviour list template (T381189) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71525 and previous config saved to /var/cache/conftool/dbconfig/20241204-081640-root.json
  • 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2018.codfw.wmnet
  • 08:13 kharlan@deploy2002: Started scap sync-world: Backport for dialog: Don't duplicate the footer in the behaviour list template (T381189)
  • 08:12 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 10%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71524 and previous config saved to /var/cache/conftool/dbconfig/20241204-081208-root.json
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71522 and previous config saved to /var/cache/conftool/dbconfig/20241204-080134-root.json
  • 07:57 marostegui@cumin1002: dbctl commit (dc=all): 'es2046 (re)pooling @ 1%: Pooling in es5', diff saved to https://phabricator.wikimedia.org/P71520 and previous config saved to /var/cache/conftool/dbconfig/20241204-075703-root.json
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2046 to es5 depooled T381259', diff saved to https://phabricator.wikimedia.org/P71519 and previous config saved to /var/cache/conftool/dbconfig/20241204-075427-marostegui.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2042 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71518 and previous config saved to /var/cache/conftool/dbconfig/20241204-074629-root.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71517 and previous config saved to /var/cache/conftool/dbconfig/20241204-070855-root.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 100%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71516 and previous config saved to /var/cache/conftool/dbconfig/20241204-070829-root.json
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71515 and previous config saved to /var/cache/conftool/dbconfig/20241204-065349-root.json
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 75%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71514 and previous config saved to /var/cache/conftool/dbconfig/20241204-065324-root.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71513 and previous config saved to /var/cache/conftool/dbconfig/20241204-063844-root.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 50%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71512 and previous config saved to /var/cache/conftool/dbconfig/20241204-063819-root.json
  • 06:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71511 and previous config saved to /var/cache/conftool/dbconfig/20241204-062339-root.json
  • 06:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 25%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71510 and previous config saved to /var/cache/conftool/dbconfig/20241204-062313-root.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2042 to dbctl depooled T381259', diff saved to https://phabricator.wikimedia.org/P71509 and previous config saved to /var/cache/conftool/dbconfig/20241204-061821-marostegui.json
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71508 and previous config saved to /var/cache/conftool/dbconfig/20241204-060834-root.json
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 10%: Repooling cloning', diff saved to https://phabricator.wikimedia.org/P71507 and previous config saved to /var/cache/conftool/dbconfig/20241204-060808-root.json
  • 02:40 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1045.eqiad.wmnet with OS bookworm
  • 02:33 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1046.eqiad.wmnet with OS bookworm
  • 02:33 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 02:32 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS bookworm
  • 02:32 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 02:08 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1085.eqiad.wmnet with OS bullseye
  • 01:56 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 01:46 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1043.eqiad.wmnet with OS bookworm
  • 01:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1044.eqiad.wmnet with OS bookworm
  • 01:39 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 01:39 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1046.eqiad.wmnet with reason: host reimage
  • 01:36 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1046.eqiad.wmnet with reason: host reimage
  • 01:23 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:22 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:22 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
  • 01:20 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1046.eqiad.wmnet with OS bookworm
  • 01:20 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS bookworm
  • 01:19 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
  • 01:15 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:15 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 01:03 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS bookworm
  • 01:02 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1041.eqiad.wmnet with OS bookworm
  • 01:00 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:00 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:57 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS bookworm
  • 00:57 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 00:56 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 00:53 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:52 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:50 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:48 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:48 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:47 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1085.eqiad.wmnet with OS bullseye
  • 00:47 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:43 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:43 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:42 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:41 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:40 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
  • 00:37 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:36 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
  • 00:31 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:30 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2020.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 00:26 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:18 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:18 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:16 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:13 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1084.eqiad.wmnet with OS bullseye
  • 00:13 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 00:09 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:09 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply

2024-12-03

  • 23:58 amastilovic@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 23:52 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 23:50 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:48 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1085.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:48 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS bookworm
  • 23:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:41 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS bookworm
  • 23:41 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS bookworm
  • 23:40 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS bookworm
  • 23:39 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:39 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1085 - vriley@cumin1002"
  • 23:39 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1085 - vriley@cumin1002"
  • 23:37 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1086.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:36 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2020.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 23:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2019.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 23:35 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 23:34 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1084.eqiad.wmnet with reason: host reimage
  • 23:30 vriley@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1084.eqiad.wmnet with reason: host reimage
  • 23:29 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086
  • 23:28 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086
  • 23:27 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1089.eqiad.wmnet with OS bullseye
  • 23:27 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:25 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:22 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:22 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 23:22 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 23:20 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1084.eqiad.wmnet with OS bullseye
  • 23:19 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1083.eqiad.wmnet with OS bullseye
  • 23:19 vriley@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 23:19 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 23:12 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1087.eqiad.wmnet with OS bullseye
  • 23:12 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:11 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:11 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1090.eqiad.wmnet with OS bullseye
  • 23:11 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:11 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:08 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1088.eqiad.wmnet with OS bullseye
  • 23:08 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:08 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1089.eqiad.wmnet with reason: host reimage
  • 23:08 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 23:04 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1089.eqiad.wmnet with reason: host reimage
  • 23:04 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs[2018-2020,2026-2027].codfw.wmnet with reason: T376150
  • 23:02 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs[2018-2020,2026-2027].codfw.wmnet with reason: T376150
  • 23:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs[1026-1027].eqiad.wmnet with reason: T376150
  • 23:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs[1026-1027].eqiad.wmnet with reason: T376150
  • 22:57 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 22:53 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:53 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1089.eqiad.wmnet with OS bullseye
  • 22:52 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1090.eqiad.wmnet with reason: host reimage
  • 22:52 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1089.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:52 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:52 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:50 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 22:46 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1087.eqiad.wmnet with reason: host reimage
  • 22:43 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1090.eqiad.wmnet with reason: host reimage
  • 22:43 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 22:43 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2019.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 22:43 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1087.eqiad.wmnet with reason: host reimage
  • 22:42 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1089.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:38 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 00m 13s)
  • 22:38 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 22:37 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1083.eqiad.wmnet with reason: host reimage
  • 22:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs[1026-1027].eqiad.wmnet with reason: T376150
  • 22:35 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs[1026-1027].eqiad.wmnet with reason: T376150
  • 22:34 vriley@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1083.eqiad.wmnet with reason: host reimage
  • 22:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS bullseye
  • 22:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1090.eqiad.wmnet with OS bullseye
  • 22:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1087.eqiad.wmnet with OS bullseye
  • 22:32 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host (duration: 00m 13s)
  • 22:32 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 22:32 ryankemper@deploy2002: deploy aborted: deploy to fresh wdqs-internal-scholarly host (duration: 03m 59s)
  • 22:32 dancy@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 22:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1090.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1088.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1087.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:31 dancy@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 22:28 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 22:23 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1083.eqiad.wmnet with OS bullseye
  • 22:21 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1089.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1090.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1088.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1089.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1087.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:15 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:15 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:12 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:10 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 21:52 ebernhardson@deploy2002: Finished scap sync-world: Backport for cirrus: Configure MLR buckets (T377128) (duration: 17m 47s)
  • 21:45 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 21:40 ebernhardson@deploy2002: ebernhardson: Backport for cirrus: Configure MLR buckets (T377128) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:34 ebernhardson@deploy2002: Started scap sync-world: Backport for cirrus: Configure MLR buckets (T377128)
  • 21:32 ebernhardson@deploy2002: Finished scap sync-world: Backport for Rerunning Web browser extension survey (T380778), Reenable non-UI experiment quick survey (T379241), Deploy Vector22 To Wikis (T381041) (duration: 22m 00s)
  • 21:28 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:28 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Backfill allocations for mw-parsoid LVS VIPs - swfrench@cumin2002"
  • 21:28 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Backfill allocations for mw-parsoid LVS VIPs - swfrench@cumin2002"
  • 21:24 ebernhardson@deploy2002: bwang, ebernhardson, lmora, jdrewniak: Continuing with sync
  • 21:23 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 21:16 ebernhardson@deploy2002: bwang, ebernhardson, lmora, jdrewniak: Backport for Rerunning Web browser extension survey (T380778), Reenable non-UI experiment quick survey (T379241), Deploy Vector22 To Wikis (T381041) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:10 ebernhardson@deploy2002: Started scap sync-world: Backport for Rerunning Web browser extension survey (T380778), Reenable non-UI experiment quick survey (T379241), Deploy Vector22 To Wikis (T381041)
  • 21:08 dancy@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 21:07 dancy@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 20:49 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1087
  • 20:49 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1087
  • 20:49 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1088
  • 20:48 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1088
  • 20:48 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1090
  • 20:48 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1090
  • 20:48 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1089
  • 20:48 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1089
  • 20:46 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:46 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 20:46 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 20:42 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 20:38 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1278-1279].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 20:38 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1279.eqiad.wmnet with OS bookworm
  • 20:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1279.eqiad.wmnet with reason: host reimage
  • 20:15 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1279.eqiad.wmnet with reason: host reimage
  • 20:01 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 20:00 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:57 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:55 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1279.eqiad.wmnet with OS bookworm
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1278.eqiad.wmnet with OS bookworm
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1278.eqiad.wmnet with reason: host reimage
  • 19:31 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1278.eqiad.wmnet with reason: host reimage
  • 19:19 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
  • 19:18 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.6 refs T375665
  • 19:15 topranks: rebooting rpki2003 to clear out tmpfs filesystem which is full
  • 19:15 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
  • 19:14 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal scholarly tier) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1027.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:13 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host (duration: 00m 07s)
  • 19:13 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 19:13 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 01m 09s)
  • 19:11 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:11 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 19:11 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 02m 45s)
  • 19:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1278.eqiad.wmnet with OS bookworm
  • 19:09 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 19:04 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:04 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1278-1279].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 19:02 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs1026.eqiad.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:00 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer wikidata_main from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 18:59 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T376150, initialize wdqs internal main tier) xfer scholarly_articles from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet, repooling source-only afterwards
  • 18:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs2027.codfw.wmnet with reason: T376150
  • 18:58 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs2027.codfw.wmnet with reason: T376150
  • 18:56 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T376150, initialize wdqs internal main tier) xfer scholarly_articles from wdqs2021.codfw.wmnet -> wdqs2018.codfw.wmnet, repooling source-only afterwards
  • 18:49 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 00m 14s)
  • 18:49 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 18:47 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 00m 14s)
  • 18:47 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 18:43 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host (duration: 03m 31s)
  • 18:40 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-main host
  • 18:39 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host (duration: 00m 11s)
  • 18:39 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 18:39 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host (duration: 00m 11s)
  • 18:39 ryankemper@deploy2002: Started deploy [wdqs/wdqs@9927a5a]: deploy to fresh wdqs-internal-scholarly host
  • 18:35 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1034-1035].eqiad.wmnet
  • 18:35 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1034-1035].eqiad.wmnet
  • 18:23 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 18:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1035.eqiad.wmnet with OS bookworm
  • 18:00 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 18:00 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 17:57 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 17:57 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 17:57 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 17:56 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 17:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1035.eqiad.wmnet with reason: host reimage
  • 17:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs2026.codfw.wmnet with reason: T376150
  • 17:50 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs2026.codfw.wmnet with reason: T376150
  • 17:48 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1035.eqiad.wmnet with reason: host reimage
  • 17:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: dc=magru,service=cdn,name=cp7001.magru.wmnet
  • 17:46 brett: Removing RSA certificate support from haproxy/cp (T370837)
  • 17:38 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 17:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1035.eqiad.wmnet with OS bookworm
  • 17:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1034.eqiad.wmnet with OS bookworm
  • 17:20 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1091.eqiad.wmnet with reason: host reimage
  • 17:17 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1091.eqiad.wmnet with reason: host reimage
  • 17:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1034.eqiad.wmnet with reason: host reimage
  • 17:08 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1034.eqiad.wmnet with reason: host reimage
  • 17:07 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 16:58 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:52 brett@puppetserver1001: conftool action : set/pooled=no; selector: dc=magru,service=cdn,name=cp7001.magru.wmnet
  • 16:51 sbisson@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 16:51 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1034.eqiad.wmnet with OS bookworm
  • 16:50 urbanecm@deploy2002: Finished scap sync-world: Backport for Revert "Increase Nuke max age to 90 days" (T380846) (duration: 12m 29s)
  • 16:49 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1034.eqiad.wmnet with OS bookworm
  • 16:47 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:44 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:44 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:38 urbanecm@deploy2002: Started scap sync-world: Backport for Revert "Increase Nuke max age to 90 days" (T380846)
  • 16:30 brett: Disabling puppet on A:cp to prep for RSA removal - T370837
  • 16:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:27 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:19 moritzm: rebalance Ganeti eqiad/B following server refreshes
  • 16:07 moritzm: installing intel-microcode security updates
  • 15:51 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1034.eqiad.wmnet with OS bookworm
  • 15:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1034.eqiad.wmnet wikikube-worker1035.eqiad.wmnet on all recursors
  • 15:48 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1034.eqiad.wmnet wikikube-worker1035.eqiad.wmnet on all recursors
  • 15:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1022 to wikikube-worker1035
  • 15:47 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1035
  • 15:45 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1035
  • 15:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:45 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1022 to wikikube-worker1035 - jelto@cumin1002"
  • 15:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1022 to wikikube-worker1035 - jelto@cumin1002"
  • 15:39 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:39 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1022 to wikikube-worker1035
  • 15:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1021 to wikikube-worker1034
  • 15:37 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1034
  • 15:36 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1034
  • 15:36 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:36 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1021 to wikikube-worker1034 - jelto@cumin1002"
  • 15:35 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1021 to wikikube-worker1034 - jelto@cumin1002"
  • 15:31 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:31 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1021 to wikikube-worker1034
  • 15:14 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1021-1022].eqiad.wmnet
  • 15:13 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1021-1022].eqiad.wmnet
  • 15:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS bookworm
  • 15:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS bookworm
  • 15:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:57 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:44 urbanecm@deploy2002: Finished scap sync-world: Backport for fix: show thumbnails in surfacing popups (T381364), fix: show thumbnails in surfacing popups (T381364) (duration: 19m 24s)
  • 14:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
  • 14:37 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
  • 14:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
  • 14:30 urbanecm@deploy2002: migr, urbanecm: Backport for fix: show thumbnails in surfacing popups (T381364), fix: show thumbnails in surfacing popups (T381364) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:25 urbanecm@deploy2002: Started scap sync-world: Backport for fix: show thumbnails in surfacing popups (T381364), fix: show thumbnails in surfacing popups (T381364)
  • 14:22 urbanecm@deploy2002: Finished scap sync-world: Backport for Increase Nuke max age to 90 days (T380846), knwiki: remove module namespace names from core-Namespaces.php (T346583), Remove temporary fix for badly set CentralAuth cookies (duration: 17m 04s)
  • 14:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS bookworm
  • 14:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS bookworm
  • 14:13 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1015-1016].eqiad.wmnet
  • 14:13 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1015-1016].eqiad.wmnet
  • 14:13 urbanecm@deploy2002: matmarex, chlod, urbanecm, anzx: Continuing with sync
  • 14:11 urbanecm@deploy2002: matmarex, chlod, urbanecm, anzx: Backport for Increase Nuke max age to 90 days (T380846), knwiki: remove module namespace names from core-Namespaces.php (T346583), Remove temporary fix for badly set CentralAuth cookies synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:05 urbanecm@deploy2002: Started scap sync-world: Backport for Increase Nuke max age to 90 days (T380846), knwiki: remove module namespace names from core-Namespaces.php (T346583), Remove temporary fix for badly set CentralAuth cookies
  • 13:57 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 13:41 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:41 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.restart_sanitarium (exit_code=0) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:40 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:39 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:39 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:35 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:34 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.restart_sanitarium (exit_code=0) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:34 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.restart_sanitarium (exit_code=0) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:33 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:32 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.restart_sanitarium (exit_code=0) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:32 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:30 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:30 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:28 fabfur: upgrade haproxykafka to version 0.3.4 (https://gitlab.wikimedia.org/repos/sre/haproxykafka/-/commits/main?ref_type=heads) (T380583)
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1022.eqiad.wmnet
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1022.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:24 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1022.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:23 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
  • 13:22 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
  • 13:22 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
  • 13:22 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
  • 13:21 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:20 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:19 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:19 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:19 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
  • 13:18 jelto@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
  • 13:18 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:18 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1016.eqiad.wmnet with OS bookworm
  • 13:14 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1022.eqiad.wmnet
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1012.eqiad.wmnet
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:13 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:13 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:10 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.restart_sanitarium (exit_code=99) Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:10 arnaudb@cumin1002: START - Cookbook sre.mysql.restart_sanitarium Restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:06 jnuche@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 13:06 jnuche@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 13:04 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1012.eqiad.wmnet
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1016.eqiad.wmnet with reason: host reimage
  • 12:57 jnuche@deploy2002: Installing scap version "4.132.0" for 207 host(s)
  • 12:56 jnuche@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 12:55 jnuche@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 12:54 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1016.eqiad.wmnet with reason: host reimage
  • 12:54 jnuche@deploy2002: Installation of scap version "4.132.0" completed for 1 hosts
  • 12:53 jnuche@deploy2002: Installing scap version "4.132.0" for 1 host(s)
  • 12:47 klausman@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ml-lab1001.eqiad.wmnet
  • 12:37 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1016.eqiad.wmnet with OS bookworm
  • 12:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1015.eqiad.wmnet with OS bookworm
  • 12:35 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-lab1001.eqiad.wmnet
  • 12:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1015.eqiad.wmnet with reason: host reimage
  • 12:15 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1015.eqiad.wmnet with reason: host reimage
  • 11:58 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1015.eqiad.wmnet with OS bookworm
  • 11:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubernetes1019.eqiad.wmnet wikikube-worker1015.eqiad.wmnet kubernetes1020.eqiad.wmnet wikikube-worker1016.eqiad.wmnet on all recursors
  • 11:53 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache kubernetes1019.eqiad.wmnet wikikube-worker1015.eqiad.wmnet kubernetes1020.eqiad.wmnet wikikube-worker1016.eqiad.wmnet on all recursors
  • 11:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1020 to wikikube-worker1016
  • 11:50 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1016
  • 11:49 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1016
  • 11:49 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:49 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1020 to wikikube-worker1016 - jelto@cumin1002"
  • 11:49 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1020 to wikikube-worker1016 - jelto@cumin1002"
  • 11:45 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:44 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1020 to wikikube-worker1016
  • 11:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1019 to wikikube-worker1015
  • 11:43 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1015
  • 11:42 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1015
  • 11:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:42 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1019 to wikikube-worker1015 - jelto@cumin1002"
  • 11:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1019 to wikikube-worker1015 - jelto@cumin1002"
  • 11:37 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:37 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1019 to wikikube-worker1015
  • 11:33 volans@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1061.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:32 volans@cumin1002: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:31 topranks: pushing new nftables rules to cloudgw1001 to block abuse from paws T381078
  • 11:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2025.codfw.wmnet with reason: cloning
  • 11:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2025.codfw.wmnet with reason: cloning
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2025 to clone es2046', diff saved to https://phabricator.wikimedia.org/P71497 and previous config saved to /var/cache/conftool/dbconfig/20241203-112015-marostegui.json
  • 10:49 volans: installed spicerack v9.0.0 on cumin[12]002
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[1019-1020].eqiad.wmnet
  • 10:41 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[1019-1020].eqiad.wmnet
  • 10:30 volans@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1061.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:27 volans@cumin1002: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 100%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71496 and previous config saved to /var/cache/conftool/dbconfig/20241203-102143-root.json
  • 10:19 volans@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update hieradata from Netbox - volans@cumin2002"
  • 10:19 volans@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update hieradata from Netbox - volans@cumin2002"
  • 10:16 robh@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti7004.magru.wmnet with OS bookworm
  • 10:16 robh@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin2002"
  • 10:16 bking@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
  • 10:16 bking@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin1002"
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 75%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71495 and previous config saved to /var/cache/conftool/dbconfig/20241203-100638-root.json
  • 09:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1006.eqiad.wmnet
  • 09:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1006.eqiad.wmnet
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 50%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71494 and previous config saved to /var/cache/conftool/dbconfig/20241203-095133-root.json
  • 09:40 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 09:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 25%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71493 and previous config saved to /var/cache/conftool/dbconfig/20241203-093627-root.json
  • 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:27 moritzm: rebalance Ganeti eqiad/A following server refreshes
  • 09:24 moritzm: removing ganeti1009 from active Ganeti nodes T378921
  • 09:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1009.eqiad.wmnet
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2041 (re)pooling @ 10%: Pooling in production', diff saved to https://phabricator.wikimedia.org/P71492 and previous config saved to /var/cache/conftool/dbconfig/20241203-092122-root.json
  • 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: parse2017.codfw.wmnet
  • 08:45 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: parse2017.codfw.wmnet
  • 08:37 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 08:35 urbanecm@deploy2002: Finished scap sync-world: Backport for Growth: enable temporary Surfacing Alpha on pilot wikis (T379976) (duration: 21m 30s)
  • 08:34 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 08:32 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:31 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:27 moritzm: installing unbound security updates
  • 08:26 urbanecm@deploy2002: urbanecm, migr: Continuing with sync
  • 08:21 urbanecm@deploy2002: urbanecm, migr: Backport for Growth: enable temporary Surfacing Alpha on pilot wikis (T379976) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1213 from dbctl T375593', diff saved to https://phabricator.wikimedia.org/P71489 and previous config saved to /var/cache/conftool/dbconfig/20241203-081434-marostegui.json
  • 08:13 urbanecm@deploy2002: Started scap sync-world: Backport for Growth: enable temporary Surfacing Alpha on pilot wikis (T379976)
  • 08:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1217.eqiad.wmnet with reason: Moving to m3
  • 08:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1217.eqiad.wmnet with reason: Moving to m3
  • 08:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1213.eqiad.wmnet with reason: Moving to m3
  • 08:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1213.eqiad.wmnet with reason: Moving to m3
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1213', diff saved to https://phabricator.wikimedia.org/P71487 and previous config saved to /var/cache/conftool/dbconfig/20241203-080726-marostegui.json
  • 07:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2021.codfw.wmnet with reason: cloning
  • 07:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2021.codfw.wmnet with reason: cloning
  • 07:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2021', diff saved to https://phabricator.wikimedia.org/P71486 and previous config saved to /var/cache/conftool/dbconfig/20241203-075751-marostegui.json
  • 07:57 marostegui: Switchover es4 codfw master to es2022 dbmaint (this happened an hour ago) T381259
  • 07:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1009.eqiad.wmnet
  • 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1009.eqiad.wmnet
  • 07:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1009.eqiad.wmnet
  • 06:41 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:41 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change VIPs for wdqs-internal-main and wdqs-internal-scholarly to avoid mw-parsoid collision - ryankemper@cumin2002"
  • 06:41 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change VIPs for wdqs-internal-main and wdqs-internal-scholarly to avoid mw-parsoid collision - ryankemper@cumin2002"
  • 06:37 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2022 to es4 master T381259', diff saved to https://phabricator.wikimedia.org/P71485 and previous config saved to /var/cache/conftool/dbconfig/20241203-063408-marostegui.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P71484 and previous config saved to /var/cache/conftool/dbconfig/20241203-063234-root.json
  • 06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2021.codfw.wmnet with reason: cloning
  • 06:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2021.codfw.wmnet with reason: cloning
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P71483 and previous config saved to /var/cache/conftool/dbconfig/20241203-061729-root.json
  • 06:10 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:10 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add VIPs for wdqs-internal-main and wdqs-internal-scholarly - ryankemper@cumin2002"
  • 06:10 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add VIPs for wdqs-internal-main and wdqs-internal-scholarly - ryankemper@cumin2002"
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2041 to es4 with just minimal weight T381259', diff saved to https://phabricator.wikimedia.org/P71482 and previous config saved to /var/cache/conftool/dbconfig/20241203-060847-marostegui.json
  • 06:06 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 06:06 ryankemper: [Netbox] T379334 Aborted netbox sync cookbook due to wrong IPs for wdqs-internal-scholarly. Fixed in UI, re-running cookbook now
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'Add es2041 depooled T381259', diff saved to https://phabricator.wikimedia.org/P71481 and previous config saved to /var/cache/conftool/dbconfig/20241203-060614-marostegui.json
  • 06:06 ryankemper@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P71480 and previous config saved to /var/cache/conftool/dbconfig/20241203-060224-root.json
  • 06:00 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 06:00 ryankemper: [Netbox] T379334 Added VIPs via UI for wdqs-internal-[main,scholarly].svc.[eqiad,codfw].wmnet
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P71479 and previous config saved to /var/cache/conftool/dbconfig/20241203-054718-root.json
  • 05:44 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on wdqs[2018-2020,2026-2027].codfw.wmnet with reason: T376150 non-prod hosts
  • 05:44 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 12:00:00 on wdqs[2018-2020,2026-2027].codfw.wmnet with reason: T376150 non-prod hosts
  • 05:17 eileen: config revision changed from b3741848 to 694158ae
  • 05:17 eileen: civicrm upgraded from be7e5d33 to 6361a578
  • 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.44.0-wmf.3 (duration: 01m 27s)
  • 04:51 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.44.0-wmf.6 refs T375665 (duration: 48m 24s)
  • 04:02 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.44.0-wmf.6 refs T375665
  • 02:56 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 02:37 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 02:36 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1084.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 02:20 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:20 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1084 - vriley@cumin1002"
  • 02:20 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1084 - vriley@cumin1002"
  • 02:16 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 01:53 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:47 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:38 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:38 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:36 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 01:35 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:35 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:26 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 01:26 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:26 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:22 pt1979@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:22 pt1979@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:34 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:34 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 00:34 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:32 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:32 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:31 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:30 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:25 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:15 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART

2024-12-02

  • 23:58 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:50 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1091.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:50 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:50 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 23:50 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added mgmt for ms-be - jclark@cumin1002"
  • 23:46 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 22:27 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:27 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:27 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:26 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:25 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:24 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:20 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:20 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:18 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:18 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:16 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:16 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:05 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:05 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:45 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:40 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:36 urbanecm@deploy2002: Finished scap sync-world: Backport for testwiki: no growth experiment anymore (T380659), fix(surfacing): don't redirect to desktop (duration: 13m 22s)
  • 21:35 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:29 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:29 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
  • 21:27 urbanecm@deploy2002: migr, urbanecm: Backport for testwiki: no growth experiment anymore (T380659), fix(surfacing): don't redirect to desktop synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:22 urbanecm@deploy2002: Started scap sync-world: Backport for testwiki: no growth experiment anymore (T380659), fix(surfacing): don't redirect to desktop
  • 21:21 urbanecm@deploy2002: Finished scap sync-world: Backport for Enable VisualEditor by default on Indonesian Wikiquote (T381214), votewiki, testwiki: add securepoll-edit-poll to electionadmin (T377531), cawiki: stop Flow being the default for some talk namespaces (T381295) (duration: 13m 40s)
  • 21:15 urbanecm@deploy2002: kemayo, urbanecm, nmw03, sd: Continuing with sync
  • 21:14 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:12 urbanecm@deploy2002: kemayo, urbanecm, nmw03, sd: Backport for Enable VisualEditor by default on Indonesian Wikiquote (T381214), votewiki, testwiki: add securepoll-edit-poll to electionadmin (T377531), cawiki: stop Flow being the default for some talk namespaces (T381295) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:12 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ms-be1083.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:09 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:09 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1083 - vriley@cumin1002"
  • 21:09 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be1083 - vriley@cumin1002"
  • 21:08 urbanecm@deploy2002: Started scap sync-world: Backport for Enable VisualEditor by default on Indonesian Wikiquote (T381214), votewiki, testwiki: add securepoll-edit-poll to electionadmin (T377531), cawiki: stop Flow being the default for some talk namespaces (T381295)
  • 21:04 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 20:10 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3069.esams.wmnet [reason: done: checking icinga alerts]
  • 19:55 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3069.esams.wmnet [reason: checking icinga alerts]
  • 19:22 volans@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 19:22 volans@cumin1002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 19:21 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs3010.esams.wmnet
  • 19:20 volans@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 19:15 dancy@deploy2002: Installation of scap version "4.131.0" completed for 207 hosts
  • 19:14 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs3010.esams.wmnet
  • 19:13 sukhe: rebooting lvs3010 to test CR 1093958
  • 19:11 dancy@deploy2002: Installing scap version "4.131.0" for 207 hosts
  • 19:07 sukhe: disable puppet on A:lvs to finish rolling out CR 1093958: T358260
  • 19:01 volans@cumin1002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 18:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2242']
  • 18:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2241']
  • 18:39 urbanecm@deploy2002: Finished scap sync-world: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277) (duration: 10m 35s)
  • 18:38 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2242']
  • 18:38 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2241']
  • 18:37 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 18:36 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 18:35 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 18:35 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 18:35 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 18:35 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 18:34 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 18:34 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 18:34 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 18:34 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 18:34 jiji@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 18:33 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 18:33 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 18:33 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 18:33 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 18:32 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 18:32 urbanecm@deploy2002: urbanecm: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:32 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 18:32 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 18:31 jiji@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 18:31 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 18:31 jiji@deploy2002: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 18:28 urbanecm@deploy2002: Started scap sync-world: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277)
  • 18:19 jiji@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2242.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:18 jiji@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 18:17 jiji@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 18:17 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2241.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:17 jiji@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 18:17 jiji@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 18:17 jiji@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 18:17 jiji@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:16 jiji@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 18:16 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 18:16 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 18:16 jiji@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 18:15 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 18:00 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 18:00 urbanecm@deploy2002: urbanecm: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:58 jiji@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
  • 17:57 urbanecm@deploy2002: Started scap sync-world: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277)
  • 17:54 urbanecm@deploy2002: Sync cancelled.
  • 17:54 urbanecm@deploy2002: urbanecm: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:54 fabfur@cumin1002: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox
  • 17:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 17:50 urbanecm@deploy2002: Started scap sync-world: Backport for ApiQueryLinkRecommendations: Do not use relative protocol URIs (T381277)
  • 17:48 jiji@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
  • 17:48 dancy@deploy2002: Installation of scap version "4.129.0" completed for 207 hosts
  • 17:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2242.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2241.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:44 dancy@deploy2002: Installing scap version "4.129.0" for 207 hosts
  • 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2241-2 to codfw - jhancock@cumin2002"
  • 17:43 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2241-2 to codfw - jhancock@cumin2002"
  • 17:38 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 17:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1006.eqiad.wmnet with reason: host reimage
  • 17:31 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1006.eqiad.wmnet with reason: host reimage
  • 17:16 fabfur@cumin1002: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough
  • 17:13 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 17:07 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "missing data for wikikube-worker1006 - jayme@cumin1002"
  • 17:07 topranks: resetting ulsfo->eqsin link to normal metric to put all codfw->eqsin traffic back on Aerlion cct
  • 17:07 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "missing data for wikikube-worker1006 - jayme@cumin1002"
  • 17:03 fabfur@cumin1002: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough and A:wikidough
  • 16:55 fabfur@cumin1002: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox
  • 16:54 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 28s)
  • 16:52 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 10m 36s)
  • 16:38 dancy@deploy2002: Installation of scap version "4.130.0" completed for 207 hosts
  • 16:34 dancy@deploy2002: Installing scap version "4.130.0" for 207 hosts
  • 16:32 jan_drewniak: starting portals deploy
  • 16:25 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 16:00 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025.eqiad.wmnet']
  • 16:00 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wdqs1025.eqiad.wmnet']
  • 16:00 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025.eqiad.wmnet']
  • 15:59 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025.eqiad.wmnet']
  • 15:58 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 15:47 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 15:46 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 15:42 volans: uploaded spicerack_9.0.0 to apt.wikimedia.org bullseye-wikimedia
  • 15:42 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 15:42 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 15:32 taavi@deploy2002: Finished scap sync-world: Backport for wikitech: Drop contentadmin group (T375950) (duration: 09m 42s)
  • 15:29 sukhe: sudo cumin -b1 -s10 "A:cp" 'run-puppet-agent --enable "merging CR 1091748"'
  • 15:26 taavi@deploy2002: taavi: Continuing with sync
  • 15:26 taavi@deploy2002: taavi: Backport for wikitech: Drop contentadmin group (T375950) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:24 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [reason: [done] testing CR 1091748]
  • 15:22 taavi@deploy2002: Started scap sync-world: Backport for wikitech: Drop contentadmin group (T375950)
  • 15:17 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [reason: testing CR 1091748]
  • 15:14 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CR 1091748"' [trafficserver: remove inbound TLS and related settings]
  • 15:08 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 15:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubernetes1018.eqiad.wmnet wikikube-worker1006.eqiad.wmnet on all recursors
  • 15:03 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache kubernetes1018.eqiad.wmnet wikikube-worker1006.eqiad.wmnet on all recursors
  • 15:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1018 to wikikube-worker1006
  • 15:01 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1006
  • 14:59 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1006
  • 14:59 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:58 marostegui: Deploy schema change on db1167 dbmaint eqiad - s8 sanitarium master, there will be days of lag in wikireplicas in s8 T367856
  • 14:57 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:53 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 14:50 sukhe: running authdns-update for CR 1099713
  • 14:44 jelto@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:43 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976) (duration: 19m 08s)
  • 14:36 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
  • 14:34 moritzm: installing curl security updates
  • 14:29 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:29 jiji@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc-gp[1001-1003].eqiad.wmnet
  • 14:29 jiji@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:28 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1018 to wikikube-worker1006
  • 14:27 urbanecm@deploy2002: migr, urbanecm: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:27 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:23 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976)
  • 14:17 urbanecm@deploy2002: Finished scap sync-world: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075) (duration: 14m 37s)
  • 14:11 urbanecm@deploy2002: urbanecm, daimona: Continuing with sync
  • 14:08 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp[1001-1003].eqiad.wmnet
  • 14:07 urbanecm@deploy2002: urbanecm, daimona: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:03 urbanecm@deploy2002: Started scap sync-world: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075)
  • 14:00 moritzm: removing ganeti1020 from active Ganeti nodes T378921
  • 13:57 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main1007.eqiad.wmnet
  • 13:57 jiji@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main1007.eqiad.wmnet
  • 13:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1003,1008].eqiad.wmnet with reason: Hardware refresh
  • 13:51 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1003,1008].eqiad.wmnet with reason: Hardware refresh
  • 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P71471 and previous config saved to /var/cache/conftool/dbconfig/20241202-134648-root.json
  • 13:46 isaranto@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:41 isaranto@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1002,1007].eqiad.wmnet with reason: Hardware refresh
  • 13:37 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1002,1007].eqiad.wmnet with reason: Hardware refresh
  • 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P71470 and previous config saved to /var/cache/conftool/dbconfig/20241202-133143-root.json
  • 13:31 effie: repacing kafka-main1003 in production with kafka-main1008 - T363214
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1018.eqiad.wmnet
  • 13:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:29 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1018.eqiad.wmnet
  • 13:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc-gp[2002-2003].codfw.wmnet
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp[2002-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 13:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:21 jiji@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp[2002-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 13:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1005.eqiad.wmnet
  • 13:18 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1005.eqiad.wmnet
  • 13:17 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P71469 and previous config saved to /var/cache/conftool/dbconfig/20241202-131638-root.json
  • 13:06 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 13:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:01 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp[2002-2003].codfw.wmnet
  • 13:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P71467 and previous config saved to /var/cache/conftool/dbconfig/20241202-130132-root.json
  • 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1020.eqiad.wmnet
  • 12:22 topranks: re-routing traffic from Drmrs towards TECHLIB-TCZ - AS2852 - National Library of Technology, Prague, to avoid path via GEANT
  • 12:18 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2005-2006].codfw.wmnet
  • 12:18 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2005-2006].codfw.wmnet
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc-gp2001.codfw.wmnet
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 12:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2005.codfw.wmnet with OS bookworm
  • 12:05 jiji@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 12:04 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 12:02 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 12:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1005.eqiad.wmnet with OS bookworm
  • 11:57 moritzm: upload mapnik 4.0.3+ds-2~wmf12u2 (adding a forward ported mapnik-config script to be consumed by node-mapnik even with the switch of mapnik 4 towards pkg-config) T327396
  • 11:56 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp2001.codfw.wmnet
  • 11:56 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2006.codfw.wmnet with OS bookworm
  • 11:55 marostegui: Stop mariadb on es2020 to clone es2041 T381259
  • 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1070.eqiad.wmnet
  • 11:52 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for ms-be1070.eqiad.wmnet
  • 11:46 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2005.codfw.wmnet with reason: host reimage
  • 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1070.eqiad.wmnet with reason: vacuum two overlarge container dbs
  • 11:45 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on ms-be1070.eqiad.wmnet with reason: vacuum two overlarge container dbs
  • 11:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1005.eqiad.wmnet with reason: host reimage
  • 11:42 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2005.codfw.wmnet with reason: host reimage
  • 11:38 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1005.eqiad.wmnet with reason: host reimage
  • 11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2020.codfw.wmnet with reason: cloning
  • 11:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2020.codfw.wmnet with reason: cloning
  • 11:36 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2006.codfw.wmnet with reason: host reimage
  • 11:33 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2006.codfw.wmnet with reason: host reimage
  • 11:26 ladsgroup@deploy2002: Finished scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250) (duration: 11m 21s)
  • 11:23 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2005
  • 11:23 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2005
  • 11:23 topranks: rollback OSPF metric change on cr4-ulsfo to place all codfw to eqsin traffic back on primary transport link
  • 11:22 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1005.eqiad.wmnet with OS bookworm
  • 11:21 marostegui@cumin2002: dbctl commit (dc=all): 'Depool es2020 T381259', diff saved to https://phabricator.wikimedia.org/P71463 and previous config saved to /var/cache/conftool/dbconfig/20241202-112105-marostegui.json
  • 11:19 ladsgroup@deploy2002: abi, ladsgroup: Continuing with sync
  • 11:19 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2005
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2005.codfw.wmnet 40.32.192.10.in-addr.arpa 0.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2005.codfw.wmnet 40.32.192.10.in-addr.arpa 0.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2005 - jayme@cumin2002"
  • 11:19 ladsgroup@deploy2002: abi, ladsgroup: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:19 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2005 - jayme@cumin2002"
  • 11:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:15 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2088.codfw.wmnet with OS bullseye
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubernetes1017.eqiad.wmnet wikikube-worker1005.eqiad.wmnet on all recursors
  • 11:15 ladsgroup@deploy2002: Started scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250)
  • 11:14 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache kubernetes1017.eqiad.wmnet wikikube-worker1005.eqiad.wmnet on all recursors
  • 11:14 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:14 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2005
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2006
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2006
  • 11:13 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2006
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2006.codfw.wmnet 141.32.192.10.in-addr.arpa 1.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:13 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2006.codfw.wmnet 141.32.192.10.in-addr.arpa 1.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2006 - jayme@cumin2002"
  • 11:13 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2006 - jayme@cumin2002"
  • 11:09 ladsgroup@deploy2002: Started scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250)
  • 11:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1017 to wikikube-worker1005
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1005
  • 11:05 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1005
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1017 to wikikube-worker1005 - jelto@cumin1002"
  • 11:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1017 to wikikube-worker1005 - jelto@cumin1002"
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2005.codfw.wmnet with OS bookworm
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2006
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2006.codfw.wmnet with OS bookworm
  • 11:01 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2005.codfw.wmnet wikikube-worker2006.codfw.wmnet on all recursors
  • 11:01 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2005.codfw.wmnet wikikube-worker2006.codfw.wmnet on all recursors
  • 11:00 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:00 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1017 to wikikube-worker1005
  • 10:55 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2437 to wikikube-worker2006
  • 10:55 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2006
  • 10:54 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2006
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2437 to wikikube-worker2006 - jayme@cumin2002"
  • 10:54 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2437 to wikikube-worker2006 - jayme@cumin2002"
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2436 to wikikube-worker2005
  • 10:52 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2088.codfw.wmnet with reason: host reimage
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2005
  • 10:51 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:51 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2005
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2436 to wikikube-worker2005 - jayme@cumin2002"
  • 10:50 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2436 to wikikube-worker2005 - jayme@cumin2002"
  • 10:48 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2088.codfw.wmnet with reason: host reimage
  • 10:46 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2437 to wikikube-worker2006
  • 10:46 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:46 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2436 to wikikube-worker2005
  • 10:45 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1017.eqiad.wmnet
  • 10:45 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1017.eqiad.wmnet
  • 10:44 ladsgroup@deploy2002: Finished scap sync-world: Backport for Enable new ParserCache key schema on every page (T373037) (duration: 17m 25s)
  • 10:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1017.eqiad.wmnet with OS bookworm
  • 10:37 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 10:35 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2088.codfw.wmnet with OS bullseye
  • 10:33 ladsgroup@deploy2002: ladsgroup: Backport for Enable new ParserCache key schema on every page (T373037) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 10:31 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2087.codfw.wmnet with OS bullseye
  • 10:26 ladsgroup@deploy2002: Started scap sync-world: Backport for Enable new ParserCache key schema on every page (T373037)
  • 10:16 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2437.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:16 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2436.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1017.eqiad.wmnet with reason: host reimage
  • 10:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1017.eqiad.wmnet with reason: host reimage
  • 10:12 marostegui: Deploy schema change on db1167 - s8 sanitarium master, there will be days of lag in wikireplicas in s8 T367856
  • 10:12 marostegui@cumin2002: dbctl commit (dc=all): 'Depool db1167 for an alter table', diff saved to https://phabricator.wikimedia.org/P71461 and previous config saved to /var/cache/conftool/dbconfig/20241202-101225-marostegui.json
  • 10:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: alter
  • 10:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: alter
  • 10:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 8:00:00 on db1167.eqiad.wmnet with reason: alter
  • 10:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 8:00:00 on db1167.eqiad.wmnet with reason: alter
  • 10:09 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
  • 10:05 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
  • 10:04 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2437.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:03 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2436.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:56 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host pc1017.eqiad.wmnet with OS bookworm
  • 09:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[2436-2437].codfw.wmnet with reason: rename/reimage
  • 09:52 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[2436-2437].codfw.wmnet with reason: rename/reimage
  • 09:52 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2087.codfw.wmnet with OS bullseye
  • 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:48 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2436-2437].codfw.wmnet
  • 09:47 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2436-2437].codfw.wmnet
  • 09:45 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2086.codfw.wmnet with OS bullseye
  • 09:45 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 09:45 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 09:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: optimizing
  • 09:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: optimizing
  • 09:42 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 09:41 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
  • 09:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:35 marostegui: Installing mariadb 10.6.20 on db1198 T378940
  • 09:28 marostegui@cumin2002: dbctl commit (dc=all): 'Depoll db1198 to install 10.6.20', diff saved to https://phabricator.wikimedia.org/P71460 and previous config saved to /var/cache/conftool/dbconfig/20241202-092854-marostegui.json
  • 09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1198.eqiad.wmnet with reason: testing
  • 09:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1198.eqiad.wmnet with reason: testing
  • 09:24 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
  • 09:20 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
  • 09:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1020.eqiad.wmnet
  • 09:09 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:59 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:52 dcausse: restarting blazegraph on wdqs1019 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 08:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:36 kartik@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription feature for some wikis (T372386) (duration: 23m 39s)
  • 08:35 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:29 kartik@deploy2002: abi, kartik: Continuing with sync
  • 08:25 kartik@deploy2002: abi, kartik: Backport for Translate: Enable message group subscription feature for some wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1020.eqiad.wmnet
  • 08:12 kartik@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription feature for some wikis (T372386)
  • 08:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1020.eqiad.wmnet
  • 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1009.eqiad.wmnet
  • 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1009.eqiad.wmnet
  • 08:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1017.eqiad.wmnet with reason: T378068, host is not pooled
  • 08:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1017.eqiad.wmnet with reason: T378068, host is not pooled
  • 08:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: T373037, host is not pooled
  • 08:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: T373037, host is not pooled
  • 05:20 TimStarling: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol=https
  • 05:14 TimStarling: on mwmaint2002: mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=idwikivoyage --force-protocol=https
  • 04:41 TimStarling: installed id.wikivoyage.org
  • 04:39 TimStarling: on db2123: grant alter ON `%wik%`.* TO `wikiadmin2023`@`10.%`
  • 04:26 tstarling@deploy2002: Finished scap sync-world: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726) (duration: 31m 05s)
  • 04:13 tstarling@deploy2002: tstarling: Continuing with sync
  • 04:12 tstarling@deploy2002: tstarling: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 03:55 tstarling@deploy2002: Started scap sync-world: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726)

2024-12-01

  • 23:53 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1156 gradually with 4 steps - Maint over (T381213)
  • 13:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1233 gradually with 4 steps - Maint over (T381213)
  • 12:31 ladsgroup@cumin1002: START - Cookbook sre.mysql.pool db1156 gradually with 4 steps - Maint over (T381213)
  • 12:16 ladsgroup@cumin1002: START - Cookbook sre.mysql.pool db1233 gradually with 4 steps - Maint over (T381213)
  • 12:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1156.eqiad.wmnet onto db1233.eqiad.wmnet
  • 10:45 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db1156.eqiad.wmnet onto db1233.eqiad.wmnet
  • 10:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depool to reclone (T381213)', diff saved to https://phabricator.wikimedia.org/P71451 and previous config saved to /var/cache/conftool/dbconfig/20241201-104441-ladsgroup.json
  • 06:18 marostegui@cumin2002: dbctl commit (dc=all): 'Depoll db1233', diff saved to https://phabricator.wikimedia.org/P71450 and previous config saved to /var/cache/conftool/dbconfig/20241201-061841-marostegui.json


Archives

See Server Admin Log/Archives.