Jump to content

Server Admin Log/Archive 34

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

2018-04-30

  • 23:38 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.DiffPage.init.js: T192755 (duration: 00m 59s)
  • 23:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Set $wgKartographerUsePageLanguage to false everywhere (T192955) (duration: 00m 59s)
  • 23:33 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/AbuseFilter/includes/AbuseFilter.php: Fix notices when disallowing edits (duration: 00m 59s)
  • 23:21 catrope@tin: Synchronized wmf-config/: USe internal cluster for SPARQL services (T192942) (duration: 01m 02s)
  • 23:14 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Config cleanup patches from SWAT (duration: 01m 00s)
  • 23:05 mutante: ores1001: rm -rf /srv/deployment/ores/venv/ (T193422)
  • 21:46 ebernhardson: T192972 increase eqiad elasticsearch disk watermarks from 75/80 to 85/85
  • 20:27 arlolra: Updated Parsoid to 50b0588 (T186358, T191700, T192909)
  • 20:22 awight@tin: Started deploy [ores/deploy@bf182e2]: ORES: Include bot edits in precaching wikidata itemquality; T187927
  • 20:21 arlolra@tin: Finished deploy [parsoid/deploy@d8d7b42]: Updating Parsoid to 50b0588 (duration: 09m 46s)
  • 20:20 awight@tin: Finished deploy [ores/deploy@5b27205]: Rollback ores1001 to master (duration: 02m 56s)
  • 20:19 bsitzmann@tin: Finished deploy [mobileapps/deploy@d3724d2]: Update mobileapps to cc00cae (T191869) (duration: 07m 32s)
  • 20:17 awight@tin: Started deploy [ores/deploy@5b27205]: Rollback ores1001 to master
  • 20:11 bsitzmann@tin: Started deploy [mobileapps/deploy@d3724d2]: Update mobileapps to cc00cae (T191869)
  • 20:11 arlolra@tin: Started deploy [parsoid/deploy@d8d7b42]: Updating Parsoid to 50b0588
  • 20:09 awight@tin: Finished deploy [ores/deploy@4601497]: Trial LFS deployment to ORES canary; T181678 (take 2) (duration: 02m 10s)
  • 20:06 awight@tin: Started deploy [ores/deploy@4601497]: Trial LFS deployment to ORES canary; T181678 (take 2)
  • 20:06 ppchelko@tin: Finished deploy [changeprop/deploy@8cd45ed]: Don't filter bots from the ORES stream T187927 (duration: 01m 15s)
  • 20:05 ppchelko@tin: Started deploy [changeprop/deploy@8cd45ed]: Don't filter bots from the ORES stream T187927
  • 19:10 awight@tin: Finished deploy [ores/deploy@25579e7]: Trial LFS deployment to ORES canary; T181678 (duration: 02m 06s)
  • 19:08 awight@tin: Started deploy [ores/deploy@25579e7]: Trial LFS deployment to ORES canary; T181678
  • 19:01 mutante: hafnium - sudo service navtiming stop; sudo service statsv stop - downtimed in icinga, decom
  • 18:27 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T191584 (duration: 01m 00s)
  • 18:23 awight@tin: Finished deploy [ores/deploy@5b27205]: Rollback ORES canary to master (duration: 00m 21s)
  • 18:22 awight@tin: Started deploy [ores/deploy@5b27205]: Rollback ORES canary to master
  • 18:17 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/CodeMirror/resources/modules/ve-cm/ve.ui.CodeMirrorAction.js: T191923 (duration: 01m 00s)
  • 18:16 ottomata: starting rolling reimage of kafka main-eqiad brokers kafka100[123] - T192832
  • 18:06 awight@tin: Finished deploy [ores/deploy@46824bb]: Canary-only test deployment for ORES + git-lfs, T181678 (take 2) (duration: 01m 58s)
  • 18:04 awight@tin: Started deploy [ores/deploy@46824bb]: Canary-only test deployment for ORES + git-lfs, T181678 (take 2)
  • 17:41 awight@tin: Finished deploy [ores/deploy@8c586ab]: Canary-only test deployment for ORES + git-lfs, T181678 (duration: 01m 59s)
  • 17:39 ariel@tin: Finished deploy [dumps/dumps@8398f53]: write checksums of dump files into seperate hashfiles, reusing their contents as appropriate (duration: 00m 03s)
  • 17:39 ariel@tin: Started deploy [dumps/dumps@8398f53]: write checksums of dump files into seperate hashfiles, reusing their contents as appropriate
  • 17:39 awight@tin: Started deploy [ores/deploy@8c586ab]: Canary-only test deployment for ORES + git-lfs, T181678
  • 17:26 gehel: restart blazegraph and updater on wdqs1003 to activate UseNUMA -T193365
  • 17:15 gehel@tin: Finished deploy [wdqs/wdqs@2579bfa]: deploying wdqs gui (duration: 04m 16s)
  • 17:11 gehel@tin: Started deploy [wdqs/wdqs@2579bfa]: deploying wdqs gui
  • 17:10 gehel: removing stale scap log for wdqs on tin.eqiad.wmnet
  • 16:50 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch LocalRenameUserJob to EventBus for all wikis - T193254 T190327 (duration: 00m 59s)
  • 16:50 ppchelko@tin: Finished deploy [cpjobqueue/deploy@01630f2]: Switch LocalRenameUserJob for all wikis. T193254 (duration: 00m 49s)
  • 16:49 ppchelko@tin: Started deploy [cpjobqueue/deploy@01630f2]: Switch LocalRenameUserJob for all wikis. T193254
  • 15:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 and db1069 with full weight (duration: 00m 59s)
  • 15:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 and db1069 with low load (duration: 00m 59s)
  • 14:31 jynus: shutting down db1056 for upgrade/maintenance and cloning
  • 14:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: Move db1069 from s7 to x1, depool db1056 (duration: 00m 59s)
  • 14:27 elukey: upgrade druid on druid100[1-3] from 0.9.2 to 0.10
  • 14:26 marostegui: Power off db2081 for HW maintenance - T193325
  • 14:17 gehel: rolling restart blazegraph on all wdqs nodes for new configuration - T192759
  • 13:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076 after alter table (duration: 00m 59s)
  • 13:40 zeljkof: EU SWAT finished
  • 13:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow bureaucrats to remove flood group for real, allow flooders to strip the group from them (T193350) (duration: 00m 59s)
  • 13:30 zfilipin@tin: Synchronized php-1.32.0-wmf.1/extensions/AbuseFilter: SWAT: Dont use an empty string for block parameters (T189681) (duration: 01m 02s)
  • 13:30 marostegui: Poweroff db1098 for HW maintenance - T193331
  • 13:26 marostegui: Stop MySQL on db1098 - T193331
  • 13:21 ottomata: beginning rolling reimage of kafka200[23] to stretch T192832
  • 13:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RCPatrol in cswiki (T193242) (duration: 00m 59s)
  • 13:16 marostegui: Drop unusued _old tables from a few wikis - https://phabricator.wikimedia.org/T54932#4167221
  • 13:13 gehel: restarting elasticsearch codfw rolling restart for plugin update and NUMA config - T191543 / T191236
  • 13:11 elukey: reimage analytics1049 and 1050 to Debian Stretch
  • 13:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Datetime Selector on Special:Block on all wikis except Meta, MediaWiki, and German Wikipedia (T192962) (duration: 01m 00s)
  • 12:48 arturo: aborrero@labtestnet2001:~ $ sudo rm /var/log/upstart/nova-api.log.1 <--- disk full, logrotate refuses to work bc that
  • 10:34 vgutierrez: Updating puppet compiler facts
  • 10:30 vgutierrez: Repool (Re-enable BGP) lvs3001 - T191897
  • 10:06 elukey: restart hdfs namenode on analytics1002 to pick up new heap settings (last step of the maintenance)
  • 10:00 elukey: set analytics1001 as active HDFS Namenode using manual failover
  • 09:50 elukey: restart HDFS Namenode on analtics1001 (current standby) again with Xmx/Xms set to 8g
  • 09:47 elukey: restart HDFS Namenode on analtics1001 (current standby)
  • 09:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060, fully pool db1090 (duration: 00m 59s)
  • 09:15 ariel@tin: Finished deploy [dumps/dumps@a6baf69]: do not update existing rss feed file if the dump job it covers is more recent than the one for which a feed is requested (duration: 00m 04s)
  • 09:15 ariel@tin: Started deploy [dumps/dumps@a6baf69]: do not update existing rss feed file if the dump job it covers is more recent than the one for which a feed is requested
  • 09:03 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 00m 59s)
  • 09:01 vgutierrez: Depool and reimage lvs3001 as stretch - T191897
  • 08:39 marostegui: Deploy schema change on db1076 - T191519 T188299 T190148
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for alter table (duration: 00m 59s)
  • 08:38 elukey: restart HDFS namenode on analytics1001 (standby master) to pick up new JVM settings - T193257
  • 08:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 after alter table (duration: 01m 00s)
  • 08:23 godog: swift eqiad-prod more weight to ms-be104[0-3] - T191896
  • 08:16 elukey: force a manual failover of the HDFS Namenode from analytics1001 to analytics1002 to test new GC Settings - T193257
  • 08:15 vgutierrez: Repool (Re-enable BGP) in lvs3002 - T191897
  • 08:02 jynus: stopping replication on both db1090 db instances to finish maintenance
  • 07:33 jynus: restarting dbstore1001@s1 to apply config change
  • 07:31 elukey: restart HDFS namenode on analytics1002 (standby master) to pick up new JVM settings - T193257
  • 07:06 marostegui: Restart replication on db1095:s3
  • 07:05 marostegui: Temporary stop replication on db1095:s3
  • 06:48 vgutierrez: Depool and reimage lvs3002 - T191897
  • 06:11 marostegui: Drop table edit_page_tracking from s3 - T57385
  • 06:04 marostegui: Drop table edit_page_tracking from s2 - T57385
  • 05:59 marostegui: Drop table edit_page_tracking from s1 - T57385
  • 05:50 marostegui: Drop table edit_page_tracking from s4, s5 and s7 - T57385
  • 05:47 marostegui: Drop table edit_page_tracking from s6 - T57385
  • 05:28 marostegui: Deploy schema change on db1074 - T191519 T188299 T190148
  • 05:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for alter table (duration: 01m 09s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.1) (duration: 08m 18s)

2018-04-29

  • 17:46 brion: rebuilding image metadata for PDFs on commons on terbium

2018-04-28

  • 23:42 volans@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098 (crashed) (duration: 01m 01s)
  • 15:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2081, crashed (duration: 01m 00s)
  • 05:19 apergos: reimaged snapshot1005 to stretch

2018-04-27

  • 22:45 mutante: m2171,mw2172,mw2173 ff. - reinstalling with stretch and raid1-LVM
  • 22:07 hashar: Running quibble-vendor-mysql-php70-docker against ~ 900 MediaWiki extensions. Triggered with a custom gear-client.py script from contint1001. PID 29710
  • 19:58 tgr: T193254 ran fixStuckGlobalRename.php for: Aliya klein Hasselb Husseinzadeh02 Jswf845 Lorraine Fgr Mikeypugs0134 Ncanty STEEEPGlobal Sunlight me THOR Global Defense Group TPBox Zenas Gao אֲבִי גְדוֹר ぽっぽ大将軍
  • 18:16 mutante: mw2167,mw2168,mw2169 - reinstalling with stretch and raid1-lvm
  • 16:26 imarlier@tin: Finished deploy [performance/navtiming@c059a60]: Deploying navtiming.py with support for enable/disable via etcd (duration: 00m 05s)
  • 16:26 imarlier@tin: Started deploy [performance/navtiming@c059a60]: Deploying navtiming.py with support for enable/disable via etcd
  • 16:19 imarlier@tin: Finished deploy [statsv/statsv@d5108c4]: Update statsv to force the Kafka broker API version (duration: 00m 05s)
  • 16:19 imarlier@tin: Started deploy [statsv/statsv@d5108c4]: Update statsv to force the Kafka broker API version
  • 14:23 anomie: Running populateRevisionLength.php on group 2 for T192189
  • 13:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 after alter table (duration: 00m 59s)
  • 11:41 moritzm: reimaging mwdebug2002 to stretch
  • 11:21 Amir1: ladsgroup@terbium:/var/log/wikidata$ mwscript updateCollation.php --wiki=fawiki --previous-collation=xx-uca-fa
  • 11:13 moritzm: installing uwsgi/Django security updates on graphite hosts in eqiad
  • 10:39 moritzm: installing uwsgi/Django security updates on graphite2001
  • 09:53 moritzm: reimaging mwdebug1001 to stretch
  • 08:58 elukey: reimage analytics10[51,53] to Debian Stretch
  • 08:46 moritzm: installing mysql 5.5 security update (distro-packaged version) on trusty
  • 08:14 moritzm: reimaging mwdebug2001 to stretch
  • 07:32 godog: swift eqiad-prod more weight to ms-be104[0-3] - T190081
  • 05:31 marostegui: Deploy schema change on db1105:3312 - T191519 T188299 T190148
  • 05:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105 for alter table (duration: 00m 59s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113 after alter table (duration: 01m 10s)
  • 05:08 cwd: killed some dedupe queries on staging that were causing alerts

2018-04-26

  • 23:31 reedy@tin: Synchronized php-1.32.0-wmf.1/extensions/PdfHandler/: (no justification provided) (duration: 01m 00s)
  • 23:16 reedy@tin: Synchronized php-1.32.0-wmf.1/extensions/UploadWizard/: (no justification provided) (duration: 01m 00s)
  • 23:10 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 01m 00s)
  • 22:44 ebernhardson: start test measuring elasticsearch master mutation latency in codfw
  • 22:38 Jeff_Green: deployed DNS update for frbast1001.wikimedia.org
  • 22:21 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/429100/ (duration: 01m 00s)
  • 22:11 maxsem@tin: Finished scap: Deploy ACW to test wikis, https://gerrit.wikimedia.org/r/429017 / T192455 (duration: 57m 06s)
  • 21:14 maxsem@tin: Started scap: Deploy ACW to test wikis, https://gerrit.wikimedia.org/r/429017 / T192455
  • 21:13 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/429017 (duration: 00m 59s)
  • 21:05 maxsem@tin: Synchronized php-1.32.0-wmf.1/extensions/ArticleCreationWorkflow/: https://gerrit.wikimedia.org/r/#/c/429111/ (duration: 01m 00s)
  • 20:29 hashar: contint1001: cleaned up old Docker images produced by docker-pkg
  • 20:09 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.1
  • 18:12 ottomata: reimaging (some?) kafka200* codfw main kafka nodes to stretch T192832
  • 17:27 awight@tin: Finished deploy [ores/deploy@5b27205]: ORES: update to revscoring 2.2.2, T192917 (duration: 21m 20s)
  • 17:09 ottomata: applying compression_type=snappy to eventbus service kafka producer
  • 17:05 awight@tin: Started deploy [ores/deploy@5b27205]: ORES: update to revscoring 2.2.2, T192917
  • 17:00 moritzm: installing systemd SUA update for stretch
  • 16:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Fix comment, test scap (duration: 01m 12s)
  • 16:03 mobrovac@tin: Synchronized wmf-config/jobqueue.php: JobQueue: Use EventBus for most jobs for test wikis - T190327 (duration: 01m 15s)
  • 16:03 ppchelko@tin: Finished deploy [cpjobqueue/deploy@bf34e00]: Enable all jobs for test, test2, testwikidata and mediawiki. T190327 (duration: 00m 51s)
  • 16:02 ppchelko@tin: Started deploy [cpjobqueue/deploy@bf34e00]: Enable all jobs for test, test2, testwikidata and mediawiki. T190327
  • 15:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Add db1090 as multiinstance (duration: 01m 16s)
  • 15:36 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1090 as multiinstance (duration: 01m 17s)
  • 15:18 mutante: added LDAP user tschumann to "nda" group (T192549)
  • 14:53 ppchelko@tin: Finished deploy [changeprop/deploy@f2f7a84]: Commit offsets for non matched messages from time to time. (duration: 01m 26s)
  • 14:51 ppchelko@tin: Started deploy [changeprop/deploy@f2f7a84]: Commit offsets for non matched messages from time to time.
  • 14:26 anomie: Running populateRevisionLength.php on group 1 for T192189
  • 14:25 jynus: stop db1069 for cloning it away
  • 13:58 marostegui: Compress enwiki on db1116:3311 - T190704
  • 13:55 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069, repool db1086 (duration: 01m 16s)
  • 13:35 zeljkof: EU SWAT finished
  • 13:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change chapcomwikis logo, add HD logo for chapcomwiki (T193024) (duration: 01m 16s)
  • 13:30 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change chapcomwikis logo, add HD logo for chapcomwiki (T193024) (duration: 01m 16s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add all Hindi projects plus meta as import sources for hiwikimedia (T188366) (duration: 01m 17s)
  • 13:09 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Fix pixelization of new wiki logos (T193028) (duration: 01m 17s)
  • 12:53 marostegui: Deploy schema change on db1113:3312 - T191519 T188299 T190148
  • 12:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113 for alter table (duration: 01m 33s)
  • 12:51 gehel: reindexing lost updates on elasticsearch - T193112
  • 12:04 mobrovac@tin: Finished deploy [cpjobqueue/deploy@7fbb152]: Support the exclude_topics config stanza (duration: 01m 12s)
  • 12:03 mobrovac@tin: Started deploy [cpjobqueue/deploy@7fbb152]: Support the exclude_topics config stanza
  • 10:35 moritzm: reimaging mw1312 mw1317, mw1339 (API servers) to stretch
  • 10:29 moritzm: reimaging mw1269, mw1323, mw1324 (app servers) to stretch
  • 09:57 marostegui: Drop prefswitch_survey on s1 - T173439
  • 09:50 godog: eqiad-prod: more weight to ms-be104[0-3] for container/account - T190081
  • 09:45 marostegui: Drop prefswitch_survey on s3 - T173439
  • 09:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 with low load (duration: 01m 16s)
  • 09:30 marostegui: Drop prefswitch_survey on s7 - T173439
  • 09:16 marostegui: Drop prefswitch_survey on s2 - T173439
  • 09:15 mark: Temp disabling cr1-ulsfo:xe-1/2/0 (Chicago transport) due to stability issues
  • 09:13 marostegui: Drop prefswitch_survey on s4 - T173439
  • 09:02 marostegui: Drop prefswitch_survey on s5 and s6 - T173439
  • 09:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 (duration: 01m 16s)
  • 08:51 moritzm: reimaging mw1320, mw1321, mw1322 (app servers) to stretch
  • 08:32 moritzm: re-attempt reimage of mw1246 (failed yesterday with an error on the puppetmaster, testing whether this can be reproduced)
  • 08:24 jynus: stop and upgrade db1109
  • 07:58 marostegui: Deploy schema change on db1090 - T191519 T188299 T190148
  • 07:45 jynus: stopping db1090 mariadb instance to move its path, port and socket
  • 07:21 gehel: restarting redis masters in codfw - T193112
  • 07:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090, pool db1122 with full weight (duration: 01m 23s)
  • 07:16 gehel: re-enabling puppet on rdb2* - T193112
  • 06:19 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=elasticsearch
  • 05:18 marostegui: Deploy schema change on dbstore1002:s2 - T191519 T188299 T190148
  • 04:39 ebernhardson: unfreeze writes to elasticsearch codfw cluster
  • 03:54 _joe_: stopping redis replication from eqiad to codfw for the jobqueue cluster, we have an issue ongoing with CirrusSearch jobs and replication is broken
  • 03:41 ejegg: re-enabled ingenico recurring charge job
  • 02:05 mutante: mw2163 through mw2166: since the wmf-auto-reimage failed after OS but before puppet run due to "Failed to puppet_generate_certs" i manually logged in with install-console and signed puppet certs (T174431)

2018-04-25

  • 22:55 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Undeploy GlobalPreferences T184121 (duration: 01m 16s)
  • 22:21 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy GlobalPreferences T189806 (duration: 01m 18s)
  • 21:03 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.1
  • 21:01 demon@tin: Synchronized php: symlink bump (duration: 01m 16s)
  • 20:58 hasharAway: on tin: rebased php-1.31.0-wmf.30 for https://gerrit.wikimedia.org/r/#/c/429018/
  • 20:21 XioNoX: remove test VIP for eqiad ping offload server - T190090
  • 20:18 bsitzmann@tin: Finished deploy [mobileapps/deploy@5a4a282]: Config: Start up to 4 workers in parallel during start-up (duration: 06m 48s)
  • 20:11 bsitzmann@tin: Started deploy [mobileapps/deploy@5a4a282]: Config: Start up to 4 workers in parallel during start-up
  • 19:39 otto@tin: Finished deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/ (duration: 01m 45s)
  • 19:37 otto@tin: Started deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 19:12 urandom: altering timeline tables for 6 month TTL -- T192689
  • 19:11 otto@tin: Finished deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/ (duration: 00m 11s)
  • 19:11 otto@tin: Started deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 19:09 otto@tin: Started deploy [eventlogging/eventbus@f562c1b]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 18:55 imarlier@tin: Finished deploy [performance/coal@1e79c79]: deploy fix for coal-web (duration: 00m 06s)
  • 18:55 imarlier@tin: Started deploy [performance/coal@1e79c79]: deploy fix for coal-web
  • 18:16 ejegg: updated CiviCRM from 219798b2c5 to 47197006d5
  • 17:35 urandom: starting cleanups on row 'a' Cassandra nodes -- T189822
  • 17:33 mepps: update civicrm from 6ddeb167ec to 219798b2c5
  • 17:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change fawiki uca to the right one (duration: 01m 17s)
  • 17:11 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on frwikiquote T192301 (duration: 01m 17s)
  • 17:00 mutante: powercycling wdqs1004
  • 16:09 mutante: re-imaging mw2258, mw2163, mw2164 ff.
  • 15:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1122, db1090 with low load (duration: 01m 14s)
  • 15:22 anomie: Running populateRevisionLength.php on group 0 for T192189
  • 15:05 ottomata: temp disabling puppet, applying ipv6 mapped on kafka200*
  • 15:04 andrewbogott: adding labvirt1016 to the nova-compute scheduling pool
  • 14:37 elukey: restart hive-server2 on analytics1003 to pick up settings in https://gerrit.wikimedia.org/r/428919
  • 14:34 akosiaris: reboot bohrium T150532
  • 14:33 ema: cp3030: upgrade varnish to 5.1.3-1wm7 T192368
  • 14:12 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1002 T193025 (duration: 01m 16s)
  • 13:57 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: pool poolcounter1003 T187297 (duration: 01m 16s)
  • 13:53 Amir1: EU SWAT is done!
  • 13:53 ladsgroup@tin: Synchronized php-1.32.0-wmf.1/extensions/Wikibase/lib/includes/Changes: Make sure statements in EntityDiffChangedAspects are not passed around as stdClass (T192085) (duration: 01m 16s)
  • 13:49 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: repool poolcounter1001 T150532 (duration: 01m 16s)
  • 13:43 ladsgroup@tin: Synchronized php-1.31.0-wmf.30/extensions/Wikibase/lib/includes/Changes: Make sure statements in EntityDiffChangedAspects are not passed around as stdClass (T192085) (duration: 01m 17s)
  • 13:40 akosiaris: reboot poolcounter1001 for T150532
  • 13:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Mapframe for bgwiki (T192895) (duration: 01m 15s)
  • 13:23 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for cswiki Wikipedia event (T192898) (duration: 01m 16s)
  • 13:19 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1001 T150532 (duration: 01m 17s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for cswiki Wikipedia event (T192898) (duration: 01m 16s)
  • 13:12 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Remove xx-uca-fa for Persian Wikis except Wikipedia (duration: 01m 17s)
  • 13:06 marostegui: Deploy schema change on s2 codfw master (db2035) - this will generate lag on codfw - T191519 T188299 T190148
  • 12:55 gehel: starting elasticsearch codfw rolling restart for plugin update and NUMA config - T191543 / T191236
  • 12:47 akosiaris: reboot puppetdb1001 for T150532
  • 12:08 moritzm: reimaging mw1251, mw1252, mw1253 (app servers) to stretch
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Add db1122 (duration: 01m 16s)
  • 11:29 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1122 (duration: 03m 24s)
  • 11:19 moritzm: reimaging mw1228, mw1229, mw1230 (API servers) to stretch
  • 10:29 jynus: stopping replication, running optimize table on dbstore2001:s8
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 (duration: 01m 16s)
  • 09:58 elukey: reimage analytics106[1,2] to Debian Stretch
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 after alter table (duration: 01m 30s)
  • 09:09 jynus: stopping db1090 for maintenance
  • 08:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 01m 17s)
  • 08:38 marostegui: Drop user_old and user_temp tables from s3 - T172664
  • 08:23 godog: eqiad-prod: add ms-be104[0-3] with minimal weight - T190081
  • 08:23 moritzm: reimaging mw1247, mw1248, mw1249 (app servers) to stretch
  • 07:35 marostegui: Deploy schema change on db1085 with replication (this will generate lag on labsdb hosts on s6) - T191519 T188299 T190148
  • 07:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 for alter table (duration: 01m 16s)
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113:3316 after alter table (duration: 01m 16s)
  • 07:05 akosiaris: starting a very slow rolling reboot of all VMs on codfw ganeti cluster, row_C nodegroup, excluding poolcounter1001 and puppetdb1001. T150532
  • 06:53 moritzm: reimaging mw1314, mw1315, mw1316 (API servers) to stretch
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 01m 21s)
  • afk: disabled ingenico recurring donation charge job
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.30) (duration: 07m 23s)
  • 02:52 ejegg: turned fundraising queue consumers back on
  • 01:31 ejegg: disabled fundraising queue consumer jobs
  • 00:31 demon@tin: Synchronized multiversion/defines.php: rm unused defines (duration: 01m 16s)

2018-04-24

  • 23:33 legoktm@tin: Synchronized php-1.32.0-wmf.1/extensions/Kartographer/includes/Tag/MapFrame.php: MapFrame: Allow lang="local" to be passed (duration: 01m 17s)
  • 23:29 urandom: starting Cassandra bootstrap, restbase1010-c -- T189822
  • 23:08 mutante: mw2242.codfw , mw2255.codfw et al.. more stretch reinstalls going on
  • 23:04 demon@tin: Synchronized docroot/m.wikipedia.org/w/mobilelanding.php: unbreak multiversion loading for a totally useless script (duration: 01m 16s)
  • 22:55 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 01m 18s)
  • 22:53 legoktm@tin: Synchronized wmf-config/CommonSettings.php: Fix wgTidyConfig and restore proper tidy & Remex config - T192855 (duration: 01m 16s)
  • 21:56 mutante: adding LDAP user 'bitpogo' to group 'wmde' (T191523)
  • 21:23 ejegg: re-enabled recurring donations queue consumer
  • 20:55 demon@tin: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.1
  • 20:27 urandom: starting Cassandra bootstrap, restbase1010-b -- T189822
  • 20:23 Dereckson: Run namespaceDupes on gorwiki
  • 20:03 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for all wikis but wikitech - T191464 (duration: 01m 26s)
  • 19:53 bblack: prometheus-fail switched to UNKNOWNs for now in https://gerrit.wikimedia.org/r/#/c/428725/ - may want to look at this further later, intent is to reduce odds of debilitating ops spam for the evening.
  • 19:49 elukey: re-enable ircecho
  • 19:40 demon@tin: Finished scap: bootstrap 1.32.0-wmf.1 (duration: 106m 55s)
  • 19:36 elukey: stop ircecho on einstenium - icinga shower
  • 19:17 jgleeson: Updating civicrm from 142edbb90b to 6ddeb167ec
  • 18:54 ottomata: temp disabling puppet and applying profile::kafka::broker on kafka100* T192831
  • 17:53 demon@tin: Started scap: bootstrap 1.32.0-wmf.1
  • 17:52 gehel: restarting wdqs-updater on all nodes for prometheus jmx exporter update - T192768
  • 17:51 andrew@tin: Synchronized wmf-config/db-eqiad.php: Renaming 'm5' section to 'wikitech' for T189542, two of two (duration: 00m 59s)
  • 17:49 andrew@tin: Synchronized wmf-config/db-codfw.php: Renaming 'm5' section to 'wikitech' for T189542, one of two (duration: 00m 59s)
  • 17:42 ottomata: temp disabling puppet on kafka200* to apply profile::kafka::broker in main-codfw T192831
  • 17:39 demon@tin: Pruned MediaWiki: 1.31.0-wmf.29 [keeping static files] (duration: 06m 28s)
  • 17:35 XioNoX: removing firewall block on cr1/2-codfw - T175361
  • 17:35 XioNoX: removing firewall block on cr1-eqdfw - T175361
  • 17:29 bstorm_: added MCR tables to labsdb1009 (slots, slot_roles, content_models, content)
  • 17:04 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% [deploy to restbase1010] - T192689 (duration: 02m 04s)
  • 17:02 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% [deploy to restbase1010] - T192689
  • 17:01 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689 (duration: 05m 27s)
  • 16:57 urandom: starting Cassandra bootstrap, restbase1010-a -- T189822
  • 16:55 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689
  • 16:52 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689 (duration: 11m 40s)
  • 16:45 marostegui: Deploy schema change on db1113:3316 - T191519 T188299 T190148
  • 16:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113:3316 for alter table (duration: 00m 58s)
  • 16:40 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3316 after alter table (duration: 00m 58s)
  • 16:30 elukey: restart hadoop hdfs journalnode on analytics1035/52 to pick up prometheus jmx settings
  • 16:11 elukey: restart hadoop-hdfs-journalnode on analytics1028 to pick up prometheus monitoring
  • 16:10 bstorm_: Added views for new MCR tables on labsdb1011 (slots, slot_roles, content and content_models)
  • 16:08 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011 - https://phabricator.wikimedia.org/T184446
  • 15:59 godog: reimage restbase1010 after ssd swap - T189822
  • 15:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with full weight (duration: 00m 58s)
  • 14:41 elukey: restart hadoop hdfs journalnode on analytics1028 to pick up jmx settings
  • 14:40 sbisson@tin: Finished deploy [kartotherian/deploy@86da82d]: Deploy latest kartotherian with updated fallbacks and support lang=local (duration: 06m 29s)
  • 14:34 sbisson@tin: Started deploy [kartotherian/deploy@86da82d]: Deploy latest kartotherian with updated fallbacks and support lang=local
  • 14:02 Amir1: EU SWAT is done
  • 14:01 hoo@tin: Synchronized wmf-config/abusefilter.php: Grant Meta-Wiki sysops the ability to edit global abusefilter rules (T192722) (duration: 00m 59s)
  • 13:58 hoo@tin: Synchronized wmf-config/: Properly set default for $wmgWikibaseSiteGroup (T188456) (duration: 00m 58s)
  • 13:56 hoo@tin: Synchronized wmf-config/: Properly set default for $wmgWikibaseSiteGroup (T188456) (duration: 01m 00s)
  • 13:43 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Increase the timespan of rate limit in wikidata from 1m to 5m (T192690) (duration: 00m 58s)
  • 13:37 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Clean up old config for logging autopatrol actions (T184485) (duration: 00m 58s)
  • 13:28 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Add badge for good lists (T190976) (duration: 00m 55s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for IndigenizeWikipedia event, clean obsolete rules (T192827) (duration: 00m 58s)
  • 13:06 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Set default for $wmgWikibaseSiteGroup (T188456) (duration: 00m 59s)
  • 12:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 58s)
  • 12:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1110 load (duration: 00m 58s)
  • 12:28 elukey: cleanup /home/elukey/zookeeper backup files (taken before the 3.4.9 migration) on conf*
  • 12:13 marostegui: Deploy schema change on db1098:3316 - T191519 T188299 T190148
  • 12:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 for alter table (duration: 00m 58s)
  • 12:10 elukey: reimage analytics106[34] to Debian Stretch
  • 12:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 after alter table (duration: 00m 58s)
  • 11:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with low load (duration: 00m 59s)
  • 11:44 moritzm: reimaging mw1241, mw1242, mw1243 (app servers) to stretch
  • 10:58 moritzm: reimaging mw1224, mw1225, mw1226 (API servers) to stretch
  • 10:50 elukey: reimage analytics106[56] to Debian Stretch
  • 10:49 arturo: enable puppet in labtestcontrol2001 to sync with repo changes
  • 10:39 akosiaris: starting a very slow rolling reboot of all VMs on codfw ganeti cluster T150532
  • 10:39 akosiaris: upgrade to qemu 2.8 on codfw ganeti cluster. T150532
  • 10:31 jynus: stop and reimage db1110
  • 10:01 apergos: reimaged snapshot1001 for testing with php7/stretch
  • 09:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 00m 58s)
  • 09:28 marostegui: Deploy schema change on db1088 - T191519 T188299 T190148
  • 09:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 for alter table (duration: 00m 58s)
  • 09:25 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #3 - T192689 T190846 (duration: 04m 30s)
  • 09:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 after alter table (duration: 03m 06s)
  • 09:21 moritzm: reimaging mw1221, mw1222, mw1223 (API servers) to stretch
  • 09:21 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #3 - T192689 T190846
  • 09:21 moritzm: reimaging mw1221, mw1222, mw1223 (app servers) to stretch
  • 09:21 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #2 - T192689 T190846 (duration: 03m 03s)
  • 09:18 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #2 - T192689 T190846
  • 09:17 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points - T192689 T190846 (duration: 13m 13s)
  • 09:12 moritzm: reimaging mw1273, mw1274, mw1275 (app servers) to stretch
  • 09:03 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points - T192689 T190846
  • 08:17 hoo: Finished running populateSitesTable.php for all wikis (T192628, T192632, T192631, T192633)
  • 08:14 elukey: upload druid_0.10.0-3~jessie1 (collection of druid packages) to jessie-wikimedia - T164008
  • 08:05 godog: power off restbase1010 for ssd replacement - T189822
  • 07:50 hoo: Started running populateSitesTable.php for all wikis (T192628, T192632, T192631, T192633)
  • 07:39 marostegui: Rename user_old and user_temp tables on db1077 - T172664
  • 07:28 gehel: restarting blazegraph on wdqs1004 for jvm upgrade
  • 07:23 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011 - T184446
  • 07:16 vgutierrez: Update puppet compiler facts
  • 06:56 elukey: restart zookeeper on conf200[123] for openjdk upgrades
  • 06:41 moritzm: installing poppler security updates
  • 06:35 marostegui: Deploy schema change on db1093 - T191519 T188299 T190148
  • 06:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 for alter table (duration: 00m 59s)
  • 05:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3316 after alter table (duration: 00m 59s)
  • 05:03 _joe_: rebuilding the docker base images
  • 04:35 mutante: repooled mw2224, reinstalling mw2225 through mw2228
  • 03:08 mutante: reinstalling mw2224.codfw.wmnet with wmf-auto-reimage
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.30) (duration: 10m 37s)
  • 01:55 cwd: payments, civi, and alerts re-enabled
  • 01:11 ejegg: re-enabled fundraising jobs
  • 01:09 ejegg: updated fundraising python tools from f3ed1d05b8 to 3754f32ab6
  • 00:18 mutante: removing travel@ and travelapproval@ exim aliases, moving to OIT/Google (T127549)

2018-04-23

  • 23:51 eileen: civicrm revision changed from 347e613aa5 to 142edbb90b, config revision is 07dee62bff
  • 23:35 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable non-static internationalized maps on test2wiki (duration: 00m 59s)
  • 23:32 catrope@tin: Synchronized php-1.31.0-wmf.30/extensions/Thanks/includes/EchoCoreThanksPresentationModel.php: Fix fatal error in Thanks notifications (T192711) (duration: 00m 58s)
  • 23:29 eileen: civicrm revision changed from b1e7ccfc4d to 347e613aa5, config revision is 07dee62bff
  • 23:15 XioNoX: changed AMS-IX peering mode to default (filter on radb+rpki)
  • 23:13 cwd: disabled most (all?) frack alerts
  • 23:11 ebernhardson: restart elasticsearch on elastic1031 to apply numa settings
  • 22:56 XioNoX: disabling flapping VCP on asw1-eqsin - T192125
  • 22:37 mutante: phab1001 - deleting duplicate cronjob for public_taskdump.py (the one that did not output to /dev/null) (T188149)
  • 22:21 ebernhardson: restart elasticsearch on elastic1030 to apply numa settings
  • 22:12 ebernhardson: restart elasticsearch on elastic1029 to apply numa settings
  • 21:49 ebernhardson: restart elasticsearch on elastic1028 to apply numa settings
  • 21:40 ejegg: updated fundraising python tools from 7c5c7a5f9e to f3ed1d05b8
  • 21:39 ejegg: updated SmashPig from 1ebee97a45 to a4de12d415
  • 21:36 ebernhardson: restart elasticsearch on elastic1024 to apply numa settings
  • 21:25 ebernhardson: restart elasticsearch on elastic1025 to apply numa settings
  • 20:53 XioNoX: redirect text-lb.eqiad pings to ping1001 on cr1/2-eqiad (24h tests) - T190090
  • 20:47 ppchelko@tin: Finished deploy [restbase/deploy@228caf8]: Log the lack of the index entries, take 2 (duration: 03m 55s)
  • 20:43 ppchelko@tin: Started deploy [restbase/deploy@228caf8]: Log the lack of the index entries, take 2
  • 20:43 ppchelko@tin: Finished deploy [restbase/deploy@228caf8]: Log the lack of the index entries (duration: 14m 19s)
  • 20:40 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.30
  • 20:32 mholloway-shell@tin: Finished deploy [mobileapps/deploy@5650605]: Update mobileapps to b011b2a (duration: 05m 56s)
  • 20:29 ppchelko@tin: Started deploy [restbase/deploy@228caf8]: Log the lack of the index entries
  • 20:26 mholloway-shell@tin: Started deploy [mobileapps/deploy@5650605]: Update mobileapps to b011b2a
  • 20:15 Dereckson: Purged all languages messages from the cache, for gorwiki (rebuildmessages.php, T189127)
  • 19:49 vgutierrez: Repool (Re-enable BGP) in lvs5001 - T191897
  • 19:34 elukey@tin: Finished deploy [analytics/pivot/deploy@cb9ddee]: Fix 0.10.0 compatibility - T164008 (duration: 00m 17s)
  • 19:34 elukey@tin: Started deploy [analytics/pivot/deploy@cb9ddee]: Fix 0.10.0 compatibility - T164008
  • 18:48 catrope@tin: Synchronized dblists/wikidataclient.dblist: Add ruwikimedia to wikidataclient (T188456) (duration: 01m 15s)
  • 18:33 vgutierrez: Depool lvs5001 - T191897
  • 18:33 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Change timezone for napwiki (T192568) (duration: 01m 31s)
  • 18:28 vgutierrez: Repool (Re-enable BGP) lvs5002 - T191897
  • 18:18 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable WikiLove on sawiki (T192212) (duration: 01m 19s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable internationalized maps on testwiki (duration: 01m 17s)
  • 17:52 ariel@tin: Finished deploy [dumps/dumps@02a3e80]: fix up checks for truncated/binary output files (duration: 00m 04s)
  • 17:52 ariel@tin: Started deploy [dumps/dumps@02a3e80]: fix up checks for truncated/binary output files
  • 17:35 XioNoX: pushing firewall block on cr1-eqdfw - T175361
  • 17:24 XioNoX: pushing firewall block on cr1/2-codfw - T175361
  • 17:18 thcipriani@tin: Synchronized php: Group1 to 1.31.0-wmf.30 (duration: 01m 16s)
  • 17:15 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to 1.31.0-wmf.30
  • 17:02 vgutierrez: Depool and reimage lvs5002 as stretch - T191897
  • 16:15 demon@tin: Pruned MediaWiki: 1.31.0-wmf.26 (duration: 03m 28s)
  • 16:07 marostegui: Deploy schema change on db1096:3316 - T191519 T188299 T190148
  • 16:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 for alter table (duration: 01m 16s)
  • 16:03 gehel: restarting wdqs-updater on all nodes
  • 15:55 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010
  • 15:53 bstorm_: Added slots, slot_roles, content and content_models to views on labsdb1010
  • 15:36 dereckson@tin: Finished scap: Rebuild localisation cache to add Gorontalo (T189127) (duration: 08m 29s)
  • 15:28 dereckson@tin: Started scap: Rebuild localisation cache to add Gorontalo (T189127)
  • 15:28 dereckson@tin: scap aborted: Rebuild localisation cache to add Gorontalo (T189127)z (duration: 00m 01s)
  • 15:28 dereckson@tin: Started scap: Rebuild localisation cache to add Gorontalo (T189127)z
  • 15:23 dereckson@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 00m 46s)
  • 15:20 dereckson@tin: Synchronized php-1.31.0-wmf.30/languages/messages/MessagesGor.php: Localisation for MediaWiki in Gorontalo (T189127) (duration: 01m 16s)
  • 15:13 dereckson@tin: Synchronized php-1.31.0-wmf.29/languages/messages/MessagesGor.php: Localisation for MediaWiki in Gorontalo (T189127) (duration: 01m 18s)
  • 14:10 ottomata: switching main -> analytics MirrorMaker to --new.consumer (temporarily stopping puppet on kafka101[234]) https://phabricator.wikimedia.org/T192387
  • 14:02 zeljkof: EU SWAT finished
  • 13:57 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: lfnwiki: add logo path and missing namespace names (T183561) (duration: 01m 15s)
  • 13:55 elukey: reimage analytics1067 to Debian Stretch - T192557
  • 13:53 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 13:50 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 13:43 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: euwikisource: add missing $wgMetaNamespace (T189465) (duration: 01m 16s)
  • 13:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: gorwiki: add missing namespaces (T189109) (duration: 01m 17s)
  • 13:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add logos for gorwiki (T192669) (duration: 01m 14s)
  • 13:27 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Add logos for gorwiki (T192669) (duration: 01m 16s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Temp rate limit for arwiki due to mass vandalism (T192668) (duration: 01m 17s)
  • 13:12 jynus: restarting es2003 to test gerrit:427902
  • 12:59 marostegui: Deploy schema change on dbstore1002 s6 - T191519 T188299 T190148
  • 12:58 jynus: disabling puppet on several mysql hosts before deploying gerrit:427902
  • 12:40 sbisson@tin: Finished deploy [kartotherian/deploy@2195dde]: Deploy kartotherian with new babel fallback rules (duration: 04m 52s)
  • 12:35 sbisson@tin: Started deploy [kartotherian/deploy@2195dde]: Deploy kartotherian with new babel fallback rules
  • 11:50 moritzm: reimaging mw1238,mw1239,mw1240 (app servers) to stretch
  • 11:46 moritzm: reimaging mw1285 (previous attempt had a hardware problem which failed to trigger the reboot via IPMI) ,mw1287,mw1288 (API servers) to stretch
  • 11:41 moritzm: installing poppler security updates
  • 11:25 mobrovac@tin: Finished deploy [citoid/deploy@b3c0818]: Add support for restful crossRef API and Wikidata QIDs - T108175 T176411 (duration: 03m 36s)
  • 11:22 mobrovac@tin: Started deploy [citoid/deploy@b3c0818]: Add support for restful crossRef API and Wikidata QIDs - T108175 T176411
  • 11:17 mobrovac@tin: Finished deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource, take #2 - T192678 (duration: 07m 21s)
  • 11:10 mobrovac@tin: Started deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource, take #2 - T192678
  • 11:09 mobrovac@tin: Finished deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource - T192678 (duration: 11m 47s)
  • 11:00 gehel: restarting wdqs updater on all wdqs notes
  • 10:57 mobrovac@tin: Started deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource - T192678
  • 10:26 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 10:25 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 17s)
  • 09:56 _joe_: restarting memcached on mc1020-1036 at 1 hour intervals - T184854
  • 09:13 godog: Flashing Smart Array P840 in Slot 3 [ 4.52 -> 6.30 ] on ms-be2034 - T192721 T141756
  • 09:05 _joe_: AMEND: restart memcached on mc1019 (T184854)
  • 09:05 _joe_: restart memcached on mw1019 (Ttail -f /var/log/etcdmirror-conftool-eqiad-wmnet/syslog.log
  • 09:05 vgutierrez: restarting pybal on lvs1006
  • 09:02 _joe_: restarting etcdmirror on conf2002 after restarting nginx on conf1001
  • 08:59 moritzm: reimaging mw1283,mw1285,mw1286 (API servers) to stretch
  • 08:57 marostegui: Deploy schema change on s6 codfw master (db2039) - this will generate lag on codfw - T191519 T188299 T190148
  • 08:56 gehel: rolling restart of blazegraph on wdqs1004, 2004 and 2005 for JVM upgrade
  • 08:55 moritzm: reimaging mw1270,mw1271,mw1272 (app servers) to stretch
  • 08:52 vgutierrez: restarting pybal on esams cluster
  • 08:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 (duration: 01m 16s)
  • 08:48 _joe_: upgrading nginx on the config cluster in eqiad (T164456)
  • 08:47 marostegui: Drop table logging_pre_1_10 in s5 - T118859
  • 08:47 marostegui: Dropped table logging_pre_1_10 in s3 - T118859
  • 08:42 vgutierrez: restarting pybal on lvs4006
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 (duration: 01m 18s)
  • 08:36 vgutierrez: restarting pybal on codfw (once at a time)
  • 08:33 vgutierrez: restart pybal on lvs4007
  • 08:31 vgutierrez: restarting pybal on lvs5002
  • 08:30 vgutierrez: restarting pybal on lvs5001
  • 08:30 marostegui: Drop table logging_pre_1_10 in s4 - T118859
  • 08:27 vgutierrez: restarting pybal on lvs4005
  • 08:27 _joe_: restarting pybal on lvs5003
  • 08:17 _joe_: upgrading nginx on the config cluster in codfw (T164456)
  • 08:13 marostegui: Drop table logging_pre_1_10 in s7 - T118859
  • 08:08 _joe_: restarting memcached in codfw (T184854)
  • 08:08 gehel: restarting blazegraph on wdqs1003 (crazy number of java threads)
  • 08:04 moritzm: upgrading terbium to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:58 ema: cp-misc: upgrade varnish to 5.1.3-1wm7
  • 07:55 marostegui: reload haproxy on dbproxy1010 to depool labsdb1010
  • 07:55 marostegui: Depool labsdb1010 - T184446
  • 07:50 marostegui: Drop table logging_pre_1_10 in s2 - T118859
  • 07:47 marostegui: Drop table logging_pre_1_10 in s6 - T118859
  • 07:36 moritzm: upgrading remaining API servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:35 elukey: reboot ms-be2034 - stuck in com2 console with "sd 0:1:0:1: rejecting I/O to offline device", not responsive to ssh
  • 07:00 moritzm: upgrading remaining app servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 06:26 marostegui: Remove logging_pre_1_10 from codfw - T118859
  • 05:28 marostegui: flow_subscription empty table from officewiki - T149936
  • 05:17 marostegui: Deploy schema change on db1070 (s5 primary master) - T191519 T188299 T190148
  • 02:40 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 56s)

2018-04-22

  • 16:29 ariel@tin: Finished deploy [dumps/dumps@bb7ae96]: creadtedirs date fixup, rerun only missing stub type (duration: 00m 03s)
  • 16:29 ariel@tin: Started deploy [dumps/dumps@bb7ae96]: creadtedirs date fixup, rerun only missing stub type

2018-04-21

2018-04-20

  • 20:45 andrewbogott: re-imaging labvirt1021 and 1022 as Jessie
  • 20:23 ejegg: updated fundraising python tools from 0c50f9e38f to 7c5c7a5f9e
  • 18:23 mutante: add LDAP user "tieu" to group "wmde" (T192256)
  • 17:42 imarlier@tin: Finished deploy [performance/coal@99db58f]: coal - update to submit via graphite. Not yet active, requires puppet changes (duration: 00m 04s)
  • 17:41 imarlier@tin: Started deploy [performance/coal@99db58f]: coal - update to submit via graphite. Not yet active, requires puppet changes
  • 17:35 no_justification: gerrit: update mysql-client and deps 5.5.59 -> 5.5.60
  • 17:28 mutante: phabricator - restarted apache
  • 17:26 mutante: phabricator (phab1001) - upgrading Apache, openssl, mysql-common
  • 17:17 mutante: phab2001 - upgrading apache, openssl, mysql-common
  • 17:04 andrewbogott: rebooting labvirt1021 and 1022
  • 16:44 dcausse@tin: Synchronized php-1.31.0-wmf.30/extensions/CirrusSearch/: T192609: Do not propagate Elastica doc modifications out of DataSender (duration: 01m 34s)
  • 15:07 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2087 (duration: 01m 16s)
  • 14:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2086, depool db2087 (duration: 01m 16s)
  • 14:16 andrew@tin: Synchronized dblists: Purging obsolete silver.dblist (duration: 01m 17s)
  • 14:02 moritzm: upgrading labweb* servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 14:00 jynus: upgrade and restart db2086
  • 13:59 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2086 (duration: 01m 13s)
  • 13:35 anomie: (re-)creating `slots` table on all wikis, following up T190153 and T184446#4143097
  • 13:25 moritzm: upgrading mysql (as shipped in Debian) on bohrium
  • 13:00 moritzm: installing zsh security updates on trusty servers
  • 12:25 moritzm: upgrading apache on auth* servers
  • 12:18 jynus: upgrading and restarting dbstore2002
  • 12:15 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 (duration: 01m 17s)
  • 12:06 moritzm: installing apache security updates on video scalers
  • 12:05 moritzm: upgrading apache on einsteinium/icinga.wikimedia.org
  • 11:53 moritzm: installing apache security updates on netmon1002/2001
  • 11:27 elukey: reimage analytics1068 to Debian Stretch - T192557
  • 11:06 moritzm: installing tiff security updates on trusty
  • 09:58 godog: upload scap 3.8.0-2 - T192124
  • 09:51 moritzm: upgrading deployment servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 09:41 jynus: starting reimage of db2070
  • 09:41 moritzm: upgrading mwdebug servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 09:33 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2071, depool db2070 (duration: 01m 16s)
  • 09:12 elukey: restart of mw apis showing ~50% cpu utilization as precaution before the weekend - mw[1224,1225,1228,1230,1231,1233-1235,1276-1283,1286,1312,1313,1315,1316,1341,1343,1344,1347,1348]*
  • 09:06 moritzm: upgrading video scalers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 08:41 moritzm: upgrading job runners in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 08:39 marostegui: Going to sanitize gorwiki euwikisource romdwikimedia inhwiki on db1095 - T189112 T189466 T187774 T184375
  • 08:39 elukey: restart hhvm on mw[1226,1232].eqiad.wmnet - high load
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 01m 16s)
  • 07:57 jynus: starting reimage of db2071
  • 07:52 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2071 (duration: 01m 16s)
  • 07:48 moritzm: upgrading app servers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 01m 17s)
  • 07:38 ema: cp3041: restart varnish-be due to mbox lag
  • 07:37 akosiaris: upgrade qemu on ganeti2006 to 1:2.8+dfsg-3~bpo8+1 and migrate mwdebug2001 to it T150532
  • 07:32 ema: cp3030: restart varnish-be due to mbox lag
  • 07:30 _joe_: upgrading hhvm on all jobrunners in eqiad
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 01m 15s)
  • 07:09 ema: cp3032/cp3043: restart varnish-be due to mbox lag
  • 07:08 moritzm: upgrading API servers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 after alter table (duration: 01m 16s)
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 01m 15s)
  • 06:26 ema: kafka::analytics remove strongswan leftovers T185136
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 01m 15s)
  • 06:07 marostegui: Stop mysql db1114 for a reboot
  • 06:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 01m 16s)
  • 05:55 _joe_: depooling mw1227 from live traffic for investigation
  • 05:31 marostegui: Start atop on db1114 with "-R" option enabled - T192551
  • 05:31 marostegui: Deploy schema change on db1110 - T191519 T188299 T190148
  • 05:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 for alter table (duration: 01m 17s)
  • 05:21 ariel@tin: Finished deploy [dumps/dumps@c2d3bb4]: keep completed stubs/abstracts/logs files around for retries (duration: 00m 04s)
  • 05:20 ariel@tin: Started deploy [dumps/dumps@c2d3bb4]: keep completed stubs/abstracts/logs files around for retries
  • 01:50 krinkle@tin: Synchronized wmf-config/CommonSettings.php: If8fdce707d (duration: 01m 17s)

2018-04-19

  • 23:16 ebernhardson@tin: Synchronized php-1.31.0-wmf.29/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off cirrus ab test (duration: 01m 18s)
  • 23:13 ebernhardson@tin: Synchronized php-1.31.0-wmf.30/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off cirrus ab test (duration: 01m 17s)
  • 23:04 thcipriani@tin: Synchronized php: complete group1 and group2 wikis back to 1.31.0-wmf.29 (duration: 01m 16s)
  • 22:30 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 and group2 wikis back to 1.31.0-wmf.29
  • 21:41 urandom: Start cleanup, restbase10{07,11,16}-c -- T189822
  • 21:22 urandom: Start cleanup, restbase10{07,11,16}-b -- T189822
  • 21:15 urandom: Start cleanup, restbase10{07,11,16}-a -- T189822
  • 21:12 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter, restbase1010-c -- T189822, T192456
  • 21:00 ebernhardson: issue move of enwiki_content shard 2 from overloaded elasti1027 to elastic1017
  • 20:48 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter, restbase1010-a -- T189822, T192456
  • 20:48 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter -- T189822, T192456
  • 20:32 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.30
  • 20:27 milimetric@tin: Finished deploy [analytics/refinery@c1c9885]: Correcting hql from last deployment (duration: 05m 09s)
  • 20:22 milimetric@tin: Started deploy [analytics/refinery@c1c9885]: Correcting hql from last deployment
  • 19:53 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.30 (duration: 01m 15s)
  • 19:45 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.30
  • 19:35 thcipriani@tin: Synchronized php-1.31.0-wmf.30/includes/page/Article.php: Do not pass USE INDEX to a $dbType parameter T192584 (duration: 01m 17s)
  • 19:33 ejegg: updated fundraising python tools from 626fe02a9f to 0c50f9e38f
  • 19:22 no_justification: gerrit: restarting services to pick up gc & indexing changes
  • 18:32 thcipriani@tin: Synchronized php-1.31.0-wmf.30/resources/src/jquery: jquery.makeCollapsible: Only add "[" "]" to autogenerated toggles T192140 (duration: 01m 17s)
  • 17:21 andrew@tin: Synchronized wmf-config/db-eqiad.php: Moving labtestwikitech to m5, step 3 (duration: 01m 16s)
  • 17:20 andrew@tin: Synchronized wmf-config/db-codfw.php: Moving labtestwikitech to m5, step 2 (duration: 01m 16s)
  • 17:18 andrew@tin: Synchronized docroot/noc/db.php: Moving labtestwikitech to m5, step 1 (duration: 01m 16s)
  • 16:56 ejegg: re-enabled banner impressions loader
  • 16:50 moritzm: uploaded tidy-0.99 to component/ci for apt.wikimedia.org/stretch-wikimedia (T191771)
  • 16:46 ejegg: disabled banner impressions loader in order to run backfill mode
  • 16:28 gehel: restarting tilerator on maps[12].* - T191655
  • 16:20 gehel: shutting down tilerator on maps[12].* for maintenance - T191655
  • 15:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 after alter table (duration: 01m 16s)
  • 15:50 fdans@tin: Finished deploy [analytics/refinery@5d0f63f]: deploying to launch page preview job (duration: 06m 34s)
  • 15:48 marostegui: Deploy schema change on dbstore1002 (s5) - T191519 T188299 T190148
  • 15:44 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2074 (duration: 01m 17s)
  • 15:44 fdans@tin: Started deploy [analytics/refinery@5d0f63f]: deploying to launch page preview job
  • 15:42 sbisson@tin: Finished deploy [kartotherian/deploy@0a5a3ef]: Deploy latest kartotherian with new i18n sources (take 3) (duration: 05m 22s)
  • 15:37 sbisson@tin: Started deploy [kartotherian/deploy@0a5a3ef]: Deploy latest kartotherian with new i18n sources (take 3)
  • 15:36 sbisson@tin: Finished deploy [kartotherian/deploy@89c4ca9]: Deploy latest kartotherian with new i18n sources (take 2) (duration: 03m 05s)
  • 15:33 sbisson@tin: Started deploy [kartotherian/deploy@89c4ca9]: Deploy latest kartotherian with new i18n sources (take 2)
  • 15:16 sbisson@tin: Finished deploy [kartotherian/deploy@74121d5]: Deploy latest kartotherian with new i18n sources (duration: 05m 19s)
  • 15:11 sbisson@tin: Started deploy [kartotherian/deploy@74121d5]: Deploy latest kartotherian with new i18n sources
  • 14:48 Dereckson: Erratum: read "User:Andrei Stroe" and not "User:Anderi Store" for the previous entry (T187184)
  • 14:47 Dereckson: Create bureaucrat account for User:Anderi Store on romd.wikimedia (T187184)
  • 14:30 marostegui: Star atop on db1114 without "-R" - T192551
  • 14:29 marostegui: Deploy schema change on db1082 (this will generate lag on s5 on labs hosts) - T191519 T188299 T190148
  • 14:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 for alter table (duration: 01m 13s)
  • 14:19 ejegg: re-enabled queue jobs
  • 14:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113:3315 after alter table (duration: 01m 16s)
  • 14:12 jynus: starting reimage of db2074
  • 13:56 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2075, depool db2074 (duration: 01m 16s)
  • 13:39 marostegui: Stop atop on db1114 - T191996
  • 13:33 marostegui: Start atop on db1114 - T191996
  • 13:30 Trey314159: reindexing serbian wikis on elastic@eqiad (T189265)
  • 13:30 moritzm: upgrading mw1334-mw1337 (job runners) to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 13:14 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: T192427 T189277 (duration: 01m 17s)
  • 12:58 jynus: starting reimage of db2075
  • 12:48 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2075 (duration: 01m 16s)
  • 11:55 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2076 (duration: 01m 16s)
  • 11:39 moritzm: upgrading eqiad video scalers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 11:24 marostegui: Run check_private_data on labsdb - T183566
  • 11:21 marostegui: Sanitize lfnwiki - T183566
  • 11:20 moritzm: upgrading app servers mw1238-mw1258 to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 11:14 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: T181121 (duration: 01m 16s)
  • 11:09 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 01m 17s)
  • 11:05 marostegui: Deploy schema change on db1113:3315 - T191519 T188299 T190148
  • 11:03 jynus: starting reimage of db2076
  • 11:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113:3315 for alter table (duration: 01m 16s)
  • 11:01 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2076 (duration: 01m 18s)
  • 10:34 moritzm: upgrading API servers mw1221-mw1235 to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 10:27 vgutierrez: Repool (Re-enable BGP) lvs4005 - T191897
  • 09:59 elukey: complete migration of zookeeper on conf100[123]
  • 09:55 akosiaris: reboot ganeti VMs on row_B in codfw for cache=none setting. T181121
  • 09:54 vgutierrez: Updating puppet compiler facts
  • 09:51 moritzm: rolling restart of Cassandra on maps completed
  • 09:33 elukey: upgrade zookeper on conf100[123] from 3.4.5 to 3.4.9 - T182924
  • 09:31 akosiaris: start a force puppet run in all of eqiad with a batch size of 30
  • 09:29 akosiaris: stop ircecho for a while, puppetdb1001 reboot was eventful
  • 09:17 akosiaris: reboot puppetdb1001 for cache=none setting apply. T181121
  • 09:14 moritzm: installing Java security updates on maps* plus rolling restart of Cassandra to pick up new JRE
  • 09:06 vgutierrez: Depool and reimage lvs4005 as stretch - T191897
  • 09:03 moritzm: upgrading API server canaries to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build (T184854)
  • 08:40 vgutierrez: Repool (Re-enable BGP) lvs4006 - T191897
  • 08:14 ema: reboot deploy1001 and arm keyholder T175288
  • 08:14 moritzm: upgrading app server canaries to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build (T184854)
  • 07:47 akosiaris: set cache=none for ganeti VMs in codfw cluster configuration. VM reboots to follow T181121
  • 07:32 vgutierrez: Depool and reimage lvs4006 - T191897
  • 07:24 akosiaris: reboot ganeti VMs on row_A in eqiad for cache=none setting. T181121
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 after alter table (duration: 01m 17s)
  • 05:36 marostegui: Kill atop on db1114 - T191996
  • 05:33 marostegui: Revert RX buffer changes on db1114 - T191996
  • 05:27 marostegui: Deploy schema change on db1097:3315 - T191519 T188299 T190148
  • 05:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 for alter table (duration: 01m 33s)
  • 03:18 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 52s)
  • 01:15 eileen: civicrm revision changed from 0ac27e7c0d to b1e7ccfc4d, config revision is 49f5ba45e8
  • 00:12 Dereckson: Wikis creation done
  • 00:12 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set project namespace for hi.wikimedia (T188366) (duration: 01m 16s)
  • 00:04 Dereckson: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Fix path to hi.wikimedia.org 1x logo (Gerrit:427567)

2018-04-18

  • 23:44 Dereckson: Created bureaucrat account for Suyash.dwivedi at hi.wikimedia (T188366)
  • 23:35 dereckson@tin: Synchronized wmf-config/interwiki.php: New interwiki map for the six newest wikis (duration: 01m 17s)
  • 23:22 Dereckson: HTCP purge for https://hi.wikimedia.org and https://hi.wikimedia.org/
  • 23:19 Dereckson: Create tables for Translate extension on hiwikimedia
  • 23:13 Dereckson: HTCP purge for eu.wikisource logos
  • 23:10 dereckson@tin: Synchronized multiversion/MWMultiVersion.php: +hi.wikimedia.org +romd.wikimedia.org (duration: 01m 15s)
  • 23:05 dereckson@tin: Synchronized langlist: New languages: gor, inh, lfn (duration: 01m 17s)
  • 23:04 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for six wikis (duration: 01m 16s)
  • 23:03 dereckson@tin: rebuilt and synchronized wikiversions files: (no justification provided)
  • 23:02 dereckson@tin: Synchronized dblists: (no justification provided) (duration: 01m 15s)
  • 23:00 Dereckson: Starting syncing to production sequence for six wiki creation
  • 22:58 dereckson@tin: Synchronized static/images/project-logos/: Logos for eu.wikisource (T189465) (duration: 01m 12s)
  • 22:58 Dereckson: Created database and set initial stuff for hi.wikimedia.org (T188366)
  • 22:57 Dereckson: Created database and set initial stuff for romd.wikimedia.org
  • 22:31 Dereckson: Created database and set initial stuff for eu.wikisource.org (T189465)
  • 22:28 Dereckson: Created database and set initial stuff for gor.wikipedia.org (T189109)
  • 22:27 Dereckson: Created database and set initial stuff for inh.wikipedia.org (T184374)
  • 22:24 dereckson@tin: Synchronized php-1.31.0-wmf.29/extensions/WikimediaMaintenance/addWiki.php: Fix MassMessage fatal error (T192468) (duration: 01m 17s)
  • 22:17 Dereckson: Created database for lfn.wikipedia.org (T183561)
  • 21:57 eileen: civicrm revision changed from 00870af548 to 0ac27e7c0d, config revision is 853fcc9111
  • 21:53 ebernhardson: restart elasticsearch on elastic1022 with numa interleave
  • 21:17 eileen: civicrm revision changed from cddfe9416c to 00870af548, config revision is 853fcc9111
  • 20:52 ebernhardson: restart elasticsearch on elastic1020 with numa interleave
  • 20:13 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9328a7d]: Update mobileapps to fb161d7 (duration: 05m 56s)
  • 20:10 ebernhardson: restart elasticsearch on elastic1019 with numa interleave
  • 20:07 mholloway-shell@tin: Started deploy [mobileapps/deploy@9328a7d]: Update mobileapps to fb161d7
  • 19:55 thcipriani@tin: Finished scap: rebuild l10n cache (duration: 58m 57s)
  • 19:28 ppchelko@tin: Finished deploy [restbase/deploy@8d8f1df]: Test concurrent worker startups (duration: 15m 23s)
  • 19:15 ebernhardson: restart elasticsearch on elastic1018 with numa interleave
  • 19:13 ppchelko@tin: Started deploy [restbase/deploy@8d8f1df]: Test concurrent worker startups
  • 18:56 thcipriani@tin: Started scap: rebuild l10n cache
  • 18:35 dereckson@tin: Synchronized php-1.31.0-wmf.30/extensions/CentralNotice: Emit CSP headers on banner preview (duration: 01m 18s)
  • 18:33 ppchelko@tin: Finished deploy [changeprop/deploy@d83fad3]: Support multi-topic rules, rename metrics, update dependencies (duration: 01m 14s)
  • 18:32 ppchelko@tin: Started deploy [changeprop/deploy@d83fad3]: Support multi-topic rules, rename metrics, update dependencies
  • 18:25 imarlier@tin: Finished deploy [performance/coal@3c0ef36]: coal: typoed the run file (duration: 00m 04s)
  • 18:25 imarlier@tin: Started deploy [performance/coal@3c0ef36]: coal: typoed the run file
  • 18:17 imarlier@tin: Finished deploy [performance/coal@f1ca191]: Deploying coal version that includes a runner for service use (duration: 00m 04s)
  • 18:17 imarlier@tin: Started deploy [performance/coal@f1ca191]: Deploying coal version that includes a runner for service use
  • 17:53 ebernhardson: restart elasticsearch on elastic1017
  • 17:35 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Emit CSP headers on banner previews (T190100, no-op for now) (duration: 01m 16s)
  • 17:19 ejegg: updated CiviCRM from 64b26ad377 to cddfe9416c
  • 16:47 andrewbogott: deleted lots of log files (mostly nova-api logs) on labtestnet2001
  • 16:42 reedy@tin: Synchronized wmf-config/interwiki.php: sync! (duration: 01m 15s)
  • 16:32 reedy@tin: Synchronized php-1.31.0-wmf.30/extensions/WikimediaMaintenance: fix addwiki.php (duration: 01m 18s)
  • 16:30 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: translatew for advisorswiki (duration: 01m 16s)
  • 16:26 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: advisorswikki (duration: 01m 15s)
  • 16:24 reedy@tin: rebuilt and synchronized wikiversions files: advisorswiki
  • 16:21 reedy@tin: Synchronized dblists/: advisorswiki (duration: 01m 16s)
  • 16:11 ppchelko@tin: Finished deploy [cpjobqueue/deploy@749ae82]: Update dependencies and reduce dedupe logging rate (duration: 00m 43s)
  • 16:10 ppchelko@tin: Started deploy [cpjobqueue/deploy@749ae82]: Update dependencies and reduce dedupe logging rate
  • 15:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2077 (duration: 01m 16s)
  • 15:33 _joe_: depooling mw1227 for investigation in high load
  • 15:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 after alter table (duration: 01m 15s)
  • 15:09 urandom: decommissioning Cassandra, restbase1010-b -- T189822
  • 15:08 dcausse: reindexing serbian wikis on elastic@eqiad (T189265)
  • 14:55 urandom: restarting Cassandra, restbase1011-a to test v 0.8 of Prometheus JMX exporter -- T192456
  • 14:51 jynus: starting reimage of db2077
  • 14:37 urandom: restarting Cassandra, restbase1011-a -- T192456
  • 14:35 marostegui: Disable puppet on db1114 - T191996
  • 14:13 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2080, depool db2077 (duration: 01m 16s)
  • 14:04 gehel: powercycle unresponsive maps-test2001
  • 14:00 elukey: restart kafka on kafka1001 and kafka2001 (jobqueues,eventbus) for opnejdk-7 upgrades
  • 13:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 for alter table (duration: 01m 16s)
  • 13:49 marostegui: Deploy schema change on db1100 - T191519 T188299 T190148
  • 13:44 moritzm: uploaded HHVM 3.18.5+dfsg-1+wmf7+icu57 to apt.wikimedia.org/jessie-wikimedia (includes a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854))
  • 13:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 after alter table (duration: 01m 15s)
  • 13:17 Amir1: EU SWAT is done
  • 13:17 moritzm: uploaded HHVM 3.18.5+dfsg-1+wmf7+deb9u1 to apt.wikimedia.org/stretch-wikimedia (includes a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 13:16 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Limit page creation and edit rate on Wikidata (T184948) (duration: 01m 17s)
  • 13:00 jynus: starting reimage of db2080
  • 12:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2081, depool db2080 (duration: 01m 16s)
  • 11:20 vgutierrez: Repool (Re-enable BGP) lvs2004 - T191897
  • 11:02 jynus: starting reimage of db2081
  • 10:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2081, repool db2082, es2013 (duration: 01m 15s)
  • 10:45 vgutierrez: Depool and reimage lvs2004 - T191897
  • 10:27 vgutierrez: Repool (Re-enable BGP) in lvs2005 - T191897
  • 09:49 hoo: Ran scap pull on mwdebug1001 after checking https://gerrit.wikimedia.org/r/427156
  • 09:49 jynus: starting reimage of db2082
  • 09:46 Amir1: start of deleting auto patrol actions in small wikis (T184485)
  • 09:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2082 (duration: 01m 15s)
  • 09:37 moritzm: strip apache/nginx/nutcracker/hhvm from former image scaler (now spares)
  • 09:32 vgutierrez: Depool and reimage lvs2005 - T191897
  • 09:30 marostegui: Deploy schema change on db1096:3315 - T191519 T188299 T190148
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 for alter table (duration: 01m 22s)
  • 09:27 godog: reenable puppet fleetwide after https://gerrit.wikimedia.org/r/c/421860
  • 09:16 moritzm: imported lz4 0.0~r131-2~wmf1+trusty1 for trusty-wikimedia to apt.wikimedia.org (needed to build HHVM 3.18 for trusty)
  • 09:09 godog: stop puppet agent fleetwide before applying https://gerrit.wikimedia.org/r/c/421860/
  • 09:08 moritzm: reimaging mw1281 to stretch
  • 09:04 _joe_: restart HHVM on mw1223,mw1224, also repool them after investigation in crashes
  • 08:59 vgutierrez: Repool (Re-enable BGP) in lvs3003 - T191897
  • 08:44 elukey: execute cumin 'analytics10[28-69]*' 'rm /etc/apt/preferences.d/r_* && apt-get update' to clear jessie backports apt config - T192348
  • 07:39 vgutierrez: Depool and reimage lvs3003 as stretch - T191897
  • 06:49 marostegui: Deploy schema change on s5 codfw master (db2052) this will generate lag in codfw - T191519 T188299 T190148
  • 06:43 moritzm: installing ruby security updates for trusty
  • 05:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 after changing RX buffers - T191996 (duration: 01m 09s)
  • 05:20 marostegui: Change RX buffers on db1114 - T191996
  • 05:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 01m 15s)
  • 05:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 after alter table (duration: 01m 16s)
  • 05:02 marostegui: Deploy schema change on db1071 (s8 primary master) - T185128 T153182
  • 02:51 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 55s)
  • 00:05 aaron@tin: Synchronized wmf-config/mc-labs.php: 8ad186728d: use mcrouter key prefixes (deployment-prep only) (duration: 01m 15s)

2018-04-17

  • 23:31 ebernhardson@tin: Synchronized wmf-config/CommonSettings-labs.php: labs config noop (duration: 01m 15s)
  • 23:17 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T191236: Shift search traffic back to eqiad (duration: 01m 17s)
  • 23:08 gilles: Private wiki thumbnail traffic now going to eqiad T191643
  • 23:07 gilles@tin: Synchronized wmf-config/filebackend.php: Fix private wiki DC configuration: Serve private wiki thumbnails with Thumbor (T191643) (duration: 01m 18s)
  • 21:34 demon@tin: Synchronized wmf-config/CommonSettings.php: ext-dist config changes for rel1_31 (duration: 01m 16s)
  • 20:13 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.30
  • 19:59 smalyshev@tin: Started deploy [wdqs/wdqs@f08fbcc]: GUI update
  • 19:48 demon@tin: Finished scap: bootstrap wmf.30 (duration: 112m 35s)
  • 19:01 imarlier@tin: Finished deploy [performance/navtiming@22483a4]: Navtiming refactor for increased testability, and to add wrapper for easy service use (duration: 00m 02s)
  • 19:01 imarlier@tin: Started deploy [performance/navtiming@22483a4]: Navtiming refactor for increased testability, and to add wrapper for easy service use
  • 18:52 urandom: rebooting restbase-dev1006 (kernel oom killer misbehaving)
  • 18:45 urandom: rebooting restbase-dev1005 (kernel oom killer misbehaving)
  • 18:41 urandom: rebooting restbase-dev1004 (kernel oom killer misbehaving)
  • 17:56 demon@tin: Started scap: bootstrap wmf.30
  • 17:27 ejegg: updated payments-wiki from 320a6c2600 to 4a8aada491
  • 17:16 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 100% of anons for enwiki - T191101 (duration: 00m 59s)
  • 16:57 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1017 fully (duration: 01m 16s)
  • 16:37 elukey: incremental rollout of the new zookeeper jmx config to druid1* and conf*
  • 16:34 urandom: decommissioning Cassandra, restbase1010-a -- T189822
  • 16:02 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 75% of anons for enwiki - T191101 (duration: 00m 58s)
  • 15:50 arturo: enable puppet in labstore1004
  • 15:37 vgutierrez: Repool (Enable BGP) on lvs3004 - T191897
  • 15:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2048 IP - T191193 (duration: 00m 58s)
  • 15:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2048 IP - T191193 (duration: 00m 58s)
  • 15:23 marostegui: Stopping mysql on db2048 will break replication on codfw s1 slaves
  • 15:23 marostegui: Stop MySQL on db2048 for rack movement - T191193
  • 15:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067, es1017 with low load (duration: 01m 02s)
  • 14:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 after changing the network cable - T191996 (duration: 01m 02s)
  • 14:55 gehel: starting data reimport after re-image for wdqs2001 - T189192
  • 14:53 marostegui: Stop MySQL on db2042 to move it to another rack - https://phabricator.wikimedia.org/T191193
  • 14:36 ariel@tin: Finished deploy [dumps/dumps@1073d75]: more exception logging from xmlstream (duration: 00m 03s)
  • 14:36 ariel@tin: Started deploy [dumps/dumps@1073d75]: more exception logging from xmlstream
  • 14:30 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 50% of anons for enwiki - T191101 (duration: 00m 58s)
  • 14:25 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Support per-event dispatch of events, file 3/3 - T191464 (duration: 03m 07s)
  • 14:23 jynus: start es1017 reimage
  • 14:22 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Support per-event dispatch of events, file 2/3 - T191464 (duration: 03m 06s)
  • 14:16 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/extension.json: Support per-event dispatch of events, file 1/3 - T191464 (duration: 03m 00s)
  • 14:08 vgutierrez: Depool and reimage lvs3004 as stretch - T191897
  • 13:42 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/extension.json: Support per-event dispatch of events, file 1/3 - T191464 (duration: 03m 07s)
  • 13:33 moritzm: removed role::mediawiki::imagescaler from deployment-mediawiki05, per watroles the only use of that role in WMCS
  • 13:32 moritzm: removed role::mediawiki::imagescaler from deployment-prep, per watroles the only use of that role in WMCS
  • 13:30 jynus: starting backup from db1067, may generate some lag
  • 13:26 volans: updating puppet compiler facts
  • 13:25 elukey: completed migration of zookeeper on conf200[123]
  • 13:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 (duration: 00m 58s)
  • 13:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 to get it ready for a network cable change (duration: 00m 58s)
  • 13:00 elukey: upgrade zookeeper on conf200[123] to 3.4.9~jessie - T182924
  • 12:31 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 25% of annons on enwiki, take #2 - T191101 (duration: 00m 58s)
  • 12:04 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 25% of annons on enwiki - T191101 (duration: 01m 03s)
  • 10:52 ema: lvs100[63] restart pybal to apply https://gerrit.wikimedia.org/r/424553 T188062
  • 10:39 ema: lvs200[63]: restart pybal to apply https://gerrit.wikimedia.org/r/424553 T188062
  • 10:03 mobrovac@tin: Finished deploy [restbase/deploy@e463fcf]: Use keep-alive for connections to AQS (duration: 20m 17s)
  • 09:43 mobrovac@tin: Started deploy [restbase/deploy@e463fcf]: Use keep-alive for connections to AQS
  • 09:37 moritzm: reimaging mw1280, mw1281, mw1282 (API servers) to stretch
  • 09:36 moritzm: reimaging mw1266, mw1267, mw1268 (app servers) to stretch
  • 09:17 godog: restart xenon-log on mwlog* - T169249
  • 08:46 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 08:19 elukey: restart nrpe-server on kafka2001 (kafka check not defined)
  • 08:01 moritzm: rolling restart of HHVM on video scalers to pick up ICU security update
  • 07:42 moritzm: installing ICU security updates
  • 07:27 jynus: restarting dbstore2001
  • 07:14 moritzm: installing perl security updates on trusty
  • 06:48 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 06:47 vgutierrez: Depool and reimage chromium as stretch - T187090
  • 06:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 00m 58s)
  • 05:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 58s)
  • 05:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 05:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 00m 58s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 05:21 marostegui: Deploy schema change on db1092 - T187089 T185128 T153182
  • 05:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 for alter table (duration: 00m 58s)
  • 05:11 marostegui: Stop MySQL and reboot db1114 to boot up with the new kernel
  • 05:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 59s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 27s)
  • 01:09 krinkle@tin: Synchronized wmf-config/CommonSettings.php: no-op Ib39022 (duration: 01m 00s)

2018-04-16

  • 23:57 eileen: update civicrm revision changed from b3326dbf70 to 64b26ad377, config revision is 853fcc9111
  • 21:03 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Use the correct way of calculating the domain from the wiki, file 2/2 - T192198 (duration: 00m 58s)
  • 21:02 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Use the correct way of calculating the domain from the wiki, file 1/2 - T192198 (duration: 00m 59s)
  • 20:34 imarlier@tin: Finished deploy [performance/navtiming@64d9c90]: null deploy (duration: 00m 02s)
  • 20:33 imarlier@tin: Started deploy [performance/navtiming@64d9c90]: null deploy
  • 20:13 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Revert using the wiki of the job runner, file 2/2 (duration: 00m 58s)
  • 20:12 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Revert using the wiki of the job runner, file 1/2 (duration: 00m 58s)
  • 19:47 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Use the wiki set in the JobQueue when creating the event, file 2/2 - T192198 (duration: 00m 59s)
  • 19:46 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Use the wiki set in the JobQueue when creating the event, file 1/2 - T192198 (duration: 01m 00s)
  • 18:28 ottomata: temporarily stopping puppet on kafka200[123] to apply MirrorMaker --new.consumer https://gerrit.wikimedia.org/r/#/c/424344/ T190940
  • 18:03 ottomata: restarting main <-> main DC kafka mirror maker instances to blacklist job and cp topics T190940 T167039
  • 17:11 moritzm: upgraded HHVM on mediawiki-jobrunner03 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 15:53 akosiaris: restart hhvm on mw2252
  • 15:29 ppchelko@tin: Finished deploy [cpjobqueue/deploy@2a720fc]: Log HTML for PHP fatal errors from MW (duration: 01m 01s)
  • 15:28 ppchelko@tin: Started deploy [cpjobqueue/deploy@2a720fc]: Log HTML for PHP fatal errors from MW
  • 15:25 moritzm: upgraded HHVM on mediawiki-deployment-07 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 15:07 jynus: start reimage of es3-codfw master, es2017
  • 15:01 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 14:53 vgutierrez: restart pybal on lvs1003 - T187766
  • 14:49 vgutierrez: restart pybal on lvs2003 - T187766
  • 14:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 00m 58s)
  • 14:42 vgutierrez: restart pybal on lvs1006 - T187766
  • 14:39 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=wdqs-internal
  • 14:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 57s)
  • 14:25 vgutierrez: restarting pybal on lvs2006 - T187766
  • 14:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 14:12 moritzm: upgraded HHVM on mediawiki-deployment-09 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 14:06 jynus: start reimage of es2-codfw master, es2016
  • 14:05 hashar: restarted Jenkins for plugin upgrade T192261
  • 14:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 00m 58s)
  • 13:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 13:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1017 (duration: 00m 58s)
  • 13:31 marostegui: Stop MySQL on db1114 to reboot with another kernel - T191996
  • 13:30 godog: roll-restart swift-proxy in codfw and eqiad - T188062
  • 13:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 54s)
  • 13:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 (duration: 00m 59s)
  • 12:12 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 12:11 vgutierrez: Depool and reimage hydrogen as stretch - T187090
  • 11:50 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org,service=pdns_recursor
  • 11:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1114 original weight (duration: 00m 59s)
  • 10:50 moritzm: reimaging mw1299 (job runner) to stretch
  • 10:23 ariel@tin: Finished deploy [dumps/dumps@4706d30]: show full stacktrace for dump job errors (duration: 00m 04s)
  • 10:23 ariel@tin: Started deploy [dumps/dumps@4706d30]: show full stacktrace for dump job errors
  • 10:18 godog: upload prometheus-memcached-exporter to stretch-wikimedia - T189056
  • 10:17 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 10:16 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 58s)
  • 09:50 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org,service=pdns_recursor
  • 09:49 vgutierrez: Depool and reimage acamar as stretch - T187090
  • 09:43 gehel: rolling restart of wdqs100[35] and wdqs200[123] for kernel upgrade completed
  • 09:40 jynus: restarting dbstore2001:s8 to increase the number of purge threads
  • 09:23 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=achernar.wikimedia.org,service=pdns_recursor
  • 09:07 gehel: starting rolling restart of wdqs100[35] and wdqs200[123] for kernel upgrade
  • 09:05 moritzm: pooled mw1276-mw1278 (API app server canaries running stretch)
  • 08:49 gehel: first manual run of populate_admin() for maps[12]001 - T190605
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1114 original main traffic weight (duration: 00m 58s)
  • 08:41 moritzm: pooled mw1261-mw1264 (app server canaries running stretch)
  • 08:29 joal@tin: Finished deploy [analytics/refinery@27416a9]: Regular weekly deploy - Mostly bugfixes from previous week huge deploy (duration: 05m 27s)
  • 08:25 _joe_: depooling mw1223 for investigation too
  • 08:23 joal@tin: Started deploy [analytics/refinery@27416a9]: Regular weekly deploy - Mostly bugfixes from previous week huge deploy
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 08:04 elukey: restart hhvm on mw[1228,1234,1281-1287,1289,1290,1312-1314,1317,1339,1343,1345,1346,1348] - more than 50% cpu usage, prevention scheme for current high load
  • 08:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 07:49 marostegui: Stop MySQL and reboot db1114 - T191996
  • 07:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 59s)
  • 07:40 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=achernar.wikimedia.org,service=pdns_recursor
  • 07:39 vgutierrez: Depool and reimage achernar.wikimedia.org - T187090
  • 07:27 moritzm: installing perl security updates on Debian systems
  • 06:45 TimStarling: depooled mw1230
  • 06:38 _joe_: repooling mw1230
  • 06:20 marostegui: Drop table flow_subscription from x1 - T149936
  • 05:59 elukey: restart hhvm on mw[1221,1233,1280,1347] - high load
  • 05:55 elukey: repool mw1341 after investigation
  • 05:48 elukey: restart hhvm on mw1225, 1315, 1316, 1340, 1341, 1342, 1347 - high load
  • 05:42 marostegui: Reload haproxy on dbproxy1010
  • 05:36 elukey: restart hhvm on mw1226,27,32,88 - high load
  • 05:35 _joe_: depooling mw1341 to further debug the API issue
  • 05:33 marostegui: Deploy schema change on db1087 with replication (this will generate lag in labs) - T187089 T185128 T153182
  • 05:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 (duration: 00m 59s)
  • 03:02 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 11m 09s)

2018-04-15

  • 22:09 ema: cp3037: restart varnish-be
  • 21:45 ema: cp3039: restart varnish-be
  • 21:42 elukey: restart hhvm on mw1286,1317,1339 - high load
  • 21:31 ema: cp3038: restart varnish-be
  • 21:30 ema: cp3036: restart varnish-be
  • 20:52 elukey: restart hhvm on mw13[43,45,46,48] - high load
  • 20:48 elukey: restart hhvm on mw13[12-14] - high load
  • 20:45 elukey: restart hhvm on mw[1285,1287,1289-1290] - high load
  • 20:40 _joe_: restart mw1344, high load
  • 20:38 elukey: restart hhvm on mw12[22,79,82] - high load
  • 20:32 elukey: restart hhvm on mw12[32-35] - high load
  • 20:24 elukey: restart hhvm on mw1229-31 - high load
  • 20:24 _joe_: restarted mw1280-4, high load
  • 20:17 elukey: restart hhvm on mw122[6-8] - high load
  • 20:05 elukey: restart hhvm on mw122[3,4] - high load
  • 13:42 elukey: restart hhvm on mw1227 due to high load (hhvm dump debug in /tmp/hhvm.44071.bt)
  • 10:53 elukey: powercycle mw1272 - not responsive to ssh, mgmt com2 console showing "[OK" and no tty

2018-04-13

  • 20:44 imarlier@tin: Finished deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active) (duration: 00m 02s)
  • 20:44 imarlier@tin: Started deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active)
  • 20:00 demon@tin: Pruned MediaWiki: 1.31.0-wmf.28 [keeping static files] (duration: 01m 34s)
  • 19:23 demon@tin: Pruned MediaWiki: 1.31.0-wmf.25 (duration: 05m 03s)
  • 17:17 andrewbogott: upgraded packages on all labvirts and restarted nova-compute
  • 16:55 arturo: enable puppet in labstore1005
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give db1104 origina main traffic weight (duration: 01m 00s)
  • 16:34 andrewbogott: upgrading packages on labvirt1016 and rebooting (1016 is a spare server that won't affect VPS users)
  • 16:26 arturo: disable puppet in labstore1005 to hot-test https://gerrit.wikimedia.org/r/#/c/426103/
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give db1104 some main traffic - T191996 (duration: 01m 00s)
  • 16:04 hashar: cleaning up lost instances in nodepool (nodepool delete XXXXX)
  • 15:50 andrewbogott: upgrading lots of packages and rebooting labservices1002 and 1002
  • 15:43 andrewbogott: restarting nodepool on labnodepool1001
  • 15:27 andrewbogott: rebooting lots of packages on labnet1001 and labnet1002 for T145919
  • 15:14 bd808: wiki replicas: added page_assessments views for frwiki & huwiki
  • 15:09 chasemp: labstore1004 stop nfs-exportd, cp export.bak to export.d, exportfs -ra (all exports were wiped out)
  • 14:59 andrewbogott: rebooting labcontrol1001
  • 14:42 andrewbogott: upgrading lots of packages on labcontrol1001 and 1002 and rebooting. T145919
  • 14:38 andrewbogott: stopping puppet and nodepool on labnodepool1001
  • 14:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 - T191996 (duration: 01m 07s)
  • 14:22 XioNoX: enable flow control on db1114's switch port - T191996
  • 14:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T191996 (duration: 00m 59s)
  • 14:13 andrewbogott: disabling puppet on labcontrol*, labnet*, labservices*, labvirt* before beginning T145919
  • 14:13 moritzm: installing apache security updates on contint1001
  • 14:09 andrewbogott: silencing alerts for labcontrol*, labnet*, labservices*, labvirt* before beginning T145919
  • 14:06 moritzm: uploaded ivy-debian-helper to apt.wikimedia.org/jessie (needed for zookeeper backport)
  • 13:52 elukey: roll restart druid + zookeeper daemons on druid100[123] for openjdk-7 updates
  • 13:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1013 with full weight (duration: 01m 00s)
  • 13:32 elukey: restart druid and zookeeper daemons on druid100[456] for opejdk-7 updates
  • 13:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 after alter table (duration: 01m 02s)
  • 13:19 urandom: increasing heap size to 16G -- T186751
  • 12:37 moritzm: installing apache security updates on mendelevium (otrs)
  • 12:36 moritzm: installing apache security updates on bohrium (piwik)
  • 11:58 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 11:56 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1013 with low load (duration: 01m 04s)
  • 10:59 moritzm: reimaging mw1261-mw1264 to stretch (T174431)
  • 10:40 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 10:38 vgutierrez: Depool and reimage maerlant.wikimedia.org as stretch
  • 10:16 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=nescio.wikimedia.org,service=pdns_recursor
  • 10:01 moritzm: installing java security updates on meiterium/archive.wikimedia.org
  • 09:33 jynus: start reimage of es1013
  • 09:03 moritzm: reimaging mw1276-mw1278 to stretch (T174431)
  • 08:53 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=nescio.wikimedia.org,service=pdns_recursor
  • 08:52 vgutierrez: depool and reimage nescio.wikimedia.org as stretch
  • 08:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 in API - T191996 (duration: 01m 00s)
  • 08:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully depool db1114 - T191996 (duration: 01m 00s)
  • 07:58 mobrovac@tin: Started restart [electron-render/deploy@94d27d7]: Kick Electron, hanging, take 2 - T174916
  • 07:52 mobrovac@tin: Started restart [electron-render/deploy@94d27d7]: Kick Electron, hanging - T174916
  • 07:22 legoktm: restarting jenkins
  • 07:15 moritzm: pooling mw1265 and mw1279 for production traffic
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from main traffic - T191996 (duration: 01m 00s)
  • 05:37 marostegui: Deploy schema change on db1104 - T187089 T185128 T153182
  • 05:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 for alter table (duration: 01m 00s)
  • 05:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after alter table (duration: 01m 01s)
  • 05:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 after alter table (duration: 01m 01s)

2018-04-12

  • 23:33 awight@tin: Finished deploy [ores/deploy@543901a]: Restore ores1001 canary to master branch (duration: 03m 24s)
  • 23:30 awight@tin: Started deploy [ores/deploy@543901a]: Restore ores1001 canary to master branch
  • 23:25 awight@tin: Finished deploy [ores/deploy@a5cec53]: Canary ores1001 only: Limited test of git-lfs for ORES (duration: 02m 31s)
  • 23:22 awight@tin: Started deploy [ores/deploy@a5cec53]: Canary ores1001 only: Limited test of git-lfs for ORES
  • 23:09 dereckson@tin: Synchronized tests/: Update PHPUnit tests to use PHPUnit\Framework\TestCase (no-op) (duration: 01m 01s)
  • 22:07 urandom: restarting Cassandra, restbase2003 -- T192112
  • 21:07 urandom: restarting Cassandra, restbase1010 -- T192112
  • 21:03 urandom: temporarily disabling puppet to make (ephemeral) change to GC settings, restbase1010 -- T192112
  • 20:37 urandom: increase change-prop sample rate in dev env to 100% (from 80) -- T186751
  • 20:34 ppchelko@tin: Finished deploy [cpjobqueue/deploy@bd772eb]: Revert switching TranslationUpdateJob T192107 (duration: 00m 39s)
  • 20:33 ppchelko@tin: Started deploy [cpjobqueue/deploy@bd772eb]: Revert switching TranslationUpdateJob T192107
  • 20:32 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch TranslateUpdateJob back to the Redis-based queue as it is using PHP serialisation - T192107 (duration: 01m 00s)
  • 20:04 XioNoX: all good, revert routing ns1 to radon
  • 19:54 ema: reboot baham for kernel upgrade T188092
  • 19:51 XioNoX: routing ns1 to radon
  • 19:46 XioNoX: all good, revert routing ns0 to baham
  • 19:41 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.29
  • 19:40 ema: reboot radon for kernel upgrade T188092
  • 19:37 XioNoX: routing ns0 to baham
  • 18:02 arlolra@tin: Finished deploy [parsoid/deploy@1807a38]: Updating Parsoid to 322b6e8 (duration: 15m 09s)
  • 17:47 arlolra@tin: Started deploy [parsoid/deploy@1807a38]: Updating Parsoid to 322b6e8
  • 17:38 herron: puppet master updates complete — re-enabling puppet agents
  • 17:35 moritzm: installing apache security updates on hafnium
  • 17:31 herron: temporarily disabling puppet agents for openssl updates and apache restarts on puppet masters
  • 17:27 moritzm: installing apache security updates on krypton
  • 17:17 moritzm: installing patch security updates on trusty
  • 16:59 urandom: increase change-prop sample rate in dev env to 80% (from 60) -- T186751
  • 16:21 marostegui: Deploy schema change on db1066 - T132416
  • 16:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 to main traffic and depool db1066 for alter table - T191996 (duration: 01m 17s)
  • 16:07 marostegui: Reboot es2013 - T191977
  • 15:27 gehel: rolling restart of elasticsearch cirrus / eqiad for jvm upgrade completed
  • 15:06 moritzm: installing django/apache security updates on labmon*
  • 15:03 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2013 (duration: 01m 17s)
  • 14:59 jynus: shutting down es2013's mariadb
  • 14:46 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: No-op: Clean up an unused global var for the EventBus-based JobQueue (duration: 01m 17s)
  • 14:44 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the second bulk of low-traffic jobs for all wikis - T190327 (duration: 01m 16s)
  • 14:43 ppchelko@tin: Finished deploy [cpjobqueue/deploy@85fbd47]: Enable second bulk of low traffic jobs for all wikis T190327 (duration: 00m 35s)
  • 14:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@85fbd47]: Enable second bulk of low traffic jobs for all wikis T190327
  • 14:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from main traffic - T191996 (duration: 01m 18s)
  • 14:21 vgutierrez: Reimage lvs2006 as stretch
  • 14:11 moritzm: pooling mw1265 (app server) temporarily for production traffic
  • 14:03 urandom: increase change-prop sample rate in dev env to 60% (from 40) -- T186751
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 into API - T191996 (duration: 01m 17s)
  • 13:47 herron: updated puppet-run script to log using syslog and updated rsyslog config to direct puppet-agent logs to /var/log/puppet.log https://gerrit.wikimedia.org/r/425538
  • 13:44 sbisson@tin: Finished deploy [tilerator/deploy@46cc948]: Deploying tilerator@i18n everywhere (duration: 02m 04s)
  • 13:44 marostegui: Deploy schema change on db1101:3318 - T187089 T185128 T153182
  • 13:42 sbisson@tin: Started deploy [tilerator/deploy@46cc948]: Deploying tilerator@i18n everywhere
  • 13:40 gehel: dropping leftover keyspace v2 and v5 on maps / eqiad - T191655
  • 13:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 for alter table (duration: 01m 17s)
  • 13:31 moritzm: installing openssl updates
  • 13:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 (duration: 01m 17s)
  • 13:22 gehel: i18n maps will not be available yet, this is only preliminary work
  • 13:22 gehel: deploying maps internationalization, including new keyspace and generating new tiles - T191655
  • 13:18 zeljkof: EU SWAT finished
  • 13:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Page Previews for 10% enwiki anon users (T189906) (duration: 01m 18s)
  • 13:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1012 with full weight (duration: 01m 17s)
  • 12:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 after alter table (duration: 01m 17s)
  • 12:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from API - T191996 (duration: 01m 17s)
  • 12:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1012 with low weight (duration: 01m 19s)
  • 12:13 marostegui: Deploy schema change on s8 dbstore1002 - T187089 T185128 T153182
  • 11:59 moritzm: pooling mw1279 for some brief test production traffic
  • 09:58 jynus: reimage es1012, take 2
  • 08:12 marostegui: Drop table linkscc from s3 codfw primary master
  • 08:11 marostegui: Drop table linkscc from s1
  • 07:55 marostegui: Drop table linkscc from s2 and s7
  • 07:50 marostegui: Drop table linkscc from s4,s5 and s6
  • 07:41 jynus: reimage es1012
  • 07:40 moritzm: enabling production traffic for mw1265
  • 07:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 after alter table - T190780 (duration: 01m 16s)
  • 07:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 for alter table - T190780 (duration: 01m 17s)
  • 06:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 after alter table - T190780 (duration: 01m 17s)
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for alter table - T190780 (duration: 01m 17s)
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 after alter table - T190780 (duration: 01m 16s)
  • 06:42 marostegui: Deploy schema change on db1072 (sanitarium master for s3) - this will generate lag on s3 labsdb - T190780
  • 06:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 for alter table - T190780 (duration: 01m 18s)
  • 06:27 marostegui: Deploy schema change on s3 codfw master (db2043) - this will generate lag on s3 codfw -T190780
  • 06:24 marostegui: Deploy schema change on s1 primary master (db1052) - T190780
  • 06:11 marostegui: Deploy schema change on s7 primary master (db1062) - T190780
  • 06:08 elukey: force kill of fuse_dfs (handling /mnt/hdfs) on stat1004, apparently causing a huge load
  • 06:05 elukey: force kill of fuse_dfs (handling /mnt/hdfs) on stat1005, apparently causing a huge load
  • 05:52 marostegui: Deploy schema change on s2 primary master (db1054) - T190780
  • 05:49 marostegui: Deploy schema change on s8 primary master (db1071) - T190780
  • 05:45 marostegui: Deploy schema change on s4 primary master (db1068) - T190780
  • 05:39 marostegui: Deploy schema change on s6 primary master (db1061) - T190780
  • 05:34 marostegui: Deploy schema change on s5 primary master (db1070) - T190780
  • 05:27 marostegui: Deploy schema change on db1109 - T187089 T185128 T153182
  • 05:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 for alter table (duration: 01m 17s)
  • 05:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 after alter table (duration: 01m 18s)
  • 05:11 marostegui: Reload haproxy on dbproxy1011 to repool labsdb1009
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 07m 20s)
  • 01:34 eileen: civicrm revision changed from 07bade75a2 to b3326dbf70, config revision is 853fcc9111 (deploy wmffraud report)
  • 00:44 twentyafterfour: The hotfix that I deployed for phabricator: https://phabricator.wikimedia.org/rPHEX7801b519442eea2bfd47a272ba36959b487ae7d7
  • 00:33 twentyafterfour: phabricator: hotfixing DeadlineEditEngineSubtype.php
  • 00:23 twentyafterfour: phabricator is back
  • 00:18 twentyafterfour: phabricator will be offline for just a moment while I run the upgrade script.
  • 00:15 twentyafterfour: preparing to deploy phabricator rPHDEP/release/2018-04-12/1 https://phabricator.wikimedia.org/project/view/3335/
  • 00:09 mutante: jerkins-bot tests all return -1 due to operations-mw-config-php55lint failing which says it can't clone on integration-slave-jessie-1003, which is out of disk space in /srv as reported by shinken. it's mostly all /srv/pbuilder
  • 00:08 twentyafterfour: phabricator update will begin shortly, running a bit behind due to a massive upstream merge which will have to wait until later date.
  • 00:08 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/425723/ (duration: 01m 18s)

2018-04-11

  • 23:48 ejegg: enabled new civicrm contact de-dupe job
  • 23:19 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Allow sysops to create Flow boards on euwiki (T190500) (duration: 01m 17s)
  • 23:09 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Stop logging autopatrol actions everywhere (T184485) (duration: 01m 18s)
  • 22:47 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy GlobalPreferences T184121 (duration: 01m 17s)
  • 22:47 mutante: ores2* - puppet ran to change venv config, then 'rm -rf /srv/deployment/ores/venv/' via cumin to clean-up (T181071)
  • 22:41 mutante: ores1002-1009 - deleting old venv dir - rm -f /srv/deployment/ores/venv (T181071)
  • 22:37 mutante: ores1001 - rm -rf /srv/deployment/ores/venv/
  • 22:37 mutante: ores - same for codfw instances, change of venv path to /srv/deployment/ores/deploy/venv/
  • 22:30 mutante: ores - all eqiad instances are being restarted by puppet after config change
  • 22:28 mutante: ores - running puppet on all instances to apply venv path change for T181071
  • 22:24 musikanimal@tin: Synchronized wmf-config/InitialiseSettings.php: Enabling PageAssessments on huwiki (T191697) (duration: 01m 17s)
  • 22:23 bstorm_: views updated on labsdb1009
  • 22:13 musikanimal@tin: Synchronized wmf-config/InitialiseSettings.php: Enabling PageAssessments on frwiki (T153393) (duration: 01m 26s)
  • 20:36 urandom: increase change-prop sample rate in dev env to 40% (from 20) -- T186751
  • 20:20 awight@tin: Finished deploy [ores/deploy@b6deb5d]: Transitional virtualenv for ORES (take 2), T181071 (duration: 18m 34s)
  • 20:02 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.29 (duration: 01m 16s)
  • 20:02 awight@tin: Started deploy [ores/deploy@b6deb5d]: Transitional virtualenv for ORES (take 2), T181071
  • 20:00 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.29
  • 19:23 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.29
  • 19:11 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki back to 1.31.0-wmf.29
  • 19:09 thcipriani@tin: Synchronized php-1.31.0-wmf.29/includes/libs/rdbms/database: rdbms: fix transaction flushing in Database::close T191916 (duration: 01m 01s)
  • 18:47 urandom: restarting cassandra, dev environment (set -XX:+PerfDisableSharedMem) -- T186751
  • 18:11 mutante: deploy1001 is back on stretch once again - it has been removed from scap hosts though (T175288 T185275)
  • 17:40 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy page previews for anons on dewiki T191966 (duration: 00m 54s)
  • 17:30 sbisson@tin: Finished deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 everywhere (duration: 02m 27s)
  • 17:29 Krinkle: actually re-enabled puppet on graphite2001
  • 17:28 sbisson@tin: Started deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 everywhere
  • 17:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on wikis with <50 issues in high priority linter cats T190731 (duration: 00m 59s)
  • 16:53 sbisson@tin: Finished deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 to maps-test* (duration: 01m 16s)
  • 16:51 sbisson@tin: Started deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 to maps-test*
  • 16:44 elukey: restart hadoop hdfs namenodes on analytics100[12] to pick up HDFS Trash settings - T189051
  • 16:35 robh: cp2018 returned to service
  • 16:33 foks: See T191887
  • 16:24 robh: cp2011 returned to service
  • 16:23 marostegui: Reload haproxy on dbproxy1011 to depool labsdb1009
  • 16:14 elukey: reboot notebook1001 for kernel updates
  • 16:11 urandom: restarting cassandra, dev environment (testing default GC settings) -- T186751
  • 15:58 Krinkle: Re-enabled puppet and coal on graphite2001
  • 15:43 robh: cp2008 repooled after memory swap
  • 15:20 Krinkle: disabling coal service on graphite2001 and disabling puppet – T191239
  • 15:19 jynus: fixing grant issue on db1114
  • {{safesubst:SAL entry|1=15:14 ema: restart pybal on lvs1003 for logstash-{json,syslog} UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425253/}}
  • {{safesubst:SAL entry|1=15:08 ema: restart pybal on lvs1006 for logstash-{json,syslog} UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425253/}}
  • 15:06 robh: shutting down cp2008, cp2011, and cp2018 for onsite work
  • 15:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1012 (duration: 01m 00s)
  • 15:01 marlier: Stopping coal on graphite2001.codfw.wmnet for data replay
  • 14:54 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2013 (duration: 01m 00s)
  • 14:54 gehel: starting rolling restart of elasticsearch cirrus / eqiad for jvm upgrade
  • 14:39 moritzm: rolling restart of restbase in eqiad to pick up openssl update
  • 14:38 Krinkle: Turned regular coal back on (T191239)
  • 14:37 ppchelko@tin: Finished deploy [cpjobqueue/deploy@a090a3c]: Fix the low priority jobs topic names (duration: 00m 38s)
  • 14:36 ppchelko@tin: Started deploy [cpjobqueue/deploy@a090a3c]: Fix the low priority jobs topic names
  • 14:15 jynus: start reimage of es2013
  • 14:14 marostegui: Deploy schema change on db1099:3318 - T187089 T185128 T153182
  • 14:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 for alter table (duration: 01m 00s)
  • 14:12 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2013 (duration: 01m 00s)
  • 13:44 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3ba6580]: Enable second bulk of low-traffic jobs T190327 take 2 (duration: 00m 49s)
  • 13:44 ppchelko@tin: Started deploy [cpjobqueue/deploy@3ba6580]: Enable second bulk of low-traffic jobs T190327 take 2
  • 13:41 ppchelko@tin: Finished deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327 (duration: 08m 27s)
  • 13:37 moritzm: rolling restart of restbase in codfw to pick up openssl update
  • 13:33 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 2/2 - T190327 (duration: 01m 00s)
  • 13:32 ppchelko@tin: Started deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327
  • 13:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 (duration: 01m 07s)
  • 13:31 ppchelko@tin: Started deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327
  • 13:27 marostegui: Drop prefstats table on s3 sanitarium master (db1072) this might cause lag on labs - T154490
  • 13:26 moritzm: installing java security updates on kafka/main cluster
  • 13:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 (duration: 01m 00s)
  • 13:13 marostegui: Drop prefstats table on s1 codfw master - db2048 (this might generate lag on codfw) - T154490
  • 13:12 elukey: restart kafka brokers on kafka1012->23 for openjdk-7 upgrades
  • 13:09 marostegui: Drop prefstats table on s3 codfw master - db2043 (this might generate lag on codfw) - T154490
  • 13:01 vgutierrez: Reimage lvs4007 as stretch
  • 13:00 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2012 (duration: 01m 00s)
  • 12:39 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry #2) (duration: 01m 01s)
  • 12:32 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry) - T190327 (duration: 01m 00s)
  • 12:21 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 - T190327 (duration: 01m 01s)
  • 12:21 moritzm: enable production traffic for mw1265 (stretch app server) for a brief test period
  • 12:09 jynus: start reimage of es2012
  • 12:05 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2011, depool es2012 (duration: 01m 01s)
  • 11:47 jynus: start reimage of es2011
  • 11:09 ema: start pybal on lvs5001, test completed on lvs5003
  • 11:04 marostegui: Drop table prefstats in s7 - T154490
  • 10:59 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2015, depool es2011 (duration: 00m 59s)
  • 10:56 ema: stop pybal on lvs5001 to test requests through lvs5003, reimaged as stretch T191897
  • 10:50 moritzm: installing openssl updates
  • 10:43 marostegui: Drop table prefstats in s2 - T154490
  • 10:33 marostegui: Drop table prefstats in s4 - T154490
  • 10:31 marostegui: Drop table prefstats in s6 - T154490
  • 10:28 marostegui: Drop table prefstats in s5 - T154490
  • 10:04 jynus: start reimage of es2015
  • 10:00 moritzm: installing java security updates on kafka/jumbo cluster
  • 09:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2014, depool es2015 (duration: 01m 02s)
  • 09:52 moritzm: installing java security updates on kafka/analytics cluster
  • 09:29 arturo: doing some testing in labtestvirt2001 mounting instance's qcow2 files into /home/aborrero/mnt
  • 09:17 jynus: start reimage of es2014
  • 09:08 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 (duration: 01m 03s)
  • 09:03 ema: restart pybal on lvs1003 for UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425251/
  • 08:59 moritzm: reimaging mw1265 to stretch (T174431)
  • 08:18 jynus: rerunning eqiad misc backups
  • 08:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 as candidate master for x1 - T191275 (duration: 01m 03s)
  • 07:45 ema: cp2022: restart varnish-be due to child process crash https://phabricator.wikimedia.org/P6979 T191229
  • 07:27 marostegui: Stop MySQL on db2033 to copy its data away before reimaging - T191275
  • 07:08 vgutierrez: Reimaging lvs5003.eqsin as stretch (2nd attempt)
  • 06:49 elukey: restart Yarn Resource Manager daemons on analytics100[12] to pick up the new Prometheus configuration file
  • 06:20 marostegui: Stop MySQL on db2033 to clone db2069 - T191275
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 03s)
  • 06:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 01s)
  • 05:28 Krinkle: manual coal back-fill still running with the normal coal disabled via systemd. Will restore normal coal when I wake up.
  • 05:22 marostegui: Deploy schema change on codfw s8 master (db2045) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182
  • 05:17 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 41s)
  • 00:12 bstorm_: Updated views and indexes on labsdb1011

2018-04-10

  • 23:32 XioNoX: depolled eqsin due to router issue
  • 23:04 Krinkle: Seemingly from 22:53 - 23:03 global traffic dropped by 30-60%, presumably due to issues in eqiad where 10 Gbits dropped to 3 Gbits sharper than ever before.
  • 22:49 joal@tin: Finished deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors (duration: 04m 46s)
  • 22:45 joal@tin: Started deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors
  • 21:18 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Rollback kartotherian to v0.0.35 (duration: 06m 27s)
  • 21:12 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Rollback kartotherian to v0.0.35
  • 20:41 sbisson@tin: Finished deploy [kartotherian/deploy@bdf70ed]: Deploying kartotherian pre-i18n everywhere (downgrade snapshot) (duration: 03m 45s)
  • 20:37 sbisson@tin: Started deploy [kartotherian/deploy@bdf70ed]: Deploying kartotherian pre-i18n everywhere (downgrade snapshot)
  • 20:30 mutante: deploy1001 - reinstalled with stretch - re-adding to puppet (T175288)
  • 20:30 mutante: deploy1001 - reinstalled with jessie - re-adding to puppet (T175288)
  • 20:13 urandom: increasing sample change-prop sample rate to 20% (from 10) in dev environment -- T186751
  • 20:06 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki back to 1.31.0-wmf.28
  • 20:02 sbisson@tin: Finished deploy [kartotherian/deploy@6e4d666]: Deploying kartotherian pre-i18n everywhere (duration: 04m 34s)
  • 19:58 sbisson@tin: Started deploy [kartotherian/deploy@6e4d666]: Deploying kartotherian pre-i18n everywhere
  • 19:57 sbisson@tin: Finished deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n everywhere (duration: 00m 48s)
  • 19:56 sbisson@tin: Started deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n everywhere
  • 19:48 sbisson@tin: Finished deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n to maps-test* (duration: 00m 27s)
  • 19:48 sbisson@tin: Started deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n to maps-test*
  • 19:16 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.29 and rebuild l10n cache (duration: 66m 28s)
  • 18:10 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.29 and rebuild l10n cache
  • 18:07 Krinkle: Stopping coal on graphite1001 to manually repopulate for T191239
  • 18:04 otto@tin: Finished deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 3 (duration: 04m 54s)
  • 17:59 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 3
  • 17:58 otto@tin: Finished deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2 (duration: 01m 50s)
  • 17:56 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2
  • 17:56 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2^
  • 17:49 joal@tin: Finished deploy [analytics/refinery@b8ea97f]: Analytics weekly deploy - Move to spark 2 (duration: 03m 55s)
  • 17:48 joal@tin: (no justification provided)
  • 17:47 joal@tin: (no justification provided)
  • 17:45 joal@tin: Started deploy [analytics/refinery@b8ea97f]: Analytics weekly deploy - Move to spark 2
  • 17:43 chasemp: add static route to neutron poc instance range for codfw 172.16.128.0/21
  • 17:22 papaul: shutting down cp2022 for main board replacement
  • 17:20 awight@tin: Finished deploy [ores/deploy@d35a1e6]: Test deploy virtualenv on ores1001, with logging and forced failure (duration: 02m 44s)
  • 17:17 awight@tin: Started deploy [ores/deploy@d35a1e6]: Test deploy virtualenv on ores1001, with logging and forced failure
  • 17:07 awight@tin: Finished deploy [ores/deploy@1e18fa6]: Test deploy virtualenv on ores1001, with logging (duration: 02m 28s)
  • 17:05 awight@tin: Started deploy [ores/deploy@1e18fa6]: Test deploy virtualenv on ores1001, with logging
  • 16:57 thcipriani: starting branch cut of 1.31.0-wmf.29
  • 16:45 andrew@tin: Synchronized wmf-config/CommonSettings.php: disable new accounts on labtestwikitech (duration: 01m 00s)
  • 16:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2045 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 16:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2045 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 16:21 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011
  • 16:11 marostegui: Stop MySQL on db2045 (s8 codfw master) to move it to another rack, this will break replication on codfw - T191193
  • 16:07 bstorm_: labsdb1010 now has the latest views available, including the comment table
  • 16:05 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010
  • 15:42 ottomata: disable puppet on analytics1003 and stop camus crons in preperation for spark 2 upgrade
  • 15:32 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010
  • 15:26 vgutierrez: Reimage lvs5003 as stretch
  • 15:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2040 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 15:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2040 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 15:08 volans: restarting Icinga on einsteinium, command file not working
  • 15:06 bd808: Wiki replicas: ran `sudo maintain-views --table page_assessments --database arwiki` on all 3 servers for T191455
  • 14:46 marostegui: Stop MySQL on db2040 for server move - this is s7 master, so replication will break in codfw T191193
  • 14:23 volans: restarted nsca server on einsteinium
  • 14:21 vgutierrez: re-enable puppet on primary LVS
  • 14:17 moritzm: installing python-crypto security updates on trusty
  • 13:55 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T188198 Enable TemplateStyles on ruwiki (duration: 01m 00s)
  • 13:51 vgutierrez: disable puppet on primary LVS to merge safely gerrit/425040 T177961
  • 13:47 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/: SWAT: Restore subtract method for backward compatibility (T191696) (duration: 01m 01s)
  • 13:41 moritzm: upgraded HHVM on mediawiki-deployment04/05/06 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 13:35 elukey: restart kafka on kafka-jumbo1001 for openjdk upgrades
  • 13:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Update wikis with consolidate editing feedback" (T168886) (duration: 00m 59s)
  • 13:29 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/: SWAT: Disable search for global filters (T191539) (duration: 01m 01s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update wikis with consolidate editing feedback (T168886) (duration: 01m 00s)
  • 13:19 ema: restart pybal on lvs1006 for config changes introduced by https://gerrit.wikimedia.org/r/#/c/425251/
  • 12:02 moritzm: upgrading naos and wasat to ICU57-enabled build of HHVM
  • 12:01 _joe_: uploading mcrouter 0.37.0 to stretch-wikimedia (T190979)
  • 11:59 _joe_: uploading mcrouter 0.37.0 to jessie-wikimedia (T190979)
  • 11:15 mobrovac@tin: Finished deploy [restbase/deploy@29df9db]: Use the MCS-provided content-type in the definition response - T191809 (duration: 24m 19s)
  • 11:07 moritzm: upgrading mwdebug servers in codfw to ICU57-enabled build of HHVM
  • 10:51 mobrovac@tin: Started deploy [restbase/deploy@29df9db]: Use the MCS-provided content-type in the definition response - T191809
  • 10:47 arturo: T188266 reimage labtestservices2002.wikimedia.org
  • 10:23 moritzm: upgrading job runners in codfw to ICU57-enabled build of HHVM
  • 09:29 moritzm: upgrading app servers in codfw to ICU57-enabled build of HHVM
  • 07:52 hoo: Updated operations/dumps/dcat (7ea4e75c..61154ca4) on snapshot1007
  • 07:37 moritzm: upgrading API servers in codfw to ICU57-enabled build of HHVM
  • 05:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2069 from config - T191275 (duration: 00m 58s)
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2069 from config - T191275 (duration: 00m 59s)
  • 05:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 after alter table (duration: 01m 11s)
  • 05:17 marostegui: Deploy alter table on s1 primary master (db1052) - T185128 T153182
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 39s)

2018-04-09

  • 21:11 XioNoX: cr1-eqsin 24h experiment on applying same local-pref to peers and transits - T186835
  • 20:48 arlolra: Updated Parsoid to edeeb60 (T191281, T187386, T185266)
  • 20:38 awight@tin: Finished deploy [ores/deploy@be69c1d]: Transitional virtualenv for ORES, T181071 (duration: 24m 14s)
  • 20:32 arlolra@tin: Finished deploy [parsoid/deploy@447fab2]: Updating Parsoid to edeeb60 (duration: 11m 03s)
  • 20:21 arlolra@tin: Started deploy [parsoid/deploy@447fab2]: Updating Parsoid to edeeb60
  • 20:14 awight@tin: Started deploy [ores/deploy@be69c1d]: Transitional virtualenv for ORES, T181071
  • 20:12 awight@tin: Finished deploy [ores/deploy@b61c338]: Transitional virtualenv for ORES, T181071 (duration: 00m 19s)
  • 20:12 awight@tin: Started deploy [ores/deploy@b61c338]: Transitional virtualenv for ORES, T181071
  • 20:01 herron: repooled rhodium (puppet master backend) https://gerrit.wikimedia.org/r/425078
  • 19:57 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2017.codfw.wmnet
  • 19:26 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Switch SET on frwiktionary to use wikitexteditor by default (T169741) (duration: 01m 00s)
  • 19:17 sbisson@tin: Finished deploy [kartotherian/deploy@a26712b]: Deploying kartotherian i18n to maps-test* (with updated source and style) (duration: 01m 46s)
  • 19:15 sbisson@tin: Started deploy [kartotherian/deploy@a26712b]: Deploying kartotherian i18n to maps-test* (with updated source and style)
  • 18:58 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2006.codfw.wmnet
  • 18:58 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2010.codfw.wmnet
  • 18:58 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable PageAssessments on arwiki (T185023) (duration: 01m 00s)
  • 18:50 papaul: shutting down cp2017 for memory replacement
  • 18:37 papaul: shutting down cp2010 for memory replacement
  • 18:21 papaul: shutting down cp2006 for memory replacement
  • 18:04 gehel@tin: Finished deploy [wdqs/wdqs@7116a56]: new GUI version (duration: 02m 11s)
  • 18:01 gehel@tin: Started deploy [wdqs/wdqs@7116a56]: new GUI version
  • 17:58 papaul: shutting down cp2022 for memory replacement
  • 16:53 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2017.codfw.wmnet
  • 16:53 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2010.codfw.wmnet
  • 16:52 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2006.codfw.wmnet
  • 15:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:28 dereckson@tin: Synchronized wmf-config/flaggedrevs.php: Always show latest revision even if not reviewed on hu.wikipedia (T121995) (duration: 00m 59s)
  • 14:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:11 marostegui: Deploy schema change on db1067 - T187089 T185128 T153182
  • 14:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 for alter table (duration: 00m 59s)
  • 14:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 13:52 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db2092 in s1 T170662 (duration: 00m 59s)
  • 13:49 zeljkof: EU SWAT finished
  • 13:48 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RelatedArticles for vector at hewiki (T191573) (duration: 00m 59s)
  • 13:43 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add adm.dp.gov.ua to wgCopyUploadDomains, change if.gov.ua to www.if.gov.ua (T191692) (duration: 00m 59s)
  • 13:37 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix broken line that includes a group into a group by mistake (T191719) (duration: 00m 59s)
  • 13:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable <mapframe> on ku.wikipedia (T190944) (duration: 00m 57s)
  • 13:14 moritzm: upgrading Boost libraries on mwdebug with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 13:14 _joe_: started updateCollation.php maintenance script for the ICU 57 migration (T189295)
  • 13:03 marostegui: Stop MySQL on db1080 for mariadb and kernel upgrade
  • 13:03 _joe_: upgrading HHVM / libboost for ICU 57 upgrade (T189295)
  • 13:01 sbisson@tin: Finished deploy [tilerator/deploy@aef010b]: Deploying tilerator i18n to maps-test* (with updated source and style) (duration: 00m 33s)
  • 13:00 sbisson@tin: Started deploy [tilerator/deploy@aef010b]: Deploying tilerator i18n to maps-test* (with updated source and style)
  • 12:54 moritzm: upgrading Boost libraries on mwdebug with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 12:39 moritzm: upgrading Boost libraries on job runners with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 12:23 _joe_: preparing to run updateCollation from mw1338: stop videoscaler, disable puppet (T189295)
  • 12:05 _joe_: upgrading boost, hhvm on terbium for ICU 57 upgrade (T189295)
  • 12:01 elukey: upgrading Boost libraries on all mediawiki eqiad API server with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 11:50 moritzm: upgrading Boost libraries on remaining app servers with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 11:42 moritzm: removed profile::beta::icu57 from deployment-prep Hiera config now that the component is part of the standard app server manifests
  • 11:04 moritzm: upgrading Boost libraries on API server canaries with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:41 moritzm: upgrading Boost libraries on mw1300 with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:31 moritzm: upgrading Boost libraries on app server canaries with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:15 moritzm: upgrading tin/deploy1001 to a ICU 57-enabled HHVM build (T189295)
  • 10:13 elukey: completed upgrade of mw eqiad api appservers to ICU 57-enabled HHVM
  • 10:10 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 10:09 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 09:54 moritzm: upgrading mwdebug servers in eqiad to to ICU 57-enabled HHVM build (T189295)
  • 09:33 _joe_: all eqiad jobrunners migrated to ICU 57 (T189295)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db2092 to the config - T170662 (duration: 00m 59s)
  • 09:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db2092 to the config - T170662 (duration: 00m 58s)
  • 08:45 elukey: upgrading eqiad api appservers to ICU 57-enabled HHVM build (T189295)
  • 08:37 marostegui: Deploy schema change on db1080 - T187089 T185128 T153182
  • 08:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 for alter table (duration: 00m 59s)
  • 08:35 jynus@tin: Synchronized wmf-config/db-codfw.php: Repoo es2019 (duration: 00m 59s)
  • 08:32 moritzm: upgrading remaining app servers in eqiad to to ICU 57-enabled HHVM build (T189295)
  • 08:32 _joe_: upgrading eqiad jobrunners to ICU 57-enabled HHVM build (T189295)
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 after alter table (duration: 00m 58s)
  • 07:56 marostegui: Remove /var/log/wikidata/rebuildTermSqlIndex.log* as per Amir1's request
  • 07:48 moritzm: upgrading mw1276-1279 (API canaries) to ICU 57-enabled HHVM build (T189295)
  • 07:42 _joe_: repooling mw1300 now with ICU 57-enabled HHVM build (T189295)
  • 07:38 _joe_: upgrading mw1300 to ICU 57-enabled HHVM build (T189295)
  • 07:32 moritzm: upgrading mw1262-1265 to ICU 57-enabled HHVM build (T189295)
  • 07:24 moritzm: repooling mw1261 after upgrade to ICU 57-enabled HHVM build (T189295)
  • 07:17 moritzm: upgrading mw1261 to ICU 57-enabled HHVM build (T189295)
  • 07:09 elukey: upgrade burrow to 1.0 on kafkamon[12]* - T188719
  • 06:58 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=zhwiktionary --check-old --before 20180223210426 --sleep 2 (T184485)
  • 06:43 marostegui: Reboot db2072 for kernel upgrade
  • 06:41 marostegui: Stop MySQL on db2072 to clone db2092 from it - T170662
  • 06:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2072 - T170662 (duration: 00m 59s)
  • 06:24 elukey: upgrade burrow 1.0.0 to stretch/jessie wikimedia
  • 06:21 marostegui: Reboot db2092 for mariadb and kernel upgrade
  • 06:04 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2079 is now s8 candidate master (duration: 00m 59s)
  • 05:54 marostegui: Stop MySQL on db2079 to change its binlog format
  • 05:34 marostegui: Deploy schema change on db1106 with replication enabled (this will generate lag on labs replicas) - T187089 T185128 T153182
  • 05:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 for alter table (duration: 01m 00s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 57s)

2018-04-07

  • 23:44 Dereckson: OATHAuth disabled for Wikimedia SUL global account Barek (T191708)
  • 07:28 legoktm: disabled and cleaned up spam from @Farjksn on Phabricator
  • 00:14 mutante: bromine - scheduled downtime, reboot for reinstall, upgrade to stretch, misc_static_services switched to codfw (T188163)

2018-04-06

  • 22:35 mutante: rsyncing bugzilla-static raw html from eqiad to codfw VM
  • 19:59 herron: moved rhodium:/var/lib/git/operations/puppet away and triggered puppet agent run to re-create
  • 19:43 ottomata: running puppet-merge on rhodium after clash between puppet-merge and new patch submitted
  • 19:23 demon@tin: Finished scap: Forcing full scap. Mostly no-op, consistency, paranoia, that sort of thing (duration: 11m 51s)
  • 19:13 bd808: wiki replicas: ran maintain-views --database mediawikiwiki --clean on labsdb10{09,10,11} for T191387
  • 19:11 demon@tin: Started scap: Forcing full scap. Mostly no-op, consistency, paranoia, that sort of thing
  • 19:02 demon@tin: scap aborted: Forcing full scap, removed clean plugin updates (duration: 11m 03s)
  • 19:00 herron: depooled rhodium (puppet master backend) again https://gerrit.wikimedia.org/r/#/c/424646/
  • 18:51 demon@tin: Started scap: Forcing full scap, removed clean plugin updates
  • 18:49 demon@tin: scap failed: average error rate on 5/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 18:47 demon@tin: Pruned MediaWiki: 1.31.0-wmf.26 [keeping static files] (duration: 01m 51s)
  • 14:37 herron: repooled rhodium (puppet master backend)
  • 14:08 herron: upgraded apache on fermium for security updates
  • 14:07 anomie: Running populateArchiveRevId.php for group2 for T191307
  • 14:03 herron: apache updated on puppet masters — re-enabling puppet agents
  • 13:55 herron: temporarily disabling puppet agents for apache security update on puppet masters
  • 13:14 moritzm: installing apache security updates on thorium (running several analytics web services)
  • 12:38 moritzm: installing apache security updates on the Kibana nodes of the logstash cluster
  • 11:50 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=fawiki --before 20180223210426 --sleep 2 (T184485)
  • 10:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1114 (duration: 01m 00s)
  • 09:45 moritzm: installing apache security updates on graphite hosts
  • 09:39 marostegui: Deploy test alter table on db2038 to test osc_host.py in core
  • 09:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 09:24 moritzm: installing apache security updates on planet1001/planet.wikimedia.org
  • 09:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:57 no_justification: gerrit: restarting services to pick up openjdk updates
  • 08:50 moritzm: installing apache security updates on prometheus hosts
  • 08:45 no_justification: installed apache updates to gerrit2001/cobalt
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:41 moritzm: installing apache security updates on mwlog*
  • 08:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:28 moritzm: installing apache security updates on releases.wikimedia.org
  • 08:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 59s)
  • 08:07 elukey: upgrade prometheus-burrow-exporter on kafkamon1001/2001 - T188719
  • 08:07 elukey: upload prometheus-burrow-exporter 0.0.5 to jessie/stretch-wikimedia - T188719
  • 08:00 marostegui: Stop MySQL on db1114 for kernel and mariadb upgrade
  • 07:40 moritzm: removed mediawiki-deployment07 from deployment-prep (T191578)
  • 07:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2047 after changing binlog format, upgrade mariadb and kernel (duration: 00m 59s)
  • 06:33 marostegui: Stop MySQL on db2047 for binlog format change, upgrade kernel and mariadb
  • 06:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2047 to change binlog format, upgrade mariadb and kernel (duration: 00m 59s)
  • 06:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046 as candidate master (duration: 00m 59s)
  • 05:59 marostegui: Restart MySQL on db2046 to change its binlog format - T191275
  • 05:44 marostegui: Deploy schema change on db1114 - T187089 T185128 T153182
  • 05:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 for alter table (duration: 00m 53s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after alter table (duration: 00m 55s)

2018-04-05

  • 21:44 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2017.codfw.wmnet
  • 21:44 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet
  • 21:43 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2006.codfw.wmnet
  • 21:43 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2010.codfw.wmnet
  • 21:34 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet
  • 21:09 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet,service=varnish-be
  • 20:10 twentyafterfour@tin: Synchronized php-1.31.0-wmf.28/extensions/Echo/: Sync https://gerrit.wikimedia.org/r/#/c/424379/ refs T183967 (duration: 01m 05s)
  • 20:07 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/424379/ refs T191335
  • 19:59 herron: added rhodium puppet master backend in offline mode
  • 19:52 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.28 refs T183967
  • 19:51 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet
  • 18:45 catrope@tin: Synchronized wmf-config/Wikibase-production.php: Disable writing wb_terms search fields on Wikidata (T189777) (duration: 01m 16s)
  • 18:25 catrope@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/includes/Views/AbuseFilterViewList.php: Unbreak Special:AbuseFilter (T191512) (duration: 01m 17s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable logging autopatrol actions on commonswiki (T184485) (duration: 01m 17s)
  • 17:56 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=commonswiki --before 20180223210426 --from-id 156008475 (T184485)
  • 17:42 Amir1: finished the script
  • 17:33 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=commonswiki --before 20180223210426 (T184485)
  • 17:18 bsitzmann@tin: Finished deploy [mobileapps/deploy@eed7961]: Update mobileapps to dbc0687 (T187430) (duration: 09m 45s)
  • 17:09 bsitzmann@tin: Started deploy [mobileapps/deploy@eed7961]: Update mobileapps to dbc0687 (T187430)
  • 16:41 robh: cp2008 shutting down for firmware updates
  • 16:09 vgutierrez: updating librdkafka1 to 0.11.3 on cache text
  • 15:54 vgutierrez: updating librdkafka1 to 0.11.3 on cache upload
  • 15:44 vgutierrez: updating librdkafka1 to 0.11.3 on cache misc
  • 15:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2039 IP as it is being moved to a different rack - T191193 (duration: 01m 17s)
  • 15:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2039 IP as it is being moved to a different rack - T191193 (duration: 01m 17s)
  • 15:26 vgutierrez: uploaded pybal 1.15.3 for stretch on apt.w.o
  • 15:17 jynus: stopping mariadb on db2039 T191193
  • 14:59 moritzm: installing apache security updates
  • 14:54 marostegui: Deploy schema change on db1066 - T187089 T185128 T153182
  • 14:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for alter table (duration: 01m 17s)
  • 14:43 moritzm: uploaded apache2 2.4.10-10+deb8u12+wmf1 to apt.wikimedia.org/jessie-wikimedia (rebase of our local patches against the latest DSA)
  • 14:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2053 is no longer a candidate master (duration: 01m 17s)
  • 14:03 andrew@tin: Finished deploy [horizon/deploy@cd1cda6]: Deploying potential fix for T191232 (duration: 03m 17s)
  • 14:00 andrew@tin: Started deploy [horizon/deploy@cd1cda6]: Deploying potential fix for T191232
  • 13:41 anomie: Running populateArchiveRevId.php on group 1 for T191307
  • 13:39 zeljkof: EU SWAT finished
  • 13:32 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter: SWAT: Make $mode optional for checkAllFilters (T191468) (duration: 01m 20s)
  • 13:23 marostegui: Stop MySQL on db2053 for binlog format change
  • 13:09 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Stop logging autopatrol actions in wikidatawiki (T184485) (duration: 01m 16s)
  • 12:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 after alter table (duration: 01m 17s)
  • 12:52 Amir1: finished the script
  • 12:41 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=wikidatawiki --before 20180223210426 (T189596)
  • 12:30 moritzm: installing net-snmp security updates on jessie (stretch not affected)
  • 12:12 ariel@tin: Finished deploy [dumps/dumps@88ca17c]: fix monitor to import status module after refactor (duration: 00m 04s)
  • 12:12 ariel@tin: Started deploy [dumps/dumps@88ca17c]: fix monitor to import status module after refactor
  • 12:04 hoo: Manually back-filled hashes for the Wikidata JSON dumps in https://dumps.wikimedia.org/wikidatawiki/entities/20180402/wikidata-20180402-*sums.txt (T190457)
  • 11:58 vgutierrez: updating libssl1.1 to 1.1.0h on cache text cluster (and nginx restart)
  • 11:36 vgutierrez: updating libssl1.1 to 1.1.0h on cache upload cluster (and nginx restart)
  • 11:22 vgutierrez: updating libssl-1-1 to 1.1.0h on cache misc cluster (and nginx restart)
  • 10:57 jynus: restart dbstore1001 for RAID re-setup and reimage
  • 10:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Specify that db1106 is sanitarium's master (duration: 01m 16s)
  • 10:33 marostegui: Deploy schema change on db1083 - T187089 T185128 T153182
  • 10:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 for alter table (duration: 01m 17s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 after alter table (duration: 01m 16s)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1089 original weight (duration: 01m 17s)
  • 08:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 17s)
  • 08:30 jynus: starting backup of es2019, it may create lag T153440
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 16s)
  • 08:23 moritzm: installing net-snmp security updates on jessie (stretch not affected)
  • 08:16 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2015, depool es2019 (duration: 01m 16s)
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 16s)
  • 07:52 moritzm: removed unused/defunct deployment-videoscaler01 from deployment-prep (T191293)
  • 07:51 moritzm: removed unused/defunct deployment-tmh01 from deployment-prep (T191293)
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 after alter table, mariadb and kernel upgrade (duration: 01m 16s)
  • 07:44 moritzm: upgrading openjdk-7 on contint*
  • 07:36 marostegui: Stop MySQL on db1089 for kernel and mariadb upgrade
  • 07:33 marostegui: Deploy schema change on db1105:3311 - T187089 T185128 T153182
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 for alter table (duration: 01m 16s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2053 as candidate master (duration: 01m 09s)
  • 07:05 marostegui: Restart MySQL on db2053 for binlog format change
  • 06:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2038 (duration: 01m 13s)
  • 06:43 marostegui: Stop MySQL on db2038 to change binlog format, upgrade mariadb and kernel
  • 06:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2038 (duration: 01m 17s)
  • 06:08 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2058 is now a candidate master for s4 - T191275 (duration: 01m 16s)
  • 05:58 marostegui: Restart MySQL on db2058 to change its binlog to STATEMENT - T191275
  • 05:52 marostegui: Deploy schema change on db1089 - T187089 T185128 T153182
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 for alter table (duration: 01m 16s)
  • 05:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 after alter table (duration: 01m 18s)
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 07m 06s)

2018-04-04

  • 23:53 andrew@tin: Finished deploy [horizon/deploy@2c55bd5]: (no justification provided) (duration: 03m 10s)
  • 23:50 andrew@tin: Started deploy [horizon/deploy@2c55bd5]: (no justification provided)
  • 23:42 catrope@tin: Synchronized php-1.31.0-wmf.27/extensions/VisualEditor/lib/ve: Fix VE drag-and-drop bugs (T191103) (duration: 01m 17s)
  • 23:36 catrope@tin: Synchronized php-1.31.0-wmf.28/resources/src/mediawiki.rcfilters/: Fix missing bookmark icon (T191366) (duration: 01m 16s)
  • 23:12 catrope@tin: Synchronized wmf-config/CommonSettings.php: Set $wgVisualEditorSourceFeedbackTitle (no-op until later) (T157953) (duration: 01m 16s)
  • 23:09 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add Txikipedia namespace on euwiki (T191396) (duration: 01m 18s)
  • 22:54 akosiaris: increase the number of mathoid pods to 16 from 4
  • 21:53 bd808: Wiki replicas: ran `sudo maintain-views --table page_assessments --database trwiki` on all 3 servers for T191455
  • 20:27 arlolra: Updated Parsoid to d887aff (T177102, T189474)
  • 20:22 twentyafterfour@tin: Synchronized php-1.31.0-wmf.28/skins/MonoBook: sync https://gerrit.wikimedia.org/r/#/c/424041/ (duration: 01m 16s)
  • 20:22 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0460519]: Update mobileapps to 2d5ab5b (duration: 05m 58s)
  • 20:18 arlolra@tin: Finished deploy [parsoid/deploy@a8e759f]: Updating Parsoid to d887aff (duration: 11m 58s)
  • 20:16 mholloway-shell@tin: Started deploy [mobileapps/deploy@0460519]: Update mobileapps to 2d5ab5b
  • 20:15 mholloway-shell@tin: Started deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88
  • 20:06 arlolra@tin: Started deploy [parsoid/deploy@a8e759f]: Updating Parsoid to d887aff
  • 19:19 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.28 refs T183967 (duration: 01m 16s)
  • 19:18 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.28 refs T183967
  • food: re-enabled thank you mailer
  • 19:03 hasharAway: upgraded blubbler 0.2.0-1 -> 0.3.0-1 on contint1001 and contint2001
  • 18:17 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0125bc4]: Fixed the new metrics names. Again (duration: 00m 37s)
  • 18:17 ppchelko@tin: Started deploy [cpjobqueue/deploy@0125bc4]: Fixed the new metrics names. Again
  • 18:02 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0185e74]: Fix the metric names and support multi-topic rules (duration: 00m 35s)
  • 18:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@0185e74]: Fix the metric names and support multi-topic rules
  • 17:54 madhuvishy: Reset ttl for dumps.wikimedia.org CNAME to 1H post switchover to labstore1007 T188646
  • 17:26 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: gerrit:422414 Enable TemplateStyles on dewiki T190910 (duration: 01m 17s)
  • 17:19 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on all wikiquotes except frwikiquote T190726 (duration: 01m 17s)
  • 17:11 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on all wikimedia wikis T188881 (duration: 01m 18s)
  • 16:58 ppchelko@tin: Finished deploy [cpjobqueue/deploy@60a2292]: Revert: Support multi-topic rules (duration: 00m 21s)
  • 16:58 ppchelko@tin: Started deploy [cpjobqueue/deploy@60a2292]: Revert: Support multi-topic rules
  • 16:55 robh: dbstore1001 rebooting for bios firmware update
  • 16:47 ppchelko@tin: Finished deploy [cpjobqueue/deploy@d4a84ae]: Support multi-topic rules T191238 (duration: 00m 42s)
  • 16:47 ppchelko@tin: Started deploy [cpjobqueue/deploy@d4a84ae]: Support multi-topic rules T191238
  • 16:26 madhuvishy: Move cert for dumps.wikimedia.org to labstore1007 (do_acme: true) T188646
  • 16:22 madhuvishy: Change CNAME for dumps.wikimedia.org to labstore1007 T188646
  • 15:44 jynus: starting backup from es2015 (will create lag)
  • 15:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2015 (duration: 01m 17s)
  • 15:20 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Clean up config for the rest of high-traffic jobs after the switch - T190327 (duration: 01m 16s)
  • 15:14 madhuvishy: Update ttl for dumps.wikimedia.org CNAME to 1M in prep for switchover to labstore1007 T188646
  • 15:07 mobrovac@tin: Started restart [restbase/deploy@f3a53b6]: Pick up the net.ipv4.tcp_tw_reuse flag change - T190213
  • 15:06 elukey: delete /srv/deployment/prometheus from restbase* as clean up step for T181728
  • 14:30 anomie: Running populateArchiveRevId.php on group0 wikis for T191307
  • 14:20 elukey: apply net.ipv4.tcp_tw_reuse=1 to restbase* via https://gerrit.wikimedia.org/r/#/c/421901 - T190213
  • 14:15 moritzm: updating deployment-prep to HHVM 3.18.5+wmf6
  • 14:11 godog: purge cron smart-data-dump from lvs100[1-6]
  • 14:09 marostegui: Deploy schema change on db1099:3311 - T187089 T185128 T153182
  • 14:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 for alter table (duration: 01m 16s)
  • 14:08 moritzm: uploaded HHVM 3.18.5+wmf6 to component/icu57 for jessie-wikimedia (updated build with the security fix for CVE-2018-6334)
  • 13:59 marostegui: Deploy schema change on dbstore1002:s1 - T187089 T185128 T153182
  • 13:56 godog: rollout https://gerrit.wikimedia.org/r/c/423852 across ms-fe machines - T183902
  • 13:32 zeljkof: EU SWAT finished
  • 13:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Add namespace to euwiki" (T191396) (duration: 01m 14s)
  • 13:08 godog: upgrade smartmontools to -backports version after https://gerrit.wikimedia.org/r/c/423871/
  • 12:02 elukey: removing /srv/deployment/prometheus from restbase2001/1007 - T181728
  • 12:00 akosiaris: revert scb hosts to apertium-fra-cat_1.2.0~r78602-1+wmf2
  • 11:47 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2057 is now a candidate master for s3 - T191275 (duration: 01m 17s)
  • 11:13 akosiaris: upgrade apertium on all scb hosts. Rolling update with in groups of 2 hosts with a 30 seconds delay
  • 11:06 marostegui: Stop MySQL on db2057 for binlog format change, mariadb and kernel upgrade
  • 11:02 akosiaris: upgrade apertium on scb1001
  • 09:46 marostegui: Deploy schema change on s1 codfw master db2048 (this will generate lag on codfw) - T187089 T185128 T153182
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1077 (duration: 01m 16s)
  • 09:25 Amir1: end of the deleteAutoPatrolLogs.php script on mediawikiwiki (T184485)
  • 09:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2041 is now a candidate master for s2 - T191275 (duration: 01m 16s)
  • 09:16 elukey: executed systemctl reset-failed kafka-mirror-main-eqiad_to_jumbo-eqiad.service on kafka1020
  • 09:02 Amir1: start of mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki--before 20180223210426 --sleep 2 (T184485)
  • 09:02 marostegui: Stop MySQL on db2041 for binlog format change and kernel upgrade
  • 09:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2041 (duration: 01m 17s)
  • 08:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1072 (duration: 01m 17s)
  • 08:19 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki --check-old --before 20160423210426 (T184485)
  • 08:17 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki --dry-run --check-old --before 20160423210426
  • 08:08 marostegui: Deploy schema change on s3 primary master (db1075) - T153182 T185128
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1072 (duration: 01m 17s)
  • 07:59 godog: depool ms-fe2005 to test rewrite.py - T183902
  • 07:53 marostegui: Drop flaggedrevs from s3 mediawikiwiki - T186865
  • 07:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2055 is now a candidate master - T191275 (duration: 01m 16s)
  • 07:37 moritzm: running some apache/stretch tests on mw2261
  • 07:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2083 - T188279 (duration: 01m 17s)
  • 07:30 ema: finish up cache@eqiad reboots for retpoline kernel updates T188092
  • 07:26 marostegui: Restart MySQL on db2055 to change its binlog to STATEMENT - T191275
  • 05:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db2083 - T188279 (duration: 01m 17s)
  • 05:48 marostegui: Deploy schema change on db1072 - s3 - with replication. This will generate lag on labs T187089 T185128 T153182
  • 05:43 marostegui: Drop click_tracking_events table from where it still exists - T115982
  • 05:21 marostegui: Stop mariadb for upgrade and kernel upgrade on db1072 - this will generate lag on s3 labs
  • 05:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 for alter table, kernel and mariadb upgrade (duration: 01m 17s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 05m 31s)
  • 01:02 eileen: update civicrm - civicrm revision changed from d6855cd281 to 7010f0f5d6, config revision is 3b900436c9

2018-04-03

  • 23:55 XioNoX: re-activating graceful-switchover on cr1-codfw - T189588
  • 23:16 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Make a note about the loading order of GlobalPreferences and Echo (Gerrit:422642) (no-op) (duration: 01m 17s)
  • 23:10 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Rollout VirtualPageViews (final stage) (T189906) (duration: 01m 19s)
  • 22:34 mutante: cobalt - puppet disabled temporarily to apply fix to "simplify directory structure" change .. on gerrit2001 first
  • 22:25 mutante: restarting Apache on phab1001 - T182832
  • 22:14 twentyafterfour: Finished MediaWiki Train for group0, 1.31.0-wmf.28 refs T183967
  • 22:12 twentyafterfour@tin: Pruned MediaWiki: 1.31.0-wmf.25 [keeping static files] (duration: 01m 55s)
  • 22:10 twentyafterfour@tin: Pruned MediaWiki: 1.31.0-wmf.24 [keeping static files] (duration: 04m 18s)
  • 21:30 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.28 refs T183967
  • 21:15 twentyafterfour@tin: Finished scap: testwikis wikis to 1.31.0-wmf.28 refs T183967 (duration: 46m 38s)
  • {{safesubst:SAL entry|1=21:13 urandom: (re)starting restbase-dev1004-{a,b} (ooms), and enabling alternately patched cassandra 3.11.2 build - T186751}}
  • 20:29 twentyafterfour@tin: Started scap: testwikis wikis to 1.31.0-wmf.28 refs T183967
  • 20:22 ejegg: disabled thank you mail sender
  • {{safesubst:SAL entry|1=19:46 urandom: restarting restbase-dev1004-{a,b} to enable patched cassandra 3.11.2 build - T186751}}
  • 19:07 twentyafterfour: Preparing to deploy 1.31.0-wmf.28 refs T183967
  • 18:25 urandom: upgrading restbase-dev1006-b to cassandra 3.11.2 - T186751
  • 18:23 urandom: upgrading restbase-dev1006-a to cassandra 3.11.2 - T186751
  • 18:20 urandom: upgrading restbase-dev1005-b to cassandra 3.11.2 - T186751
  • 18:18 urandom: upgrading restbase-dev1005-a to cassandra 3.11.2 - T186751
  • 18:15 urandom: upgrading restbase-dev1004-b to cassandra 3.11.2 - T186751
  • 18:13 urandom: upgrading restbase-dev1004-a to cassandra 3.11.2 - T186751
  • 18:05 mutante: rhodium - closing idle screen session from maintenance work on puppetmasters
  • 18:03 mutante: elnath - fixing and re-enabling Icinga alert about screens, none are running, spare hosts should not have these
  • 17:59 mutante: restarting ferm on bromine
  • 17:40 elukey: manually set net.ipv4.tcp_tw_reuse=1 on restbase1007 as test for T190213
  • 17:35 sbisson@tin: Finished deploy [tilerator/deploy@8e68cb8]: Deploying tilerator i18n to maps-test* (take 2) (duration: 00m 25s)
  • 17:35 sbisson@tin: Started deploy [tilerator/deploy@8e68cb8]: Deploying tilerator i18n to maps-test* (take 2)
  • 17:28 awight@tin: Finished deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071 (duration: 01m 27s)
  • 17:28 sbisson@tin: Finished deploy [tilerator/deploy@03add2d]: Deploying tilerator i18n to maps-test* (duration: 04m 09s)
  • 17:27 awight@tin: Started deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071
  • 17:25 awight@tin: Finished deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071 (duration: 00m 27s)
  • 17:25 awight@tin: Started deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071
  • 17:24 sbisson@tin: Started deploy [tilerator/deploy@03add2d]: Deploying tilerator i18n to maps-test*
  • 17:24 moritzm: upgrading HHVM on labweb*
  • 17:18 jynus: reloading labsdb proxy configuration
  • 17:08 elukey: manually set net.ipv4.tcp_tw_reuse=1 on restbase2001 as test for T190213
  • 16:53 demon@tin: Finished deploy [gerrit/gerrit@aa1a1a0]: no-op, pushing empty motd.config file (duration: 00m 11s)
  • 16:53 demon@tin: Started deploy [gerrit/gerrit@aa1a1a0]: no-op, pushing empty motd.config file
  • 16:33 urandom: rebooting restbase-dev1006 - T186751
  • 16:10 urandom: rebooting restbase-dev1004 - T186751
  • 16:04 ariel@tin: Finished deploy [dumps/dumps@77dc467]: split up some large modules, prep work for prefetch changes (duration: 00m 04s)
  • 16:04 ariel@tin: Started deploy [dumps/dumps@77dc467]: split up some large modules, prep work for prefetch changes
  • 15:39 elukey: roll restart of zookeeper on conf100[123] to pick up prometheus monitoring
  • 15:09 godog: depool ms-fe2005 to test rewrite.py - T183902
  • 14:40 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch the remaining high-traffic jobs for all wikis, file 2/2 - T190327 (duration: 00m 59s)
  • 14:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@60a2292]: Switch all high traffic jobs to kafka T190327 (duration: 00m 44s)
  • 14:39 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the remaining high-traffic jobs for all wikis, file 1/2 - T190327 (duration: 00m 59s)
  • 14:38 ppchelko@tin: Started deploy [cpjobqueue/deploy@60a2292]: Switch all high traffic jobs to kafka T190327
  • 13:48 anomie@tin: Synchronized php-1.31.0-wmf.27/extensions/intersection/DynamicPageList.hooks.php: Backporting fix for T191116 (gerrit:423689) (duration: 00m 58s)
  • 13:47 anomie@tin: Synchronized php-1.31.0-wmf.27/includes/specials/SpecialWhatlinkshere.php: Backporting fix for T191116 (gerrit:423688) (duration: 00m 58s)
  • 13:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1077 after alter table (duration: 00m 58s)
  • 13:21 marostegui: Reimport  s51541_sulwatcher.logging from master to slave - T191020
  • 13:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1077 after alter table (duration: 00m 58s)
  • 13:18 elukey: roll restart of zookeeper on conf200[123] to pick up prometheus monitoring settings
  • 12:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1077 after alter table (duration: 00m 59s)
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1077 after alter table (duration: 00m 58s)
  • 11:16 godog: deploy thumbor 1.16 in codfw
  • 11:06 moritzm: installing libdatetime-timezone-perl update from Debian SUA
  • 09:51 godog: deploy thumbor 1.16 in codfw and eqiad - T186528 T179200 T189647 T191028
  • 08:46 marostegui: Deploy schema change on db1077 - s3 - T187089 T185128 T153182
  • 08:41 moritzm: upgrading HHVM on video scalers
  • 08:40 volans: temporarily disabled puppet (and re-enabling it one-by-one) on all prod puppetmasters to deploy g/422907 - T190918
  • 08:36 marostegui: Stop MySQL on db1077 for mysql and kernel upgrade
  • 08:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for alter table (duration: 00m 59s)
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1078 (duration: 00m 58s)
  • 08:29 godog: codfw-prod: more weight to ms-be204[0-3] - T189633
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 08:01 elukey: restart of druid-(overlord|middlemanager) on druid1004[456] as precautionary measure after zk restart
  • 08:01 moritzm: uploaded HHVM 3.18.5-dfsg-1+wmf5+deb9u1 for stretch-security to apt.wikimedia.org
  • 08:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 07:50 elukey: roll restart zookeeper on druid100[456] to enable prometheus monitoring
  • 07:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 with low weight (duration: 00m 58s)
  • 07:12 jynus: upgrade and restart of labsdb1010
  • 07:10 marostegui: Stop MySQL on db1078 for mariadb and kernel upgrade
  • 06:43 elukey: execute systemctl reset-failed kafka-mirror-main-eqiad_to_jumbo-eqiad.service on kafka102[23]
  • 06:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2035 rack comment (duration: 00m 58s)
  • 05:37 marostegui: Deploy schema change on db1078 - s3 - T187089 T185128 T153182
  • 05:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 for alter table (duration: 00m 59s)
  • 05:18 marostegui: Enable back gtid on db2035 - T191193
  • 02:44 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 19m 03s)
  • 00:11 Amir1: Evening SWAT is done

2018-04-02

  • 23:56 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Add several domains of Ukraine government to wgCopyUploadsDomains (T185399) (duration: 00m 59s)
  • 23:45 ladsgroup@tin: Synchronized tests/cirrusTest.php: Shift all search traffic to codfw, part II (T191236) (duration: 00m 58s)
  • 23:44 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Shift all search traffic to codfw (T191236) (duration: 00m 59s)
  • 23:29 Amir1: Persian Wikipedia logos have been purged using purgeList.php on terbium
  • 23:26 ladsgroup@tin: Synchronized static/images/project-logos: Update logo for the Persian Wikipedia (T191174) (duration: 00m 59s)
  • 22:41 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/CSSMin.php: sync https://gerrit.wikimedia.org/r/423574 (duration: 00m 58s)
  • 22:11 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.27
  • 21:59 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/CSSMin.php: (no justification provided) (duration: 01m 16s)
  • 21:28 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 14s)
  • 21:26 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 21:22 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/: Revert ceb7d61 refs T183966 T190960 (duration: 00m 59s)
  • 21:09 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.26
  • 20:59 twentyafterfour: MediaWiki Train: rolling back to 1.31.0-wmf.26 refs T183966, T190960
  • 20:38 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/DatabaseMysqlBase.php: sync 36c5235 refs T190960 (duration: 01m 16s)
  • 20:22 mholloway-shell@tin: Finished deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88 (duration: 05m 52s)
  • 20:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88
  • 19:56 herron: puppetdb postgres update complete — puppet agents re-enabled
  • 19:46 herron: temporarily disabling puppet agents for puppetdb postgres security update
  • 19:41 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/DatabaseMysqlBase.php: sync 779e7fd refs T190960 (duration: 01m 16s)
  • 19:17 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 15s)
  • 19:16 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 19:09 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/: sync I57dd8d refs T183966 T190960 (duration: 01m 19s)
  • 19:06 twentyafterfour: sync rdbms: avoid lag estimates in getLagFromPtHeartbeat ruined by snapshots Bug: T190960 Change-Id: I57dd8d
  • 19:04 twentyafterfour: Getting the train back on track: deploying 1.31.0-wmf.27 to Group0
  • 17:49 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the remaining high-traffic jobs to EventBus, test wikis only, file 2/2 - T190327 (duration: 01m 15s)
  • 17:48 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9e1b203]: Switch remaining high traffic jobs for test wikis. T190327 (duration: 00m 43s)
  • 17:47 ppchelko@tin: Started deploy [cpjobqueue/deploy@9e1b203]: Switch remaining high traffic jobs for test wikis. T190327
  • 17:47 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch the remaining high-traffic jobs to EventBus, test wikis only, file 1/2 - T190327 (duration: 01m 16s)
  • 17:36 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Shift serach traffic for enwiki to codfw (duration: 01m 17s)
  • 17:21 smalyshev@tin: Finished deploy [wdqs/wdqs@49f4eed]: GUI update (duration: 09m 49s)
  • 17:11 smalyshev@tin: Started deploy [wdqs/wdqs@49f4eed]: GUI update
  • 16:37 madhuvishy: Rolling out new symlinks to /public/dumps for labstore1006 dumps nfs mount T188643
  • 15:59 madhuvishy: Absenting /public/dumps mount from labstore1003 across the VPS fleet T188643
  • 15:56 ebernhardson: restart elasticsearch on elastic1024, been stuck at 100% cpu for 3+ hours
  • 15:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2035 IP - T191193 (duration: 01m 15s)
  • 15:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2035 IP - T191193 (duration: 01m 15s)
  • 15:28 marostegui: Stop MySQL and power off db2035 (s2 codfw master - this will stop replication on s2 codfw slaves) for rack change - T191193
  • 15:06 madhuvishy: Reenabled puppet and rolled out mounting new dumps NFS shares from labstore1006|7 on VPS instances T188643
  • 14:40 cmjohnson1: disabling puppet on decom host db1020
  • 14:28 madhuvishy: Disabling puppet across VPS instances with dumps mounted (https://phabricator.wikimedia.org/P6921) T188643
  • 14:22 marostegui: Drop contest* tables from s3 - T186867
  • 14:12 akosiaris@puppetmaster1001: conftool action : set/weight=15; selector: dc=eqiad,service=recommendation-api,cluster=scb,name=scb1003.*
  • 14:12 akosiaris@puppetmaster1001: conftool action : set/weight=15; selector: dc=eqiad,service=recommendation-api,cluster=scb,name=scb1004.*
  • 14:10 akosiaris: lower weight for scb1001, scb1002 from 10 to 8 for all services. T191199. scb1003, scb1004 have a weight of 15 already
  • 14:09 akosiaris@puppetmaster1001: conftool action : set/weight=8; selector: dc=eqiad,cluster=scb,name=scb1002.*
  • 14:09 akosiaris@puppetmaster1001: conftool action : set/weight=8; selector: dc=eqiad,cluster=scb,name=scb1001.*
  • 13:54 ariel@tin: Finished deploy [dumps/dumps@0363d50]: add check that xml files don't have binary corruption (nulls) after the header (duration: 00m 04s)
  • 13:54 ariel@tin: Started deploy [dumps/dumps@0363d50]: add check that xml files don't have binary corruption (nulls) after the header
  • 13:48 twentyafterfour@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Sync initializesettings for T190445 (duration: 01m 16s)
  • 13:36 twentyafterfour@tin: Synchronized wmf-config/throttle.php: SWAT: Sync throttle rules for T191187 (duration: 01m 15s)
  • 13:30 twentyafterfour@tin: Synchronized wmf-config/throttle.php: SWAT: Sync throttle rules for T191168 (duration: 01m 16s)
  • 13:27 jynus: restarting pdfrender on scd1003 (Socket timeout)
  • 12:49 akosiaris: upgrade mediawiki servers for hhvm upgrade
  • 12:06 marostegui: Deploy schema change on dbstore1002 - s3 - T187089 T185128 T153182
  • 11:51 akosiaris: repool mediawiki canary servers after hhvm upgrade
  • 11:44 akosiaris: depool mediawiki canary servers for hhvm upgrade
  • 10:16 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 10:15 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 09:13 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove references to virt1000 (duration: 01m 16s)
  • 09:12 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove references to virt1000 (duration: 01m 16s)
  • 08:50 marostegui: Deploy schema change on s3 codfw master db2043 (this will generate lag on codfw) - T187089 T185128 T153182
  • 08:21 jynus: stop mariadb at labsdb1009 and labsdb1010
  • 08:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Specify current m5 codfw master (duration: 01m 17s)
  • 08:11 jynus: depool labsdb1011 from web wikirreplicas
  • 07:21 apergos: restarted pdfrender on scb1004 after poking around there a bit
  • 07:01 apergos: restarted pdfrender on scb1001,2, service paged and no jobs were being processed
  • 06:06 marostegui: Drop localisation table from the hosts where it still existed - T119811
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.26) (duration: 12m 53s)

2018-03-31

  • 21:15 mutante: bast1001 has been shutdown and decom'ed as planned. if you have any issues with shell access make sure you have replaced with bast1002 or any other bast host
  • 11:26 urandom: removing corrupt commitlog segment, restbase1009-c
  • 11:25 urandom: removing corrupt commitlog segment, restbase1009-b
  • 11:19 urandom: starting restbase1009-c
  • 11:18 urandom: truncating hints, restbase1009-a
  • 11:14 urandom: restarting restbase1009-b
  • 11:13 urandom: stopping restbase1009-a (high hints storage)

2018-03-30

  • 14:16 akosiaris: T189076 upload apertium-fra-cat to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189076 upload apertium-cat to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189075 upload apertium-lex-tools to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189075 upload apertium-separable to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189076 upload apertium-fra to apt.wikimedia.org/jessie-wikimedia/main
  • 11:44 dcausse: running forceSearchIndex from terbium to cleanup elastic indices for (testwiki, mediawikiwiki, labswiki, labtestwiki, svwiki) (T189694)
  • 11:40 dcausse: elastic@codfw cluster restarts complete (T189239)
  • 10:55 dcausse: resuming elastic@codfw cluster restarts
  • 10:17 elukey: roll restart of zookeeper daemons on druid100[123] (Druid analytics cluster) to pick up the new prometheus jmx agent
  • 09:31 elukey: restart oozie/hive daemons on an1003 for openjdk-8 upgrades
  • 08:38 elukey: rolling restart of hadoop-hdfs-datanode on all the hadoop worker nodes after https://gerrit.wikimedia.org/r/423000
  • 07:39 elukey: rolling restart of yarn-hadoop-nodemanagers on all the hadoop worker nodes after https://gerrit.wikimedia.org/r/423000

2018-03-29

  • 23:47 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T189252: Enable perf oversampling for remaining countries in Asia (duration: 01m 16s)
  • 23:40 ebernhardson@tin: Synchronized php-1.31.0-wmf.27/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Start cirrus AB test (duration: 01m 16s)
  • 23:37 ebernhardson@tin: Synchronized php-1.31.0-wmf.26/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Start cirrus AB test (duration: 01m 16s)
  • 23:12 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Configure 5 buckets for cirrus AB test (duration: 01m 17s)
  • 22:10 andrew@tin: Finished deploy [horizon/deploy@14d3e7d]: Updating Horizon with possible fix for T189706 (duration: 03m 16s)
  • 22:06 andrew@tin: Started deploy [horizon/deploy@14d3e7d]: Updating Horizon with possible fix for T189706
  • 20:07 robh: shuttdown cp2022 for hw testing
  • 18:49 maxsem@tin: Synchronized php-1.31.0-wmf.27/skins/MinervaNeue: https://gerrit.wikimedia.org/r/#/c/423012/ (duration: 01m 17s)
  • 18:27 maxsem@tin: Synchronized php-1.31.0-wmf.26/includes/: Shorten summary length to 500 (duration: 02m 06s)
  • 18:22 maxsem@tin: Synchronized php-1.31.0-wmf.27/includes/: Shorten summary length to 500 (duration: 02m 14s)
  • 17:55 dcausse: pausing restarts of elastic@codfw (6 nodes left)
  • 17:35 mobrovac@tin: Finished deploy [restbase/deploy@af592d6]: Add bawikibooks - T191033 (duration: 30m 35s)
  • 17:30 demon@tin: Synchronized docroot/wwwportal/w/search-redirect.php: removing symlink indirection (duration: 01m 16s)
  • 17:05 mobrovac@tin: Started deploy [restbase/deploy@af592d6]: Add bawikibooks - T191033
  • 14:54 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Cleanup: Use only EventBus for refreshLinks - T185052 (duration: 01m 18s)
  • 14:00 moritzm: restarting parsoid and related service on ruthenium to pick up openssl update
  • 13:52 dcausse: reverted and rebased tin for undeployed patch due to scap issues (https://gerrit.wikimedia.org/r/#/c/422906/ https://gerrit.wikimedia.org/r/#/c/422929/)
  • 13:34 dcausse: aborted scap sync-dir php-1.31.0-wmf.27/extensions/CirrusSearch/ (was taking too much time at: waiting on sync-masters, ok: 1, left: 1)
  • 12:54 moritzm: installing ICU security updates on trusty
  • 12:29 dcausse: recreating replicas for skwiki_content in elastic@codfw due to stalled shard recovery
  • 12:18 ariel@tin: Finished deploy [dumps/dumps@982cebd]: ability to configure production of recombined metacurrent page content file (duration: 00m 02s)
  • 12:18 ariel@tin: Started deploy [dumps/dumps@982cebd]: ability to configure production of recombined metacurrent page content file
  • 11:02 ariel@tin: Finished deploy [dumps/dumps@96ba844]: cleanup 'latest' links, rss files from old runs (duration: 00m 04s)
  • 11:02 ariel@tin: Started deploy [dumps/dumps@96ba844]: cleanup 'latest' links, rss files from old runs
  • 10:50 dcausse: restarting elastic@codfw for JVM and plugin upgrade (T189239)
  • 09:16 elukey: roll restart aqs on aqs100* for icu/openssl upgrades
  • 08:18 akosiaris: T189075 upload apertium_3.5.1-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 08:18 moritzm: installing OpenJDK security updates on elastic* hosts (along with current version of the search plugins package)
  • 08:07 elukey: roll restart of cassandra on aqs* for openjdk-8 upgrades
  • 07:20 moritzm: installing openssl security updates
  • 07:18 ema: reboot cache@eqiad for retpoline kernel updates: T188092
  • 04:35 twentyafterfour: ran scap pull on deploy1001
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-28

  • 23:50 eileen: update civicrm revision changed from 9478ca39f1 to d6855cd281 (further security module updates, engage import dedupe)
  • 23:38 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Configure next Cirrus AB test (duration: 01m 16s)
  • 23:18 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T184969: Enable PageAssessments on trwiki (duration: 01m 09s)
  • 23:13 MaxSem: created PageAssessments tables on trwiki
  • 22:17 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.26 (duration: 01m 18s)
  • 22:16 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.26
  • 22:13 twentyafterfour: deploy of 1.31.0-wmf.27 resulted in a lot of SlowTimer errors for SlowTimer [10000ms] at runtime/ext_mysql: slow query: SELECT MASTER_GTID_WAIT(...)
  • 22:12 eileen: civicrm revision changed from 3f6028b24f to 9478ca39f1 (drupal security update)
  • 22:09 twentyafterfour@tin: rebuilt and synchronized wikiversions files: sync https://gerrit.wikimedia.org/r/#/c/422563/ group1 wikis to 1.31.0-wmf.27 refs T183966 T190960
  • 22:08 twentyafterfour: rolling forward group1 to 1.31.0-wmf.27 refs T183966 T190960
  • 22:05 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/: sync https://gerrit.wikimedia.org/r/#/c/422565/ (duration: 02m 15s)
  • 22:03 twentyafterfour: syncing https://gerrit.wikimedia.org/r/#/c/422565/ refs T190960 T183966
  • 21:53 mutante: deploy1001 - revoking old puppet certs and signing new ones
  • 21:42 twentyafterfour: getting the train back on track, group1 wikis to 1.31.0-wmf.27
  • 20:51 XenoRyet: updated civicrm from 85c89c7d0a to 3f6028b24f
  • 20:50 bsitzmann@tin: Finished deploy [mobileapps/deploy@6a0d877]: Update mobileapps to a5833a0 (duration: 05m 36s)
  • 20:44 bsitzmann@tin: Started deploy [mobileapps/deploy@6a0d877]: Update mobileapps to a5833a0
  • 20:12 mlitn@tin: Finished deploy [3d2png/deploy@c447488]: Updating 3d2png (duration: 02m 26s)
  • 20:09 mlitn@tin: Started deploy [3d2png/deploy@c447488]: Updating 3d2png
  • 19:54 mutante: deploy1001 - schedule downtime for reinstall with jessie, reinstalling (T175288)
  • 19:24 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.26 (duration: 01m 17s)
  • 19:22 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.26
  • 19:20 twentyafterfour: Rolling back to wmf.26 due to increase in fatals: "Replication wait failed: lost connection to MySQL server during query"
  • 19:12 milimetric@tin: Finished deploy [analytics/refinery@c22fd1e]: Fixing python import bug (duration: 02m 48s)
  • 19:09 milimetric@tin: Started deploy [analytics/refinery@c22fd1e]: Fixing python import bug
  • 19:09 milimetric@tin: Started deploy [analytics/refinery@c22fd1e]: (no justification provided)
  • 19:06 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 17s)
  • 19:05 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 19:02 ebernhardson: restore elasticsearch eqiad disk high/low watermarks to 75/80% with all large reindexes complete
  • {{safesubst:SAL entry|1=18:52 urandom: upgrading restbase-dev1005-{a,b} to cassandra 3.11.2 -- T178905}}
  • 18:17 urandom: upgrading restbase-dev1004-b to cassandra 3.11.2 (canary) -- T178905
  • 18:12 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.27
  • 18:12 urandom: upgrading restbase-dev1004-a to cassandra 3.11.2 (canary) -- T178905
  • 18:03 twentyafterfour: deploying 1.31.0-wmf.27 to group0. group1 in an hour. See T183966 for blockers.
  • 17:38 joal@tin: Finished deploy [analytics/refinery@7135d44]: Regular weekly analytics deploy - Scheduled hadoop jobs updates (duration: 05m 21s)
  • 17:32 joal@tin: Started deploy [analytics/refinery@7135d44]: Regular weekly analytics deploy - Scheduled hadoop jobs updates
  • 16:37 akosiaris: T189075 upload lttoolbox_3.4.0~r84331-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 15:37 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable oversampling for IN, GU, MP in preparation for eqsin (T189252) (duration: 01m 18s)
  • 15:13 andrewbogott: restarting nodepool on labnodepool1001 (cleanup from T189115)
  • 15:08 andrewbogott: restarting nova-fullstack on labnet1001
  • 15:07 andrewbogott: restarting nova-network on labnet1001 in case it's upset by the rabbit outage
  • 15:02 andrewbogott: rebooting labservices1001 and labcontrol1001 for T189115
  • 15:00 andrewbogott: stopping nova-fullstack on labnet1001 for T189115
  • 15:00 andrewbogott: stopping nodepool on labnodepool1001
  • 14:58 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Disable redis queue for cirrusSearch jobs for test wikis, file 2/2 - T189137 (duration: 01m 17s)
  • 14:56 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Disable redis queue for cirrusSearch jobs for test wikis, file 1/2 - T189137 (duration: 01m 17s)
  • 14:54 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c84880a]: Switch CirrusSearch jobs to kafka for test wikis (duration: 00m 44s)
  • 14:54 ppchelko@tin: Started deploy [cpjobqueue/deploy@c84880a]: Switch CirrusSearch jobs to kafka for test wikis
  • 13:51 elukey: reduced number of jobrunner runners on the videoscalers after the last burst of jobs that maxed out the cluster
  • 13:51 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TemplateStyles on all Wikivoyages (T189838) (duration: 01m 17s)
  • 13:42 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Wikidata description override on enwik (T184000) (duration: 01m 18s)
  • 13:36 catrope@tin: Synchronized php-1.31.0-wmf.27/extensions/Echo/modules/nojs/mw.echo.badge.less: Prevent FOUC when loading notification badges (duration: 01m 20s)
  • 13:35 jynus: upgrade mariadb client on sarin, neodymium, terbium and wasat
  • 13:18 catrope@tin: Synchronized dblists/flow.dblist: Enable Flow on euwiki (T190500) (duration: 01m 17s)
  • 13:07 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Translate extension on amwikimedia (T180879) (duration: 01m 22s)
  • 12:35 twentyafterfour@tin: Finished scap: test running full scap sync from tin (duration: 46m 05s)
  • 11:49 twentyafterfour@tin: Started scap: test running full scap sync from tin
  • 11:48 twentyafterfour@tin: Synchronized README: test deploy from tin.eqiad.wmnet (duration: 03m 35s)
  • 10:59 volans: performing a few minutes live test of reporting Puppet reports to puppetdb too on puppetmaster1001 - T190918
  • 10:27 godog: reload icinga on einsteinium after https://gerrit.wikimedia.org/r/c/413142
  • 10:05 jynus: upgrade and restart db2093
  • 09:25 godog: disable puppet on icinga servers before merging https://gerrit.wikimedia.org/r/c/413142/
  • 08:25 arturo: reboot labstore200[2,3,4] for T189115
  • 08:25 godog: add more weight to ms-be204[0-3] - T189633
  • 08:18 arturo: reboot labstore2001 for T189115
  • 08:17 arturo: reboot labstore1002 for T189115
  • 08:15 arturo: reboot labstore1001 for T189115
  • 07:49 moritzm: uploaded openssl 1.0.2o to apt.wikimedia.org/jessie-wikimedia
  • 06:51 moritzm: installing remaining ICU security updates
  • 02:28 l10nupdate@deploy1001: scap sync-l10n completed (1.31.0-wmf.26) (duration: 13m 33s)

2018-03-27

  • 23:18 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T189906: (duration: 00m 55s)
  • 23:08 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Update enwiki search ranking model (duration: 00m 54s)
  • 22:56 twentyafterfour@deploy1001: Finished scap: Deploy 1.31.0-wmf.27 to test wikis (duration: 41m 00s)
  • 22:28 mutante: DNS - switching deployment service name to deploy1001 (T175288)
  • 22:15 twentyafterfour@deploy1001: Started scap: Deploy 1.31.0-wmf.27 to test wikis
  • 22:14 demon@deploy1001: Synchronized wmf-config/abusefilter.php: beta-only sync (duration: 00m 53s)
  • 22:12 demon@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: beta-only sync (duration: 02m 32s)
  • 21:26 twentyafterfour@deploy1001: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.738JVwJRDN" ' returned non-zero exit status 127 (duration: 00m 43s)
  • 21:26 twentyafterfour@deploy1001: Started scap: Deploy 1.31.0-wmf.27 to test wikis
  • 21:25 mutante: deploy100 rm /var/lock/scap-global-lock to switch to active server, puppet code only adds lock file to inactive servers (T175288)
  • 21:22 twentyafterfour@deploy1001: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap-global-lock"; owner is "root"; reason is "Not the active deployment server, use tin.eqiad.wmnet" (duration: 00m 00s)
  • 21:22 mutante: deployment_server has been switched to deploy1001.eqiad.wmnet. tin is not the active server anymore as of right now
  • 20:55 twentyafterfour@deploy1001: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap-global-lock"; owner is "root"; reason is "Not the active deployment server, use tin.eqiad.wmnet" (duration: 00m 00s)
  • 20:47 twentyafterfour@tin: Finished scap: 2nd Sync to co-masters to initialize deploy1001.eqiad.wmnet (duration: 12m 50s)
  • 20:34 twentyafterfour@tin: Started scap: 2nd Sync to co-masters to initialize deploy1001.eqiad.wmnet
  • 20:32 twentyafterfour@tin: Finished scap: Sync to co-masters to initialize deploy1001.eqiad.wmnet (duration: 21m 30s)
  • 20:11 twentyafterfour@tin: Started scap: Sync to co-masters to initialize deploy1001.eqiad.wmnet
  • 20:09 twentyafterfour@tin: Synchronized README: (no justification provided) (duration: 00m 52s)
  • 19:41 mutante: deploy1001 - deleting /srv and letting puppet recreate it, so _not_ rsyncing manually from tin but just a clean version of what puppet pulls in (T175288)
  • 18:42 twentyafterfour: branching 1.31.0-wmf.27
  • 18:03 andrewbogott: rebooting labsdb1007 for T189115
  • 17:59 demon@tin: Finished deploy [gerrit/gerrit@4910e7c]: motd plugin (duration: 00m 11s)
  • 17:59 demon@tin: Started deploy [gerrit/gerrit@4910e7c]: motd plugin
  • 17:55 andrewbogott: rebooting labsdb1006 for T189115
  • 17:51 foks: disable 2FA from User:Céréales Killer
  • 16:51 madhuvishy: Running rsync catch up job for dumps from ms1001 to labstore1007
  • 16:43 moritzm: uploaded openssl 1.1.0h for jessie-wikimedia to apt.wikimedia.org
  • 16:18 godog: point eqiad puppet traffic to eqiad
  • 15:58 godog: point esams puppet agent traffic to eqiad
  • 15:35 hashar: Bumping operations-puppet-tests-docker job to docker-registry.wikimedia.org/releng/operations-puppet:0.3.1 | https://gerrit.wikimedia.org/r/#/c/422169/ | ping vgutierrez
  • 15:23 godog: reenable puppet fleetwide for CA failover - T189891
  • 15:10 godog: stop puppet fleetwide for CA failover - T189891
  • 14:45 andrewbogott: rebooting labpuppetmaster1001 for T189115
  • 14:36 andrewbogott: rebooting labpuppetmaster1002 for T189115
  • 14:12 ppchelko@tin: Finished deploy [restbase/deploy@e19bad9]: Deploy without feed check to verify that misterious deploy timeouts still happen (duration: 10m 52s)
  • 14:04 zeljkof: EU SWAT finished
  • 14:01 ppchelko@tin: Started deploy [restbase/deploy@e19bad9]: Deploy without feed check to verify that misterious deploy timeouts still happen
  • 13:54 ppchelko@tin: Started restart [restbase/deploy@e19bad9]: Restart to verify that misterious deploy timeouts still happen
  • 13:37 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Change wording for AbuseFilter global block durations (T190602) (duration: 00m 57s)
  • 13:31 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Enable $wgAbuseFilterProfile on itwiki (T190137) (duration: 00m 57s)
  • 13:30 godog: deactivate/clean iridium.eqiad.wmnet -- decom'd
  • 13:24 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AbuseFilter runtime profile on more Wikis (T175954) (duration: 00m 58s)
  • 11:36 moritzm: installing ICU security updates
  • 10:50 arturo: reboot labtestvirt2002 to test if it would boot or not
  • 09:44 elukey: reboot aqs1009 for kernel + cassandra upgrades
  • 09:28 elukey: reboot aqs1008 for kernel + cassandra upgrades
  • 09:25 vgutierrez: uploaded mtail-3.0.0~rc5-1 to apt.w.o for jessie-wikimedia
  • 09:09 elukey: reboot aqs1007 for kernel + cassandra upgrades
  • 08:36 kartik@tin: Finished deploy [cxserver/deploy@a6b029f]: Update cxserver to 9e8ebda (Fix etag parsing and T188403) (duration: 03m 09s)
  • 08:33 kartik@tin: Started deploy [cxserver/deploy@a6b029f]: Update cxserver to 9e8ebda (Fix etag parsing and T188403)
  • 08:33 elukey: reboot aqs1006 for kernel + openjdk-8 + cassandra upgrade
  • 08:29 godog: add more weight to ms-be204[0-3] - T189633
  • 08:15 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=aqs1005.eqiad.wmnet
  • 08:11 elukey: reboot aqs1005 for kernel + openjdk-8 + cassandra upgrade
  • 06:59 elukey: powercycle restbase2007 (no ssh, vsp not available via mgmt console)
  • 05:59 Krinkle: Manually purge https://en.wikipedia.org/static/images/project-logos/nds_nlwiki-2x.pngT190051
  • 05:59 Krinkle: Manually purge https://en.wikipedia.org/static/images/project-logos/nds_nlwiki-1.5x.pngT190051
  • 02:57 Krinkle: Fix retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/ve/*)
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-26

  • 23:41 niharika29@tin: Synchronized static/images/project-logos/: Correct high-density logos for the Dutch Low Saxon Wikipedia T190051 (duration: 00m 59s)
  • 22:38 mutante: syncing /srv from tin.eqiad to deploy1001.eqiad (T175288)
  • 22:09 demon@tin: Finished deploy [gerrit/gerrit@b14b43b]: wikimedia plugin (duration: 00m 10s)
  • 22:09 demon@tin: Started deploy [gerrit/gerrit@b14b43b]: wikimedia plugin
  • 21:43 urandom: rolling restart of restbase dev environment
  • 20:50 demon@tin: Pruned MediaWiki: 1.31.0-wmf.25 [keeping static files] (duration: 01m 26s)
  • 20:46 mholloway-shell@tin: Finished deploy [mobileapps/deploy@e223f51]: Update mobileapps to 534f95d (duration: 05m 23s)
  • 20:41 mholloway-shell@tin: Started deploy [mobileapps/deploy@e223f51]: Update mobileapps to 534f95d
  • 20:29 no_justification: gerrit: restarting services to pick up bugfix
  • 20:26 demon@tin: Finished deploy [gerrit/gerrit@f6c5350]: update to 2.14.7-9-g0f04397dbd (duration: 00m 10s)
  • 20:25 demon@tin: Started deploy [gerrit/gerrit@f6c5350]: update to 2.14.7-9-g0f04397dbd
  • 19:55 andrew@tin: Finished deploy [horizon/deploy@99153e4]: Rolling out fix for security groups, 421983 (duration: 03m 12s)
  • 19:52 andrew@tin: Started deploy [horizon/deploy@99153e4]: Rolling out fix for security groups, 421983
  • 19:44 ejegg: updated payments-wiki from 9e83e7f7a0 to 320a6c2600
  • 19:23 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 01m 22s)
  • 19:21 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:21 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 04m 16s)
  • 19:17 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:17 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 03m 29s)
  • 19:13 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:12 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 03m 49s)
  • 19:09 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:06 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 19m 28s)
  • 18:46 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 18:28 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mobile-only Mediawiki:MainPageCss styles for Hindi wiki T190101 (duration: 00m 58s)
  • 17:26 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2010.codfw.wmnet
  • 17:26 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2006.codfw.wmnet
  • 16:38 demon@tin: Pruned MediaWiki: 1.31.0-wmf.23 (duration: 05m 03s)
  • 13:55 hashar: restarting CI Jenkins . Upgrades Mail plugin from 1.20 to 1.21 | T190393
  • 13:30 moritzm: restarting HHVM on app server canaries to pick up ICU security update (not rebooting as logged before)
  • 13:30 moritzm: rebooting app server canaries to pick up ICU security update
  • 13:27 zeljkof: EU SWAT finished
  • 13:26 zfilipin@tin: Synchronized php-1.31.0-wmf.26/extensions/MobileFrontend/: SWAT: Squash: Hygiene: Auto namespace ResourceLoader modules and Add $wgMFMobileMainPageCss config flag; Hygiene: Auto namespace ResourceLoader modules; Add $wgMFMobileMainPageCss config flag (T190101) (duration: 01m 01s)
  • 13:23 ottomata: temporarily stopping puppet on kafka102[023] to use --new.consumer mirrormaker consuming from end
  • 13:21 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Enable AbuseFilter profiler at zh.wikipedia (T190663) (duration: 01m 00s)
  • 13:13 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add tboverride to engineer at ruwiki (T190619) (duration: 01m 01s)
  • 12:47 godog: add ms-be204[0-3] with minimal weight - T189633
  • 12:40 arturo: reboot labservices1002 for T189115
  • 12:30 arturo: reboot labnet100[2,3,4]* for T189115
  • 12:30 arturo: reboot labbwr100[2,3,4] for T189115
  • 12:00 arturo: reboot labmon100[1,2] for T189115
  • 12:00 moritzm: restarting HHVM on mediawiki canaries to pick up ICU security update
  • 11:47 arturo: reboot labcontrol100[3,4] for T189115
  • 11:31 arturo: reboot labcontrol1002 for T189115
  • 11:16 akosiaris: depool scb hosts for mathoid service. T184919
  • 11:16 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: service=mathoid,cluster=scb,name=scb.*
  • 10:56 moritzm: installing ICU security updates for jessie/stretch
  • 10:39 arturo: reboot silver for T189115
  • 10:34 arturo: reboot californium for T189115
  • 10:26 moritzm: upgrading debdeploy across the fleet to 0.0.99.4
  • 10:23 moritzm: uploaded debdeploy 0.0.99.4 to apt.wikimedia (for trusty/jessie/stretch)
  • 08:17 moritzm: upgrading debdeploy across the fleet to latest release
  • 07:33 elukey: stop eventlogging zmq-forwarder on eventlog1001 as part of decom process - T189566
  • 05:39 _joe_: restarting pdfrenderer on scb1001,1003
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-24

  • 20:22 foks: rm 2fa from Awight@officewiki
  • 15:00 elukey: rm -rf /srv/mediawiki/core on stat100[456] and force puppet run (git pull returned fatal: protocol error: bad pack header)
  • 02:33 bblack: powercycle cp3048
  • 02:31 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp3048.esams.wmnet
  • 01:27 Krinkle: Correct retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/VisualEditor/*)
  • 00:39 Krinkle: Correct retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/mw/*)

2018-03-23

  • 21:35 ebernhardson: delete indices for deleted wikis (from deleted.dblist) in eqiad and codfw elasticsearch clusters: alswikiquote, alswiktionary, mowiki, mowiktionary, ukwikimedia
  • 19:24 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 4) (duration: 06m 58s)
  • 19:17 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 4)
  • 19:11 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 3) (duration: 04m 19s)
  • 19:07 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 3)
  • 19:06 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 2) (duration: 06m 23s)
  • 18:59 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 2)
  • 18:28 sbisson@tin: Finished deploy [kartotherian/deploy@a66ff1d]: Deploying i18n feature to maps-test* (duration: 00m 29s)
  • 18:28 sbisson@tin: Started deploy [kartotherian/deploy@a66ff1d]: Deploying i18n feature to maps-test*
  • 15:43 moritzm: uploaded debdeploy 0.0.99.3 to apt.wikimedia.org (now based on Python 3 for the clients)
  • 15:08 ema: cache_codfw: begin reboots for retpoline kernel upgrades T188092
  • 15:02 bawolff@tin: Synchronized php-1.31.0-wmf.26/includes/api/ApiQueryUserContributions.php: T190507 (duration: 00m 59s)
  • 13:24 moritzm: installing postgres security updates on rhenium
  • 12:51 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=appserver,service=apache2
  • 12:48 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=api_appserver,service=apache2
  • 11:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1072 weight (duration: 00m 59s)
  • 11:19 moritzm: installing libvorbis security updates on trusty (Debian already fixed)
  • 11:09 elukey: restarting jvm daemons on analytics100[12] (Hadoop Masters) for openjdk-8 upgrade
  • 10:59 jynus: deployed new replication filter for labsdb1004 on u2815__p.all_articles T190488
  • 10:49 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Disable reading wb_terms search fields on wikidata (T189777) (duration: 00m 59s)
  • 10:36 elukey: upload cassandra2.2.6-wmf3 to jessie/stretch-wikimedia -C component/cassandra22 - T189529
  • 10:22 moritzm: restarting apache on krypton to pick up curl security update
  • 10:00 moritzm: installing plexus-utils2 security updates
  • 09:49 moritzm: armed keyholder on deploy1001
  • 08:19 elukey: reboot eventlog1001 for kernel upgrades

2018-03-22

  • 23:40 Amir1: Evening SWAT is done
  • 23:40 Amir1: Just to note, if you are seeing any performance regression (specially database-wise) 421333 might be the reason
  • 23:39 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Disable reading wb_terms search fields on wikidata (T189777) (duration: 00m 58s)
  • 23:29 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 58s)
  • 23:27 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-2x.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 56s)
  • 23:26 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-1.5x.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 58s)
  • 23:23 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-1.5x.png: static/images/project-logos/nds_nlwiki-2x.png static/images/project-logos/nds_nlwiki.png Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 59s)
  • 22:47 mutante: restarting Gerrit to apply config changes gerrit:406145 and gerrit:410474
  • 22:25 mutante: icinga - re-enabling notifications for a LOT of "systemd checks" that were all OK since a longer time but had not been re-enabled after some maintenance
  • 20:18 andrewbogott: reimaged labtestvirt2002
  • 19:52 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.26
  • 19:34 cmjohnson1: db1052 replacing disk slot 8
  • 18:52 XioNoX: done with the asw-a/b/c-eqiad switches uplink work
  • 18:43 Amir1: Morning SWAT is done
  • 18:41 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable VirtualPageViews on s6 (ja,ru,fr) wikis (T189906) (duration: 01m 16s)
  • 17:59 ppchelko@tin: Finished deploy [changeprop/deploy@4f9fbe4]: Purge page metadata and references on html change and page deletion. (duration: 01m 16s)
  • 17:57 ppchelko@tin: Started deploy [changeprop/deploy@4f9fbe4]: Purge page metadata and references on html change and page deletion.
  • 17:44 mutante: install1002 - restarted dhcp server to confirm there was no syntax error
  • 17:21 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, with increased timeout (duration: 03m 15s)
  • 17:18 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, with increased timeout
  • 17:14 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 3 (duration: 02m 54s)
  • 17:11 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 3
  • 17:10 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 2 (duration: 03m 00s)
  • 17:07 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 2
  • 17:03 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints (duration: 08m 39s)
  • 16:55 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints
  • 16:42 moritzm: installing postgres security updates on netmon*
  • 16:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Redeploy GlobalPreferences to test wikis and mw.org" (T189806) (duration: 01m 14s)
  • 16:28 moritzm: restarting graphite on labmon1001 to pick up uwsgi security update
  • 16:04 XioNoX: starting the asw-a/b/c-eqiad switches uplink work
  • 15:43 sbisson@tin: Finished deploy [tilerator/deploy@e259530]: Weekly progress to production (duration: 00m 43s)
  • 15:42 sbisson@tin: Started deploy [tilerator/deploy@e259530]: Weekly progress to production
  • 15:37 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Weekly progress to production (duration: 02m 27s)
  • 15:35 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Weekly progress to production
  • 15:23 sbisson@tin: Finished deploy [tilerator/deploy@e259530]: Deploying weekly progress to maps-test* (duration: 00m 26s)
  • 15:23 ottomata: ran puppet-merge on puppetmaster2001, got ssh: connect to host puppetmaster1001.eqiad.wmnet port 22: Connection timed out, hope all is ok. T189891
  • 15:23 sbisson@tin: Started deploy [tilerator/deploy@e259530]: Deploying weekly progress to maps-test*
  • 15:17 moritzm: installing openssh updates from stretch point release
  • 15:14 cmjohnson1: db1054 replacing disk at slot 1
  • 15:10 cmjohnson1: replacing disk slot 11 db1061
  • 15:09 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Deploying weekly progress to maps-test* (duration: 01m 59s)
  • 15:08 moritzm: installing java-atk-wrapper updates from stretch point release
  • 15:07 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Deploying weekly progress to maps-test*
  • 14:57 moritzm: installing cups update from stretch point release (we only install the client libs)
  • 14:24 jynus: killing ongoing truncate to investigate s3 issues
  • 14:16 elukey: rolling restart of the three hadoop hdfs journal nodes (an1028/35/52) for openjdk-8 upgrades
  • 14:00 godog: reimage puppetmaster1001 - T184562
  • 13:57 zeljkof: EU SWAT finished
  • 13:55 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Properly setup ProofreadPage namespaces for cywikisource (T181406) (duration: 01m 16s)
  • 13:38 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Make eswikibooks logo normal size (T190366) (duration: 01m 16s)
  • 13:29 mobrovac@tin: Finished deploy [zotero/translators@1c30955]: Update translators - T188893 (duration: 00m 08s)
  • 13:29 mobrovac@tin: Started deploy [zotero/translators@1c30955]: Update translators - T188893
  • 13:27 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change bewikibooks logo (T189218) (duration: 01m 15s)
  • 13:25 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change bewikibooks logo (T189218) (duration: 01m 16s)
  • 13:23 godog: reenabling puppet fleetwide to enable CA switch - T189891
  • 13:11 ladsgroup@tin: Synchronized wmf-config/Wikibase.php: Remove forceWriteTermsTableSearchFields from testwikidatawiki, part II (T189776) (duration: 01m 15s)
  • 13:09 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Remove forceWriteTermsTableSearchFields from testwikidatawiki, part I (T189776) (duration: 01m 16s)
  • 13:05 godog: stop rsync of ca/volatile on puppetmaster1001
  • 12:31 godog: chown puppet:puppet /var/lib/puppet/server/ssl/ca on puppetmaster2001
  • 12:20 godog: running puppet on puppetmaster[21]001 - T189891
  • 12:12 godog: stopping puppet fleetwide for ca migration - T189891
  • 11:20 elukey: rolling restart of the hadoop hdfs datanode daemons on all the analytics hadoop workers for openjdk-8 upgrade
  • 11:18 apergos: and a third time to try updating the puppet compiler facts, this time using puppetmaster2001
  • 11:09 arturo: T189722 reboot labtestvirt2002 to downgrade kernel
  • 11:02 moritzm: installing plexus-utils security updates
  • 11:01 arturo: T189722 reboot labtestvirt2001 to downgrade kernel
  • 10:53 apergos: due to miscommunication, second update of puppet compiler facts happening now. oh well
  • 10:42 elukey: update puppet compiler's fact
  • 10:28 ema: cp-upload_esams: carry on with reboots for retpoline kernel updates T188092
  • 10:10 ema: repool cp3010
  • 09:55 elukey: rolling restart of yarn nodemanagers on the analytics hadoop workers for openjdk-8 upgrade
  • 09:21 marostegui: Truncate updatelog on s3 - T174804
  • 09:19 marostegui: Truncate updatelog on s1 - T174804
  • 09:04 marostegui: Truncate updatelog on s7 - T174804
  • 08:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 (duration: 01m 15s)
  • 08:45 marostegui: Truncate updatelog on s2 - T174804
  • 08:30 marostegui: Truncate updatelog on s4,s5,s6,s8 - T174804
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1006 after kernel, mariadb and socket location upgrade (duration: 01m 11s)
  • 08:21 jynus: upgrade and restart db1060
  • 08:17 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 (duration: 01m 15s)
  • 08:06 marostegui: Restart pt-heartbeat on pc2006
  • 08:05 marostegui: Restart pt-heartbeat on pc2004 and pc2005
  • 08:04 marostegui: Restart pt-heartbeat on pc1004 and pc1005
  • 07:59 marostegui: Stop MySQL on pc1006 for kernel, mariadb and socket path upgrade
  • 07:58 elukey: depool cp3010 + powercycle (no ssh access, mgmt console frozen)
  • 07:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1006 for kernel, mariadb and socket location upgrade (duration: 01m 16s)
  • 06:25 marostegui: Remove db1001 from tendril - T190262
  • 06:25 marostegui: Stop MySQL on db1001 to get ready to decommission it - T190262
  • 06:16 marostegui: Reload dbproxy1006 to pick up the new standby host - T183469
  • 06:16 marostegui: Reload dbproxy1001 to pick up the new standby host - T183469
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 07m 46s)
  • 01:52 ebernhardson: increase cluster.routing.allocation.disk.watermark.low to 80% on eqiad elasticsearch due to shards not allocating during reindex
  • 01:10 ebernhardson: started in-place reindex of all wikis on both elasticsearch clusters
  • 00:02 andrewbogott: restarted nova-network on labnet1001 and nova-compute on labvirt1015 as part of debugging T190367
  • 00:00 Amir1: Evening SWAT is done
  • 00:00 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: guwiki: fix rollback -> rollbacker (group) (T190370) (duration: 01m 16s)

2018-03-21

  • 23:53 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Migrate $wgOresModels to the new config system (T189948) (duration: 01m 16s)
  • 23:41 ladsgroup@tin: Synchronized wmf-config/throttle.php: Add new throttle rule and add task for one in comment (duration: 01m 16s)
  • 23:36 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: guwiki: clean up $wg{Add,Remove}Groups configuration (duration: 01m 16s)
  • 23:21 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable $wgAbuseFilterProfile & $wgAbuseFilterRuntimeProfile on eswikibooks, part II (T190264) (duration: 01m 15s)
  • 23:19 ladsgroup@tin: Synchronized wmf-config/abusefilter.php: Enable $wgAbuseFilterProfile & $wgAbuseFilterRuntimeProfile on eswikibooks, part I (T190264) (duration: 01m 15s)
  • 22:33 eileen: civicrm revision changed from 3291ad35c9 to 85c89c7d0a, config revision is 03511638ed
  • 22:32 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: Revert global prefs (duration: 01m 15s)
  • 22:18 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/421185/ (duration: 01m 15s)
  • 21:55 andrew@tin: Synchronized wmf-config/CommonSettings.php: turning off wgReadOnly on labtestwikitech (duration: 01m 16s)
  • 20:34 mlitn@tin: Finished deploy [3d2png/deploy@812a68a]: Updating 3d2png (duration: 02m 57s)
  • 20:31 mlitn@tin: Started deploy [3d2png/deploy@812a68a]: Updating 3d2png
  • 20:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@675837f]: Update mobileapps to e6b50a0 (duration: 05m 33s)
  • 20:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@675837f]: Update mobileapps to e6b50a0
  • 19:12 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.26
  • 19:09 demon@tin: Synchronized php: symlink bump (duration: 01m 15s)
  • 19:05 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: rvv (duration: 01m 15s)
  • 19:03 anomie: Deleted some 12-year-old open proxy blocks to resolve T189840.
  • 18:36 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: beta-only (duration: 01m 16s)
  • 18:34 demon@tin: Synchronized scap/plugins/prep.py: consistency (duration: 01m 17s)
  • 18:09 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: name=mw22(59|[6-9][0-9])\.codfw\.wmnet
  • 18:08 _joe_: pooling all the new codfw appservers
  • 18:05 maxsem@tin: Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/420397/ (duration: 01m 15s)
  • 18:02 godog: delete obsolete metrics from prometheus following https://gerrit.wikimedia.org/r/c/421086
  • 17:46 maxsem@tin: Synchronized wmf-config/Wikibase.php: https://gerrit.wikimedia.org/r/#/c/420336/ (duration: 01m 15s)
  • 17:43 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/421046/ (duration: 01m 15s)
  • 17:35 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/420947/ (duration: 01m 15s)
  • 17:30 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/420910/ (duration: 01m 16s)
  • 17:26 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/419528/ (duration: 01m 15s)
  • 17:22 volans@tin: Finished deploy [puppetboard/deploy@d6514d6]: Adjust wsgi config - T184563 (duration: 00m 06s)
  • 17:22 volans@tin: Started deploy [puppetboard/deploy@d6514d6]: Adjust wsgi config - T184563
  • 17:17 maxsem@tin: Synchronized dblists/flow.dblist: https://gerrit.wikimedia.org/r/#/c/420799/ (duration: 01m 12s)
  • 17:11 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/419611/ (duration: 01m 15s)
  • 17:07 ppchelko@tin: Finished deploy [cpjobqueue/deploy@545cb61]: Increase refreshLinks concurrency to 20 per partition (duration: 00m 37s)
  • 17:06 ppchelko@tin: Started deploy [cpjobqueue/deploy@545cb61]: Increase refreshLinks concurrency to 20 per partition
  • 16:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 fully, post-silver cleanup (duration: 01m 14s)
  • 16:53 _joe_: running systemd-tmpfiles --create on the new appservers
  • 16:51 jynus@tin: Synchronized wmf-config/db-codfw.php: Post-silver cleanup (duration: 01m 03s)
  • 16:48 andrew@tin: Synchronized wmf-config/CommonSettings.php: one of many wikitech cleanups (duration: 01m 38s)
  • 16:46 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: one of many wikitech cleanups (duration: 03m 12s)
  • 16:42 andrew@tin: Synchronized wmf-config/wikitech.php: first of many wikitech cleanups (duration: 03m 16s)
  • 16:12 andrew@tin: Synchronized wmf-config/filebackend.php: labtestwikitech -> swift (duration: 01m 14s)
  • 16:10 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: labtestwikitech -> swift (duration: 01m 15s)
  • 16:07 oblivian@puppetmaster2001: conftool action : set/pooled=inactive; selector: name=mw22(59|[6-9][0-9])\.codfw\.wmnet
  • 15:53 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b291728]: Partition the refreshLinks topic by DB shard T189738 take 2 (duration: 00m 40s)
  • 15:53 ppchelko@tin: Started deploy [cpjobqueue/deploy@b291728]: Partition the refreshLinks topic by DB shard T189738 take 2
  • 15:51 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0dcdc82]: Partition the refreshLinks topic by DB shard T189738 (duration: 03m 03s)
  • 15:48 ppchelko@tin: Started deploy [cpjobqueue/deploy@0dcdc82]: Partition the refreshLinks topic by DB shard T189738
  • 15:28 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 15:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1019 after socket location upgrade (duration: 01m 12s)
  • 15:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 with low load (duration: 01m 15s)
  • 15:11 volans@tin: Finished deploy [puppetboard/deploy@81cd93a]: Adjust wsgi config - T184563 (duration: 00m 06s)
  • 15:11 volans@tin: Started deploy [puppetboard/deploy@81cd93a]: Adjust wsgi config - T184563
  • 15:05 jynus: stop, upgrade and restart db1079
  • 15:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 (duration: 01m 15s)
  • 13:39 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 13:23 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 13:20 zeljkof: EU SWAT finished
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: config: Enable testwiki NavTiming oversample for a bunch more countries (T190229) (duration: 01m 15s)
  • 13:12 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T189778) (duration: 01m 16s)
  • 11:33 moritzm: rolling restart of Kibana/Logstash to pick up OpenJDK security update
  • 11:32 ema: cache_misc@esams: upgrade varnish to 5.1.3-1wm7
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Cleanup old hosts (duration: 01m 18s)
  • 11:29 jynus@tin: Synchronized wmf-config/db-codfw.php: Cleanup old hosts (duration: 01m 13s)
  • 11:17 ema: varnish 5.1.3-1wm7 uploaded to apt.w.o
  • 10:51 marostegui: Stop MySQL on db1016 to clone db1065 - T183469
  • 10:47 moritzm: rolling restart of elasticsearch on logstash to pick up OpenJDK security update
  • 10:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1065 from config - T183469 (duration: 01m 15s)
  • 10:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1065 from config - T183469 (duration: 01m 15s)
  • 10:37 moritzm: rolling restart of elasticsearch on relforge to pick up OpenJDK security update
  • 10:16 volans: re-enabling puppet on einsteinium (icinga host) see T177253#4067901
  • 09:57 moritzm: installing php5 security updates on trusty (jessie already fixed)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T183469 (duration: 01m 15s)
  • 09:47 moritzm: installing tiff security updates on trusty
  • 09:40 marostegui: Stop db1065 and db1106 in sync - this will generate lag on labs
  • 09:23 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 09:11 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 09:11 marostegui: Stop mysql on db2078 for new socket config
  • 08:56 marostegui: Stop mysql on db2037 for new socket config
  • 08:46 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:46 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3030.esams.wmnet,service=varnish-be
  • 08:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T183469 (duration: 01m 14s)
  • 08:35 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3030.esams.wmnet,service=varnish-be
  • 08:35 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1005 after kernel, mariadb and socket location upgrade (duration: 01m 15s)
  • 08:20 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:19 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033,service=varnish-be
  • 08:10 hashar: contint1001: deleting some old docker images
  • 08:09 hashar: contint1001: docker image prune ; docker container prune # T178663
  • 08:09 hashar: contint1001: docker image prune ; docker container prune
  • 08:08 marostegui: Stop MySQL on pc1005 for kernel, mariadb and socket path upgrade
  • 08:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1005 for kernel, mariadb and socket location upgrade (duration: 01m 15s)
  • 07:07 marostegui: Remove db1020 from tendril - T189773
  • 07:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1020 from config - T189773 (duration: 01m 15s)
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1020 from config - T189773 (duration: 01m 13s)
  • 06:50 marostegui: Stop MySQL on db1020 - T189773
  • 06:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1019 after socket location upgrade (duration: 01m 14s)
  • 06:29 marostegui: Stop MySQL on es1019 to upgrade socket path
  • 06:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1019 - socket location upgrade (duration: 01m 21s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 06m 18s)
  • 01:51 herron: codfw puppetdb upgrade complete. eqiad puppetmaster remains depooled T177253

2018-03-20

  • 23:41 Krinkle: Mass no-op resizing of Whisper files on graphite2001 and graphite1001 for T179622 (webpagetest.* namespace)
  • 23:01 MaxSem: Cleaned centralauth.global_preferences after testing
  • 22:58 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: Revert GlobalPreferences (duration: 01m 17s)
  • 22:55 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 05s)
  • 22:55 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 22:54 maxsem@tin: Finished scap: Test deployment of GlobalPreferences (duration: 39m 31s)
  • 22:41 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 07s)
  • 22:41 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 22:14 maxsem@tin: Started scap: Test deployment of GlobalPreferences
  • 21:02 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 26s)
  • 21:02 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:55 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 09s)
  • 20:55 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:44 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 02m 19s)
  • 20:42 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:28 papaul: OS install on mw2259-mw2290
  • 19:36 herron: temporarily disabling puppet agents for puppetdb upgrade
  • 19:28 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.26
  • 19:23 ejegg: updated payments-wiki from 30f5f3edfb to 9e83e7f7a0
  • 18:54 demon@tin: Finished scap: bootstrap wmf.26 (duration: 42m 16s)
  • 18:33 ema: varnish 5.1.3-1wm6 uploaded to apt.w.o
  • 18:30 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751) (duration: 02m 43s)
  • 18:27 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751)
  • 18:24 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751) (duration: 05m 58s)
  • 18:18 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751)
  • 18:12 demon@tin: Started scap: bootstrap wmf.26
  • 18:10 demon@tin: Synchronized wmf-config/CommonSettings.php: instantcommons for labstestwiki (duration: 01m 58s)
  • 17:30 mholloway-shell@tin: Finished deploy [mobileapps/deploy@fad1009]: Update mobileapps to 634a15f (duration: 05m 34s)
  • 17:29 elukey: test a depool/repool action for kafka1001 (eventbus/jobqueue) - part of an investigation to figure out where timeouts come from
  • 17:24 mholloway-shell@tin: Started deploy [mobileapps/deploy@fad1009]: Update mobileapps to 634a15f
  • 17:06 demon@tin: Pruned MediaWiki: 1.31.0-wmf.24 [keeping static files] (duration: 01m 23s)
  • 17:04 demon@tin: Pruned MediaWiki: 1.31.0-wmf.22 (duration: 02m 57s)
  • 16:38 jynus: running reset slave all on db1063 T189655
  • 16:16 akosiaris: restart bacula-dir T189655
  • 16:14 akosiaris: restart etherpad T189655
  • 16:13 jynus: db1063 in read-write (m1) again
  • 16:10 jynus: set m1 in read only
  • 16:09 jynus: heartbeat killed on m1-master
  • 16:02 herron: restarted apache2 on puppetmaster1001
  • 16:00 jynus: disable puppet on db1063, db1016
  • 15:57 jynus: changing replication topology of m1
  • 15:51 no_justification: gerrit: restarting services to pick up 2.14.6 -> 2.14.7 upgrade
  • 15:49 demon@tin: Finished deploy [gerrit/gerrit@09534cb]: gerrit 2.14.7 (duration: 00m 12s)
  • 15:49 demon@tin: Started deploy [gerrit/gerrit@09534cb]: gerrit 2.14.7
  • 15:20 marostegui: Drop empty (confirmed) table slots from s3 - T190153
  • 14:59 herron: codfw puppet masters upgraded to puppetdb4. placing puppet agents into icinga downtime and beginning puppet —noop runs (to send facts to new puppetdb) T177253
  • 14:58 marostegui: Drop empty (confirmed) table slots from s7 - T190153
  • 14:55 marostegui: Drop empty (confirmed) table slots from s6 - T190153
  • 14:53 twentyafterfour@tin: testing scap on tin
  • 14:53 marostegui: Drop empty (confirmed) table slots from s8 - T190153
  • 14:52 marostegui: Drop empty (confirmed) table slots from s5 - T190153
  • 14:52 marostegui: Drop empty (confirmed) table slots from s4 - T190153
  • 14:47 godog: upload scap 3.7.7-1 - T189306
  • 14:42 marostegui: Drop empty (confirmed) table slots from s2 - T190153
  • 14:40 marostegui: Drop empty (confirmed) table slots from s1 - T190153
  • 14:14 moritzm: rolling restart of elasticsearch in deployment-prep for new Java update
  • 14:03 ema: cp3007: upgrade varnish to 5.1.3-1wm5
  • 14:00 ema: upload varnish_5.1.3-1wm5 to apt.w.o
  • 13:59 ayounsi@tin: Finished deploy [netbox/deploy@7e29963]: Fixing netbox typo LDAP (duration: 00m 28s)
  • 13:59 ayounsi@tin: Started deploy [netbox/deploy@7e29963]: Fixing netbox typo LDAP
  • 13:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065, give main traffic to db1106 - T183469 (duration: 00m 58s)
  • 13:29 herron: depooling codfw puppet masters via dns T177253
  • 12:59 moritzm: restarting apache on bohrium/piwik to pick up curl security update
  • 12:53 jynus: applying schema change to wikishared.cx_translations T190133
  • 12:50 arturo: reboot labtestservices2003 for T189722
  • 12:33 arturo: reboot labtestservices2002 for T189722
  • 12:04 arturo: reboot labtestservices2001 for T189722
  • 11:28 godog: run compiler-update-facts
  • 11:07 arturo: reboot labtestnet2002 for T189722
  • 11:03 jynus: upgrade and reboot db1095 - this can create temp. lag on wikireplicas
  • 10:50 arturo: reboot again labtestnet2001 for T189722. Now with a proper grub menu
  • 10:44 jynus: upgrade and reboot db1102 - this can create tempory lag on wikireplicas
  • 10:44 arturo: reboot labtestnet2001 for T189722
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1012 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 09:28 jynus: repool labsdb1009 after upgrade
  • 09:11 moritzm: restarting apache on netmon* to pick up curl security updates
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1012 after kernel, mariadb and socket location upgrade (duration: 00m 57s)
  • 09:01 hashar: restarting Jenkins for java update
  • 08:50 marostegui: Stop MySQL on es1012 for mariadb, kernel and socket location upgrade
  • 08:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1012 for kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1004 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:41 jynus: upgrade and restart labsdb1009
  • 08:34 jynus: depool labsdb1009
  • 08:25 moritzm: installing curl security updates
  • 08:23 marostegui: Stop MySQL on pc1004 for mariadb, kernel and socket location upgrade
  • 08:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1004 for kernel, mariadb and socket location upgrade (duration: 00m 57s)
  • 07:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1106 in s1 - T183469 (duration: 00m 58s)
  • 06:18 marostegui: Deploy schema change on s4 primary master db1068 - T187089 T185128 T153182
  • 04:18 krinkle@tin: Synchronized wmf-config/throttle-analyze.php: (no justification provided) (duration: 00m 58s)
  • 04:17 krinkle@tin: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 00m 58s)
  • 03:56 Krinkle: Deleting stale webpagetest.* metrics on graphite1001 and graphite2001 (any wsp file last modified 600+ days ago) – T179622
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 06m 40s)
  • 00:00 reedy@tin: Synchronized wmf-config/CommonSettings.php: Allow protocol-relative URLs in TemplateStyles (duration: 00m 59s)

2018-03-19

  • 23:43 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Wikidata description override on testwiki (duration: 00m 58s)
  • 23:39 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Log ReadingLists warning (duration: 00m 58s)
  • 23:36 ayounsi@tin: Finished deploy [netbox/deploy@f7faa04]: Fixing netbox deploy issue (duration: 00m 38s)
  • 23:35 reedy@tin: Synchronized multiversion/MWRealm.php: T45956 (duration: 00m 57s)
  • 23:35 ayounsi@tin: Started deploy [netbox/deploy@f7faa04]: Fixing netbox deploy issue
  • 23:27 ayounsi@tin: Finished deploy [netbox/deploy@bed8da1]: Fixing netbox deploy issue (duration: 00m 37s)
  • 23:27 ayounsi@tin: Started deploy [netbox/deploy@bed8da1]: Fixing netbox deploy issue
  • 20:32 mutante: signing puppet certs for new host bast1002. initial puppet run, will replace bast1001 soon (T186623)
  • 20:19 bblack: discarding unused vcl on all cp frontends, 1-at-a-time
  • 20:14 bblack: discarding unused vcl on all cp backends, 1-at-a-time
  • 19:53 andrew@tin: Synchronized wmf-config/wikitech.php: fix for T189347 take 2 (duration: 00m 57s)
  • 19:42 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751) (duration: 10m 28s)
  • 19:38 andrew@tin: Synchronized wmf-config/wikitech.php: fix for T189347 (duration: 00m 57s)
  • 19:31 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751)
  • 19:25 mobrovac@tin: (no justification provided)
  • 19:24 herron: upgraded compiler03.puppet3-diffs.eqiad.wmflabs (depooled) to puppetdb4/postgres backend
  • 19:14 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751) (duration: 08m 30s)
  • 19:05 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751)
  • 19:01 mutante: DNS - authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones on ns servers to recreate zone files to add new language "gor" to langs.tmpl (T189109)
  • 19:00 mutante: adding gor.wikipedia.org - new language Gorontalo https://www.ethnologue.com/language/gor | https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Gorontalo
  • 18:44 smalyshev@tin: Finished deploy [wdqs/wdqs@d6bc746]: GUI update (duration: 02m 24s)
  • 18:43 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): bring dev environment current w/ production (T186751) (duration: 10m 16s)
  • 18:42 smalyshev@tin: Started deploy [wdqs/wdqs@d6bc746]: GUI update
  • 18:33 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable $wgFlowReadOnly on commonswiki (T186463) (duration: 00m 57s)
  • 18:33 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): bring dev environment current w/ production (T186751)
  • 18:27 catrope@tin: Synchronized dblists/: Uninstall Flow from wikis where it was never used (T188812) (duration: 00m 57s)
  • 18:14 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mapframe on knwiki (T189883) (duration: 00m 58s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add enwiki and commons as import sources to mrwikisource (T188486) (duration: 00m 58s)
  • 15:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1091 (duration: 00m 59s)
  • 15:23 elukey: reboot kafka1003 for kernel upgrades (jobqueues/eventbus)
  • 15:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1091 (duration: 01m 01s)
  • 15:05 hashar: upgrading java on contint1001 / contint2001
  • 14:42 akosiaris: T184919 pool all kubernetes for service mathoid.
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2004.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2003.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:34 elukey: reboot kafka1002 (eventbus/jobqueue) for kernel upgrades
  • 14:28 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:28 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:18 ema: cp3040: discard old VCL T189892
  • 14:09 moritzm: restarting apache on contint1001 to pick up curl security update
  • 13:48 anomie: Cleaning up orphaned image_comment_temp rows on all wikis for T189985
  • 13:44 anomie@tin: Synchronized php-1.31.0-wmf.25/includes/filerepo/file/LocalFile.php: Applying fix for T189985 (duration: 00m 58s)
  • 13:22 zeljkof: EU SWAT finished
  • 13:20 zfilipin@tin: Synchronized wmf-config/flaggedrevs.php: SWAT: Revert "Restrict FlaggedRevs to only operated on NS_MAIN on arwiki" (T148603 T189224) (duration: 00m 58s)
  • 13:10 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable rollbacker user right at arwikiquote (T189732) (duration: 00m 57s)
  • 13:09 moritzm: reimage mw1294-1296 as video scalers
  • 13:02 arturo: labtestcontrol2001: set GRUB_TIMEOUT=30 in /etc/default/grub, the previous value (10) wasn't enough to display the menu via mgmt
  • 12:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1091 (duration: 00m 57s)
  • 12:40 arturo: T189722 reboot labtestcontrol2001
  • 12:37 moritzm: installing curl security updates
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore es1016 original weight after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 11:55 _joe_: stopping hhvm on terbium for a test.
  • 11:44 moritzm: reimage mw1293 as video scaler
  • 11:29 godog: point codfw puppet to puppetmaster2001
  • 11:27 hashar@tin: Synchronized docroot/wwwportal/portal: (no justification provided) (duration: 00m 57s)
  • 11:17 ema: cache_misc@esams: upgrade to varnish 5.1.3-1wm4
  • 11:14 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Switching portals submodule to portals-deploy (T180777) (duration: 00m 58s)
  • 11:13 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Switching portals submodule to portals-deploy (T180777) (duration: 00m 58s)
  • 11:06 moritzm: uploaded openjdk-8 8u162-b12-1~bpo8+1 for jessie-wikimedia to apt.wikimedia.org
  • 10:58 godog: point eqsin puppet to puppetmaster2001
  • 10:53 moritzm: restarting jenkins on releases1001 to pick up Java security update
  • 10:47 godog: point ulsfo puppet to puppetmaster2001
  • 10:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T183469 (duration: 00m 58s)
  • 10:25 marostegui: Remove db1009 from tendril - T189216
  • 10:14 ema: cp3008: upgrade to varnish 5.1.3-1wm4
  • 09:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1009 from config - T189216 (duration: 00m 57s)
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1009 from config - T189216 (duration: 00m 58s)
  • 09:45 marostegui: Stop MySQL on db1009 - T189216
  • 09:37 elukey: restart hadoop daemons on analytics1070 for openjdk upgrades (canary)
  • 09:27 godog: reimage puppetmaster2001 with stretch - T184562
  • 09:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1016 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 09:10 godog: depool codfw puppetmaster - T184562
  • 09:08 marostegui: Stop MySQL on es1016 for kernel, mariadb and socket location upgrade
  • 09:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1016 for kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:57 moritzm: installing openjdk-8 security updates
  • 08:41 elukey: reboot thorium for kernel security upgrades (hosts all analytics websites, they will go down temporary)
  • 08:26 moritzm: installing libvorbis security updates
  • 08:22 elukey: revert previous state on aqs1004, the new pkg might need some more work - T189529
  • 08:19 marostegui: Reset slave on db1106 to get it ready for s1 - https://phabricator.wikimedia.org/T183469
  • 08:11 marostegui: Reboot db1106 for kernel upgrade
  • 07:58 elukey: manually installed cassandra-2.2.6-wmf3 on aqs1004 - T189529
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T183469 (duration: 00m 57s)
  • 07:47 elukey: drain cassandra instances and reboot aqs1004 for kernel upgrades
  • 07:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Move db1106 from s5 to s1 - T183469 (duration: 01m 00s)
  • 07:27 marostegui: Reload dbproxy1002 and dbproxy1007 to get the new config - T189773
  • 06:20 marostegui: Deploy schema change on db1091 - T187089 T185128 T153182
  • 06:13 marostegui: Stop MySQL on db1091 for kernel and mariadb upgrade
  • 06:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 for schema change, kernel upgrade and mariadb upgrade (duration: 00m 58s)
  • 02:39 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 10m 54s)

2018-03-17

  • 18:41 elukey: executed apt-get clean on scb1004 to free some space (root partition disk space warning)
  • 03:09 krinkle@tin: Synchronized docroot/noc/db.php: noc: I410a56431a (duration: 00m 59s)
  • 00:13 mutante: running puppet on all cache::misc to rename director bromine to webserver_misc_static (T188163)

2018-03-16

  • 23:32 mutante: signing puppet cert for vega.codfw.wmnet, initial puppet run after fresh stretch install (T188163)
  • 18:43 mutante: creating new ganeti VM vega.codfw.wmnet to be equivalent of bromine, 1G RAM, 30G disk, 1vCPU (T189899)
  • 18:13 jynus: switching back wikireplica cloud dns to the original config
  • 17:32 jynus: reimage dbproxy1010
  • 16:29 jynus: updating wikireplica_dns 2/3
  • 16:22 moritzm: installing curl security updates
  • 16:09 marostegui: Stop MySQL on db1020 - T189773
  • 14:48 andrewbogott: reset contintcloud quotas as per https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#incorrect_quota_violations
  • 14:48 jynus: reimage dbproxy1011
  • 14:27 andrewbogott: restarting nodepool on nodepool1001
  • 14:25 elukey: reboot druid1002 for kernel updates
  • 14:14 andrewbogott: restarting rabbitmq on labcontrol1001
  • 13:57 andrewbogott: stopping nodepool temporarily during changes to nova.conf
  • 13:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2050 (duration: 00m 58s)
  • 13:15 chasemp: disable puppet across cloud things for safe rollout
  • 12:52 moritzm: uploaded libsodium23/php-acpu/php-mailparse to thirdparty/php72 (deps/extentions needed by Phabricator)
  • 12:51 ema: text-esams: reboot for kernel upgrades T188092 and to mitigate https://grafana.wikimedia.org/dashboard/db/varnish-failed-fetches?panelId=7&fullscreen&orgId=1&from=1518746284946&to=1521204628041
  • 12:12 marostegui: Reboot dbproxy1005 for kernel upgrade
  • 12:02 marostegui: Run pt-table-checksum on m2
  • 12:00 marostegui: Run pt-table-checksum on m5
  • 11:11 hashar: zuul: reenqueue all coverage jobs lost when restarting Zuul
  • 10:53 hashar: Upgrading zuul to zuul_2.5.1-wmf4 to resolve a mutex deadlock T189859
  • 10:45 jynus: disable puppet and load balance between 3 wikirreplicas on dbproxy1010
  • 10:19 jynus: upgrade and restart of dbproxy1009 (passive)
  • 10:01 elukey: restart eventlogging_sync on db1108 (eventlogging db slave) as precautions after the change of m4-master.eqiad.wmnet's CNAME
  • 10:00 moritzm: reverting the HHVM/ICU 57 setup on mwdebug2001 which was used for the dry run tests
  • 09:57 elukey: restart eventlogging-consumer@mysql-eventbus on eventlog1002 to force the DNS resolution of m4-master (changed from dbproxy1009 -> dbproxy1004)
  • 09:56 hashar: Zuul coverage pipeline is deadlocked on an unreleased mutex. Will need a new Zuul version.
  • 09:51 elukey: restart eventlogging-consumer@mysql-m4 on eventlog1002 to force the DNS resolution of m4-master (changed from dbproxy1009 -> dbproxy1004)
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for es1015 after kernel, mariadb and socket upgrade (duration: 00m 57s)
  • 09:27 oblivian@tin: Finished deploy [netbox/deploy@ccc342a]: Re-deploying with the newly built artifacts/2 (duration: 00m 29s)
  • 09:26 oblivian@tin: Started deploy [netbox/deploy@ccc342a]: Re-deploying with the newly built artifacts/2
  • 09:17 oblivian@tin: (no justification provided)
  • 09:17 oblivian@tin: Finished deploy [netbox/deploy@f3e0159]: Re-deploying with the newly built artifacts (duration: 00m 47s)
  • 09:16 oblivian@tin: Started deploy [netbox/deploy@f3e0159]: Re-deploying with the newly built artifacts
  • 09:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T183469 (duration: 00m 57s)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1015 after kernel, mariadb and socket upgrade (duration: 00m 56s)
  • 08:49 jynus: upgrade and restart of dbproxy1004 (passive)
  • 08:41 marostegui: Stop MySQL on es1015 for maintenance
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1015 for kernel, mariadb and socket upgrade (duration: 00m 58s)
  • 08:40 elukey: reboot druid1006 for kernel updates
  • 08:29 elukey: reboot druid1005 for kernel updates
  • 07:53 moritzm: reimage mc2036 after mainboard replacement (T185587)
  • 07:15 marostegui: Stop MySQL on es2017 (es3 codfw master) for maintenance
  • 07:06 marostegui: Stop MySQL on es2016 (es2 codfw master) for maintenance
  • 06:52 marostegui: Stop MySQL on db2048 (s1 codfw master) for maintenance
  • 06:41 marostegui: Stop MySQL on db2051 (s4 codfw master) for maintenance
  • 06:28 marostegui: Stop MySQL on db2045 (s8 codfw master) for maintenance
  • 06:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084 (duration: 00m 58s)
  • 01:46 XioNoX: librenms IRC bot moved to -operations channel. Doc on how to turn it off is on https://wikitech.wikimedia.org/wiki/LibreNMS#IRC_Alerting
  • 01:00 reedy@tin: Synchronized php-1.31.0-wmf.25/includes/specials/pagers/NewFilesPager.php: Fix T189846 (duration: 00m 58s)

2018-03-15

  • 23:25 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/AbuseFilter/: Fix display issues (duration: 00m 59s)
  • 23:20 ebernhardson@tin: Synchronized php-1.31.0-wmf.25/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off Cirrus AB test (duration: 00m 58s)
  • 22:58 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/AbuseFilter/: add some missing globals (duration: 00m 58s)
  • 20:38 demon@tin: Synchronized robots.txt: minor tidying (duration: 00m 58s)
  • 20:05 chasemp: disable puppet for cloud things for a safe rollout
  • 19:50 XenoRyet: updated civicrm from 9e79d63426 to 3291ad35c9
  • 19:14 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.25
  • 18:51 niharika29@tin: Synchronized php-1.31.0-wmf.25/extensions/MobileApp/: https://gerrit.wikimedia.org/r/#/c/419785/; https://gerrit.wikimedia.org/r/#/c/419784/; https://gerrit.wikimedia.org/r/#/c/419776/ (duration: 01m 14s)
  • 18:25 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/417329/ (duration: 01m 15s)
  • 18:11 maxsem@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/419492/ (duration: 01m 16s)
  • 18:09 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/419492/ (duration: 01m 15s)
  • 17:27 ppchelko@tin: Finished deploy [changeprop/deploy@9f4f380]: Purge media endpoint and update sources (duration: 01m 23s)
  • 17:26 bsitzmann@tin: Finished deploy [mobileapps/deploy@97d9085]: Update mobileapps to c5e1522 (T184327) (duration: 05m 38s)
  • 17:25 ppchelko@tin: Started deploy [changeprop/deploy@9f4f380]: Purge media endpoint and update sources
  • 17:20 bsitzmann@tin: Started deploy [mobileapps/deploy@97d9085]: Update mobileapps to c5e1522 (T184327)
  • 17:18 moritzm: installing dbus updates from stretch 9.4 point release
  • 16:43 ppchelko@tin: Finished deploy [restbase/deploy@8dbc93c]: Release lint and media endpoints (duration: 15m 22s)
  • 16:28 ppchelko@tin: Started deploy [restbase/deploy@8dbc93c]: Release lint and media endpoints
  • 16:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2050 for data checks (duration: 01m 15s)
  • 15:58 volans: updated facts on both CI puppet-compilers
  • 15:56 moritzm: pruning obsolete packages from jessie-wikimedia/experimental
  • 15:56 marostegui: Stop MySQL on s5 codfw master (db2052) this will break replication on s5 codfw
  • 15:51 godog: repool puppetmaster1002
  • 15:47 moritzm: installing libvirt security updates
  • 15:20 elukey: reboot druid1003 for kernel updates
  • 15:13 marostegui: Stop MySQL on s6 codfw master (db2039) this will break replicaiton on s6 codfw
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after socket path location update (duration: 01m 15s)
  • 15:05 _joe_: restarted jobrunner, jobchron on the eqiad jobrunners
  • 14:30 elukey: reboot druid1004 for kernel updates
  • 13:51 elukey: reboot kafka1001 (eventbus/job-queues eqiad) for kernel updates
  • 13:49 zeljkof: EU SWAT finished
  • 13:48 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikiquote (T189289) (duration: 01m 14s)
  • 13:33 arturo: T189682 reimage labtestmetal2001 with jessie and a new partition layout, again. Last time didn't pick the right partman config
  • 13:14 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikiquote (T189289) (duration: 01m 15s)
  • 13:09 moritzm: restarting HHVM on canaries to pick up curl security update
  • 13:09 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule, clean expired rules (T189442) (duration: 01m 15s)
  • 12:54 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 12:36 moritzm: installing curl security updates on jessie/stretch
  • 12:26 arturo: T189682 reimage labtestmetal2001 with jessie and a new partition layout
  • 12:08 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1007 after kernel security update (duration: 01m 14s)
  • 12:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after socket path location update (duration: 01m 14s)
  • 11:59 moritzm: rebooting rdb1007 for kernel security update
  • 11:56 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1007 for kernel security update (duration: 01m 14s)
  • 11:52 marostegui: Stop MySQL on es1013 for socket path upgrade
  • 11:51 moritzm: rebooted rdb1005 for kernel security update
  • 11:49 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1005 after kernel security update (duration: 01m 14s)
  • 11:48 godog: reimage puppetmaster1002 with stretch
  • 11:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 for socket path location update (duration: 01m 14s)
  • 11:42 godog: depool puppetmaster1002 for stretch reimage
  • 11:29 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1005 for kernel security update (duration: 01m 10s)
  • 11:16 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1003 after kernel security update (duration: 01m 14s)
  • 11:04 moritzm: rebooting rdb1003 for kernel security update
  • 11:01 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1003 for kernel security update (duration: 01m 14s)
  • 10:48 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1001 after kernel security update (duration: 01m 14s)
  • 10:32 moritzm: rebooting rdb1001 for kernel security update
  • 10:24 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1001 for kernel security update (duration: 01m 14s)
  • 10:22 ema: apt.w.o: upload varnish=5.1.3-1wm4 to jessie-wikimedia/main (upstream "extrachance" fixes) T174932
  • 10:12 gehel@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=elastic1021.eqiad.wmnet
  • 09:56 ema: apt.w.o: move varnish=5.1.3-1wm3, varnish-modules=0.12.1-1+wmf1, libvmod-netmapper=1.6-1 from jessie-wikimedia/experimental to jessie-wikimedia/main T188545
  • 09:56 moritzm: installing curl security updates on Debian
  • 09:30 godog: repool puppetmaster2002
  • 09:16 jynus: reset slave all @db1051
  • 08:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore normal weight for es1017 (duration: 01m 14s)
  • 08:44 godog: roll-restart thumbor in eqiad/codfw to enable access to swift private container
  • 08:42 jynus: end of maintenance for m2
  • 08:31 jynus: setting m2 as read only
  • 08:29 gilles: setZoneAccess done
  • 08:28 gilles: foreachwikiindblist "% private.dblist" extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --backend=local-multiwrite --private
  • 08:18 jynus: disable puppet on db1051, db1020 for switchover preparation
  • 08:06 ayounsi@tin: Finished deploy [netbox/deploy@278aec4]: Upgrading Netbox to v2.3.1 (duration: 01m 02s)
  • 08:05 ayounsi@tin: Started deploy [netbox/deploy@278aec4]: Upgrading Netbox to v2.3.1
  • 08:01 jynus: switching db2044 to be a direct replica of db1051
  • 07:49 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 01m 07s)
  • 07:48 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:30 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 00m 05s)
  • 07:30 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:30 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 00m 39s)
  • 07:29 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1017 (duration: 01m 14s)
  • 07:21 moritzm: reimaging mc2036 after hardware replacement T185587
  • 07:07 marostegui: Stop mariadb on es1017 for kernel, mariadb and socket location upgrade
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1017 (duration: 01m 14s)
  • 07:01 marostegui: Deploy schema change on db1084 - T187089 T185128 T153182
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 01m 15s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 (duration: 01m 15s)
  • 06:29 marostegui: Stop MySQL on db1064 for mariadb upgrade
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 10m 10s)
  • 00:25 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/client/includes/RecentChanges/ExternalChangeFactory.php: T189320 Use only local part of username when building the RC line (duration: 01m 18s)
  • 00:22 tgr@tin: Synchronized php-1.31.0-wmf.24/includes/user/ExternalUserNames.php: T189320 Add ExternalUserNames::getLocal() to get local part of username (duration: 01m 15s)
  • 00:20 ejegg: updated payments-wiki from 9068692c32 to 30f5f3edfb
  • 00:08 tgr@tin: Synchronized php-1.31.0-wmf.25/extensions/VisualEditor/: VE fixes followup (duration: 01m 15s)
  • 00:03 tgr@tin: Synchronized php-1.31.0-wmf.25/extensions/VisualEditor/modules/ve-mw: VE fixes: T189267, T189381 (duration: 01m 15s)
  • 00:02 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw: VE fixes: T189267, T189381 (duration: 01m 16s)

2018-03-14

  • 23:45 XenoRyet: updated payments-wiki from 86715f6e9e to 9068692c32
  • 23:45 tgr@tin: Synchronized wmf-config/Wikibase.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 14s)
  • 23:43 tgr@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 15s)
  • 23:41 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 15s)
  • 23:21 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T181159 Update ORES threshold config to the new syntax (duration: 01m 20s)
  • 23:18 tgr@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T181159 Update ORES threshold config to the new syntax (duration: 01m 15s)
  • 22:13 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/Thanks: T189752 (duration: 01m 16s)
  • 21:27 hoo: Ran scap pull on mwdebug1001 after testing https://gerrit.wikimedia.org/r/417180
  • 21:26 andrewbogott: rebuilding labtestweb2001 with Debian Stretch
  • 20:34 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.25
  • 20:32 demon@tin: Synchronized php: symlink bump to wmf.25 (duration: 01m 14s)
  • 20:27 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0f9625a]: Update mobileapps to 9f4a80c (duration: 05m 37s)
  • 20:24 demon@tin: Finished scap: trying a php5/hhvm theory (duration: 06m 37s)
  • 20:21 mholloway-shell@tin: Started deploy [mobileapps/deploy@0f9625a]: Update mobileapps to 9f4a80c
  • 20:17 demon@tin: Started scap: trying a php5/hhvm theory
  • 20:16 demon@tin: Finished scap: scapping, pt. 2. prior one failed because i tested something (duration: 69m 43s)
  • 19:06 demon@tin: Started scap: scapping, pt. 2. prior one failed because i tested something
  • 19:06 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "demon"; reason is "rebuilding l10n" (duration: 00m 00s)
  • 18:20 jynus: running pt-table-checksum on all m2, some lag will happen on passive replicas
  • 18:16 jynus: running pt-table-checksum on all m1, some lag will happen on passive replicas
  • 17:56 demon@tin: Started scap: rebuilding l10n
  • 17:55 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/CentralNotice: updates! (duration: 01m 16s)
  • 17:54 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "reedy"; reason is "updates!" (duration: 00m 00s)
  • 17:54 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/CentralNotice: updates! (duration: 01m 18s)
  • 17:26 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/416489 (duration: 01m 14s)
  • 17:18 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/419077 (duration: 01m 15s)
  • 16:58 hoo: Manually running extensions/Wikibase/repo/maintenance/dispatchChanges.php on terbium, so that dispatching can catch up
  • 16:56 jynus: deploying new firewall rules to dbproxy1001 and 7
  • 16:40 moritzm: installing cron updates from stretch 9.4 point release
  • 16:35 demon@tin: Synchronized .gitignore: ignore scap logs (duration: 01m 15s)
  • 16:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1074 original weight (duration: 01m 13s)
  • 16:12 godog: temporarily add back puppetmaster2002 as a low-weight backend
  • 15:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 15s)
  • 15:47 andrew@tin: Synchronized multiversion/MWMultiVersion.php: wikitech cleanup (duration: 01m 14s)
  • 15:25 XioNoX: Re-enabling BGP on cr2-codfw Zayo transit - T189452
  • 15:12 XioNoX: Disabling BGP on cr2-codfw Zayo transit - T189452
  • 15:02 jynus: disabling puppet in preparation for reimage of dbproxy1002 and 6
  • 14:59 moritzm: installing virt-what updates from stretch point release
  • 14:58 paravoid: rebooting furud
  • 14:44 ottomata: beginning migration of eventlogging analtyics from Kafka analytics to Kafka jumbo: T183297
  • 14:33 godog: depool puppetmaster2002 for reimage
  • 14:06 Reedy: created wbc_entity_uages on ruwikimedia T188456
  • 13:50 zeljkof: EU SWAT finished
  • 13:49 zfilipin@tin: Synchronized dblists/wikidataclient.dblist: SWAT: Revert "Add ruwikimedia to wikidataclient" (T188456) (duration: 01m 14s)
  • 13:42 zfilipin@tin: Synchronized docroot/noc/conf/: SWAT: Revert "Publish throttle-analyze at noc" (T187894) (duration: 01m 15s)
  • 13:21 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c879056]: Deduplicate based on the root job dt and sha1 combination. Forgot to pull (duration: 00m 33s)
  • 13:21 ppchelko@tin: Started deploy [cpjobqueue/deploy@c879056]: Deduplicate based on the root job dt and sha1 combination. Forgot to pull
  • 13:21 zfilipin@tin: Synchronized docroot/noc/conf/throttle-analyze.php.txt: SWAT: Publish throttle-analyze at noc (T187894) (duration: 01m 13s)
  • 13:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5686f16]: Deduplicate based on the root job dt and sha1 combination (duration: 00m 38s)
  • 13:20 ppchelko@tin: Started deploy [cpjobqueue/deploy@5686f16]: Deduplicate based on the root job dt and sha1 combination
  • 13:12 zfilipin@tin: Synchronized dblists/commonsuploads.dblist: SWAT: Disable upload for non-admins on kowikiversity (T189021) (duration: 01m 14s)
  • 13:06 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Remove obsolete throttle rules, add one new (T189241) (duration: 01m 15s)
  • 12:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 14s)
  • 12:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 15s)
  • 12:22 kartik@tin: Finished deploy [cxserver/deploy@c204d9c]: Update cxserver to c355d0c (duration: 03m 12s)
  • 12:19 kartik@tin: Started deploy [cxserver/deploy@c204d9c]: Update cxserver to c355d0c
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1074 (duration: 01m 14s)
  • 11:45 marostegui: Stop db1074 for kernel upgrade
  • 11:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for data checks and kernel upgrade (duration: 01m 14s)
  • 11:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for es1018 after kernel and mariadb upgrade (duration: 01m 15s)
  • 11:02 moritzm: rebooting einsteinium / icinga.wikimedia.org for kernel security update
  • 10:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slwoly repool es1018 after kernel and mariadb upgrade (duration: 01m 14s)
  • 10:37 marostegui: Stop mariadb on es1018 for kernel and mariadb upgrade + change socket location
  • 10:35 moritzm: rebooting hydrogen for kernel security update
  • 10:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1018 for kernel and mariadb upgrade (duration: 01m 14s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2006 after kernel and mariadb upgrade (duration: 01m 14s)
  • 10:22 jynus: dropping testotrs from m2
  • 10:16 jynus: archiving and dropping bugzilla_testing from m2
  • 10:10 marostegui: Stop mariadb on pc2006 for kernel and mariadb upgrade + change socket location
  • 10:09 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2006 for kernel and mariadb upgrade (duration: 01m 14s)
  • 10:07 jynus: archiving and dropping testblog from m2
  • 10:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2005 after kernel and mariadb upgrade (duration: 01m 15s)
  • 09:50 marostegui: Stop mariadb on pc2005 for kernel and mariadb upgrade + change socket location
  • 09:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2005 for kernel and mariadb upgrade (duration: 01m 14s)
  • 09:44 moritzm: installing samba security update (just the client side libraries)
  • 09:40 marostegui: Stop mysql on es2015 to upgrade socket path
  • 09:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2004 after kernel and mariadb upgrade (duration: 01m 14s)
  • 09:34 marostegui: Stop mysql on es2014 to upgrade socket path
  • 09:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2004 for kernel and mariadb upgrade (duration: 01m 14s)
  • 09:23 marostegui: Stop mariadb on pc2004 for kernel upgrade
  • 09:13 marostegui: Stop mysql on es2013 to upgrade socket path
  • 09:08 marostegui: Stop mysql on es2012 to upgrade socket path
  • 08:57 ema: cp3041: restart varnish-be
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after kernel and mariadb upgrade (duration: 01m 15s)
  • 08:28 ema: cp3040: restart varnish-be
  • 08:21 hashar: Restarting the CI Jenkins
  • 07:45 marostegui: Reboot es2004 for kernel upgrade
  • 07:45 marostegui: Reboot es2003 for kernel upgrade
  • 07:34 marostegui: Reboot es2002 for kernel upgrade
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after kernel and mariadb upgrade (duration: 01m 14s)
  • 07:03 marostegui: Stop mariadb on es1013 for mariadb and kernel upgrade
  • 06:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 for kernel and mariadb upgrade (duration: 01m 14s)
  • 06:45 marostegui: Deploy schema change on db1064 with replication (this will generate lag on s4 on labs hosts) - T187089 T185128 T153182
  • 06:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 for alter table (duration: 01m 14s)
  • 06:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 after alter table (duration: 01m 15s)
  • 03:13 mutante: bacula is working again - restored missing file set (https://gerrit.wikimedia.org/r/419341 )
  • 02:49 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 05m 40s)
  • 02:44 Jamesofur: deleted 46 archived files
  • 02:18 mutante: helium - running bacula-dir with -f in foreground revealed: ERROR TERMINATION at parse_conf.c:485 - Config error: Could not find config Resource mysql-srv-backups - line 7, col 33 of file /etc/bacula/jobs.d/bohrium.eqiad.wmnet-mysql-predump-piwik-Weekly-Wed-production.conf
  • 02:17 mutante: helium - bacula director process failed (Bacula interrupted by signal 11: Segmentation violation), icinga alerted. attempted to restart it. then: bacula-dir - the configtest failed!
  • 00:01 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: crwiki logo (duration: 01m 15s)
  • 00:00 reedy@tin: Synchronized static/images/project-logos/crwiki.png: (no justification provided) (duration: 01m 14s)

2018-03-13

  • 23:46 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/MobileFrontend/: T188825 (duration: 01m 18s)
  • 23:43 mutante: tin: chmod -R g+w /srv/mediawiki-staging/.git/objects/* ; chmod -R g+w /srv/mediawiki-staging/php-1.31.0-wmf.24/.git/objects/*
  • 23:35 Reedy: that was Enable VirtualPageViews on Hungarian Wikipedia T184793
  • 23:35 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 01m 15s)
  • 23:26 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: moar logos (duration: 01m 15s)
  • 23:24 reedy@tin: Synchronized static/images/project-logos/: YOU GET A LOGO, YOU GET A LOGO. YOU ALL GET LOGOS (duration: 01m 16s)
  • 23:11 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHTML on 96 wikis T188010 (duration: 01m 16s)
  • 23:10 mutante: restbase-dev1006 - reinstalling, manually skipping " Volume group name already in use" (T185494)
  • 22:52 eileen: civicrm revision changed from c8458c4a2f to 9e79d63426, config revision is 08b7e6216e (Benevity comma fix)
  • 20:40 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.25
  • 20:09 demon@tin: Finished scap: bootstrap wmf.25 (duration: 67m 17s)
  • 19:02 demon@tin: Started scap: bootstrap wmf.25
  • 18:47 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "awight"; reason is "Beta: Fix ORES thresholds and enable JADE, T181159, T176333" (duration: 00m 00s)
  • 18:46 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "awight"; reason is "Beta: Fix ORES thresholds and enable JADE, T181159, T176333" (duration: 00m 00s)
  • 18:42 gehel: repool wdqs1004 & wdqs2001 now that data reload is completed T189548
  • 18:39 XenoRyet: updated civicrm from 8652db05f5 to c8458c4a2f
  • 18:37 moritzm: installing reportbug updates from stretch point release
  • 18:32 moritzm: installing w3m updates from stretch point release
  • 17:55 moritzm: installing ncurses updates from stretch point release
  • 17:53 moritzm: installing ncurses updates from stretch point release
  • 17:19 awight@tin: Started scap: Beta: Fix ORES thresholds and enable JADE, T181159, T176333
  • 17:06 godog: cleanup integration-slave-jessie-1001:/srv/pbuilder/build - T189587
  • 16:45 marostegui: Clean iptables rules on dbproxy1001 to leave it as dbproxy1006
  • 16:33 marostegui: Retroactive: cleared iptables rules on dbproxy1007
  • 16:32 jynus: restarting gerring on cobalt, stalled
  • 16:26 jynus: restarting gerring on cobalt, stalled
  • 16:18 jynus: update CNAME for m1-master and m2-master
  • 15:50 marostegui: Deploy schema change on db1097:3314 - T187089 T185128 T153182
  • 15:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 for alter table (duration: 00m 56s)
  • 15:39 jynus: upgrade and restart dbproxy1007
  • 15:33 vgutierrez: upgrading eqiad LVSs to pybal 1.15.2
  • 15:32 jynus: upgrade and restart dbproxy1001
  • 14:55 vgutierrez: upgrading codfw LVSs to pybal 1.15.2
  • 14:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 after alter table (duration: 00m 57s)
  • 14:51 jynus: stopping db2044 (this will make proxies complain about redundancy)
  • 14:42 moritzm: rebooting chromium for kernel security update
  • 14:11 chasemp: add chico to wmf-nda (verified nda things with moritz and all the goodness)
  • 13:29 jynus: stop db1001 for maintenance (proxies will temporarely complain about lack of redundancy)
  • 13:20 zeljkof: EU SWAT finished
  • 13:20 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: wmf-config: enable Singapore oversample as default on all wikis (T188652) (duration: 00m 57s)
  • 12:32 akosiaris: reboot ganeti VMs on row_A in eqiad for cache=none setting. T181121
  • 12:26 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=argon.eqiad.wmnet,service=kubemaster
  • 12:04 reedy@tin: Synchronized wmf-config/interwiki.php: T188537 (duration: 00m 57s)
  • 11:59 moritzm: rebooting DNS recursors in codfw for kernel security update
  • 11:43 _joe_: include our own etcd package (3.2.16) on stretch
  • 11:37 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=argon.eqiad.wmnet,service=kubemaster
  • 11:33 kartik@tin: Finished deploy [cxserver/deploy@30ff3b1]: Update cxserver to bd2ccfc (duration: 03m 30s)
  • 11:30 kartik@tin: Started deploy [cxserver/deploy@30ff3b1]: Update cxserver to bd2ccfc
  • 11:23 jynus: ran update-netboot-stretch.sh
  • 11:21 moritzm: rebooting DNS recursors in esams for kernel security update
  • 10:22 moritzm: rebooting DNS recursors in ulsfo and eqsin for kernel security update
  • 10:17 vgutierrez: upgrading esams LVSs to pybal 1.15.2
  • 10:08 jynus: stopping mysql on db1063 and db1051 to validate the depool before full reimage
  • 10:07 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling poolcounter1001 after kernel security update (duration: 00m 57s)
  • 10:00 gehel: shuttind down blazegraph on wdqs2001 for data transfer to wdqs1004 - T189548
  • 09:48 vgutierrez: upgrading ulsfo LVSs to pybal 1.15.2
  • 09:37 moritzm: rebooting poolcounter1001 for kernel security update
  • 09:15 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling poolcounter1001 for kernel security update (duration: 00m 56s)
  • 09:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1051 and db1063 (duration: 00m 56s)
  • 09:02 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1051 and db1063 (duration: 00m 56s)
  • 08:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 (duration: 00m 57s)
  • 06:58 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 06:56 marostegui: Deploy schema change on db1081 - T187089 T185128 T153182
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 for alter table (duration: 00m 56s)
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 after alter table (duration: 01m 19s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 05m 30s)

2018-03-12

  • 22:52 eileen: update civicrm revision changed from a819d64d98 to 8652db05f5, config revision is 08b7e6216e - update civicrm.settings.php
  • 20:44 arlolra: Updated Parsoid to 16ced34 (T188670, T90902)
  • 20:37 arlolra@tin: Finished deploy [parsoid/deploy@174c87d]: Updating Parsoid to 16ced34 (duration: 10m 16s)
  • 20:36 andrewbogott: updated wikitech-static as detailed in https://wikitech.wikimedia.org/wiki/Wikitech-static#Manual_updates
  • 20:27 arlolra@tin: Started deploy [parsoid/deploy@174c87d]: Updating Parsoid to 16ced34
  • 20:26 andrewbogott: apt-get upgrade and reboot on wikitech-static
  • 20:25 andrewbogott: stopping apache2 on Silver in anticipation of it being decommissioned
  • 20:16 mholloway-shell@tin: Finished deploy [mobileapps/deploy@c764714]: Update mobileapps to 5c90db7 (duration: 05m 29s)
  • 20:11 mholloway-shell@tin: Started deploy [mobileapps/deploy@c764714]: Update mobileapps to 5c90db7
  • 19:53 MaxSem: disabled 2FA for User:Ctac (T189520)
  • 19:48 chasemp: labstore1003:~# service nfs-kernel-server restar
  • 19:44 chasemp: labstore1003:~# exportfs -ra
  • 18:53 Krinkle: Clean up left-over .wsp.bak files under frontend.navtiming* on graphite1001 (following T179622)
  • 18:44 mutante: added to DNS: romd.wikimedia.org (and romd.m) for Wikimedians of Romania and Moldova User Group
  • 18:43 mutante: added to DNS: hi.wikimedia.org (and hi.m) for Hindi Wikimedian User Group
  • 18:25 ppchelko@tin: Finished deploy [restbase/deploy@754aa8c]: Enable ensure_content_type filter for summaries (duration: 15m 25s)
  • 18:09 ppchelko@tin: Started deploy [restbase/deploy@754aa8c]: Enable ensure_content_type filter for summaries
  • 17:48 ottomata: removed kafka.protocol.version setting for varnishkafka webrequest instances; version should now be properly negotiated
  • 17:29 gehel@tin: Finished deploy [wdqs/wdqs@ce72538]: new wdqs updater (duration: 04m 47s)
  • 17:27 _joe_: poweroff mw2097-2134, T189111
  • 17:24 gehel@tin: Started deploy [wdqs/wdqs@ce72538]: new wdqs updater
  • 16:34 joal@tin: Finished deploy [analytics/refinery@1ef2e27]: Deploy patch over regula rdeploy bug (duration: 08m 50s)
  • 16:25 joal@tin: Started deploy [analytics/refinery@1ef2e27]: Deploy patch over regula rdeploy bug
  • 15:56 mepps: updated payments-wiki from ce68e8e80b to 86715f6e9e
  • 15:51 gehel: restart blazegraph on wdqs2001 to validate new config - T175919
  • 15:43 vgutierrez: eqsin LVSs: upgrade pybal to 1.15.2
  • 15:39 ottomata: bouncing kafka main-eqiad -> jumbo-eqiad mirror maker instances
  • 15:37 ottomata: disabling puppet on kafka1020,1022,1023 to test partition.assigment.strategy change for mirror maker
  • 15:28 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Set up separate Thumbor Swift user for private containers (T187822) (duration: 00m 54s)
  • 15:26 demon@tin: Pruned MediaWiki: 1.31.0-wmf.23 [keeping static files] (duration: 01m 19s)
  • 15:24 vgutierrez: lvs1007,lvs1010 upgraded pybal to 1.15.2
  • 15:17 demon@tin: Pruned MediaWiki: 1.31.0-wmf.22 [keeping static files] (duration: 01m 22s)
  • 15:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.21 (duration: 02m 35s)
  • 15:12 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5686f16]: Decrease refreshLinks concurrency to 120 (duration: 00m 31s)
  • 15:11 ppchelko@tin: Started deploy [cpjobqueue/deploy@5686f16]: Decrease refreshLinks concurrency to 120
  • 15:08 joal: Provide correct log message for analytics/refinery scap deploy: Regular deploy of analytics-hadoop code
  • 15:07 joal@tin: Finished deploy [analytics/refinery@fd0a90f]: Regular a (duration: 04m 54s)
  • 15:07 demon@tin: Pruned MediaWiki: 1.31.0-wmf.20 (duration: 03m 58s)
  • 15:02 joal@tin: Started deploy [analytics/refinery@fd0a90f]: Regular a
  • 14:42 jynus: upgrade and restart es2001
  • 14:09 sbisson@tin: Finished deploy [tilerator/deploy@4bcae95]: Deploying tilerator#update-deps for testing on maps-test* (duration: 00m 34s)
  • 14:09 sbisson@tin: Started deploy [tilerator/deploy@4bcae95]: Deploying tilerator#update-deps for testing on maps-test*
  • 14:02 zeljkof: EU SWAT finished
  • 13:59 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 00m 57s)
  • 13:31 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 03m 08s)
  • 13:24 moritzm: synchronised PHP 7.2.3 to thirdparty/php72 for stretch-wikimedia
  • 13:17 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 03m 09s)
  • 12:44 godog: start a catalog compilation on elnath to check for puppetdb4 diffs - T177253
  • 11:26 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling poolcounter1002 after kernel security update (duration: 03m 09s)
  • 11:14 moritzm: reboot poolcounter1002 for kernel security update
  • 11:10 jmm@tin: Synchronized wmf-config/ProductionServices.php: depooling poolcounter1002 for kernel security update (duration: 03m 09s)
  • 10:39 _joe_: running decommission_appserver on mw2097-2134 T189111
  • 10:23 XioNoX: labs->cloud vlan rename in eqiad - T187933
  • 09:56 elukey: restart kafka mirror maker (main eqiad -> jumbo) on kafka1020 (all consumers not assigned to any partition on kafka102*)
  • 09:53 moritzm: installing util-linux security updates
  • 09:31 _joe_: decommission mw2097-mw2134 from conftool T189111
  • 08:40 moritzm: rebooting iron for kernel security update
  • 08:32 ema: cp3033/cp3031: restart varnish-be
  • 08:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2015 after kernel upgrade (duration: 00m 58s)
  • 08:20 ema: cp3033/cp3031: set transaction_timeout to 60s
  • 08:14 marostegui: Stop MySQL on es2015 for kernel upgrade
  • 08:06 ema: cp3042: restart varnish-be
  • 08:03 ema: cp3042: set transaction_timeout to 30s
  • 07:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2015 for kernel upgrade (duration: 00m 58s)
  • 07:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2014 after kernel upgrade (duration: 01m 01s)
  • 07:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 for kernel upgrade (duration: 00m 59s)
  • 07:26 marostegui: Stop MySQL on es2014 for kernel upgrade
  • 07:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 for kernel upgrade (duration: 00m 58s)
  • 07:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1113:3316 as vslow,dump in s6 - T184161 (duration: 00m 58s)
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1113:3315 as vslow,dump in s5 - T184161 (duration: 00m 58s)
  • 06:27 marostegui: Deploy schema change on db1103:3314 - T187089 T185128 T153182
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 for alter table (duration: 01m 06s)
  • 02:52 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 11m 56s)

2018-03-11

  • 08:50 elukey: executed sudo rm /etc/logrotate.d/kafkatee-webrequest-analytics on oxygen/rhenium to stop daily cronspam

2018-03-10

  • 14:56 ema: cp1053: restart varnish-be
  • 13:29 ema: cp1068/cp1055: restart varnish-be

2018-03-09

  • 23:29 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/ReadingLists/src/Api/ApiQueryReadingListEntries.php: T189272 fix stupid ReadingLists typo breaking production (duration: 00m 54s)
  • 19:43 foks: changed global email for User:Mathmensch
  • 19:19 MaxSem: restarted my script on tin, now with more aggressive writes
  • 18:26 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/AbuseFilter/includes/AbuseFilter.class.php: Unbreak AbuseFilter tagging T189299 (duration: 00m 59s)
  • 17:35 andrew@tin: Finished deploy [horizon/deploy@9c234d6]: Another try at fixing T188458 (duration: 03m 00s)
  • 17:32 andrew@tin: Started deploy [horizon/deploy@9c234d6]: Another try at fixing T188458
  • 16:14 andrewbogott: test log
  • 16:07 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp3034.esams.wmnet
  • 15:59 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5795526]: Enable root_claim_ttl for refreshLinks T189303 (duration: 00m 38s)
  • 15:59 andrewbogott: moving wikitech dns record to point to misc-web and the new labweb cluster, https://gerrit.wikimedia.org/r/#/c/417926/
  • 15:59 ppchelko@tin: Started deploy [cpjobqueue/deploy@5795526]: Enable root_claim_ttl for refreshLinks T189303
  • 15:54 andrew@tin: Finished deploy [horizon/deploy@f59f568]: rolling out a fix for T188458 (duration: 03m 11s)
  • 15:51 andrew@tin: Started deploy [horizon/deploy@f59f568]: rolling out a fix for T188458
  • 15:30 moritzm: installing zsh security update on trusty
  • 15:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1063 after cloning db1113:3316 - T184161 (duration: 00m 58s)
  • 15:15 moritzm: installing sensible-utils security update on trusty (Debian already fixed)
  • 15:11 ema: cp-upload_esams: reboot for retpoline kernel updates T188092
  • 13:12 marostegui: Compress s6 on db1113:3316 - T184161
  • 12:41 elukey: manually executed systemctl reset-failed to some old (not present anymore) units on kafka analytics hosts
  • 12:26 marostegui: Compress s5 on db1113:3315 - T184161
  • 12:16 marostegui: Stop mysql on db1063 to clone db1113:3316 - T184161
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 to clone db1113:3316 - T184161 (duration: 00m 58s)
  • 12:11 jynus: dropping test databases on dbstore2* instances
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 after cloning db1113:3315- T184161 (duration: 00m 58s)
  • 12:06 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db1051 after cloning db1113:3315- T184161 (duration: 00m 58s)
  • 11:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add initial config for db1113 multiinstance - T184161 (duration: 00m 58s)
  • 11:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add initial config for db1113 multiinstance - T184161 (duration: 00m 58s)
  • 11:15 marostegui: Stop MySQL on db1051 to clone db1113 - https://phabricator.wikimedia.org/T184161
  • 11:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 to clone db1113 - T184161 (duration: 00m 58s)
  • 09:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 with normal load (duration: 00m 58s)
  • 09:22 ema: cp-misc_esams: reboot for retpoline kernel updates T188092
  • 08:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2058 and db2084 (duration: 00m 58s)
  • 08:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 with low load (duration: 00m 58s)
  • 07:35 marostegui: Stop mariadb on db2058 and db2084 for mariadb+kernel upgrade
  • 07:34 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2058 and db2084 (duration: 00m 58s)
  • 07:33 marostegui: Logging for the record: es2013 was stopped and rebooted for mariadb and kernel upgrade
  • 07:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2013 (duration: 00m 58s)
  • 07:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2012, depool es2013 (duration: 00m 58s)
  • 06:52 marostegui: Stop MariaDB on es2012 to upgrade mariadb and kernel
  • 06:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2012 for kernel and mariadb upgrade (duration: 00m 58s)
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore es1019 normal weight (duration: 00m 59s)
  • 05:00 andrew@tin: Finished deploy [horizon/deploy@930009e]: rebuilding venvs to avoid rogue configs, as was causing T189278 (duration: 02m 59s)
  • 04:57 andrew@tin: Started deploy [horizon/deploy@930009e]: rebuilding venvs to avoid rogue configs, as was causing T189278
  • 00:40 thcipriani@tin: Synchronized static/images/project-logos: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART IV (duration: 00m 58s)
  • 00:38 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART III (duration: 00m 58s)
  • 00:36 thcipriani@tin: Synchronized static/images/project-logos/urwiki-2x.png: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART II (duration: 00m 58s)
  • 00:33 thcipriani@tin: Synchronized static/images/project-logos/urwiki-1.5x.png: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART I (duration: 00m 59s)
  • 00:03 urandom: set compression chunk length to 32, parsoid tables (group "enwiki") - T189057

2018-03-08

  • 23:10 urandom: set compression chunk length to 32, parsoid tables (group "wikipedia") - T189057
  • 22:31 urandom: set compression chunk length to 32, parsoid tables (group "commons") - T189057
  • 22:16 reedy@tin: Synchronized php-1.31.0-wmf.24/includes/specials/pagers/BlockListPager.php: T189251 (duration: 00m 59s)
  • 22:07 MaxSem: guess what? trying T187516 again
  • 21:41 urandom: set compression chunk length to 32, parsoid tables (group "others") - T189057
  • 21:15 otto@tin: Synchronized wmf-config/ProductionServices.php: Revert: point monolog avro producer back at Kafka analytics. Too many TCP connections? T188136 (duration: 00m 58s)
  • 21:09 sbisson@tin: Finished deploy [kartotherian/deploy@6dcacbc]: Deploying kartotherian with updated dependencies and zoom level 19 to maps-test* (take 3) (duration: 04m 42s)
  • 21:04 sbisson@tin: Started deploy [kartotherian/deploy@6dcacbc]: Deploying kartotherian with updated dependencies and zoom level 19 to maps-test* (take 3)
  • 20:40 urandom: set compression chunk length to 32, mobile tables - T189057
  • 20:34 urandom: set compression chunk length to 32, page_summary tables - T189057
  • 20:30 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to php-1.31.0-wmf.24
  • 20:26 thcipriani@tin: Synchronized php: Ensure symlink for 1.31.0-wmf.24 is up-to-date (duration: 01m 15s)
  • 19:52 niharika29@tin: Synchronized php-1.31.0-wmf.24/extensions/Echo/: https://gerrit.wikimedia.org/r/#/c/417330/ and https://gerrit.wikimedia.org/r/#/c/417340/ (duration: 01m 21s)
  • 19:33 anomie: Running `cleanupUsersWithNoId.php --table recentchanges --prefix wikidata --force` on wikidata client wikis for T181731. This shouldn't create any local SUL accounts.
  • 19:29 niharika29@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/: Hooks: Don't register beta features if they're enabled for all https://gerrit.wikimedia.org/r/#/c/417277/ (duration: 01m 14s)
  • 19:24 sbisson@tin: Finished deploy [kartotherian/deploy@a839a16]: Deploying kartotherian with updated dependencies and zoom lovel 19 to maps-test* (duration: 02m 40s)
  • 19:23 niharika29@tin: Synchronized wmf-config/CommonSettings.php: NavigtationTiming: Enable oversampling for Singapore T188652 (duration: 01m 15s)
  • 19:22 sbisson@tin: Started deploy [kartotherian/deploy@a839a16]: Deploying kartotherian with updated dependencies and zoom lovel 19 to maps-test*
  • 19:21 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: NavigtationTiming: Enable oversampling for Singapore T188652 (duration: 01m 16s)
  • 18:43 bsitzmann@tin: Finished deploy [mobileapps/deploy@d6819a0]: Update mobileapps to afb0167 (duration: 06m 14s)
  • 18:37 bsitzmann@tin: Started deploy [mobileapps/deploy@d6819a0]: Update mobileapps to afb0167
  • 17:19 andrew@tin: Synchronized wmf-config/wikitech.php: wikitech varnish updates (duration: 01m 15s)
  • 17:05 jynus: stop and reboot db1114 for kernel regression
  • 16:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool es1019 with less weight after HW maintenance (duration: 01m 15s)
  • 16:32 bd808: Running wikireplica_dns from labcontrol1001
  • 16:14 cmjohnson: wdqs1004 down for systemboard replacement
  • 15:56 andrewbogott: restarting nova-fullstack on labnet1001
  • 15:54 andrewbogott: restarting nodepool again
  • 15:42 andrewbogott: stopping nodepool again because something isn't quite right
  • 15:41 marostegui: Power off es1019 - T187530
  • 15:32 otto@tin: Synchronized wmf-config/ProductionServices.php: Point Mediawiki Monolog at new Kafka jumbo-eqiad cluster: T188136 (duration: 01m 16s)
  • 15:29 ottomata: merging and then deploying mediawiki-config to point monolog avro kafka producer at new kafka jumbo cluster: https://phabricator.wikimedia.org/T188136
  • 15:29 andrewbogott: disabling puppet on labnodepool1001
  • 15:17 andrewbogott: silencing nova and other openstack alerts in anticipation of service interruptions for https://phabricator.wikimedia.org/T189005
  • 15:01 marostegui: Disable puppet on db1073 - T189005
  • 15:00 marostegui: Change topology in m5, db2037 to become a slave of db1073 - T189005
  • 14:56 oblivian@tin: Synchronized wmf-config/CommonSettings.php: Use EtcdConfig everywhere (duration: 01m 15s)
  • 14:38 zeljkof: EU SWAT finished
  • 14:38 marostegui: Stop mysql on es1019 - T187530
  • 14:37 zfilipin@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: SWAT: Blacklist Web of Trust junk from being added to pages (T189148) (duration: 01m 15s)
  • 14:35 zfilipin@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ArticleTarget.js: SWAT: Follow-up I5357a909: Fix logic for autosave from edited state (T189071) (duration: 01m 16s)
  • 14:28 mobrovac@tin: Finished deploy [cpjobqueue/deploy@4fa1cf0]: Lower the refreshLinks concurrency to 175 - T185052 (duration: 00m 33s)
  • 14:27 mobrovac@tin: Started deploy [cpjobqueue/deploy@4fa1cf0]: Lower the refreshLinks concurrency to 175 - T185052
  • 14:26 vgutierrez: uploaded pybal_1.15.2_all.deb to apt.wikimedia.org jessie-wikimedia
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: 2017 wikitext editor: Enable by default on officewiki (T188028) (duration: 01m 16s)
  • 14:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create the rollbacker group at ar.wikinews (T189206) (duration: 01m 16s)
  • 13:56 gehel: restart wdqs-updater on wdqs1005 to validate new config option - T188716
  • 13:52 sbisson@tin: Finished deploy [kartotherian/deploy@42b3280]: Deploying kartotherian with updated dependencies and zoom lovel 19 to test servers (duration: 08m 31s)
  • 13:44 moritzm: depooling mwdebug2001, the host will temporarily be using an HHVM build linked against libicu57 to perform some tests
  • 13:43 sbisson@tin: Started deploy [kartotherian/deploy@42b3280]: Deploying kartotherian with updated dependencies and zoom lovel 19 to test servers
  • 13:40 elukey: eventlogging analytics migrated from eventlog1001 to eventlog1002
  • 13:35 ariel@tin: Finished deploy [dumps/dumps@f26c114]: fix prefetch stubs; retrieval globals more robustly (duration: 00m 03s)
  • 13:35 ariel@tin: Started deploy [dumps/dumps@f26c114]: fix prefetch stubs; retrieval globals more robustly
  • 13:29 ema: cp-ulsfo: reboot for retpoline kernel updates T188092
  • 12:50 oblivian@puppetmaster1001: conftool action : edit; selector: scope=codfw,name=ReadOnly
  • 12:47 oblivian@puppetmaster1001: conftool action : edit; selector: scope=codfw,name=ReadOnly
  • 12:06 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 fully (duration: 01m 16s)
  • 11:32 moritzm: installing isc-dhcp security updates
  • 10:43 moritzm: installing libvpx security updates
  • 10:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Change db1114 load (duration: 01m 16s)
  • 10:14 akosiaris: conduct IO stresstests on ganeti1005 (sca1004 VM) with cache=none KVM flag on T181121
  • 10:13 akosiaris: conduct IO stresstests on ganeti1005 (sca1004 VM) with cache=none KVM flag on
  • 09:57 dcausse: restaring mjolnir-kafka-daemon.service on relforge1002 to switch to kafka jumbo
  • 09:56 dcausse: restaring mjolnir-kafka-daemon.service on relforge1001 to switch to kafka jumbo
  • 09:56 _joe_: decommissioning mw2017-2099 T187467
  • 09:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 partially (duration: 01m 16s)
  • 09:44 moritzm: rearming keyholder on neodymium after reboot
  • 09:40 moritzm: rebooting neodymium for kernel security update
  • 09:22 ema: cp-eqsin: reboot for retpoline kernel updates T188092
  • 09:12 ema: cp3043: varnish-be-restart T189085
  • 09:08 moritzm: rebooting bast1001 for kernel security update
  • 08:58 elukey: restart varnish backend on cp3041 (failed fetches)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046, db2053 and db2060 after kernel upgrade (duration: 01m 15s)
  • 08:58 moritzm: reset RAC on bast1001, serial console was stuck
  • 08:50 elukey: rebooting analytics1003 (Hadoop Hive, Oozie, etc..) for kernel updates
  • 08:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2046, db2053 and db2060 for kernel upgrade (duration: 01m 17s)
  • 08:31 elukey: reboot analytics1002 (Hadoop master standby) for kernel upgrades
  • 08:28 marostegui: Stop MySQL on db2046, db2053 and db2060 for kernel upgrade
  • 08:19 elukey: reboot analytics1001 (Hadoop master) for kernel upgrade (temp failover to analytics1002)
  • 08:09 ema: cp3040: varnish-be-restart T189085
  • 08:00 ema: cp3032: varnish-be-restart T189085
  • 07:44 elukey: reboot kafka2003 (eventbus codfw) for kernel updates
  • 07:24 elukey: reboot kafka2002 (eventbus codfw) for kernel updates
  • 07:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1019 for maintenance - T187530 (duration: 01m 16s)
  • 07:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Revert: Depool db1064, it is not performing well with 2 failed disks - T188685 (duration: 01m 31s)
  • 04:27 Krinkle: Running whisper-mass-resize for ResourceLoader.* metrics on graphite1001 and graphite2001 (T179622)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.23) (duration: 07m 37s)
  • 02:15 tgr@tin: Synchronized wmf-config/throttle.php: T189161 Temporarely remove account creation limit for event on Portuguese Wikipedia on March 08, 2018 (duration: 01m 10s)
  • 01:17 twentyafterfour: phabricator update completed
  • 01:13 twentyafterfour: preparing for phabricator update 2018-03-07/1
  • 00:37 thcipriani@tin: Synchronized wmf-config/db-eqiad.php: SWAT: wikitech: use FQDNs for m5 cluster members (duration: 01m 16s)
  • 00:28 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add configuration for CirrusSearch to instantly index new Wikidata items T183053 (duration: 01m 15s)
  • 00:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable loginOnly mode for local auth provider on group 2 T57420 (duration: 01m 16s)

2018-03-07

  • 23:36 MaxSem: aborted due to growing DB lag
  • 23:08 MaxSem: running script for T187516
  • 23:00 maxsem@tin: Synchronized php-1.31.0-wmf.24/extensions/AntiSpoof/: https://gerrit.wikimedia.org/r/#/c/417013/ (duration: 01m 16s)
  • 22:52 maxsem@tin: Synchronized php-1.31.0-wmf.24/extensions/CentralAuth/: https://gerrit.wikimedia.org/r/#/c/417014/ (duration: 01m 20s)
  • 22:44 MaxSem: dumping centralauth.spoofuser from db1079
  • 22:27 ejegg: deployed patch for T171987 to 1.31.0-wmf.23
  • 22:23 ejegg: deployed patch for T171987 to 1.31.0-wmf.24
  • 21:51 herron: puppetdb server reboots complete — re-enabling puppet agents
  • 21:45 herron: temporarily disabling puppet agents while puppetdb servers nitrogen and nihal are rebooted for kernel updates
  • 21:24 thcipriani@tin: Synchronized wmf-config: Improve load-order documentation for CommonSettings and InitialiseSettings noop doc change (duration: 01m 18s)
  • 21:05 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: Switch wikitech to swift (duration: 01m 15s)
  • 20:58 andrew@tin: Synchronized wmf-config/filebackend.php: Preparing wikitech to use swift for images, step two (duration: 01m 12s)
  • 20:56 andrew@tin: Synchronized wmf-config/CommonSettings.php: Preparing wikitech to use swift for images, step one (duration: 01m 16s)
  • 20:45 andrew@tin: Synchronized multiversion/MWMultiVersion.php: (no justification provided) (duration: 01m 16s)
  • 20:27 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to php-1.31.0-wmf.24
  • 19:43 Amir1: ladsgroup@terbium:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https (T183019)
  • 19:35 Amir1: ladsgroup@terbium:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https on fawiki and hewiki (T183019)
  • 19:18 Amir1: ladsgroup@terbium:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=mediawikiwiki --force-protocol https (T183019)
  • 18:56 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: retry (duration: 01m 15s)
  • 18:42 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T189116 Update logos for Limburgish and Picardic Wikipedias (duration: 01m 16s)
  • 18:40 tgr@tin: Synchronized static/images/project-logos: T189116 Update logos for Limburgish and Picardic Wikipedias (duration: 01m 17s)
  • 18:30 tgr@tin: Synchronized debug.json: T187468 Switch to mwdebug hosts in codfw too (duration: 01m 15s)
  • 18:26 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T57420 Enable loginOnly mode for local auth provider on group 1 (duration: 01m 20s)
  • 17:41 moritzm: rebooting restbase-test* for kernel security update
  • 16:55 ema: cp5001: reboot for retpoline kernel updates T188092
  • 16:46 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ff41710]: Increase refreshLinks concurrency to 250 T185052 (duration: 00m 33s)
  • 16:46 ppchelko@tin: Started deploy [cpjobqueue/deploy@ff41710]: Increase refreshLinks concurrency to 250 T185052
  • 16:08 elukey: updating pcc facts for new hosts
  • 15:54 moritzm: rebooting rdb* fallback hosts in eqiad for kernel security update
  • 15:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064, it is not performing well with 2 failed disks - T188685 (duration: 01m 16s)
  • 15:26 marostegui: Set disk 32:2 on db1064 as offline
  • 15:20 moritzm: rebooting krypton (running grafana among others) for kernel security update
  • 15:17 reedy@tin: Synchronized wmf-config/throttle.php: T189121 (duration: 01m 15s)
  • 14:45 Amir1: EU SWAT is done
  • 14:42 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aee2eb1]: Increase refreshLinks concurrency to 150 T185052 (duration: 00m 36s)
  • 14:41 ppchelko@tin: Started deploy [cpjobqueue/deploy@aee2eb1]: Increase refreshLinks concurrency to 150 T185052
  • 14:37 moritzm: rebooting rdb* hosts in codfw for kernel security update
  • 14:37 ladsgroup@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase/lib/tests/phpunit/Sites/SiteMatrixParserTest.php: Add code of special wikis as interwiki when populating sites table, part II (T183019) (duration: 01m 16s)
  • 14:35 ladsgroup@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase/lib/includes/Sites/SiteMatrixParser.php: Add code of special wikis as interwiki when populating sites table (T183019) (duration: 01m 16s)
  • 14:31 ladsgroup@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/lib/tests/phpunit/Sites/SiteMatrixParserTest.php: Add code of special wikis as interwiki when populating sites table, part II (T183019) (duration: 01m 15s)
  • 14:27 ladsgroup@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/lib/includes/Sites/SiteMatrixParser.php: Add code of special wikis as interwiki when populating sites table (T183019) (duration: 01m 16s)
  • 14:19 _joe_: adding mwdebug200{1,2} to ganeti in codfw, T187468
  • 14:17 urandom: reducing compression chunk length to 32kb on "wikipedia_T_page__summary".data - T189057
  • 14:10 zfilipin@tin: Synchronized wmf-config/: SWAT: Load Wikibase Quality extensions using extension registration (T106104) (duration: 01m 17s)
  • 14:03 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T188626) (duration: 01m 18s)
  • 14:01 urandom: setting trace probability to 0.0, restbase eqiad cassandra cluster - T189057
  • 13:22 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all refreshLinks jobs to EventBus, file #2 - T185052 (duration: 01m 15s)
  • 13:22 moritzm: rebooting tungsten for kernel security update
  • 13:21 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all refreshLinks jobs to EventBus - T185052 (duration: 01m 15s)
  • 13:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@d84286a]: Switch all refreshLinks jobs to kafka T185052 (duration: 00m 43s)
  • 13:20 moritzm: rebooting install2002 for kernel security update
  • 13:19 ppchelko@tin: Started deploy [cpjobqueue/deploy@d84286a]: Switch all refreshLinks jobs to kafka T185052
  • 10:55 marostegui: Deploy schema change on codfw s4 master (db2051) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182
  • 10:54 moritzm: rearmed keyholders on netmon1002 and netmon2001
  • 10:50 elukey: reboot stat100[56] for kernel upgrades
  • 10:49 moritzm: reboot memcached hosts in codfw for kernel security update
  • 10:34 moritzm: rebooting netmon2001 for kernel security update
  • 10:29 moritzm: rebooting netmon1002 for kernel security update
  • 10:26 moritzm: rebooting boron for kernel security update
  • 10:11 moritzm: rebooting openldap/WMCS servers for kernel security update
  • 10:05 moritzm: rebooting openldap/corp servers for kernel security update
  • 10:03 elukey: reboot analytics10[35,52] for kernel updates - hadoop hdfs journal nodes (didn't manage to complete the work yesterday)
  • 10:03 moritzm: rebooting pool counters in codfw for kernel security update
  • 10:02 akosiaris: upload apertium-rus-ukr_0.2.0~r82706-1+wmf1 on apt.wikimedia.org/jessie-wikimedia/main. T184901
  • 09:56 moritzm: rebooting tureis/roentgenium for kernel security update
  • 09:53 akosiaris: upload apertium-rus_0.2.0~r82706-1+wmf1 and apertium-ukr_0.1.0~r82563-1+wmf1 on apt.wikimedia.org/jessie-wikimedia/main. T184901
  • 09:46 moritzm: rebooting etherpad1001 (etherpad.wikimedia.org) for kernel security update
  • 09:31 moritzm: rebooting darmstadtium (docker registry) for kernel security update
  • 09:24 moritzm: rearming keyholder on sarin after reboot
  • 09:16 moritzm: rebooting sarin for kernel security update
  • 08:57 ema: cp3033: restart varnish-be, backend connections piling up (~12k)
  • 08:40 marostegui: Deploy schema change on s7 primary master db1062 - T153182 T185128
  • 08:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 after alter table (duration: 01m 16s)
  • 07:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2089,db2079 and db2065 after mariadb and kernel upgrade (duration: 01m 16s)
  • 07:30 marostegui: Stop mariadb on db2089,db2079 and db2065 for kernel upgrade
  • 07:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2089,db2079 and db2065 (duration: 01m 15s)
  • 06:49 marostegui: Deploy schema change on db1079 with replication enabled (this will generate lag on labs) - T187089 T185128 T153182
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 for alter table (duration: 01m 16s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.23) (duration: 06m 03s)
  • 00:57 Amir1: Evening SWAT is done
  • 00:32 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Re-enable Wikidata descriptions (T188182) (duration: 01m 16s)

2018-03-06

  • 23:10 MaxSem: cancelled
  • 23:05 MaxSem: refreshing spoofuser
  • 23:00 MaxSem: dumping centralauth.spoofuser from db1094
  • 21:22 mutante: restbase-dev1006 powercycled via console (T185494)
  • 20:49 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.24
  • 20:44 ottomata: reverted change to point mediawiki monolog kafka producers at kafka jumbo-eqiad until deployment train is done T188136
  • 20:36 mutante: phab1001 (phabricator) - rebooting for maintenance
  • 20:35 ottomata: pointing mediawiki monolog kafka producers at kafka jumbo-eqiad cluster: T188136
  • 20:08 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.24 and rebuild l10n cache (duration: 29m 13s)
  • 19:39 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.24 and rebuild l10n cache
  • 18:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@5986ab7]: Update mobileapps to afbe9af (duration: 05m 28s)
  • 18:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@5986ab7]: Update mobileapps to afbe9af
  • 18:22 godog: puppet-merge Revert: Use hiera3 role/nuyaml backends on >= stretch
  • 17:58 marostegui: Reload haproxy on dbproxy1004 and dbproxy1009
  • 17:53 thcipriani: starting branch cut for 1.31.0-wmf.24
  • 17:53 andrewbogott: disabling puppet and apache on labpuppetmatser1001 and 1002
  • 17:47 moritzm: rebooting dbmonitor1001 for kernel security update
  • 17:42 moritzm: rebooting dbmonitor2001 for kernel security update
  • 17:38 moritzm: rebooting hassaleh for kernel security update
  • 17:34 vgutierrez: update pybal to 1.15.1 on lvs5003
  • 17:32 vgutierrez: update pybal to 1.15.1 on lvs1010
  • 17:28 vgutierrez: uploaded pybal_1.15.1_all.deb to apt.wikimedia.org jessie-wikimedia
  • 17:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 after alter table (duration: 00m 58s)
  • 16:58 cmjohnson1: powering off rhenium to reset the idrac
  • 16:44 sbisson@tin: Finished deploy [kartotherian/deploy@255401a]: Testing update-deps2 branch (duration: 05m 47s)
  • 16:38 sbisson@tin: Started deploy [kartotherian/deploy@255401a]: Testing update-deps2 branch
  • 16:11 oblivian@tin: Synchronized wmf-config: Fetch data from etcd on all appservers (duration: 01m 01s)
  • 16:01 marostegui: Deploy schema change on db1069 - T187089 T185128 T153182
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 for alter table (duration: 00m 57s)
  • 15:54 jynus: deploying new query killer logic to all wikidata (s8) db replicas T188505
  • 15:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 after alter table (duration: 00m 57s)
  • 15:51 moritzm: installing libvpx security updates
  • 15:50 oblivian@tin: Synchronized wmf-config: Expose etcd last modified index (duration: 01m 00s)
  • 15:45 moritzm: rebooting ununpentium for kernel security update
  • 15:39 oblivian@tin: Finished scap: Deploying Expose the latest modified index seen by EtcdConfig (duration: 09m 49s)
  • 15:29 oblivian@tin: Started scap: Deploying Expose the latest modified index seen by EtcdConfig
  • 15:28 moritzm: rebooting bromine for kernel security update
  • 15:19 mobrovac@tin: Synchronized php-1.31.0-wmf.23/includes/jobqueue/JobQueueSecondTestQueue.php: [JobQueueSecondTestQueue] Support read-only mode - T185052 (duration: 00m 58s)
  • 15:09 vgutierrez: update to pybal 1.15.0 on lvs5003
  • 15:02 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Article counts: Change 'comma' method to 'any' - T188472 (duration: 01m 00s)
  • 14:50 vgutierrez: update pybal to 1.15.0 on lvs1010
  • 14:46 hashar: tin: /srv/mediawiki-staging/php-1.31.0-wmf.23 rebased on tip of https://gerrit.wikimedia.org/r/#/c/416686/ (that revert a merge of master branch)
  • 14:42 gehel: rebooting maps1* (eqiad) for kernel security update completed
  • 14:36 ottomata: beginning migration of webrequest text varnishkafka logs from Kafka analytics to Kafka jumbo-eqiad T185136
  • 14:21 moritzm: rebooting labweb* for kernel security update
  • 14:13 moritzm: rebooting sca* for kernel security update
  • 14:07 gehel: rebooting maps1* (eqiad) for kernel security update
  • 14:07 moritzm: rebooting pybal-test for kernel security update
  • 14:00 _joe_: SWAT is suspended for investigation on tin's git status
  • 14:00 moritzm: rebooting oxygen for kernel security update
  • 13:16 moritzm: powercycling ms-be1038, stuck after reboot
  • 13:10 marostegui: Deploy schema change on db1094 - T187089 T185128 T153182
  • 13:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 for alter table (duration: 00m 58s)
  • 12:55 moritzm: rebooting URL downloaders for kernel security update
  • 12:51 mobrovac@tin: Finished deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052 (duration: 00m 34s)
  • 12:50 mobrovac@tin: Started deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 after alter table (duration: 00m 58s)
  • 12:33 moritzm: rebooting mwlog* for kernel security update
  • 12:04 moritzm: rebooting graphite hosts in eqiad for kernel security update
  • 11:29 moritzm: rebooting k8s masters for kernel security update
  • 11:05 elukey: reboot analytics10[28,35,52] for kernel updates (one at the time, hadoop hdfs journal nodes)
  • 10:46 moritzm: powercycling ms-be1021, stuck after reboot
  • 10:45 akosiaris@tin: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 01m 22s)
  • 10:43 moritzm: rearming keyholder on naos after reboot
  • 10:39 akosiaris: emergency add a captcha in metawiki contact pages like https://meta.wikimedia.org/wiki/Special:Contact/Stewards to stop bot abuse. phab Task to be filed later on
  • 10:39 godog: reboot ms-be1013 to try fix disk ordering
  • 10:35 moritzm: rebooting naos for kernel security update
  • 10:32 moritzm: rearming keyholder on tin after reboot
  • 10:30 gehel: kafka poller active on all production wdqs nodes - T188252
  • 10:28 moritzm: rebooting tin for kernel security update
  • 10:20 gehel: reboot completed for maps2* and maps-test*
  • 09:51 moritzm: rebooting graphite hosts in codfw for kernel security update
  • 09:42 marostegui: Stop MySQL on db1107 for mariadb and kernel upgrade
  • 09:41 vgutierrez: pybal_1.15.0_all.deb to apt.wikimedia.org jessie-wikimedia
  • 09:40 marostegui: Start proxysql on wasat
  • 09:38 moritzm: rebooting wezen for kernel security update
  • 09:27 elukey: reboot kafka2001 (eventbus codfw) for kernel updates
  • 09:24 marostegui: Deploy schema change on db1086 - T187089 T185128 T153182
  • 09:18 marostegui: Stop and reboot db1086 for kernel and mariadb upgrade
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 for alter table (duration: 00m 57s)
  • 09:17 moritzm: rebooting swift backend servers in eqiad for kernel security update
  • 09:17 moritzm: rebooting wwift backend servers in eqiad for kernel security update
  • 09:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 after alter table (duration: 00m 57s)
  • 09:05 gehel: rolling restart of maps* for kernel upgrade
  • 08:50 elukey: reboot meitnerium (archiva) for kernel updates
  • 08:38 paravoid: rebooting furud
  • 08:35 moritzm: rebooting wasat for kernel security update
  • 08:30 elukey: drain+reboot analytics[1065-1067] for kernel updates
  • 08:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Update db1069 IP (duration: 00m 57s)
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Update db1069 IP (duration: 00m 57s)
  • 08:15 moritzm: rebooting ruthenium for kernel security update
  • 08:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Revert depool some codfw hosts for kernel and mariadb upgrade (duration: 00m 57s)
  • 08:10 moritzm: rebooting bast5001 for kernel security update
  • 08:01 elukey: drain+reboot analytics[61,63,64] for kernel updates
  • 07:59 moritzm: rebooting tegmen for kernel security update
  • 07:43 marostegui: Stop mysql on db2090 db2080 db2076 db2073 db2067 for mariadb and kernel upgrade
  • 07:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool some codfw hosts for kernel and mariadb upgrade (duration: 00m 58s)
  • 07:36 moritzm: rebooting remaining swift backend servers in codfw for kernel security update
  • 07:18 marostegui: Stop MySQL on db2093 to get some data from the event scheduler
  • 06:56 marostegui: Deploy schema change on db1101:3317 - T187089 T185128 T153182
  • 06:51 marostegui: Stop mysql on db2037 to upgrade it
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 for alter table (duration: 00m 58s)
  • 05:00 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend.php: T180183: I6d72873b9d3 (duration: 00m 56s)
  • 04:59 krinkle@tin: Synchronized wmf-config/profiler.php: T180183 - Ie5a164a9e2b (duration: 00m 57s)
  • 04:58 krinkle@tin: Synchronized wmf-config/profiler-labs.php: beta: no-op (duration: 00m 54s)
  • 04:57 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend-labs.php: beta: no-op (duration: 00m 57s)
  • 04:29 bblack: eqsin router maintenance starting soon-ish. all of eqsin will be offline and isn't in production service to begin with. We've tried to downtime all the things, but don't be shocked at spurious alerts! - T187807
  • 04:08 krinkle@tin: Synchronized multiversion/MWMultiVersion.php: Ia2acf57c6 (duration: 00m 57s)
  • 04:01 krinkle@tin: Synchronized wmf-config/profiler.php: T180183 (duration: 01m 33s)
  • 02:26 tgr@tin: Synchronized wmf-config/CommonSettings.php: T186296 Increase ReadingLists list size limit to 5k (duration: 01m 06s)
  • 02:07 tgr@tin: Finished scap: T187226#4025352 update ReadingLists (duration: 18m 49s)
  • 01:48 tgr@tin: Started scap: T187226#4025352 update ReadingLists
  • 01:00 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: refresh wmf-config/InitialiseSettings, seems to have stuck in old state on some servers after doing the initial sync in the wrong order (duration: 00m 57s)
  • 00:54 tgr@tin: Synchronized wmf-config: T57420 Enable loginOnly mode for local auth provider on group 0 (duration: 01m 00s)
  • 00:41 krinkle@tin: Synchronized wmf-config/CommonSettings.php: no-op I33f09b164e7 (duration: 00m 58s)
  • 00:38 krinkle@tin: Synchronized wmf-config/CommonSettings-labs.php: beta-only: I02a4d4 (duration: 00m 57s)

2018-03-05

  • 22:44 bawolff@tin: Synchronized php-1.31.0-wmf.23/includes/logging/LogPager.php: T188145 (duration: 00m 58s)
  • 21:32 arlolra: Updated Parsoid to d115592 (T188591)
  • 21:25 arlolra@tin: Finished deploy [parsoid/deploy@232631f]: Updating Parsoid to d115592 (duration: 12m 12s)
  • 21:13 arlolra@tin: Started deploy [parsoid/deploy@232631f]: Updating Parsoid to d115592
  • 20:04 gehel@tin: Finished deploy [wdqs/wdqs@1983ddf]: wdqs GUI update (duration: 01m 36s)
  • 20:03 gehel@tin: Started deploy [wdqs/wdqs@1983ddf]: wdqs GUI update
  • 20:02 hashar@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase: Fix empty condition list in metadata lookup - T188313 (duration: 01m 58s)
  • 19:51 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/416219/ (duration: 00m 57s)
  • 19:43 maxsem@tin: Synchronized php-1.31.0-wmf.23/extensions/Cite: https://gerrit.wikimedia.org/r/#/c/416467/ (duration: 00m 58s)
  • 19:30 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: rolling back previous GUI update (duration: 02m 36s)
  • 19:28 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: rolling back previous GUI update
  • 19:23 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/416456/ (duration: 00m 58s)
  • 19:21 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version (duration: 01m 23s)
  • 19:20 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version
  • 19:14 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/416457/ (duration: 00m 58s)
  • 18:54 jynus: stop slave on db2044
  • 18:24 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: rolling back to previous state, UI is broken (duration: 00m 54s)
  • 18:23 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: rolling back to previous state, UI is broken
  • 18:20 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version (duration: 03m 08s)
  • 18:16 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version
  • 17:34 elukey: drain + reboot analytics10[58-60] for kernel updates
  • 17:32 bd808: Added zhuyifei1999_ and chicocvenancio to the "toollabs-trusted" gerrit group
  • 16:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 - T186699 (duration: 00m 57s)
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 after alter table (duration: 00m 57s)
  • 16:00 elukey: test
  • 15:56 akosiaris: upload tiller on apt.wikimedia.org Component: main distros: jessie-wikimedia, stretch-wikimedia T189919
  • 15:56 akosiaris: upload helm on apt.wikimedia.org Component: main distros: jessie-wikimedia, stretch-wikimedia T189919
  • 15:55 urandom: setting trace probability to 0.001 (.1%), eqiad datacenter, restbase cassandra cluster
  • 15:52 urandom: updating `system_traces` keyspace replication strategy, restbase cassandra cluster
  • 15:51 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all of the cdnPurge to EventBus, file 2/2 - T188540 (duration: 00m 57s)
  • 15:50 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 15:49 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all of the cdnPurge to EventBus, file 1/2 - T188540 (duration: 00m 57s)
  • 15:45 ppchelko@tin: Finished deploy [cpjobqueue/deploy@346a2b6]: Switch all cdnPurge jobs to kafka (duration: 00m 35s)
  • 15:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@346a2b6]: Switch all cdnPurge jobs to kafka
  • 15:42 marostegui: stop and poweroff db1069 for rack change - T186699
  • 15:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 - T186699 (duration: 00m 57s)
  • 15:41 elukey: drain + reboot analytics 1055->57 for kernel updates
  • 15:38 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch 50% for refreshLinks to EventBus - T185052 (duration: 00m 57s)
  • 15:31 ppchelko@tin: Finished deploy [cpjobqueue/deploy@fe5b1f3]: Enable refreshLinks for 50% of the jobs (duration: 00m 39s)
  • 15:31 ppchelko@tin: Started deploy [cpjobqueue/deploy@fe5b1f3]: Enable refreshLinks for 50% of the jobs
  • 15:28 marostegui: Mark as failed disk 32:9 on db1068 (s4 primary master) - T188187
  • 15:20 mobrovac@tin: Synchronized php-1.31.0-wmf.23/extensions/EventBus/includes/JobExecutor.php: [JobExecutor] Wait for the replicas if the transaction takes too long (duration: 00m 57s)
  • 15:14 moritzm: rebooting webperf2001 for kernel security update
  • 14:57 hashar: European SWAT completed
  • 14:57 hashar@tin: Finished scap: 2017 wikitext editor: Simplify config part 2 (duration: 02m 57s)
  • 14:54 hashar@tin: Started scap: 2017 wikitext editor: Simplify config part 2
  • 14:52 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable translate extension in bdwikimedia - T188853 (duration: 00m 57s)
  • 14:48 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable translate extension in bdwikimedia - T188853 (duration: 00m 57s)
  • 14:44 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable rollbacker user right at arwikiversity - T188633 (duration: 00m 57s)
  • 14:41 hashar@tin: Finished scap: core + Flow, master/replicate race condition - T182358 T184670 (duration: 04m 24s)
  • 14:36 hashar@tin: Started scap: core + Flow, master/replicate race condition - T182358 T184670
  • 14:34 elukey: graphite metrics mw.error.* deprecated in T188749
  • 14:31 hashar@tin: Finished scap: Popups: Remove client side formatters in the REST formatter - T183833 (duration: 23m 08s)
  • 14:11 hashar: mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=bdwikimedia translate # T188853
  • 14:08 hashar@tin: Started scap: Popups: Remove client side formatters in the REST formatter - T183833
  • 14:06 hashar@tin: scap aborted: Popups: Remove client side formatters in the REST formatter - T183833 (duration: 00m 16s)
  • 14:06 hashar@tin: Started scap: Popups: Remove client side formatters in the REST formatter - T183833
  • 13:55 moritzm: rolling reboot of swift backends in codfw for kernel security update
  • 13:49 moritzm: rebooting releases2001 for kernel security update
  • 13:37 moritzm: rebooting neon for kernel security update
  • 13:37 mobrovac@tin: Started restart [cpjobqueue/deploy@b5255f0]: Force RecordLintJob rebalance in Kakfa - T188870
  • 13:04 moritzm: rebooting bast4002 for kernel security update
  • 13:00 marostegui: Deploy schema change on db1098:3317 - T187089 T185128 T153182
  • 13:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for alter table (duration: 00m 57s)
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1011 from config - T184703 (duration: 01m 02s)
  • 12:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1011 from config - T184703 (duration: 01m 02s)
  • 12:40 moritzm: rebooting bast4001 for kernel security update
  • 12:30 marostegui: Remove db1011 from tendril as it will be decommissioned - T184703
  • 12:19 moritzm: installing libvpx security updates
  • 12:13 moritzm: installing wavpack security updates
  • 12:08 moritzm: installing freexl security updates
  • 11:59 moritzm: upgrading tor on radium
  • 11:40 moritzm: updating tor packages to 0.3.2.10
  • 11:19 moritzm: running "racadm racreset" on rhenium, mgmt inaccessible
  • 11:09 elukey: drain + reboot analytics10[50,51,53,54] for kernel updates
  • 10:53 moritzm: rebooting bast2001 for kernel security update
  • 10:46 moritzm: rebooting lithium for kernel security update
  • 10:24 elukey: drain + reboot analytics10[46-49] for kernel updates
  • 10:23 moritzm: rolling reboot of logstash* for kernel security update
  • 09:33 godog: roll restart swift in codfw to add thumbor private user
  • 09:15 marostegui: Deploy schema change on s7 codfw master (db2040), this will generate lag on codfw - T187089 T185128 T153182
  • 09:01 godog: roll-restart thumbor to apply https://gerrit.wikimedia.org/r/416240
  • 08:54 marostegui: Stop mariadb on db2037 to copy it to db1073
  • 08:25 marostegui: Stop MySQL on db2078 for mariadb and kernel upgrade
  • 07:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1073 from config (duration: 00m 58s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1073 from config (duration: 00m 59s)
  • 07:06 marostegui: Deploy schema change on s2 primary master db1054 - T185128 T153182
  • 02:08 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed

2018-03-04

  • 20:16 tgr: T188721 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --ignorestatus --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
  • 18:05 musikanimal: T188721 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
  • 15:59 elukey: powercycle stat1004 - available via mgmt, root login freezes while trying

2018-03-03

  • 14:16 akosiaris: 13:56:20 ema: powercycle ganeti1005 T181121
  • 13:56 ema: powercycle ganeti1005
  • 13:25 andrewbogott: forced quota update in admin-monitoring as well; the reserved fixed_ip value was incorrect
  • 13:23 andrewbogott: forcing quota update in nova with update quota_usages set reserved='-1' where project_id='contintcloud';
  • 13:10 andrewbogott: restarting rabbitmq-server on labcontrol1001
  • 13:08 andrewbogott: retarting nodepool
  • 13:05 andrewbogott: restarting nova-conductor
  • 13:02 andrewbogott: stopping nodepool for a bit while investigating openstack issues
  • 02:14 chasemp: labnodepool1001:~# service nodepool start
  • 01:30 chasemp: root@labnet1001:~# service nova-fullstack restart
  • 01:21 chasemp: labnodepool1001:~# service nodepool stop

2018-03-02

  • 19:44 jynus: restarting labsdb1010
  • 17:22 mepps: updated payments-wiki 498f49a758 to ce68e8e80b
  • 15:19 elukey: drain + reboot analytics10[41-45] for kernel updates
  • 15:15 moritzm: rebooting auth* for kernel security updates
  • 13:46 elukey: drain + reboot analytics10[38,39,40,41] for kernel updates
  • 13:22 elukey: drain + reboot analytics10[33,34,36,37] for kernel updates
  • 13:17 moritzm: upgrading labtest trusty hosts to latest 4.4 kernel
  • 12:23 moritzm: rebooting kubetcd/kubestagetcd for kernel security update
  • 12:00 moritzm: rebooting etcd* for kernel security updates
  • 11:58 elukey: drain + reboot analytics10[29,31,32] for kernel updates
  • 11:33 moritzm: draining restbase1018 for eventual reboot for kernel security update
  • 11:28 akosiaris: upload to apt.wikimedia.org component thirdparty/ci distro jessie-wikimedia docker-ce_17.12.1~ce-0~debian_amd64 T177499
  • 11:07 moritzm: rebooting mwdebug* for kernel security update
  • 10:54 ema: spare LVSs lvs[1011-1012], lvs[4001-4004]: reboot for retpoline kernel updates T188092
  • 10:53 moritzm: draining restbase1017 for eventual reboot for kernel security update
  • 10:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 (duration: 00m 57s)
  • 10:18 moritzm: draining restbase1016 for eventual reboot for kernel security update
  • 10:18 jynus: shutting down labsdb1010
  • 10:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 56s)
  • 10:01 elukey: deleted /etc/burrow/* from zookeeper main eqiad/codfw after https://gerrit.wikimedia.org/r/415818 (garbage to cleanup)
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 57s)
  • 09:40 moritzm: draining restbase1015 for eventual reboot for kernel security update
  • 09:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool db1114 in s1 after cloning it from db1073 - T183469 (duration: 01m 01s)
  • 08:57 moritzm: rebooting scb1004 for kernel security update (was omitted from earlier reboots due to hardware issues on scb1003)
  • 08:51 moritzm: repooling scb1003 after memory module was replaced (T188385)
  • 07:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 57s)
  • 07:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db1114 to the config - T183469 (duration: 00m 57s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db1114 to the config - T183469 (duration: 00m 57s)
  • 07:11 moritzm: rebooting xenon/praseodymium/cerium for kernel security update
  • 07:11 moritzm: rebooting xenon/praseodymium/xenon for kernel security update
  • 06:52 marostegui: Stop MySQL on db1073 to clone db1114 - T183469
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 to clone db1114 - T183469 (duration: 00m 58s)
  • 02:48 legoktm: manually purged ExtensionDistributor cache (T188692)
  • 01:54 mutante: cobalt (gerrit) - rebooting for kernel upgrade
  • 01:46 mutante: LDAP: added lucaswerkmeister-wmde to 'wmde' and 'nda' groups (T188105)
  • 00:49 ebernhardson@tin: Synchronized wmf-config/flaggedrevs.php: SWAT: T148603: (duration: 00m 57s)
  • 00:48 herron: fermium (lists) and mx systems rebooted for kernel update
  • 00:46 ebernhardson@tin: Synchronized php-1.31.0-wmf.23/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT T187148: Start cirrus query explorer AB test (duration: 00m 57s)
  • 00:25 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148 Configure Cirrus AB test (step 2) (second try) (duration: 00m 57s)
  • 00:23 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: T187148 Configure Cirrus AB test (step 1) (second try) (duration: 00m 57s)
  • 00:12 ebernhardson@tin: Synchronized wmf-config/: REVERT SWAT: T187148 Configure Cirrus AB test (duration: 00m 59s)
  • 00:09 ebernhardson@tin: Synchronized wmf-config/: SWAT: T187148 Configure Cirrus AB test (duration: 01m 00s)

2018-03-01

  • 22:35 gehel: rolling restart of elsticsearch / cirrus - eqiad complete, cluster is green
  • 21:45 thcipriani@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.23
  • 21:33 bsitzmann@tin: Finished deploy [mobileapps/deploy@bd9924e]: Update mobileapps to 1056fde (T183833) (duration: 05m 15s)
  • 21:28 bsitzmann@tin: Started deploy [mobileapps/deploy@bd9924e]: Update mobileapps to 1056fde (T183833)
  • 21:17 thcipriani@tin: Synchronized php-1.31.0-wmf.23/extensions/GeoData/includes/api/ApiQueryGeoSearchElastic.php: Fix undefined property error in ApiQueryGeoSearchElastic T188659 (duration: 01m 15s)
  • 20:30 thcipriani@tin: Synchronized php: php link to 1.31.0-wmf.23 (duration: 01m 12s)
  • 20:29 andrewbogott: restarting labweb1002
  • 20:28 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.23
  • 20:15 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/specials/pagers/NewPagesPager.php: SWAT: NewPagesPages: Use array_merge rather than + for RC query info fields T188555 (duration: 01m 14s)
  • 20:15 andrewbogott: rebooting labweb1001
  • 19:56 thcipriani@tin: Synchronized langlist-labs: SWAT: beta: add nlwiki to langlist T188582 (beta-only change) (duration: 01m 13s)
  • 19:50 gehel: new kafka based poller for wdqs now enabled on wdqs2001 - T188252
  • 19:48 thcipriani@tin: Synchronized wmf-config/throttle-analyze.php: SWAT: Revert "Automatically include commons and wikidata in $wmgThrottlingExceptions" (duration: 01m 14s)
  • 19:36 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable rollback for editors at zh_classicalwiki T188064 (duration: 01m 14s)
  • 19:31 gehel@tin: Finished deploy [wdqs/wdqs@86da751]: new updater to fix kafka poller issues (duration: 02m 12s)
  • 19:29 gehel@tin: Started deploy [wdqs/wdqs@86da751]: new updater to fix kafka poller issues
  • 19:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable responsive references by default on rowiki T187997 (duration: 01m 15s)
  • 19:21 mutante: scb1003 depooled scb1003 from all services on scb because it went down, including mgmt
  • 19:20 dzahn@neodymium: conftool action : set/pooled=no; selector: name=scb1003.eqiad.wmnet
  • 19:17 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Make last throttle limit raise work accross all wikis T188630 (duration: 01m 13s)
  • 19:15 mutante: powercycling crashed scb1003
  • 19:13 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Fix throttle date for outreach dashboard T188630 (duration: 01m 13s)
  • 18:47 demon@tin: Synchronized wmf-config/: killing extension-list-labs (duration: 01m 17s)
  • 18:45 demon@tin: Synchronized wmf-config/InitialiseSettings.php: disable performance inspector in prod explicitly (duration: 01m 14s)
  • 18:43 demon@tin: Synchronized docroot/noc/: killing extension-list-labs (duration: 01m 14s)
  • 18:13 bsitzmann@tin: Finished deploy [mobileapps/deploy@ada38aa]: Update mobileapps to 0db4a60 (T183833) (duration: 06m 01s)
  • 18:07 bsitzmann@tin: Started deploy [mobileapps/deploy@ada38aa]: Update mobileapps to 0db4a60 (T183833)
  • 17:51 gehel: depooling wdqs2001 and switching to kafka poller - T188252
  • 17:47 gehel: restarting wdqs-updater on wdqs1004 -T188045
  • 17:46 mutante: re-enabling icinga notifications for wdqs1004 services, ethernet cable has been replaced (T188045)
  • 17:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 (duration: 01m 14s)
  • 17:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 01m 28s)
  • 17:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 01m 13s)
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 (duration: 01m 13s)
  • 16:41 jynus: reimporting database testreduce_0715 from db1009 to db2037
  • 16:36 marostegui: Restart mariadb on db1093 for binlog format change - T186321
  • 16:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T186321 (duration: 01m 13s)
  • 16:14 moritzm: rebooting hafnium for kernel security update
  • 16:06 marostegui: Fix s7 replication on labsdb1010 - T186579
  • 16:00 moritzm: rebooting radium (tor relay) for kernel security update
  • 15:52 moritzm: draining restbase1014 for eventual reboot for kernel security update
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 as API (duration: 01m 13s)
  • 15:32 bblack: disabling puppet on A:cp for deploy of https://gerrit.wikimedia.org/r/#/c/415204/ and friends
  • 15:30 mobrovac@tin: Synchronized php-1.31.0-wmf.23/extensions/EventBus/includes/JobQueueEventBus.php: EventBus: Specify that EventBus queue supports delayed jobs (wmf/1.31.0-wmf.23) - T188540 (duration: 01m 14s)
  • 15:26 mobrovac@tin: Synchronized php-1.31.0-wmf.22/extensions/EventBus/includes/JobQueueEventBus.php: EventBus: Specify that EventBus queue supports delayed jobs (wmf/1.31.0-wmf.22) - T188540 (duration: 01m 13s)
  • 15:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 after alter table (duration: 01m 13s)
  • 15:22 moritzm: draining restbase1013 for eventual reboot for kernel security update
  • 15:19 zeljkof: EU SWAT finished
  • 15:18 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/Popups: SWAT: Fix: dont assume thumbnail URLs contain pixel size (T187955) (duration: 01m 14s)
  • 15:17 moritzm: rolling restart of swift frontends in eqiad for kernel security update
  • 15:12 godog: upload puppetdb 4.4.0-1~wmf1 to component/puppetdb4 - T177253
  • 15:00 ema: eqiad LVSs: reboot for retpoline kernel updates T188092
  • 14:36 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Import sources on maiwikimedia (T188374) (duration: 01m 13s)
  • 14:28 moritzm: rolling restart of swift frontends in codfw for kernel security update
  • 14:26 moritzm: draining restbase1012 for eventual reboot for kernel security update
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Quiz Extension at zhwikibooks (T188213) (duration: 01m 14s)
  • 14:12 mobrovac@tin: Synchronized wmf-config/jobqueue.php: JobQueue: Enable EventBus for cdnPurge for all but wikipedia, commons and wikidata, file 2/2 - T188540 (duration: 01m 13s)
  • 14:10 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: JobQueue: Enable EventBus for cdnPurge for all but wikipedia, commons and wikidata, file 1/2 - T188540 (duration: 01m 14s)
  • 14:02 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b5255f0]: Enable kafka queue for cdnPurge for all but wikipedia, commons and wikidata. T188540 (duration: 00m 44s)
  • 14:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@b5255f0]: Enable kafka queue for cdnPurge for all but wikipedia, commons and wikidata. T188540
  • 13:54 moritzm: draining restbase1011 for eventual reboot for kernel security update
  • 13:50 ema: codfw LVSs: reboot for retpoline kernel updates T188092
  • 13:33 gehel: force merging enwiki_general index on codfw to reclaim space
  • 13:18 moritzm: draining restbase1010 for eventual reboot for kernel security update
  • 13:17 elukey: reboot kafka-jumbo100[5,6] for kernel updates
  • 13:16 ema: esams LVSs: reboot for retpoline kernel updates T188092
  • 12:44 moritzm: draining restbase1009 for eventual reboot for kernel security update
  • 12:39 moritzm: rolling reboot of parsoid in eqiad for kernel security update
  • 12:27 elukey: reboot kafka-jumbo1004 for kernel updates
  • 12:21 elukey: reboot kafka1023 for kernel updates
  • 11:59 moritzm: draining restbase1008 for eventual reboot for kernel security update
  • 11:48 moritzm: powercycling wtp2013, stuck in reboot
  • 11:36 elukey: reboot kafka-jumbo1003 for kernel updates
  • 11:33 jynus: restarting labsdb1011
  • 11:32 elukey: reboot kafka1022 for kernel updates
  • 11:20 elukey: reboot kafka-jumbo1002 for kernel security updates
  • 11:15 moritzm: draining restbase1007 for eventual reboot for kernel security update
  • 11:13 ema: ulsfo LVSs: reboot for retpoline kernel updates T188092
  • 11:08 elukey: reboot kafka1020 for kernel updates
  • 10:38 ema: eqsin LVSs: reboot for retpoline kernel updates T188092
  • 10:32 moritzm: rolling reboot of parsoid in codfw for kernel security update
  • 10:27 moritzm: draining restbase2012 for eventual reboot for kernel security update
  • 10:20 moritzm: rebooting labnodepool1001 for kernel security update
  • 10:02 moritzm: rebooting contint1001 for kernel security update
  • 09:59 elukey: reboot kafka1014 for kernel security updates
  • 09:57 moritzm: draining restbase2011 for eventual reboot for kernel security update
  • 09:43 elukey: reboot kafka1013 for kernel security updates
  • 09:29 elukey: rebooting analytics1030 for kernel updates
  • 09:17 moritzm: draining restbase2010 for eventual reboot for kernel security update
  • 08:52 moritzm: rebooting prometheus servers in eqiad for kernel security update
  • 08:41 moritzm: draining restbase2009 for eventual reboot for kernel security update
  • 08:34 elukey: reboot kafka1012 for kernel updates - T188594
  • 08:20 gehel: banning elastic1021 from cluster (failed memory) - T188595
  • 07:55 elukey: reboot kafka-jumbo1001 for kerne updates - T188594
  • 07:52 elukey: run kafka preferred-replica-election on kafka1012 to force broker 18 to get back among Kafka topic leaders
  • 07:26 gehel: starting rolling reboot of elasticsearch / cirrus - eqiad (kernel upgrade and config changes)
  • 07:24 demon@tin: Synchronized php-1.31.0-wmf.22/maintenance/sql.php: adding --json output mode (duration: 01m 15s)
  • 06:59 chasemp: restart nova-api on labnet1001
  • 06:57 madhuvishy: Restart nova-conductor on labcontrol1001
  • 06:26 marostegui: Deploy schema change on db1074 - T187089 T185128 T153182
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for alter table (duration: 01m 14s)
  • 06:09 marostegui: Reload haproxy on dbproxy1005
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 23s)
  • 02:05 demon@tin: Synchronized wmf-config/: removing extension-list-wikitech (duration: 01m 13s)
  • 02:03 demon@tin: Synchronized docroot/noc/: cleanup extension-list-wikitech removal (duration: 01m 12s)
  • 01:49 demon@tin: Synchronized wmf-config/: Undeploying EmailAuth from beta, no-op (duration: 01m 16s)
  • 01:32 eileen: update civicrm revision changed from 341c734a79 to a819d64d98, config revision is 62631813fc (add geocoder extension)
  • 00:43 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Clean up $wgEchoPerUserBlacklist setting (duration: 01m 14s)
  • 00:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Remove $wgUsejQueryThree (duration: 01m 14s)
  • 00:27 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on eswikibooks (T145394) (duration: 01m 13s)
  • 00:17 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on eswiki (T130279) (duration: 01m 14s)

2018-02-28

  • 23:27 eileen: civicrm revision changed from a47eafcbad to 341c734a79, config revision is 62631813fc (update civicrm submodule & vendor but not geocoder extension as yet)
  • 22:11 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.22 T188555
  • 22:00 ejegg: updated payments-wiki from 1acfc4a9a0 to 498f49a758
  • 21:57 milimetric@tin: Finished deploy [analytics/refinery@fdd6c25]: Fix error due to invalid docopts comment (duration: 04m 19s)
  • 21:56 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to 1.31.0-wmf.23
  • 21:53 milimetric@tin: Started deploy [analytics/refinery@fdd6c25]: Fix error due to invalid docopts comment
  • 21:46 arlolra: Updated Parsoid to 1415a2a (T58756, T169006)
  • 21:26 arlolra@tin: Finished deploy [parsoid/deploy@d376a3c]: Updating Parsoid to 1415a2a (duration: 08m 46s)
  • 21:17 arlolra@tin: Started deploy [parsoid/deploy@d376a3c]: Updating Parsoid to 1415a2a
  • 20:53 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 (back) to 1.31.0-wmf.23
  • 20:28 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki to 1.31.0-wmf.23
  • 20:20 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/page/WikiPage.php: WikiPage: Avoid $user variable reuse in doDeleteArticleReal() T188479 (duration: 00m 57s)
  • 19:52 demon@tin: Synchronized README: no-op, forcing co-master sync (duration: 00m 57s)
  • 19:29 gehel: rolling reboot of elasticsearch / cirrus - codfw completed
  • 18:56 demon@tin: Finished deploy [gerrit/gerrit@f16f4a4]: GO plugin (duration: 00m 10s)
  • 18:55 demon@tin: Started deploy [gerrit/gerrit@f16f4a4]: GO plugin
  • 18:53 niharika29@tin: Synchronized wmf-config/throttle.php: Clean obsolete rules and add a new one - T188529 (duration: 00m 56s)
  • 18:44 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Reduce the batch size of statment usage tracking to 33 T151717 (duration: 00m 57s)
  • 18:42 niharika29@tin: Synchronized wmf-config/Wikibase.php: Reduce the batch size of statment usage tracking to 33 T151717 (duration: 00m 57s)
  • 18:32 godog: puppet reenable on einsteinium
  • 18:30 niharika29@tin: Synchronized wmf-config/Wikibase-production.php: Enable reading from full term entity id everywhere T114903 (duration: 00m 57s)
  • 18:23 niharika29@tin: Synchronized wmf-config/Wikibase-production.php: Enable Wikibase RC injection for ruwiki [mediawiki-config] - https://gerrit.wikimedia.org/r/415078 (duration: 00m 57s)
  • 18:19 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Compact Language Links out of Beta on English Wikipedia T187677 (duration: 00m 58s)
  • 18:17 mutante: gerrit2001 - reboot for kernel upgrade
  • 18:12 godog: force a puppet run on failed hosts in eqiad for recovery
  • 18:09 apergos: rebooting dataset1001 (dumps.wm.o) for new kernel
  • 18:06 godog: stop and restart apache2 on puppetmaster1002
  • 17:58 godog: restart apache2 on puppetmaster1002
  • 17:46 milimetric@tin: Finished deploy [analytics/refinery@e551744]: Update sqoop job and orm artifact (duration: 06m 45s)
  • 17:46 kart_: Finished running CLL preference migration script on terbium (T187677)
  • 17:39 milimetric@tin: Started deploy [analytics/refinery@e551744]: Update sqoop job and orm artifact
  • 17:38 mutante: phab2001 - downtimed, rebooting for kernel upgrade
  • 16:44 moritzm: draining restbase2008 for eventual reboot for kernel security update
  • 16:10 moritzm: rebooting prometheus servers in codfw for kernel security update
  • 16:10 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3622e38]: Enable refreshLinks for all but wikipedia, wiktionary and commons (duration: 00m 41s)
  • 16:09 ppchelko@tin: Started deploy [cpjobqueue/deploy@3622e38]: Enable refreshLinks for all but wikipedia, wiktionary and commons
  • 16:02 moritzm: draining restbase2007 for eventual reboot for kernel security update
  • 15:45 godog: repool rhodium as puppet master backend
  • 15:22 moritzm: rebooting ores in eqiad for kernel security update
  • 15:22 ema: upgrade cache_text@eqiad to varnish 5
  • 15:20 moritzm: draining restbase2006 for eventual reboot for kernel security update
  • 15:16 zeljkof: EU SWAT finished
  • 15:15 zfilipin@tin: Synchronized php-1.31.0-wmf.23/extensions/WikibaseQualityConstraints/: SWAT: Don’t query WikiPageEntityMetaDataAccessor with empty list (T188311) Bump cache key for check results (T188384) (duration: 01m 02s)
  • 15:11 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints: SWAT: Bump cache key for check results (T188384) (duration: 01m 02s)
  • 14:54 moritzm: rebooting ores in codfw for kernel security update
  • 14:53 jynus: stopping labsdb1011 to clone it to labsdb1010 T186579
  • 14:50 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Drop the medlem user group and editallpages user right (T184981) (duration: 00m 57s)
  • 14:48 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints: SWAT: Don’t query WikiPageEntityMetaDataAccessor with empty list (T188311) (duration: 01m 02s)
  • 14:47 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints/: SWAT: Only filter statuses after collecting metadata (T188384) (duration: 01m 03s)
  • 14:38 jynus: dropping sqldata on dbstore1001
  • 14:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable HTML Previews on all wikipedias (T182319) (duration: 00m 57s)
  • 14:28 moritzm: rebooting kubestage* for kernel security update
  • 14:25 gehel@tin: Finished deploy [tilerator/deploy@455a31a]: adding Brighmed, Meddo and ClearTables to tilerator (duration: 04m 27s)
  • 14:22 moritzm: draining restbase2005 for eventual reboot for kernel security update
  • 14:21 gehel@tin: Started deploy [tilerator/deploy@455a31a]: adding Brighmed, Meddo and ClearTables to tilerator
  • 14:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: beta: enable VirtualPagePreviews events on beta cluster (T184793 T186728) (duration: 00m 57s)
  • 13:13 moritzm: draining restbase2004 for eventual reboot for kernel security update
  • 12:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2011 - T187886 (duration: 00m 59s)
  • 12:42 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2011 - T187886 (duration: 00m 58s)
  • 12:35 moritzm: draining restbase2003 for eventual reboot for kernel security update
  • 12:00 marostegui: Reboot db1115 tendril master to pick up new my.cnf options - T184704
  • 11:49 moritzm: draining restbase2002 for eventual reboot for kernel security update
  • 11:37 marostegui: Reset slave all on db2093 - T184704
  • 11:35 moritzm: rebooting eqiad job runners for kernel security update
  • 11:18 moritzm: powercycling restbase2001, stuck in reboot
  • 11:10 godog: rollout thumbor 1.15 to codfw/eqiad
  • 10:59 godog: upload python-thumbor-wikimedia 1.15 - T187822 T187350
  • 10:59 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1261.eqiad.wmnet
  • 10:54 moritzm: draining restbase2001 for eventual reboot for kernel security update
  • 10:43 moritzm: rebooting remaining mediawiki app servers in eqiad
  • 09:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2083, db2082 and db2081 after kernel upgrade (duration: 00m 57s)
  • 09:25 ema: upgrade cache_text@codfw to varnish 5
  • 09:06 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2083, db2082 and db2081 for kernel upgrade (duration: 00m 56s)
  • 09:06 marostegui: Reboot db2083, db2082 and db2081 for kernel and mariadb upgrade
  • 08:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 - T162807 (duration: 00m 57s)
  • 08:42 filippo@neodymium: conftool action : set/pooled=yes; selector: name=neodymium.eqiad.wmnet
  • 08:42 filippo@neodymium: conftool action : set/pooled=no; selector: name=neodymium.eqiad.wmnet
  • 08:34 marostegui: Reboot db2069 for kernel upgrade
  • 08:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2069 - T162807 (duration: 00m 57s)
  • 08:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 - T162807 (duration: 00m 57s)
  • 08:10 moritzm: rebooting remaining mediawiki API servers in eqiad
  • 07:51 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 - T162807 (duration: 00m 57s)
  • 07:51 marostegui: Reboot db2062 for mariadb and kernel upgrade
  • 07:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2085 (duration: 00m 57s)
  • 07:15 marostegui: Upgrade kernel and mariadb on db2085
  • 07:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2085 for mariadb and kernel upgrade (duration: 01m 00s)
  • 06:32 marostegui: Deploy schema change on db1060 (with replication) - this will cause lag on labs servers - T187089 T185128 T153182
  • 06:31 kart_: (Re)Starting CLL preference migration script on terbium (T187677)
  • 06:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 for alter table (duration: 00m 57s)
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 for alter table (duration: 00m 57s)
  • 05:43 demon@tin: rebuilt and synchronized wikiversions files: (no justification provided)
  • 04:55 krinkle@tin: Synchronized wmf-config/profiler.php: Iba417de75a and Ied984d (duration: 01m 06s)
  • 03:01 kart_: Starting CLL preference migration script on terbium (T187677)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 21s)
  • 00:55 demon@tin: Synchronized scap/plugins/wmfbetaautoupdate.py: no-op (duration: 01m 14s)
  • 00:24 papaul: OS install on wdqs200[4-6]
  • 00:03 thcipriani@tin: Synchronized php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameUserJob.php: LocalRenameUserJob: escape backreferences in replacement title T188171 (duration: 01m 13s)

2018-02-27

  • 23:38 krinkle@tin: Synchronized dblists/: remove pp_stage1_raw.dblist (duration: 01m 14s)
  • 21:23 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/user/User.php: Add a missing check of $wgActorTableSchemaMigrationStage T188437 (duration: 01m 14s)
  • 20:42 ppchelko@tin: Finished deploy [eventstreams/deploy@14e0b03]: Set correct CSP headers, forgot to git pull (duration: 02m 29s)
  • 20:39 ppchelko@tin: Started deploy [eventstreams/deploy@14e0b03]: Set correct CSP headers, forgot to git pull
  • 20:37 ppchelko@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Set correct CSP headers (duration: 00m 25s)
  • 20:36 ppchelko@tin: Started deploy [eventstreams/deploy@8f2eec4]: Set correct CSP headers
  • 20:31 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.23
  • 20:08 herron: eqiad puppet master reboots finished -- re-enabling puppet agents
  • 20:02 herron: temporarily disabling puppet agents and rebooting eqiad puppet masters for kernel update
  • 20:02 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.23 and rebuild l10n cache (duration: 32m 10s)
  • 19:30 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.23 and rebuild l10n cache
  • 19:08 otto@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (duration: 04m 16s)
  • 19:03 otto@tin: Started deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241
  • 19:03 otto@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (scb2002 only) (duration: 00m 22s)
  • 19:03 otto@tin: Started deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (scb2002 only)
  • 18:32 otto@tin: Started restart [eventstreams/deploy@7629e16]: service restart to publish page change related streams: T187241 (scb2001 only)
  • 18:32 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: Config deploy to publish page change related streams: T187241 (scb2001 only) (duration: 00m 03s)
  • 18:32 otto@tin: Started deploy [eventstreams/deploy@7629e16]: Config deploy to publish page change related streams: T187241 (scb2001 only)
  • 18:02 moritzm: rebooting kubernetes workers in eqiad for kernel security update
  • 17:46 moritzm: rebooting kubernetes workers in codfw for kernel security update
  • 17:41 jynus: restarting ferm on db2049, seems failed one day ago
  • 17:38 gehel: restarting wdqs-updater on wdqs1004 - T188045
  • 17:32 thcipriani: starting branch cut for 1.31.0-wmf.23 T183962
  • 17:14 godog: upload puppetdb 2.3.8-1~wmf1+stretch to stretch-wikimedia - T184562
  • 17:10 urandom: restarting Cassandra, restbase1007-a to test jmx_exporter
  • 16:53 elukey: restart cassandra-a on aqs1004 to test the prometheus jmx agent before complete rollout - T184795
  • 16:52 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH everywhere (duration: 00m 56s)
  • 16:50 ema: lvs1010: retpoline kernel/libs upgrade T188092
  • 16:46 ema: cp1008: retpoline kernel/libs upgrade T188092
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1081 (duration: 02m 04s)
  • 16:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 55s)
  • 16:26 moritzm: rebooting mw1293-mw1298 for kernel security update
  • 16:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 56s)
  • 16:10 thcipriani: restarting jenkins for plugin update
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 56s)
  • 16:06 moritzm: rebooting restbase-dev for kernel security update
  • 15:49 awight: Restarting ORES celery workers, changing from 35 -> 45 workers per node.
  • 15:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1081 - T186321 (duration: 00m 56s)
  • 15:37 marostegui: Stop MySQL and reboot db1081 for kernel ugprade, mariadb upgrade and binlog format change - T186321
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T186321 (duration: 00m 55s)
  • 15:33 moritzm: installing squid security updates
  • 15:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T187534 (duration: 00m 57s)
  • 15:20 moritzm: powercycling thumbor1004, stuck during reboot
  • 15:19 ottomata: beginning migration of varnishkafka webrequest upload from Kafka analytics to kafka jumbo
  • 15:11 ema: upgrade cache_text@esams to varnish 5 T184448
  • 15:02 gilles: EU SWAT finished
  • 15:02 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Set up separate Thumbor Swit user for private containers (T187822) (duration: 00m 55s)
  • 15:00 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: (T187822) (duration: 00m 56s)
  • 14:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix: Add missed line in wgLogo (T185977) (duration: 00m 56s)
  • 14:44 moritzm: rebooting thumbor in eqiad for kernel security update
  • 14:31 bblack: puppet disable on RPS-using hosts to be careful with RPS hosts https://gerrit.wikimedia.org/r/#/c/414676/ - cp*, lvs*, labstore
  • 14:27 chasemp: silence labvirt1019/1020 in icinga
  • 14:24 ariel@tin: Finished deploy [dumps/dumps@9b7841f]: fix off-by-one error in prefetch stubs generation (duration: 00m 04s)
  • 14:23 ariel@tin: Started deploy [dumps/dumps@9b7841f]: fix off-by-one error in prefetch stubs generation
  • 14:15 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T188292) New throttle rule for cswiki (T187990) New throttle rule (T188034) (duration: 00m 57s)
  • 14:05 marostegui: Update tendril shard table for the "tendril" replication topology - T184704
  • 13:33 gehel: starting rolling restart of elasticsearch / cirrus codfw (config changes + kernel upgrade)
  • 13:25 moritzm: rebooting thumbor in codfw for kernel security update
  • 13:22 godog: upload ruby-mysql 2.9.1-1~bpo9+1 to stretch-wikimedia - T184562
  • 13:00 Amir1: inserting wikidata-related interwikis to site_identifiers table using eval.php in enwiki (T183019)
  • 12:35 marostegui: Remove /srv/tmp/dbstore1001 files from es1017 to free up space - T186596
  • 12:16 Hauskatze: The global rename: Darkweasel94 → Tokfo has FINISHED - T187629
  • 11:56 moritzm: rebooting mw1221-mw1235 (API servers) for kernel security update
  • 11:08 moritzm: rebooting mw1240-mw1258 (app servers) for kernel security update
  • 11:00 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=scb1003.eqiad.wmnet
  • 10:57 moritzm: keeping scb1003 depooled for T188385
  • 10:51 _joe_: updating python-conftool everywhere to 1.0.0
  • 10:51 _joe_: uploaded python-conftool 1.0.0 to stretch-wikimedia
  • 10:49 moritzm: powercycling scb1003, stuck during reboot
  • 10:29 Hauskatze: Starting big global rename: Darkweasel94 → Tokfo - with DBA/OPS green light - T187629
  • 10:07 akosiaris: poweroff sca1004 for T181121 tests
  • 10:05 moritzm: reboot scb in eqiad for kernel security updates
  • 10:03 _joe_: uploading conftool-1.0.0-1 to jessie-wikimedia
  • 09:16 godog: reimage rhodium - T184562
  • 08:42 gehel: powercycling wdqs1004 - T188045
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1084 (duration: 00m 56s)
  • 08:24 gilles@tin: Synchronized private/PrivateSettings.php: Separate Thumbor Swift user for private containers (duration: 00m 56s)
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 (duration: 00m 56s)
  • 07:04 marostegui: Stop MySQL on db1084 for kernel and mariadb upgrade
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 00m 56s)
  • 07:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1084 (duration: 00m 56s)
  • 06:59 demon@tin: Synchronized README: no-op (duration: 00m 56s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-codfw.php: Increase traffic for db1103:3312 (duration: 00m 56s)
  • 06:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Slowly repool db1103:3312 (duration: 00m 56s)
  • 06:33 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 06:21 marostegui: Stop MySQL on db1115 to copy it to db2093 - tendril (dbtree) service will be down for this maintenance - T184704
  • 06:20 marostegui: Reload haproxy on dbproxy1005
  • 05:26 krinkle@tin: Synchronized wmf-config/profiler.php: I1e7dc263b43 (duration: 00m 56s)
  • 05:00 krinkle@tin: Synchronized wmf-config/profiler.php: I34687c0569af (duration: 00m 57s)
  • 03:28 krinkle@tin: Synchronized wmf-config/profiler.php: various refactor and clean up for T180183 (no-op) (duration: 00m 54s)
  • 03:12 krinkle@tin: Synchronized wmf-config/profiler-labs.php: beta only (no-op) (duration: 00m 56s)
  • 02:58 demon@tin: Pruned MediaWiki: 1.31.0-wmf.21 [keeping static files] (duration: 01m 24s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 11s)
  • 01:39 mutante: install1002 - re-enabling disabled puppet
  • 00:55 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: add very likely bad faith filter on svwiki (T174560) (duration: 00m 57s)
  • 00:49 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on svwiki (T174560) (duration: 00m 56s)
  • 00:40 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on simplewiki (T182012) (duration: 00m 56s)
  • 00:39 demon@tin: Synchronized wmf-config/CommonSettings.php: beta-only change: lsctorestaticarray (duration: 00m 56s)
  • 00:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on all wikinews wikis (T188000), all private wikis (T188009), test2wiki, loginwiki, votewiki and wikimania2017wiki (T188008) (duration: 00m 56s)

2018-02-26

  • 23:37 bd808@tin: Finished scap: wikitech: use 'labswiki' database on m5-master (T188029) (duration: 03m 21s)
  • 23:34 bd808@tin: Started scap: wikitech: use 'labswiki' database on m5-master (T188029)
  • 23:31 bd808: Pulled T188029 change to silver
  • 22:57 demon@tin: Synchronized wmf-config/: fileimporter/fileexporter improvements (duration: 00m 58s)
  • 22:56 demon@tin: Synchronized wmf-config/InitialiseSettings.php: fileimporter/fileexporter improvements (duration: 00m 57s)
  • 22:09 andrewbogott: hotfixed mediawiki on silver to use m5-master for wikitech. This will be finalized with the merge of https://gerrit.wikimedia.org/r/#/c/414733/
  • 22:07 andrewbogott: made mysql on silver read-only, hopefully for good. T188029
  • 22:05 andrewbogott: logging a log to test logging a log
  • 22:03 andrewbogott: testing the log by logging a test
  • 19:46 catrope@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints/: T184937 (duration: 01m 03s)
  • 19:46 mutante: running puppet on cache::misc servers to add new director for design.wm
  • 19:29 catrope@tin: Synchronized wmf-config/CommonSettings.php: Simplify 2017 wikitext editor config (part 1) (duration: 00m 54s)
  • 19:26 catrope@tin: Synchronized wmf-config/throttle.php: Add throttle rule (T188129) (duration: 00m 56s)
  • 19:22 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add mushroomobserver.org to wgCopyUploadsDomains (T188203) (duration: 00m 57s)
  • 19:08 herron: codfw puppet master kernel updates complete re-enabling puppet agents
  • 18:31 gehel@tin: Finished deploy [wdqs/wdqs@f74cbd1]: new forAllCategoryWikis.sh (duration: 06m 28s)
  • 18:24 gehel@tin: Started deploy [wdqs/wdqs@f74cbd1]: new forAllCategoryWikis.sh
  • 18:13 demon@tin: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor: Ignore empty repositories (duration: 00m 56s)
  • 17:34 jynus: deploying new query killer to db1109
  • 17:32 akosiaris: shutdown sca1004 on ganeti1005 for T181121
  • 16:39 andrewbogott: making wikitech read-only (via a local patch) while I migrate the database to m5
  • 16:33 marostegui: Reboot db1111 storage crashed - T187526
  • 16:31 papaul: Maintenance: removing Msw-d4-codfw for replacement:T187534
  • 16:29 mutante: restarted stashbot on toolforge because it didn't react to !log
  • 16:26 mutante: test !log
  • 16:08 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T187534 (duration: 00m 56s)
  • 15:45 andrewbogott: made wikitech read/write again pending a bit more preliminary work
  • 15:43 cmjohnson1: swapping failed disk db1068
  • 15:42 andrewbogott: marking wikitech read-only (via a local edit to CommonSettings.php) for https://phabricator.wikimedia.org/T188029
  • 15:32 addshore: EU SWAT done
  • 15:31 addshore@tin: Finished scap: Updated mediawiki/extensions/AdvancedSearch i18n files for some translations (duration: 11m 29s)
  • 15:19 addshore@tin: Started scap: Updated mediawiki/extensions/AdvancedSearch i18n files for some translations
  • 15:12 Amir1: This might have performance implications roll it back if it affects these wikis too much
  • 15:12 gehel: reboot of relforge completed, cluster is green again
  • 15:11 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Enable reading full entity id from wb_terms table in three wikis (T114903) (duration: 00m 56s)
  • 14:54 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Add patrol rights/groups to fawikisource (T187662) (duration: 00m 56s)
  • 14:52 gehel: rebooting relforge for kernel upgrade
  • 14:50 godog: upload puppetdb 4.4.0-1~wmf1 to stretch-wikimedia - T177253
  • 14:48 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable statement usage tracking in several wikis (T151717) (duration: 00m 57s)
  • 14:40 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespaces to urwiktionary (T186393) (duration: 00m 56s)
  • 14:28 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 00m 55s)
  • 14:15 moritzm: rebooting scb in codfw for kernel security updates
  • 14:10 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/UniversalLanguageSelector/maintenance/ULSCompactLinksDisablePref.php: SWAT: Added option to continue script from particular User ID Use a replica dedicated to slow queries (if available) (T187880) (duration: 00m 58s)
  • 13:09 moritzm: rebooting video scalers in eqiad for kernel security update
  • 11:12 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 11:11 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 11:01 moritzm: powercycling mw1264 (stuck after reboot)
  • 10:10 moritzm: rebooting mw canaries for kernel security update
  • 09:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 and db2070 (duration: 00m 55s)
  • 09:23 elukey: copied burrow 0.1 from jessie-wikimedia to stretch-wikimedia
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1103:3314 (duration: 00m 56s)
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1103:3314 (duration: 00m 56s)
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1103:3314 (duration: 00m 56s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1103:3314 after mariadb and kernel upgrade (duration: 00m 56s)
  • 07:08 marostegui: Deploy schema change on db1103:3312 - T187089 T185128 T153182
  • 06:59 marostegui: Stop MySQL on db1103:3312 and 3314 to upgrade it and kernel
  • 06:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 (duration: 00m 54s)
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 (duration: 00m 56s)
  • 06:35 marostegui: Stop MySQL db2070 and db2055 to copy data to db2055 (and upgrade kernel and mariadb)
  • 06:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2055 and db2070 (duration: 01m 07s)
  • 06:15 marostegui: Stop MySQL on db1115 tendril database to copy it to db2093. Tendril (dbtree) service will be down for maintenance - T184704
  • 02:55 XioNoX: labs->cloud vlan rename in codfw - T187933
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 07m 12s)
  • 02:15 XioNoX: disabling ALGs on MR routers

2018-02-25

  • 07:35 marostegui: Fix s7 replication on labsdb1010 - T186579

2018-02-24

  • 06:11 marostegui: Reload haproxy on dbproxy1005
  • 01:42 demon@tin: Synchronized docroot/noc/conf/highlight.php: one last time (duration: 00m 57s)
  • 01:18 demon@tin: Synchronized docroot/noc/conf/index.php: fix dblist links from listing (duration: 00m 56s)
  • 01:13 Reedy: added eqsin ipv6 range to botpasswords ip range restriction T188111
  • 01:08 demon@tin: Synchronized docroot/noc/: dblists cleanup (duration: 00m 57s)
  • 01:07 demon@tin: Synchronized tests/: no-op (duration: 00m 59s)

2018-02-23

  • 22:36 demon@tin: Finished deploy [gerrit/gerrit@010ad50]: no-op, removing permission from file (duration: 00m 10s)
  • 22:35 demon@tin: Started deploy [gerrit/gerrit@010ad50]: no-op, removing permission from file
  • 21:27 demon@tin: Finished scap: pos mysql code (duration: 23m 09s)
  • 21:04 demon@tin: Started scap: pos mysql code
  • 20:48 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.22
  • 20:39 no_justification: wmf.21, that is
  • 20:38 demon@tin: rebuilt and synchronized wikiversions files: roll wikidatawiki back to wmf.11, busted
  • 20:35 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.22
  • 19:10 ebernhardson: restart relforge elasticsearch cluster to test entity extraction on larger dataest
  • 18:28 Amir1: mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=enwiki --force-protocol https (T183019)
  • 17:22 ema: libvmod-netmapper 1.6-1 uploaded to apt.w.o/experimental T188089
  • 16:37 moritzm: rebooting image scalers in codfw for kernel security updates
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1083 (duration: 01m 14s)
  • 15:58 moritzm: rebooting job runners in codfw for kernel security updates
  • 15:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 (duration: 02m 21s)
  • 15:15 jynus: about to deploy gerrit:413375 disabling puppet on affected hosts
  • 14:59 elukey: update facts on puppet compiler
  • 14:40 moritzm: installing kernel updates on API servers in codfw
  • 14:09 jynus: restarting tendril database- will case unavailability of dbtree for a while
  • 13:44 moritzm: reboot ocg1003 for tests
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 and fully repool db1076 (duration: 01m 13s)
  • 12:28 hashar@tin: Synchronized wmf-config/throttle.php: Define new throttle rule - T188090 (duration: 01m 11s)
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 (duration: 01m 21s)
  • 12:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 - T186321 (duration: 01m 12s)
  • 11:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1076 - T186321 (duration: 01m 13s)
  • 11:29 marostegui: Restart mariadb on db1076 for binlog format change - T186321
  • 11:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for binlog format change - T186321 (duration: 01m 08s)
  • 11:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090 after alter table (duration: 01m 12s)
  • 11:02 moritzm: installing kernel updates on mw* in codfw
  • 10:30 hashar: releases1001: sudo -u jenkins rm -fR /var/lib/jenkins/jobs/mediawiki-private-nightlies/workspace/BRANCH/REL1_??/mediawiki-snapshot-REL1_??-2018???? # T188080
  • 10:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db2090 for the first time (duration: 01m 12s)
  • 10:08 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db2090 for the first time (duration: 01m 12s)
  • 10:01 elukey: restart hhvm on mw1230
  • 09:54 elukey: restart hhvm on mw1286
  • 09:50 elukey: restart hhvm on mw1227
  • 08:05 marostegui: MariaDB and kernel upgrade on db1083
  • 07:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083, fully repool db1089 - T162807 (duration: 01m 12s)
  • 06:55 marostegui: Reboot db2093 to test /srv auto-mounting
  • 06:40 marostegui: Deploy schema change on db1090 - T187089 T185128 T153182
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 for alter table (duration: 01m 13s)
  • 05:58 mutante: puppetmaster1001 - signing puppet certs for kafkamon1001/kafkamon2001 - initial puppet runs, adding as role spare (T187901)
  • 05:40 mutante: ganeti1004 - initial startup of kafkamon1001 - booting to PXE, installing stretch (T187901)
  • 04:56 mutante: ganeti: ganeti2004 - creating new VM kafkamon2001 - vcpus=2,memory=8g,disk=60G, row_A codfw (T187901)
  • 04:53 mutante: ganeti: creating new VM kafkamon1001 - vcpus=2,memory=8g,disk=60G, row_A eqiad (T187901)
  • 02:46 demon@tin: Finished deploy [gerrit/gerrit@23ebf75]: deploying webhooks plugin (duration: 00m 10s)
  • 02:46 demon@tin: Started deploy [gerrit/gerrit@23ebf75]: deploying webhooks plugin
  • 02:10 demon@tin: Synchronized docroot/: mw.org docroot moving (duration: 01m 13s)
  • 01:45 eileen: update process control process-control config revision is 1605238b2e
  • 01:20 eileen: update civicrm revision changed from aa251f1a93 to a47eafcbad, config revision is c1787646bc
  • 01:19 demon@tin: Synchronized static/favicon/: smaller favicons (duration: 01m 12s)
  • 01:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: point mkwikt favicon to en version, dupe (duration: 01m 15s)
  • 01:08 demon@tin: Synchronized wmf-config/InitialiseSettings.php: rtl wikibooks logo (duration: 01m 13s)
  • 01:06 demon@tin: Synchronized static/favicon/wikibooks-rtl.ico: rtl wikibooks logo (duration: 01m 12s)
  • 00:52 demon@tin: Synchronized static/images/project-logos/: new project logos for urdu wikt (duration: 01m 13s)
  • 00:37 krinkle@tin: Synchronized docroot/m.wikipedia.org/w/mobilelanding.php: Ia54cd7 - rm use of MW_LANG (duration: 01m 13s)

2018-02-22

  • 22:33 demon@tin: Synchronized php-1.31.0-wmf.22/includes/filerepo/file/LocalFile.php: Id5cdd8ec (duration: 01m 12s)
  • 22:32 demon@tin: Synchronized php-1.31.0-wmf.22/includes/externalstore/: Id5cdd8ec (duration: 01m 12s)
  • 22:30 demon@tin: Synchronized php-1.31.0-wmf.22/includes/Storage/: Id5cdd8ec (duration: 01m 13s)
  • 22:16 maxsem@tin: Synchronized php-1.31.0-wmf.21/extensions/SyntaxHighlight_GeSHi/: T188019 (duration: 01m 12s)
  • 22:14 maxsem@tin: Synchronized php-1.31.0-wmf.22/extensions/SyntaxHighlight_GeSHi/: T188019 (duration: 01m 14s)
  • 21:51 demon@tin: Synchronized php-1.31.0-wmf.22/includes/externalstore/: I9334d36e (duration: 01m 15s)
  • 21:37 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1004.eqiad.wmnet
  • 21:11 gehel: powercycling wdqs1004 (complete loss of network)
  • 20:39 demon@tin: Synchronized php-1.31.0-wmf.22/includes/libs/objectcache/WANObjectCache.php: betterer logging for cache ttl reduction, Iea029e78 (duration: 01m 13s)
  • 19:33 XioNoX: redirecting Facebook bots large source of traffic to codfw ( https://gerrit.wikimedia.org/r/#/c/413446/ )
  • 19:14 akosiaris: rolling restart of eqiad appservers. sudo cumin -b3 -s 30 'A:mw-eqiad' 'restart-hhvm' T188019
  • 19:12 twentyafterfour@tin: Synchronized php-1.31.0-wmf.22/extensions/SyntaxHighlight_GeSHi: deploy https://gerrit.wikimedia.org/r/#/c/413437/ (duration: 01m 13s)
  • 19:09 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/SyntaxHighlight_GeSHi: deploy https://gerrit.wikimedia.org/r/#/c/413437/ (duration: 01m 13s)
  • 19:09 twentyafterfour: syncing https://gerrit.wikimedia.org/r/#/c/413437/
  • 19:03 chasemp: baham:~# authdns-update
  • 19:00 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2073 (duration: 01m 12s)
  • 17:23 elukey: installed linux-perf-4.9 on phab1001 to experiment with perf tracing
  • 17:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1076 (duration: 01m 12s)
  • 17:05 XioNoX: rolling back "redirecting ns2 traffic to radon"
  • 17:02 ema: reboot eeden with new kernel 4.9.0-0.bpo.6
  • 16:58 XioNoX: redirecting ns2 traffic to radon
  • 16:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 (duration: 01m 12s)
  • 16:28 ejegg: updated CiviCRM from b27e6a5019 to aa251f1a93
  • 16:26 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Use EventBus for refreshLinks in test wikis, file 2/2 - T185052 (duration: 01m 12s)
  • 16:25 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for refreshLinks in test wikis, file 1/2 - T185052 (duration: 01m 12s)
  • 16:23 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ab3d002]: Enable refreshLinks for group0 wikis T185052 (duration: 00m 36s)
  • 16:23 ppchelko@tin: Started deploy [cpjobqueue/deploy@ab3d002]: Enable refreshLinks for group0 wikis T185052
  • 16:22 mobrovac@tin: scap failed: average error rate on 8/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 16:13 jynus: tendril and dbtree database currently under maintanance
  • 16:04 ejegg: updated payments-wiki from fe311c2d26 to 1acfc4a9a0
  • 15:26 ema: finished upgrading cache_text@ulsfo to varnish 5
  • 15:24 elukey: manually removing from cp1008 and cache::misc old files related to the varnishkafka jumbo testing instance (after https://gerrit.wikimedia.org/r/413370)
  • 14:58 matthiasmullie: EU SWAT finished
  • 14:52 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable 3D file display (duration: 01m 12s)
  • 14:50 mlitn@tin: Synchronized php-1.31.0-wmf.21/extensions/3D/extension.json: Remove MMV dependency for 3D (duration: 01m 12s)
  • 14:41 ottomata: beginning migration of webrequest_misc from Kafka analytics to jumbo: T185136
  • 14:40 mlitn@tin: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 14:38 mlitn@tin: Synchronized wmf-config/InitialiseSettings.php: Enable 3D file display (duration: 01m 13s)
  • 14:32 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2171.codfw.wmnet
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Show HTML summaries on cswiki (T182321) (duration: 01m 13s)
  • 13:41 ema: bounce pybal on lvs1003 to try establish missing etcd connections (zotero, thumbor, wdqs) https://phabricator.wikimedia.org/P6730
  • 13:30 moritzm: rebooting kubernetes1001
  • 13:21 ema: upgrade pybal on lvs1003 to 1.14.4
  • 12:42 _joe_: ended live-hacking on mwdebug1001 (T185078)
  • 12:24 _joe_: live-hacking ProductionServices.php on mwdebug1001 for testing (T185078)
  • 11:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 and slowly repool db1076 (duration: 01m 12s)
  • 11:40 kartik@tin: Finished deploy [cxserver/deploy@300f728]: Update cxserver to b0404d1 (duration: 03m 37s)
  • 11:39 akosiaris: purge ORES from scb hosts T168073 T171851
  • 11:37 kartik@tin: Started deploy [cxserver/deploy@300f728]: Update cxserver to b0404d1
  • 11:19 _joe_: upgrading python-conftool on all cache hosts
  • 10:55 ema: upgrading python-conftool on cp5007
  • 10:51 _joe_: upgrading python-conftool on cp1008
  • 10:42 jynus: stop db2073 for maintenance
  • 10:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 and fully repool db1104 (duration: 01m 13s)
  • 10:37 _joe_: benchmarking EtcdConfig failure scenarios on mwdebug1001, T185078
  • 10:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 - T186321 (duration: 01m 14s)
  • 10:18 ema: upgrade cache_text @ ulsfo to varnish 5
  • 10:13 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2073 for maintenance (duration: 01m 12s)
  • 10:08 moritzm: uploaded Linux 4.9.82-1~wmf1 for jessie-wikimedia to apt.wikimedia.org (retpoline-enabled kernel)
  • 10:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low traffic and depool db1067 - T162807 (duration: 01m 12s)
  • 09:59 akosiaris: reboot kraz.wikimedia.org (irc.wikimedia.org)
  • 09:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 - T186321 (duration: 01m 12s)
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 - T186321 (duration: 01m 12s)
  • 09:20 marostegui: Stop MySQL on db1104 to switch its binlog to statement - T186321
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T186321 (duration: 01m 13s)
  • 09:19 moritzm: rebooting multatuli
  • 09:03 ema: eqiad LVSs: upgrade pybal to 1.14.4
  • 08:48 jynus: tendril and dbtree database currently under maintanance
  • 08:47 ema: codfw LVSs: upgrade pybal to 1.14.4
  • 08:35 marostegui: Stop tendril database (db1011) to copy it to db1115 - tendril will be offline while the copy is in progress - T184704
  • 08:32 ema: esams LVSs: upgrade pybal to 1.14.4
  • 08:24 ema: ulsfo LVSs: upgrade pybal to 1.14.4
  • 08:05 marostegui: Disable puppet on db1011 - T184704
  • 07:48 krinkle@tin: Synchronized wmf-config/FeaturedFeedsWMF.php: I73945d7d - minor clean-up (duration: 01m 13s)
  • 07:32 _joe_: starting tests on mwdebug1001 again
  • 07:32 marostegui: Deploy schema change on db1076 - T187089 T185128 T153182
  • 07:24 marostegui: Stop MySQL on db1076 for mariadb and kernel upgrade + alter table
  • 07:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for alter table (duration: 01m 14s)
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 (duration: 01m 13s)
  • 06:21 marostegui: Stop puppet and mysql on db1011 to get ready to copy its data to db1115 - T184704
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 05m 53s)
  • 01:05 anomie: Running cleanupBlocks.php on more wikis for T187834: alswiki bgwiki bhwiki cawiki dewiki elwiki eswiki frwiki hewiki hiwiki huwiki hywiki jawiki jawikibooks jawikinews jawikiquote jawikisource jawiktionary kawiki kowiki mswiki mswiktionary rowiki sourceswiki
  • 01:01 anomie: Running cleanupBlocks.php on mediawikiwiki for T187834
  • 00:46 smalyshev@tin: Finished deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace (duration: 03m 07s)
  • 00:43 smalyshev@tin: Started deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace
  • 00:41 smalyshev@tin: Finished deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace (duration: 00m 27s)
  • 00:40 smalyshev@tin: Started deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace
  • 00:25 tgr@tin: Synchronized wmf-config/CommonSettings-labs.php: T57420 enable loginOnly flag in beta (duration: 01m 12s)
  • 00:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@8ffb03b]: Update mobileapps to a1339a9 (duration: 06m 05s)
  • 00:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@8ffb03b]: Update mobileapps to a1339a9
  • 00:13 demon@tin: Synchronized php-1.31.0-wmf.22/includes/media/JpegMetadataExtractor.php: T184048 (duration: 01m 13s)
  • 00:12 demon@tin: Synchronized php-1.31.0-wmf.21/includes/media/JpegMetadataExtractor.php: T184048 (duration: 01m 21s)
  • 00:00 mutante: LDAP - added uid 'raz-shuty' to group 'wmde' (T187442)

2018-02-21

  • 21:50 elukey: restart hhvm on mw1224 - high load alarms
  • 21:46 elukey: restart hhvm on mw1235 - high load alarms
  • 21:44 elukey: restart hhvm on mw1233 - high load alarms
  • 21:39 awight@tin: Finished deploy [ores/deploy@addba9c]: T187914 on the scb* cluster (duration: 10m 02s)
  • 21:34 elukey: restart hhvm on mw1232 - high load alarms
  • 21:30 ppchelko@tin: Finished deploy [restbase/deploy@56fffcf]: Do not check for article deletion for update requests T181636 (duration: 15m 59s)
  • 21:30 elukey: restart hhvm on mw1229 - high load alarms
  • 21:29 awight@tin: Started deploy [ores/deploy@addba9c]: T187914 on the scb* cluster
  • 21:28 awight@tin: Finished deploy [ores/deploy@7bbf21f]: T187914 on the ores* cluster (duration: 13m 03s)
  • 21:27 elukey: restart hhvm on mw1227 - high load alarms
  • 21:23 elukey: restart hhvm on mw1221 - high load alarms
  • 21:15 awight@tin: Started deploy [ores/deploy@7bbf21f]: T187914 on the ores* cluster
  • 21:14 ppchelko@tin: Started deploy [restbase/deploy@56fffcf]: Do not check for article deletion for update requests T181636
  • 20:53 twentyafterfour: MediaWiki Train for 1.31.0-wmf.22 is blocked by T187942
  • 20:39 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.21 (duration: 01m 12s)
  • 20:38 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.21
  • 20:34 twentyafterfour: rolling back group1 to wmf.21
  • 20:28 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.22 (duration: 01m 08s)
  • 20:27 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.22
  • 20:10 mutante: phab2001 - testing phab restart cron
  • 19:34 ebernhardson@tin: Synchronized wmf-config/PoolCounterSettings.php: Increase pool counter workers for cirrus namespace lookup (duration: 01m 13s)
  • 19:24 ottomata: applying changes to kafkatee module, first rhenium then oxygen. will require manual config fixings
  • 18:59 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sitename for Burmese Wiktionary T187882 (duration: 01m 06s)
  • 18:48 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespace localization for sdwiki T186943 (duration: 01m 13s)
  • 18:39 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Added new throttle rule for Wikipedia Women in Red editathon T187803 (duration: 01m 12s)
  • 18:37 chasemp: labsdb rm -fR /usr/local/lib/mediawiki-config && puppet agent --test
  • 18:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set Topic namespace alias of zhwiki T187546 (duration: 01m 13s)
  • 18:12 _joe_: stopped testing on mwdebug1001 for SWAT window
  • 17:43 ema: eqsin LVSs: upgrade pybal to 1.14.4
  • 17:34 _joe_: resuming tests on mwdebug1001
  • 17:17 ema: eqiad LVSs: bounce pybal for labweb proxfetch config changes
  • 17:12 ppchelko@tin: Finished deploy [changeprop/deploy@e9a6bb0]: Use post for ORES precache rules T158437 (duration: 01m 23s)
  • 17:11 ppchelko@tin: Started deploy [changeprop/deploy@e9a6bb0]: Use post for ORES precache rules T158437
  • 17:07 _joe_: finished testing on mwdebug1001 for swat
  • 16:56 oblivian@puppetmaster1001: conftool action : edit; selector: name=ReadOnly,scope=eqiad
  • 16:40 _joe_: testing various etcd failure scenarios on mwdebug1001, T185078
  • 16:39 ppchelko@tin: Finished deploy [changeprop/deploy@1be63aa]: Simplify ORES precaching by using the new endpoint T158437 (duration: 01m 33s)
  • 16:37 ppchelko@tin: Started deploy [changeprop/deploy@1be63aa]: Simplify ORES precaching by using the new endpoint T158437
  • 16:27 ema: lvs1010: restart pybal
  • 16:00 godog: restart rsyslogd on lithium and wezen - T136312
  • 15:50 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Serve private wiki thumbnails with Thumbor (T169144) (duration: 01m 12s)
  • 15:44 no_justification: pruned old 1.29.x and 1.30.x versions that somehow stuck around. Also 1.31.0-wmf.* cache/ directories for unused branches. T157030
  • 15:37 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Serve officewiki thumbnails with Thumbor (T169144) (duration: 01m 11s)
  • 15:27 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Add Thumbor/Mediawiki shared secret (T169144) (duration: 01m 11s)
  • 15:24 chasemp: reboot labtestservices2002
  • 15:24 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Add Thumbor/Mediawiki shared secret (T169144) (duration: 01m 12s)
  • 15:19 gilles: Thumbor private wiki support deployment
  • 15:08 zeljkof: EU SWAT finished
  • 15:08 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Removing Mobile beta feedback link (T187712) (duration: 01m 12s)
  • 15:03 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Page Previews EventLogging instrumentation (T185973) (duration: 01m 13s)
  • 14:52 _joe_: rolling restart another 4 api appservers
  • 14:49 oblivian@tin: Synchronized wmf-config: Serve configuration to mwdebug hosts via etcd (duration: 01m 16s)
  • 14:42 _joe_: restarted hhvm on mwdebug1001 too
  • 14:38 _joe_: restarting hhvm on mwdebug1002
  • 14:06 _joe_: restarting hhvm on misbehaving api appservers
  • 14:02 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T187870) (duration: 01m 13s)
  • 13:28 marostegui: Reboot db2092 for a kernel upgrade
  • 13:26 moritzm: powercycling ganeti1007
  • 12:43 _joe_: rolling restart of hhvm on api servers under high load
  • 12:38 elukey: restart hhvm on mw1234 - high load
  • 12:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify that db1067 is now s1 candidate master - T186321 (duration: 01m 13s)
  • 12:26 elukey: restart hhvm on mw1231 - high load, hhvm-dump-debug in /home/elukey/hhvm.6759.bt
  • 12:21 elukey: restart hhvm on mw1227 - high load, hhvm-dump-debug in /home/elukey/hhvm.23382.bt
  • 12:10 moritzm: uploading retpoline-enabled gcc-4.9 to apt.wikimedia.org / jessie-wikimedia to be able to use it on boron for building Linux (trying to adapt our pbuilder setup to also include security.debian.org ran into a few proxy-related problems and this is really a rare corner case anyway)
  • 12:02 ema: lvs5003: pybal upgraded to 1.14.4
  • 12:01 ema: pybal 1.14.4 uploaded to apt.w.o
  • 11:17 moritzm: installing db5.3 security updates
  • 11:12 jynus: cloning db2011 to db2044
  • 10:40 kart_: Finished running CLL preference migration script dry-run on terbium (T187677)
  • 10:33 marostegui: Reload haproxy on dbproxy1005 - T187722
  • 10:26 marostegui: Remove db2030 from tendril - T187768
  • 10:09 moritzm: installing openssh bugfix updates from jessie/stretch point releases
  • 10:01 kart_: Running CLL preference migration script dry-run on terbium (T187677)
  • 09:46 moritzm: installing dbus updates from stretch point release
  • 09:23 moritzm: installing sqlite security updates on stretch
  • 08:35 godog: roll-restart thumbor in codfw and eqiad to apply https://gerrit.wikimedia.org/r/c/412980
  • 08:20 gilles: foreachwikiindblist "% private.dblist" extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --backend=local-multiwrite --private
  • 07:20 marostegui: Stop Mariadb on db1108 for kernel upgrade
  • 06:36 marostegui: Deploy schema change on db1105:3312 - T187089 T185128 T153182
  • 06:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105 for alter table (duration: 01m 17s)
  • 05:00 eileen: enable major gifts address job
  • 04:41 eileen: update civicrm revision changed from 43a7641597 to b27e6a5019, config revision is ef884a2c5d
  • 04:13 andrew@tin: Finished deploy [horizon/deploy@0e7783d]: updating branded graphics slightly more (duration: 02m 45s)
  • 04:10 andrew@tin: Started deploy [horizon/deploy@0e7783d]: updating branded graphics slightly more
  • 03:34 andrew@tin: Finished deploy [horizon/deploy@0e28f49]: updating branded graphics (duration: 02m 49s)
  • 03:31 andrew@tin: Started deploy [horizon/deploy@0e28f49]: updating branded graphics
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 06m 18s)
  • 02:15 no_justification: running `initSiteStats.php --update` for all wikis in medium.dblist. T187845
  • 02:01 no_justification: running `initSiteStats.php --update` for all wikis in small.dblist. T187845
  • 01:54 no_justification: WikipediaMobileFirefoxOS submodule references caused labsdb* (and related) puppet failures. They should recover now (self reverted my docroot changes). Filed T187850
  • 01:51 demon@tin: Synchronized docroot/: revert docroot improvements. some servers don't like improvements (duration: 01m 12s)
  • 01:36 demon@tin: Synchronized docroot/: Swapping wikimedia.org docroot for symlink (second try, old WPFirefoxMobileOS cleanup was still needed) (duration: 01m 12s)
  • 01:16 eileen: update civicrm revision changed from efba904b06 to 43a7641597, config revision is ef884a2c5d
  • 01:10 cwd: disabled process-control
  • 01:08 eileen: start outage to upgrade civicrm to 4.7.31
  • 00:56 mutante: gerrit2001 - restarted gerrit to test that gerrit:411397 and gerrit:411394 don't break anything - didn't touch cobalt right now to minimize affecting users and their logins
  • 00:43 thcipriani@tin: Synchronized wmf-config/abusefilter.php: SWAT: Allow CheckUsers and Stewards to access private data from the AbuseLog T160357 (duration: 01m 12s)
  • 00:29 thcipriani@tin: Synchronized php-1.31.0-wmf.21/includes/page/WikiPage.php: SWAT: site_stats: Unbreak counting newly created pages (duration: 01m 12s)
  • 00:26 thcipriani@tin: Synchronized php-1.31.0-wmf.21/resources/src/mediawiki/mediawiki.ForeignStructuredUpload.js: SWAT: Follow-up I0bb4ed7f7: Use correct "this" T187523 (duration: 01m 13s)
  • 00:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable x-kill feature everywhere T186714 T184322 (duration: 01m 13s)

2018-02-20

  • 22:58 ejegg: restarted donations queue consumer
  • 22:26 ejegg: turned off donations queue consumer for timing test
  • 22:25 demon@tin: Synchronized php-1.31.0-wmf.21/extensions/Thanks/modules/ext.thanks.revthank.js: T187757 (duration: 01m 14s)
  • 22:20 chasemp: T184209 create labs-instance-transport1-b-codfw
  • 22:06 eileen: update civicrm revision changed from 915a4419c8 to efba904b06, config revision is 8c7ce87207 (extended report update for regex)
  • 21:44 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.22
  • 21:39 no_justification: ran `namespaceDupes.php --wiki=enwikiversity` for T187660
  • 21:18 twentyafterfour@tin: Finished scap: Sync 1.31.0-wmf.22 and promote test wikis - refs T183961 (duration: 46m 59s)
  • 20:34 ejegg: updated CiviCRM from 31115684f6 to 915a4419c8
  • 20:31 twentyafterfour@tin: Started scap: Sync 1.31.0-wmf.22 and promote test wikis - refs T183961
  • 20:20 chasemp: labtestmetal2001:~# aptitude install linux-image-4.4.0-109-generic && aptitude install linux-image-extra-4.4.0-109-generic
  • 20:17 chasemp: labtestmetal mkfs -t xfs -i size=512 /dev/mapper/labtestmetal2001--vg-data
  • 20:16 andrew@tin: Finished deploy [horizon/deploy@b02c819]: trying to get a clean deploy (duration: 01m 54s)
  • 20:14 andrew@tin: Started deploy [horizon/deploy@b02c819]: trying to get a clean deploy
  • 20:10 andrew@tin: Finished deploy [horizon/deploy@b02c819]: a couple of bug fixes (duration: 02m 55s)
  • 20:07 andrew@tin: Started deploy [horizon/deploy@b02c819]: a couple of bug fixes
  • 20:07 andrew@tin: Started deploy [horizon/deploy@6a40f84]: a couple of bug fixes
  • 19:57 twentyafterfour: Cutting new branch wmf/1.31.0-wmf.22 - Deployment blockers: T183961
  • 19:45 demon@tin: Synchronized docroot/mediawiki/keys/: symlink magic (duration: 00m 56s)
  • 19:26 mobrovac@tin: Started restart [changeprop/deploy@5fdc03a]: (no justification provided)
  • 19:00 ppchelko@tin: Finished deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 take 2 (duration: 02m 47s)
  • 18:57 ppchelko@tin: Started deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 take 2
  • 18:57 ppchelko@tin: Finished deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 (duration: 14m 02s)
  • 18:43 ppchelko@tin: Started deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875
  • 18:34 arlolra@tin: Finished deploy [parsoid/deploy@5fbabfc]: Updating Parsoid to e5e8113 (duration: 10m 37s)
  • 18:23 arlolra@tin: Started deploy [parsoid/deploy@5fbabfc]: Updating Parsoid to e5e8113
  • 18:03 ppchelko@tin: Finished deploy [restbase/deploy@dca0290]: Switch summary implementation to MCS T179875 (duration: 16m 01s)
  • 17:52 moritzm: installing cups updates from jessie point release
  • 17:50 gilles: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --wiki=officewiki --backend=local-multiwrite --private
  • 17:47 ppchelko@tin: Started deploy [restbase/deploy@dca0290]: Switch summary implementation to MCS T179875
  • 17:41 andrew@tin: Finished deploy [striker/deploy@3684a73]: rolling stretch-ready striker out to labweb hosts (duration: 00m 55s)
  • 17:40 andrew@tin: Started deploy [striker/deploy@3684a73]: rolling stretch-ready striker out to labweb hosts
  • 17:11 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on group 1 (duration: 00m 56s)
  • 16:33 godog: roll-restart thumbor in codfw/eqiad to apply https://gerrit.wikimedia.org/r/412935
  • 16:25 moritzm: installing initramfs-tools update from jessie point release
  • 16:17 jynus: drop s3 from dbstore2001
  • 16:14 gilles@tin: Synchronized private/PrivateSettings.php: Add Thumbor secret to Swift configuration (duration: 00m 56s)
  • 15:37 oblivian@puppetmaster1001: conftool action : edit; selector: dc=esams,name=cp3033.esams.wmnet
  • 15:36 bblack: eqsin: restarting all varnish backends for storage changes (not in prod traffic flow, yet!)
  • 15:27 _joe_: upgrading conftool on swift proxies, thumbor
  • 15:25 _joe_: upgrading conftool on parsoid,wdqs
  • 15:23 _joe_: upgrading conftool on aqs, restbase, ores clusters
  • 15:19 _joe_: upgrading conftool on the mediawiki appservers
  • 15:15 _joe_: upgrading conftool on the maps cluster
  • 15:10 _joe_: installing python-conftool on puppetmasters, cumin masters
  • 14:53 godog: roll-restart thumbor after rollback
  • 14:50 volans: running puppet on thumbor1002 (was already logged in)
  • 14:40 zeljkof: EU SWAT finished
  • 14:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update the sitename of newiki (T186952) (duration: 00m 55s)
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Draft namespace to hiwikiversity. (T187535) (duration: 00m 56s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect to autoconfirmed at zhwikt (T187018) (duration: 00m 55s)
  • 14:10 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T187171) (duration: 00m 55s)
  • 14:03 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: throttle: add new rule for Wikidata edit-a-thon (T187655) (duration: 00m 56s)
  • 13:29 marostegui: Upgrade kernel and reboot db1113 and db1114
  • 13:23 marostegui: Stop MySQL and reboot db1111 for kernel and mariadb upgrade
  • 13:17 marostegui: Stop MySQL and reboot db1112 for kernel and mariadb upgrade
  • 13:03 moritzm: installing libav security updates
  • 12:11 _joe_: upgrading conftool to 1.0.0~beta2 on scb*
  • 11:24 jynus: upgrding mariadb-client on neodymium and sarin
  • 11:09 marostegui: Deploy schema change on labtestweb2001 - T153182 T185128 T187089
  • 11:00 marostegui: Deploy schema change on s2 codfw master (db2035) with replication, this will generate lag on codfw - T187089 T185128 T153182
  • 11:00 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db2037 and db2044 (duration: 00m 55s)
  • 10:58 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db2037 and db2044 (duration: 00m 53s)
  • 10:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2037 and db2044 (duration: 00m 55s)
  • 10:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2030 from config - T187768 (duration: 00m 55s)
  • 10:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2030 from config - T187768 (duration: 00m 56s)
  • 10:13 volans: unified python-requests-mock packages in apt.wikimedia.org jessie-wikimedia to be 1.3.0-3~wmf1, removed binaries for 1.3.0-3
  • 09:49 marostegui: Deploy schema change on s6 primary master db1061 - T185128 T153182
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 after alter table (duration: 00m 55s)
  • 09:16 marostegui: Data checks for db2037 before removing it from s4 - T187722
  • 09:14 elukey: restart zookeeper on druid1001 (follower) to verify that the last changes are no-op
  • 09:12 marostegui: Deploy schema change on db1088 - T187089 T185128 T153182
  • 09:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 for alter table (duration: 00m 55s)
  • 09:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3316 and db1085 (duration: 00m 55s)
  • 09:03 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:03 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:02 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:01 oblivian@puppetmaster2001: conftool action : edit; selector: scope=codfw
  • 08:56 oblivian@puppetmaster2001: conftool action : edit; selector: scope=codfw
  • 08:51 oblivian@puppetmaster2001: conftool action : edit; selector: scope=common
  • 08:32 _joe_: uploading conftool 1.0.0~beta1 on stretch
  • 08:26 _joe_: uploading conftool 1.0.0~beta1 to jessie
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 00m 55s)
  • 08:09 godog: powercycle ganeti1006
  • 08:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 01m 10s)
  • 07:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 00m 55s)
  • 07:27 marostegui: Deploy schema change on db1096:3316 - T187089 T185128 T153182
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 for alter table (duration: 00m 56s)
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1085 (duration: 00m 55s)
  • 06:58 marostegui: Upgrade mariadb and kernel on db1085
  • 06:26 marostegui: Deploy schema change on db1085 (with replication - this will generate lag on labs hosts) - T187089 T185128 T153182
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 for alter table (duration: 00m 56s)
  • 04:56 krinkle@tin: Synchronized docroot/mediawiki/keys/: Ie26638ed0c - rm old 2009 keys file (duration: 00m 56s)
  • 04:27 krinkle@tin: Synchronized w/extract2.php: Ib6d77e863b - clean up MW_LANG indirection (duration: 00m 55s)
  • 03:40 krinkle@tin: Synchronized wmf-config/CommonSettings.php: Ie4c7879f8ac - Clean up TemplateSandboxEditNamespaces config (duration: 00m 57s)
  • 03:37 Krinkle: It seems 'scap pull' on mwdebug1002 is acting weird (prompt doesn't return until 3-5 minutes after last line of "Finished rsync common")
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 05m 50s)

2018-02-19

  • 23:21 eileen: re-enable omnirecipient jobs - process-control config revision is 8c7ce87207
  • 22:03 volans: uploaded cumin_3.0.1-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 20:03 volans: uploaded cumin_3.0.0-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 19:29 volans: uploaded python3-requests-mock, python-requests-mock and python-requests-mock-doc for version 1.3.0-3~wmf1 to apt.wikimedia.org jessie-wikimedia
  • 18:53 volans: disabled all notifications on Icinga for db2030
  • 18:04 volans: uploaded clustershell_1.8-1~wmf1_all.deb, python-clustershell_1.8-1~wmf1_all.deb and python3-clustershell_1.8-1~wmf1_all.deb to apt.wikimedia.org jessie-wikimedia
  • 17:04 elukey@tin: Finished deploy [eventlogging/analytics@8bebdf7]: (no justification provided) (duration: 00m 05s)
  • 17:04 elukey@tin: Started deploy [eventlogging/analytics@8bebdf7]: (no justification provided)
  • 16:29 _joe_: uploading conftool 1.0.0beta1 to reprepro for jessie
  • 16:22 andrew@tin: Finished deploy [striker/deploy@8a79195]: further attempt to cram striker onto labweb1001 and 1002 (duration: 00m 10s)
  • 16:22 andrew@tin: Started deploy [striker/deploy@8a79195]: further attempt to cram striker onto labweb1001 and 1002
  • 16:11 andrew@tin: Finished deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002 (duration: 00m 22s)
  • 16:10 andrew@tin: Started deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002
  • 16:10 andrew@tin: Finished deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002 (duration: 00m 17s)
  • 16:09 andrew@tin: Started deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002
  • 14:59 jynus: testing new dbproxy1010 configuration locally to pool labsdb1010 for analytics
  • 13:44 godog: roll-restart prometheus after retention period bump
  • 13:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1063 (duration: 00m 55s)
  • 13:19 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 13:16 ema: upgrade cache_text@eqsin to varnish 5
  • 12:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1098 s6 and s7 (duration: 00m 55s)
  • 12:27 marostegui: Deploy schema change on db1063 - T187089 T185128 T153182
  • 12:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 for alter table (duration: 00m 55s)
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 (duration: 00m 55s)
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098 s6 and s7 (duration: 00m 56s)
  • 11:07 jdrewniak@tin: Synchronized portals: Wikimedia portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 11:06 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia portals Update: Bumping portals to master (T128546) (duration: 00m 56s)
  • 10:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098 s6 and s7 (duration: 00m 55s)
  • 10:35 marostegui: Deploy schema change on db1093 - T187089 T185128 T153182
  • 10:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 for alter table (duration: 00m 56s)
  • 10:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1098 s6 and s7 (duration: 00m 56s)
  • 10:10 marostegui: Upgrade mariadb and kernel on db1098
  • 09:59 marostegui: Enable GTID on dbstore2002:3313 and dbstore2001:3316
  • 09:57 marostegui: Enable GTID on dbstore2002 and dbstore2001 for x1
  • 09:55 jynus: reenable gtid replication on db1053 and db2042
  • 09:53 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1260.eqiad.wmnet
  • 09:53 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1259.eqiad.wmnet
  • 09:43 marostegui: Upgrade mariadb and kernel on db2033
  • 09:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1090 (duration: 00m 55s)
  • 09:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 - T162807 (duration: 00m 55s)
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for mariadb and kernel upgrade (duration: 00m 55s)
  • 08:49 marostegui: Deploy schema change on db1098:3316 - T187089 T185128 T153182
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 for alter table (duration: 00m 55s)
  • 08:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1090 (duration: 00m 55s)
  • 08:11 godog: repool mw1227 - T149287
  • 08:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Promote db2034 to x1 codfw master - T184888 (duration: 00m 56s)
  • 07:58 moritzm: installing werkzeug security updates on trusty
  • 07:42 marostegui: Change topology on x1 codfw - T184888
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1090 (duration: 00m 55s)
  • 07:01 marostegui: Reboot db1090 for kernel ugprade, mariadb upgrade, socket path location upgrade
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 00m 55s)
  • 06:44 marostegui: Stop MySQL on db1089 to update its socket path
  • 06:42 marostegui: Deploy schema change on s6 codfw master (db2039), this will generate lag on codfw - T187089 T185128 T153182
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1105 - T162807 (duration: 00m 56s)
  • 05:29 andrew@tin: Finished deploy [horizon/deploy@6a40f84]: rolling out several horizon bugfixes (duration: 03m 14s)
  • 05:26 andrew@tin: Started deploy [horizon/deploy@6a40f84]: rolling out several horizon bugfixes
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 10m 59s)

2018-02-18

  • 15:49 _joe_: rolling restart (1 at a time, staggered by 2 minutes) of 18 api appservers in equiad

2018-02-17

  • 17:33 twentyafterfour: restarting apache on phab1001 to clear deadlocked workers. refs T182832
  • 03:15 demon@tin: Pruned MediaWiki: 1.31.0-wmf.20 [keeping static files] (duration: 01m 17s)
  • 03:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.17 (duration: 04m 32s)

2018-02-16

  • 21:12 hashar: Upgraded Zuul to https://gerrit.wikimedia.org/r/#/c/411322/3 | T187567
  • 20:40 andrew@tin: Finished deploy [horizon/deploy@efcba2b]: sudo dashboard update (duration: 01m 16s)
  • 20:39 andrew@tin: Started deploy [horizon/deploy@efcba2b]: sudo dashboard update
  • 20:11 andrew@tin: Finished deploy [horizon/deploy@1fdd122]: two more small fixes (duration: 01m 21s)
  • 20:10 andrew@tin: Started deploy [horizon/deploy@1fdd122]: two more small fixes
  • 19:54 andrew@tin: Finished deploy [horizon/deploy@bdcc12b]: ocata branch with sidebar fix (duration: 03m 12s)
  • 19:51 andrew@tin: Started deploy [horizon/deploy@bdcc12b]: ocata branch with sidebar fix
  • 18:34 hashar: upgraded zuul
  • 16:21 andrew@tin: Finished deploy [horizon/deploy@16f3d8e]: ocata branch with upper new requirements (duration: 08m 00s)
  • 16:13 andrew@tin: Started deploy [horizon/deploy@16f3d8e]: ocata branch with upper new requirements
  • 16:06 cmjohnson1: labstore1006 and labstore1007 down for rack relocation
  • 16:03 andrew@tin: Finished deploy [horizon/deploy@16d0b17]: ocata branch with upper constraints (duration: 02m 18s)
  • 16:00 andrew@tin: Started deploy [horizon/deploy@16d0b17]: ocata branch with upper constraints
  • 15:40 andrew@tin: Finished deploy [horizon/deploy@29f9afb]: second attempt at ocata branch (duration: 03m 22s)
  • 15:37 andrew@tin: Started deploy [horizon/deploy@29f9afb]: second attempt at ocata branch
  • 15:29 andrew@tin: Finished deploy [horizon/deploy@58d2718]: first attempt at ocata branch (duration: 01m 28s)
  • 15:28 andrew@tin: Started deploy [horizon/deploy@58d2718]: first attempt at ocata branch
  • 15:27 godog: shut ms-be1018 for bbu swap - T186988
  • 15:16 akosiaris: run T181121#3978654 oneliner once more on sca1004, this time the VM has no DRBD
  • 15:14 akosiaris: poweroff sca1004, switch from DRBD to plain disk template T181121
  • 14:15 akosiaris: doing more IO stress tests on ganeti1005. T181121. Seems like we can reproduce
  • 14:06 chasemp: T184209 initial setup of labs-instances2-b-codfw and hosts
  • 13:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1094 (duration: 00m 56s)
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 and db1067 - T162807 (duration: 00m 55s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1094 (duration: 00m 56s)
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1094 (duration: 00m 56s)
  • 12:46 jynus: reload dbproxy1008 configuration
  • 12:44 jynus: reload dbproxy1003 configuration
  • 12:37 ema: cp3049: restart varnish-fe to clear 'child restarted' alert
  • 12:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1094 (duration: 00m 56s)
  • 12:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 56s)
  • 12:17 marostegui: Stop MySQL on db1094 for mariadb upgrade, kernel upgrade and socket location upgrade
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 (duration: 00m 56s)
  • 12:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 (duration: 00m 56s)
  • 11:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 00m 56s)
  • 11:35 jynus: stopping mysql on db1043, db2012 for clonning data away
  • 11:33 jynus: changing socket location on phabricator db hosts T148507
  • 11:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 00m 56s)
  • 11:28 ema: cp3036: restart varnish-fe to clear 'child restarted' alert
  • 11:28 hashar: Switching operations/mediawiki-config job for composer to Docker | https://gerrit.wikimedia.org/r/#/c/411206/
  • 11:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 (duration: 00m 56s)
  • 11:09 elukey: restart nfaccd on rhenium to see if it picks up the new kafka topic config (3 partitions)
  • 11:06 marostegui: Stop MySQL on db1093 for mariadb and kernel upgrade, also update socket path
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 00m 56s)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 - T162807 (duration: 00m 56s)
  • 09:55 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1053 (duration: 00m 56s)
  • 09:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1053 (duration: 00m 56s)
  • 08:48 akosiaris: doing IO stress tests on ganeti1005. T181121
  • 08:34 akosiaris: manually allocate logstash1008 on ganeti1005 to undo the manual override of sensible allocation rules by ganeti
  • 08:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 (duration: 00m 57s)
  • 08:14 akosiaris: powercycle ganeti1006 T181121
  • 08:13 akosiaris: powercycle ganeti1006
  • 06:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 - T162807 (duration: 00m 59s)
  • 06:41 moritzm: installing installing quagga security updates
  • 06:35 marostegui: Deploy schema change on s5 primary master db1070 - T185128 T153182
  • 00:16 ebernhardson@tin: Synchronized php-1.31.0-wmf.21/extensions/ProofreadPage/modules/page/ext.proofreadpage.page.edit.js: SWAT: T187454 fix text selection on #wpTextbox1 (duration: 00m 58s)

2018-02-15

  • 23:43 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 56s)
  • 22:54 demon@tin: Synchronized php-1.31.0-wmf.21/extensions/MassMessage/includes/MassMessage.php: fix use statement, T187510 (duration: 00m 57s)
  • 21:50 ejegg: updated CiviCRM from 61acc9175e to 31115684f6
  • 20:22 twentyafterfour: 1.31.0-wmf.21 deployed: no apparent change in fatalmonitor error rate. refs T183960
  • 20:18 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.21
  • 20:11 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/TwoColConflict/includes/TwoColConflictHooks.php: sync https://gerrit.wikimedia.org/r/#/c/410809/ (duration: 01m 13s)
  • 20:09 twentyafterfour: syncing a patch before deploying 1.31.0-wmf.21 to all wikis.
  • 19:55 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Follow-up 77be427a1: Enable the Beta Feature on all wikis T185708 (duration: 01m 12s)
  • 19:34 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set Portal and Portal talk namespace alias of zhwiki T184866 (duration: 01m 13s)
  • 19:13 thcipriani@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Set SPARQL endpoint for category search T184840 (duration: 01m 12s)
  • 18:42 arlolra@tin: Finished deploy [parsoid/deploy@6da4591]: Updating Parsoid to 0650195 (duration: 08m 34s)
  • 18:33 arlolra@tin: Started deploy [parsoid/deploy@6da4591]: Updating Parsoid to 0650195
  • 18:11 bsitzmann@tin: Finished deploy [mobileapps/deploy@0bfafa9]: Update mobileapps to d219d1b (T187475) (duration: 05m 54s)
  • 18:06 bsitzmann@tin: Started deploy [mobileapps/deploy@0bfafa9]: Update mobileapps to d219d1b (T187475)
  • 17:24 foks: removed 2FA from User:Lea Lacroix (WMDE)
  • 17:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 (duration: 01m 12s)
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1097:3315, db1089, db1066 (duration: 01m 12s)
  • 16:32 andrew@tin: Finished deploy [horizon/deploy@4e7ccc5]: lots of updates (duration: 03m 13s)
  • 16:29 andrew@tin: Started deploy [horizon/deploy@4e7ccc5]: lots of updates
  • 16:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3315 (duration: 01m 12s)
  • 15:34 ema: upgrade upload @ eqsin to varnish 5
  • 15:27 marostegui: Deploy schema change on db1051 - T187089 T185128 T153182
  • 15:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051, fully repool db1097:3314, increase weight for db1097:3315 (duration: 01m 13s)
  • 15:15 zeljkof: EU SWAT finished
  • 15:14 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Log accessing private abusefilter details (T160357) (duration: 01m 12s)
  • 14:58 moritzm: installing erlang security updates on labcontrol1001
  • 14:53 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable the visual diff beta feature (T185708) (duration: 01m 12s)
  • 14:37 zfilipin@tin: Synchronized php-1.31.0-wmf.21/includes/Revision.php: SWAT: Log the reason why revision->getContent() returns null (T184670) (duration: 01m 12s)
  • 14:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable log channel T184670 (T184670) (duration: 01m 12s)
  • 14:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db2042 (duration: 01m 11s)
  • 14:20 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db2042 (duration: 01m 12s)
  • 14:09 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add RevisionStore to wmgMonologChannels: (duration: 01m 13s)
  • 12:01 addshore: script run for T185738 done
  • 11:59 milimetric@tin: Finished deploy [analytics/refinery@26d4e50]: Deploying Refinery jobs with new 0.0.58 jars (duration: 09m 33s)
  • 11:58 addshore: addshore@terbium:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki elwiktionary --batchsize 1000 # T185738
  • 11:49 milimetric@tin: Started deploy [analytics/refinery@26d4e50]: Deploying Refinery jobs with new 0.0.58 jars
  • 10:58 marostegui: Stop replication in sync db1089 and db1066
  • 10:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3314 and slowly repool db1097:3315 (duration: 01m 12s)
  • 10:38 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 fully (duration: 01m 12s)
  • 10:28 marostegui: Upgrade mariadb on db1066
  • 10:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3314 (duration: 01m 12s)
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1097:3314 (duration: 01m 12s)
  • 09:48 marostegui: Deploy schema change on db1097:3315 - T187089 T185128 T153182
  • 09:39 marostegui: Upgrade kernel and mariadb on db1097
  • 09:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 for s4 and s5 (duration: 01m 12s)
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1082 (duration: 01m 12s)
  • 09:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1082 (duration: 01m 12s)
  • 08:54 moritzm: installing erlang security updates on labtestcontrol*
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1082 (duration: 01m 13s)
  • 08:18 marostegui: Upgrade kernel + mariadb on db1082 (sanitarium master in s5)
  • 07:55 marostegui: Stop replication in sync on db1089 and db1066 - T162807
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1066 - T162807 (duration: 01m 12s)
  • 07:39 marostegui: Deploy schema change on db1082 (sanitarium master) with replication, this will generate lag on labs - T187089 T185128 T153182
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 13s)
  • 07:35 moritzm: installing libvorbis security updates on stretch
  • 07:30 twentyafterfour: phabricator upgrade finished. phd is back online.
  • 07:27 twentyafterfour: phabricator database migration finished
  • 07:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1110 (duration: 01m 12s)
  • 07:09 jynus: reimage dbproxy1003 to stretch
  • 07:04 twentyafterfour: Applying patch "phabricator:20180215.maniphest.02.populate.php" to host "m3-master.eqiad.wmnet"...
  • 07:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1110 (duration: 01m 13s)
  • 06:57 twentyafterfour: apache restarted, update appears to be successful
  • 06:57 twentyafterfour: phabricator database migrations applied
  • 06:50 twentyafterfour: shutting down apache on phab1001 to deploy update, downtime should be only a couple of minutes
  • 06:49 twentyafterfour: starting phabricator upgrade tagged release/2018-02-15/1
  • 06:45 twentyafterfour: restarted apache on phab1001 and reset cluster.read-only to false
  • 06:44 jynus: set db1059 in read-write
  • 06:38 jynus: merging dns update for phabricator db
  • 06:35 jynus: set db1043 as read only
  • 06:34 twentyafterfour: set cluster.read-only in phabricator
  • 06:33 jynus: about to set phabricator.wikimedia.org as read only
  • 06:28 jynus: scheduling downtime for phabricator on phab1001
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1110 (duration: 01m 13s)
  • 06:06 marostegui: Upgrade mysql on db1110
  • 05:57 jynus: restarting dbproxy1008 for kernel upgrade
  • 05:43 andrew@tin: Finished deploy [horizon/deploy@c355366]: testing a couple of cherry-picks in horizon (duration: 03m 06s)
  • 05:40 andrew@tin: Started deploy [horizon/deploy@c355366]: testing a couple of cherry-picks in horizon
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 07m 25s)
  • 02:01 mutante: phab1001 - restarted apache to fix server status page
  • 01:27 twentyafterfour: restarting apache2 on phab1001 to free deadlocked php processes.
  • 01:03 twentyafterfour: using the current phabricator maintenance window to deploy https://gerrit.wikimedia.org/r/#/c/410626/
  • 01:03 twentyafterfour: the scheduled phabricator upgrade is delayed until 06:00 UTC Thursday because of large database migrations. Doing the upgrade at a time when DBAs are available to assist.
  • 00:52 maxsem@tin: Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/410267/ (duration: 01m 14s)
  • 00:49 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/410267/ (duration: 01m 13s)

2018-02-14

  • 23:39 AaronSchulz: Running initSiteStats.php on s3 for T186947
  • 22:04 aaron@tin: Synchronized php-1.31.0-wmf.20/includes/SiteStats.php: f549559dc0 (duration: 01m 13s)
  • 21:52 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with full weight (duration: 01m 13s)
  • 21:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9bad612]: Update mobileapps to f23519f (duration: 06m 01s)
  • 21:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@9bad612]: Update mobileapps to f23519f
  • 21:30 arlolra@tin: Finished deploy [parsoid/deploy@7961b3f]: Updating Parsoid to caee2ed (duration: 15m 12s)
  • 21:15 arlolra@tin: Started deploy [parsoid/deploy@7961b3f]: Updating Parsoid to caee2ed
  • 21:00 ema: upgrade cp1099 to varnish 5 (last upload@eqiad host)
  • 20:54 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/CentralNotice: Sync CentralNotice again after proper rebase (duration: 01m 14s)
  • 20:43 ema: upgrade cp1074 to varnish 5
  • 20:42 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/CentralNotice/: sync https://gerrit.wikimedia.org/r/#/c/410346/ for Ejegg (duration: 01m 15s)
  • 20:40 twentyafterfour: Group1 wikis are now running MediaWiki 1.31.0-wmf.21 - still no blockers on T183960
  • 20:38 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.21 (duration: 01m 12s)
  • 20:37 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.21
  • 20:33 ema: upgrade cp1073 to varnish 5
  • 20:05 ema: upgrade cp1072 to varnish 5
  • 19:44 ema: upgrade cp1071 to varnish 5
  • 19:25 XioNoX: enabling netflow on cr1-eqiad
  • 19:24 no_justification: ran namespaceDupes.php --fix for hiwiki
  • 19:24 demon@tin: Synchronized wmf-config/InitialiseSettings.php: portal aliases for hiwiki (duration: 01m 13s)
  • 19:22 ema: upgrade cp1064 to varnish 5
  • 19:20 no_justification: running updateCollation.php on nowikimedia
  • 19:19 demon@tin: Synchronized wmf-config/InitialiseSettings.php: nowikimedia collation, T185630 (duration: 01m 13s)
  • 19:16 andrewbogott: rebooting labvirt1019 so I can have a look at the raid setup, for T172538
  • 19:14 no_justification: ran namespaceDupes.php --fix on wawiktionary
  • 19:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: wawiktionary namespaces, T185289 (duration: 01m 13s)
  • 19:11 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Revert prior, busted the canaries (duration: 01m 15s)
  • 19:08 demon@tin: scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 19:06 demon@tin: rebuilt and synchronized wikiversions files: namespace aliases for zhwiki, T184866
  • 19:00 ema: upgrade cp1063 to varnish 5
  • 17:43 ema: upgrade cp1062 to varnish 5
  • 17:42 moritzm: updated jenkins packages on apt.wikimedia.org for stretch (thirdpary/ci) and jessie (thirdparty) to 2.89.4
  • 17:39 hashar: CI Jenkins seems all happy following the upgrade ^o^
  • 17:34 moritzm: updating remaining python-cryptography updates from jessie point release
  • 17:32 hashar: Upgrading Jenkins on contint1001 / contint2001
  • 17:30 godog: roll-restart ms-fe to pick up https://gerrit.wikimedia.org/r/c/410199/
  • 17:22 moritzm: installing uwsgi jessie update on graphite*
  • 17:20 godog: roll-upgrade thumbor 1.14 in eqiad/codfw
  • 16:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 09s)
  • 16:56 ema: upgrade cp1050 to varnish 5
  • 16:50 marostegui: Deploy schema change on db1110 - T187089 T185128 T153182
  • 16:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 01m 12s)
  • 16:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 16:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with low weight (duration: 01m 12s)
  • 16:19 ema: upgrade cp1049 to varnish 5
  • 15:59 jynus: upgrade and restart db1088
  • 15:52 moritzm: rolling out debdeploy 0.0.99.2 (cumin masters already upgraded for a while, just synching the clients)
  • 15:51 andrewbogott: powering down labvirt1008 so chris can re-apply thermal paste
  • 15:45 moritzm: installing libgcrypt security updates on trusty
  • 15:31 zeljkof: EU SWAT finished
  • 15:24 godog: roll-upgrade thumbor to 1.13 - T187159 T179954 T187088
  • 15:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Add suppressredirect to autoconfirmed at zhwikt" (T187018) (duration: 01m 13s)
  • 15:18 ema: upgrade cp1048 to varnish 5
  • 14:47 moritzm: installing PHP security updates
  • 14:44 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable flood flag at zhwikt (T187018) (duration: 01m 12s)
  • 14:37 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Require 7 days & 10 edits for autoconfirmed at zhwiktionary (T187018) (duration: 01m 13s)
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Make alias from old NS_PROJECT to new NS_PROJECT at hiwikiversity (T185347) (duration: 01m 12s)
  • 14:21 akosiaris: reboot ganeti1008 for kernel upgrade T181121
  • 14:14 zfilipin@tin: Synchronized wmf-config/reverse-proxy.php: SWAT: wgSquidServersNoPurge: add eqsin, remove dead IP (T156027) (duration: 01m 12s)
  • 14:11 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/mmv.3d.head.js: Fix 3D badge (duration: 01m 12s)
  • 14:10 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/ext.3d.js: Fix 3D badge and Webkit thumb load detection (duration: 01m 13s)
  • 13:44 elukey: rollback java 8 upgrade for archiva - issues with Analytics builds
  • 13:34 elukey: installed openjdk-8 on meitnerium, manually upgraded java-update-alternatives to java8, restarted archiva
  • 13:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 original weight (duration: 01m 12s)
  • 13:16 jynus: stop slave and rolling schema change on db1059 m3 replica
  • 13:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 (duration: 01m 12s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1106 (duration: 01m 12s)
  • 12:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more traffic to db1106 (duration: 01m 12s)
  • 12:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1106 (duration: 01m 12s)
  • 11:25 marostegui: Deploy schema change on db1106 - T187089 T185128 T153182
  • 11:16 marostegui: Stop MySQL and reboot db1106 for mysql and kernel upgrade
  • 11:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 01m 12s)
  • 11:14 filippo@tin: Synchronized wmf-config/ProductionServices.php: repool poolcounter1002 after disk replacement (duration: 01m 12s)
  • 11:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 (duration: 01m 12s)
  • 10:46 jynus: dropping test databases from m5 T186585
  • 10:42 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 10:28 moritzm: installing libvorbis security updates on trusty systems
  • 10:13 marostegui: Deploy schema change on db1100 - T187089 T185128 T153182
  • 10:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 (duration: 01m 12s)
  • 10:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3316,3315 (duration: 01m 12s)
  • 09:50 akosiaris: set standard weight for all ores* hosts
  • 09:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316,3315 (duration: 01m 12s)
  • 09:49 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: all (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 09:49 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: all (tags: ['dc=codfw', 'cluster=ores', 'service=ores'])
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316,3315 (duration: 01m 12s)
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool slowly db1096:3316,3315 (duration: 01m 13s)
  • 09:08 marostegui: Deploy schema change on s5 dbstore1002 https://phabricator.wikimedia.org/T187089 https://phabricator.wikimedia.org/T185128 https://phabricator.wikimedia.org/T153182
  • 09:02 marostegui: Stop MySQL on db1096:3315 and 3316 for mysql+kernel upgrade
  • 08:45 jynus@tin: Synchronized wmf-config/db-eqiad.php: Rebalance s8 (duration: 01m 13s)
  • 08:38 akosiaris: pybal restart on lvs1003 to pickup https://gerrit.wikimedia.org/r/410398
  • 08:29 akosiaris: pybal restart on lvs1006, lvs1009, lvs1012 to pickup https://gerrit.wikimedia.org/r/410398
  • 08:08 _joe_: powercycled ganeti1008
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 (duration: 01m 12s)
  • 06:44 marostegui: Stop replication in sync db1089 and db1067 - T162807
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 01m 12s)
  • 06:31 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1096:3315 for alter table (duration: 01m 13s)
  • 06:30 marostegui: Deploy schema change on db1096:3315 - T187089 T185128 T153182
  • 05:55 andrew@tin: Finished deploy [horizon/deploy@c355366]: updating sudo-dashboard (duration: 03m 13s)
  • 05:52 andrew@tin: Started deploy [horizon/deploy@c355366]: updating sudo-dashboard
  • 05:52 andrew@tin: Finished deploy [horizon/deploy@c355366]: updating sudo-dashboard (duration: 00m 20s)
  • 05:51 andrew@tin: Started deploy [horizon/deploy@c355366]: updating sudo-dashboard
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 05m 39s)
  • 02:02 demon@tin: Synchronized fonts/: removing executable bits, no-op (duration: 01m 15s)
  • 01:33 demon@tin: Finished deploy [gerrit/gerrit@b234c85]: rm reviewers plugin (for now) (duration: 00m 11s)
  • 01:32 demon@tin: Started deploy [gerrit/gerrit@b234c85]: rm reviewers plugin (for now)
  • 00:25 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add uploader user group to mznwiki and make it automagically added T187187 (duration: 01m 12s)
  • 00:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable xkill on top wikis that use x aspect T187265 (duration: 01m 14s)

2018-02-13

  • 21:19 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.21
  • 21:07 andrew@tin: Finished deploy [horizon/deploy@c355366]: another try with static content (duration: 00m 49s)
  • 21:07 andrew@tin: Started deploy [horizon/deploy@c355366]: another try with static content
  • 21:06 andrew@tin: Finished deploy [horizon/deploy@c355366]: another try with static content (duration: 00m 03s)
  • 21:06 andrew@tin: Started deploy [horizon/deploy@c355366]: another try with static content
  • 20:43 twentyafterfour@tin: Finished scap: T183960 Build l10n cache & Deploy wmf/1.31.0-wmf.21 to test wikis (duration: 31m 01s)
  • 20:41 andrew@tin: Finished deploy [horizon/deploy@c355366]: updated static content collection process (duration: 00m 21s)
  • 20:41 andrew@tin: Started deploy [horizon/deploy@c355366]: updated static content collection process
  • 20:26 jynus: upgrading labsdb1010 database - proxies will complain for some time
  • 20:18 andrew@tin: Finished deploy [horizon/deploy@c355366]: updated static content collection process (duration: 01m 17s)
  • 20:17 andrew@tin: Started deploy [horizon/deploy@c355366]: updated static content collection process
  • 20:12 twentyafterfour@tin: Started scap: T183960 Build l10n cache & Deploy wmf/1.31.0-wmf.21 to test wikis
  • 20:11 twentyafterfour: Currently there are no blockers listed on T183960 and the train is leaving the station.
  • 20:05 twentyafterfour: MediaWiki Train 1.31.0-wmf.21 branched, prepped and patched | Changelog uploaded to https://www.mediawiki.org/wiki/MediaWiki_1.31/wmf.21/Changelog | Blockers: T183960
  • 19:03 jynus: upgrade and restart db2042
  • 18:53 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 (duration: 01m 58s)
  • 18:25 elukey: Analytics Hadoop cluster upgrade to Java 8 about to start - complete cluster shutdown is needed - T166248
  • 18:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@e488cee]: Update mobileapps to 5851dfc (duration: 05m 28s)
  • 18:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@e488cee]: Update mobileapps to 5851dfc
  • 18:00 twentyafterfour: Preparing to cut new MediaWiki branch wmf/1.31.0-wmf.21 - report deployment blockers for this branch in phabricator: T183960
  • 17:54 godog: repool mw1256 after disk swap - T186535
  • 17:20 demon@tin: Synchronized README: forcing git config sync, setting core.sharedRepository=group, T187076 (duration: 01m 12s)
  • 17:13 cmjohnson1: sorry snapshot1001 is going down for rack relocation
  • 17:12 cmjohnson1: stat1001 going down to for rack relocation
  • 17:04 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: all (tags: ['dc=codfw', 'cluster=ores', 'service=ores'])
  • 17:03 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: all (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 16:36 demon@tin: Synchronized scap/plugins/clean.py: no-op, consistency (duration: 00m 55s)
  • 16:23 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on group 0 (duration: 00m 56s)
  • 16:17 cmjohnson1: replacing disk poolcounte1002
  • 15:35 marostegui: Deploy schema change on s5 codfw master (db2052), this will generate lag on codfw - T187089 T185128 T153182
  • 15:30 bblack: deploying changes to URL-encoding normalization on caches - https://gerrit.wikimedia.org/r/407488
  • 15:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2066 (duration: 00m 55s)
  • 15:01 zeljkof: EU SWAT finished
  • 14:59 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logo for urwikibooks, add hd logo (T185977) (duration: 00m 55s)
  • 14:58 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update logo for urwikibooks, add hd logo (T185977) (duration: 00m 54s)
  • 14:37 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change logos for sdwiki (T185865) (duration: 00m 55s)
  • 14:25 zfilipin@tin: Synchronized php-1.31.0-wmf.20/extensions/ContentTranslation/extension.json: SWAT: Add ext.cx.widgets.overlay dependency to template editor (T187119) (duration: 00m 55s)
  • 14:22 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sitename for sdwiki (T184521) (duration: 00m 57s)
  • 13:51 marostegui: Reboot db2066 to pick up new kernel
  • 13:50 marostegui: Deploy schema change on dbstore2001 - T187089 T185128 T153182
  • 12:51 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable STL uploads on Commons (duration: 00m 56s)
  • 12:20 mlitn@tin: Synchronized wmf-config/InitialiseSettings.php: Enable STL uploads on Commons (duration: 00m 55s)
  • 12:19 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable STL uploads on Commons (duration: 00m 55s)
  • 12:07 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/ext.3d.js: Fix 3D badge (duration: 00m 56s)
  • 11:57 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2066 (duration: 00m 55s)
  • 11:56 marostegui: Deploy schema change on db2066 - T187089 T185128 T153182
  • 11:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Rpool db2038 and db2059 (duration: 00m 55s)
  • 11:47 jynus: reenabling puppet on all eqiad databases
  • 11:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099 (duration: 00m 56s)
  • 11:37 marostegui: Stop MySQL on db2059 and db2038 for kernel upgrade
  • 11:29 ema: lvs1003: restart pybal to reconnect to etcd
  • 11:27 ema: lvs1006/1010: restart pybal to reconnect to etcd
  • 11:26 ema: lvs4005: restart pybal to reconnect to etcd
  • 11:23 ema: esams primary LVSs: restart pybal to reconnect to etcd
  • 11:21 ema: esams secondary LVSs: restart pybal to properly reconnect to etcd
  • 11:14 ema: repool cp3007
  • 11:13 ema: depool cp3007 to test pybal's behavior on lvs3002
  • 10:51 filippo@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1002 for disk replacement (duration: 00m 56s)
  • 10:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099 (duration: 00m 54s)
  • 10:08 godog: roll-restart ms-fe in codfw/eqiad after applying https://gerrit.wikimedia.org/r/c/409942/
  • 10:03 ema: restart pybal on lvs2003
  • 09:58 ema: restart pybal on lvs2006
  • 09:52 filippo@neodymium: conftool action : set/pooled=no; selector: name=ms-fe2005.codfw.wmnet
  • 09:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2075, depool db2038 and db2059 (duration: 00m 55s)
  • 09:32 marostegui: Stop mysql on db2075 for mysql and kernel upgrade
  • 09:30 marostegui: Stop replication in sync on db1089 and dbstore1002 - T162807
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 54s)
  • 09:22 elukey: powercycle analytics1062 - not reachable via ssh, frozen via serial console
  • 09:22 jynus: disabling puppet on all eqiad databases
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 55s)
  • 09:20 marostegui: Stop replication in sync on db1089 and db1065 - T162807
  • 09:12 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2084:3315, depool db2075 (duration: 00m 55s)
  • 09:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 55s)
  • 08:52 marostegui: Stop replication in sync on db1089 and db1099:3311 - T162807
  • 08:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089, db1099 - T162807 (duration: 00m 55s)
  • 08:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2084:3315 (duration: 00m 56s)
  • 08:37 hashar: tin.eqiad.wmnet: removing live hack in /srv/mediawiki-staging/scap/plugins/clean.py | T187160
  • 08:32 moritzm: installing wavpack security updates
  • 08:09 moritzm: installing exim security updates on trusty hosts
  • 07:02 marostegui: Deploy schema change on s5 db2089 db2084 db2075 db2039 db2059 - T187089
  • 06:28 marostegui: reload haproxy on dbproxy1005
  • 05:10 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp50(0[12345789]|1[12]).eqsin.wmnet
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 05m 29s)
  • 00:24 cwd: re-enabled p-c
  • 00:16 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/modules/ve-mw/ui/pages/: T187112 (duration: 00m 56s)
  • 00:10 cwd: disabled p-c jobs for reboot
  • 00:04 demon@tin: Synchronized wmf-config/: cleanup Sentry inclusion for labs, should be no-op (duration: 00m 57s)
  • 00:03 demon@tin: Synchronized wmf-config/InitialiseSettings.php: cleanup Sentry inclusion for labs, should be no-op (duration: 00m 56s)

2018-02-12

  • 23:47 demon@tin: Finished deploy [gerrit/gerrit@6adde70]: reviewers plugin (duration: 00m 12s)
  • 23:46 demon@tin: Started deploy [gerrit/gerrit@6adde70]: reviewers plugin
  • 23:32 mutante: terbium,wasat: touch /var/log/mediawwiki/purge_abusefilter.log ; set owner/permissions like other logfiles
  • 23:13 elukey: manual restart of Yarn Node Managers on analytics1058/31 (failed due to root partition filled up for the issue logged before)
  • 23:09 elukey: cleaned up tmp files on all analytics hadoop worker nodes, job filling up tmp
  • 21:27 andrew@tin: Finished deploy [horizon/deploy@2f70002]: updating several submodules, probably breaking static content (duration: 03m 18s)
  • 21:24 andrew@tin: Started deploy [horizon/deploy@2f70002]: updating several submodules, probably breaking static content
  • 21:06 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0639c31]: Update mobileapps to f14bdd5 (duration: 05m 46s)
  • 21:00 mholloway-shell@tin: Started deploy [mobileapps/deploy@0639c31]: Update mobileapps to f14bdd5
  • 20:21 andrew@tin: Finished deploy [horizon/deploy@c009388]: updating puppet dashboard (duration: 03m 22s)
  • 20:17 andrew@tin: Started deploy [horizon/deploy@c009388]: updating puppet dashboard
  • 20:13 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Model/UUID.php: T186909 (duration: 00m 56s)
  • 20:08 andrew@tin: Finished deploy [horizon/deploy@cba66d2]: more submodule tinkering (duration: 01m 15s)
  • 20:07 ppchelko@tin: Finished deploy [restbase/deploy@b257b4f]: Support batching in the reading lists API (duration: 15m 10s)
  • 20:07 andrew@tin: Started deploy [horizon/deploy@cba66d2]: more submodule tinkering
  • 20:01 andrew@tin: Finished deploy [horizon/deploy@1fcd9ff]: fixes to post-install checks (duration: 01m 02s)
  • 20:00 andrew@tin: Started deploy [horizon/deploy@1fcd9ff]: fixes to post-install checks
  • 19:58 andrew@tin: Finished deploy [horizon/deploy@9d73005]: fixes to post-isntall checks (duration: 01m 01s)
  • 19:57 andrew@tin: Started deploy [horizon/deploy@9d73005]: fixes to post-isntall checks
  • 19:52 ppchelko@tin: Started deploy [restbase/deploy@b257b4f]: Support batching in the reading lists API
  • 19:50 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two (duration: 00m 45s)
  • 19:50 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two
  • 19:48 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two (duration: 00m 03s)
  • 19:47 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two
  • 19:44 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes (duration: 01m 06s)
  • 19:43 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes
  • 19:17 niharika29@tin: Synchronized wmf-config/filebackend.php: Proxy public wiki thumb.php requests through Thumbor T169144 (duration: 00m 55s)
  • 19:13 andrew@tin: Finished deploy [horizon/deploy@01021b4]: trying another force (duration: 00m 17s)
  • 19:13 andrew@tin: Started deploy [horizon/deploy@01021b4]: trying another force
  • 19:12 niharika29@tin: Synchronized php-1.31.0-wmf.20/extensions/PageAssessments/: Fix 500 error with PageAssessments API T185037 (duration: 00m 56s)
  • 19:07 niharika29@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Stop PHP errors from going to the hhvm channel T45086 (duration: 00m 56s)
  • 18:58 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 07m 39s)
  • 18:50 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:48 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 12m 14s)
  • 18:35 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:34 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 06m 47s)
  • 18:27 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:23 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: ores1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 18:12 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 12m 30s)
  • 18:09 gehel@tin: Finished deploy [wdqs/wdqs@b6bd483]: new WDQS GUI (duration: 01m 53s)
  • 18:07 gehel@tin: Started deploy [wdqs/wdqs@b6bd483]: new WDQS GUI
  • 18:00 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 17:47 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 13m 18s)
  • 17:45 andrew@tin: Finished deploy [horizon/deploy@01021b4]: rolling out new dashboards again (duration: 00m 17s)
  • 17:45 andrew@tin: Started deploy [horizon/deploy@01021b4]: rolling out new dashboards again
  • 17:34 gilles: added thumborUrl to PrivateSettings.php on labs, in preparation for https://gerrit.wikimedia.org/r/#/c/407611/
  • 17:34 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 17:21 demon@tin: Pruned MediaWiki: 1.31.0-wmf.17 [keeping static files] (duration: 02m 08s)
  • 17:18 elukey: home dirs on stat1004 moved to /srv/home (/home symlinks to it)
  • 17:10 andrew@tin: Finished deploy [horizon/deploy@01021b4]: rolling out new dashboards (duration: 00m 54s)
  • 17:09 andrew@tin: Started deploy [horizon/deploy@01021b4]: rolling out new dashboards
  • 16:56 bblack@neodymium: conftool action : set/pooled=yes; selector: name=dns5001.wikimedia.org
  • 16:52 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/ApiVisualEditor.php: T186934 (duration: 00m 57s)
  • 16:27 andrew@tin: Finished deploy [horizon/deploy@4d1bdeb]: updating requirements.txt (duration: 01m 04s)
  • 16:26 andrew@tin: Started deploy [horizon/deploy@4d1bdeb]: updating requirements.txt
  • 16:16 andrew@tin: Finished deploy [horizon/deploy@de72527]: scap debugging run (duration: 00m 24s)
  • 16:16 andrew@tin: Started deploy [horizon/deploy@de72527]: scap debugging run
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 00m 55s)
  • 15:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 55s)
  • 15:28 marostegui: Stop replication in sync on db1089 and db1105:3311 - T162807
  • 15:23 moritzm: installing libtasn security updates
  • 15:02 reedy@tin: Synchronized php-1.31.0-wmf.20/extensions/AbuseFilter/maintenance/: Fix maintenance scripts (duration: 00m 56s)
  • 15:01 godog: roll-upgrade thumbor to 1.12 - T186500 T186594 T186492
  • 14:54 elukey: upload prometheus-burrow-exporter 0.0.4 on jessie/stretch-wikimedia
  • 14:51 ottomata: emitting IP field from varnishkafka-eventlogging instance T186833
  • 14:51 zeljkof: EU SWAT finished
  • 14:47 filippo@neodymium: conftool action : set/pooled=no; selector: name=mw1227.eqiad.wmnet
  • 14:44 addshore@tin: Finished scap: T186612 gerrit:409063 TwoColConflict wmf.20 (Remove hint and link from twoColConflict-beta-feature-description) (duration: 19m 56s)
  • 14:44 andrew@tin: Finished deploy [horizon/deploy@de72527]: just checking that this still doesn't work (duration: 00m 04s)
  • 14:44 andrew@tin: Started deploy [horizon/deploy@de72527]: just checking that this still doesn't work
  • 14:38 moritzm: uploading cassandra 3.11.0-wmf5 to component/cassandra311 for stretch-wikimedia/apt.wikimedia.org (T186619)
  • 14:24 addshore@tin: Started scap: T186612 gerrit:409063 TwoColConflict wmf.20 (Remove hint and link from twoColConflict-beta-feature-description)
  • 14:22 otto@tin: Finished deploy [eventlogging/analytics@01d5761]: T186833 (duration: 00m 04s)
  • 14:22 otto@tin: Started deploy [eventlogging/analytics@01d5761]: T186833
  • 14:20 godog: grant group write for wikidev on tin on /srv/mediawiki-staging/php-1.31.0-wmf.20/.git
  • 13:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 01m 06s)
  • 13:11 marostegui: Deploy schema change on db2084 and db2075 - T185128 T153182
  • 12:03 moritzm: upgrading jessie-based servers in deployment-prep/beta to the HHVM build using ICU 57 (component/icu57)
  • 11:15 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 11:14 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 10:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T162807 (duration: 00m 55s)
  • 10:07 marostegui: Stop replication in sync on db1089 and db1066 - T162807
  • 10:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T162807 (duration: 00m 55s)
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 55s)
  • 09:51 elukey: reboot mw1302 (hhvm defunct processes, hungs registered in dmesg, very high load)
  • 09:46 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 00m 56s)
  • 09:29 moritzm: installing libdatetime-timezone-perl SUA update
  • 09:25 godog: install swift stretch updates on ms-be eqiad - T177739
  • 09:19 marostegui: Deploy schema change on s5 - T185128 T153182
  • 09:05 marostegui: Stop replication in sync on db1089 and db2048 - T162807
  • 09:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 55s)
  • 08:57 moritzm: installing glibc security updates on trusty (harmless in our environment; CVE-2018-1000001 is non-exploitable due to disabled unprivileged user name spaces)
  • 08:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T184599 (duration: 00m 55s)
  • 08:36 marostegui: Reboot db1087 to pick new kernel
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092, depool db1087 - T184599 (duration: 00m 55s)
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318, depool db1092 - T184599 (duration: 00m 55s)
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318, depool db1099:3318 - T184599 (duration: 00m 55s)
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104, depool db1101:3318 - T184599 (duration: 00m 55s)
  • 08:01 hashar: Upgrading CI Jenkins plugins
  • 07:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1109, depool db1104 - T184599 (duration: 00m 55s)
  • 07:46 moritzm: installing exim security updates on remaining hosts
  • 07:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 - T184599 (duration: 00m 55s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 - T184599 (duration: 00m 55s)
  • 07:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1109 - T184599 (duration: 00m 55s)
  • 06:53 marostegui: Reboot db1109 to pick up new kernel
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T184599 (duration: 00m 56s)
  • 06:40 marostegui: Drop dewiki database from s8 servers - T184599
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 11m 40s)

2018-02-11

  • 14:06 moritzm: installing exim4 security updates on MXs

2018-02-10

  • 16:51 legoktm@tin: Synchronized php-1.31.0-wmf.20/includes/specials/SpecialLog.php: SpecialLog: Fix results when no offender is specified - T186950 (duration: 00m 57s)
  • 01:10 demon@tin: Finished deploy [gerrit/gerrit@5d5193e]: one last gitiles rebuild before the weekend (duration: 00m 10s)
  • 01:10 demon@tin: Started deploy [gerrit/gerrit@5d5193e]: one last gitiles rebuild before the weekend

2018-02-09

  • 23:28 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/MobileFrontend/includes/api/ApiMobileView.php: emergency fix for T186927 (now incldes actual code change!) (duration: 00m 55s)
  • 23:26 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/TextExtracts/includes/ApiQueryExtracts.php: emergency fix for T186927 (now incldes actual code change!) (duration: 00m 55s)
  • 23:01 jynus: restart haproxy on dbproxy1005
  • 22:47 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again again) (duration: 00m 03s)
  • 22:47 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again again)
  • 22:45 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/TextExtracts/includes/ApiQueryExtracts.php: emergency fix for T186927 (duration: 00m 55s)
  • 22:43 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/MobileFrontend/includes/api/ApiMobileView.php: emergency fix for T186927 (duration: 00m 55s)
  • 22:42 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again) (duration: 00m 40s)
  • 22:42 tgr@tin: Synchronized php-1.31.0-wmf.20/includes/parser/ParserOutput.php: emergency fix for T186927 (duration: 00m 57s)
  • 22:42 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again)
  • 22:36 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (duration: 09m 59s)
  • 22:26 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901
  • 22:10 andrew@tin: Finished deploy [horizon/deploy@de72527]: Doing this while halfaker watches, again (duration: 00m 03s)
  • 22:10 andrew@tin: Started deploy [horizon/deploy@de72527]: Doing this while halfaker watches, again
  • 22:08 andrew@tin: Finished deploy [horizon/deploy@de72527]: Doing this while halfaker watches (duration: 00m 17s)
  • 22:08 andrew@tin: Started deploy [horizon/deploy@de72527]: Doing this while halfaker watches
  • 21:40 andrew@tin: Finished deploy [horizon/deploy@de72527]: At this point I'm just hoping scap will really deploy the wheels on my second try (duration: 00m 14s)
  • 21:40 andrew@tin: Started deploy [horizon/deploy@de72527]: At this point I'm just hoping scap will really deploy the wheels on my second try
  • 21:28 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/AbuseFilter/includes/api/ApiQueryAbuseLog.php: T186914 (duration: 00m 54s)
  • 21:20 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Block/TopicList.php: T186911 (duration: 00m 55s)
  • 21:10 ejegg@tin: Synchronized php-1.31.0-wmf.20/extensions/CentralNotice/CentralNoticePageLogPager.php: Sync CentralNotice for banner content log fix (duration: 00m 56s)
  • 20:12 demon@tin: Synchronized php-1.31.0-wmf.20/includes/user/User.php: Avoid pointless DB_MASTER connections in User::clearSharedCache() (duration: 00m 55s)
  • 20:08 demon@tin: Synchronized php-1.31.0-wmf.20/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Catch Error exceptions in MediaWiki::run() (duration: 00m 55s)
  • 20:07 demon@tin: Synchronized php-1.31.0-wmf.20/includes/MediaWiki.php: Catch Error exceptions in MediaWiki::run() (duration: 00m 57s)
  • 19:16 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Scribunto/common/Hooks.php: silence divide by zero / no such index 0 errors (duration: 00m 56s)
  • 18:31 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.20
  • 18:12 demon@tin: Synchronized php-1.31.0-wmf.20/includes/filerepo/file/LocalFile.php: Fix CommentStore->createComment() call in LocalFile.php (duration: 01m 12s)
  • 18:08 bblack: cp4023: after a brief period of levelling off a bit: sharp, steep recovery of mbox lag ramp back to ~6K. not sure if this is a new floor or will drop further, but seems pretty ok.
  • 18:03 bblack: cp4023: now seems to be leveling off on lag and decreasing objhdr locks. either expiry thread prio helped (which argues for our prio-related patches) or it was naturally going to end?
  • 17:44 bblack: cp4023: experimental, "renice -19 39007" (backend cache-timeout aka expiry thread), to see if mbox lag resolves on its own quicker
  • 17:19 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20
  • 16:53 andrew@tin: Finished deploy [horizon/deploy@de72527]: Rolling out pyldap wheel (duration: 02m 26s)
  • 16:51 andrew@tin: Started deploy [horizon/deploy@de72527]: Rolling out pyldap wheel
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 16:29 demon@tin: Finished deploy [gerrit/gerrit@7ca3b02]: no-op to gerrit: deploying scap config change (duration: 00m 10s)
  • 16:29 demon@tin: Started deploy [gerrit/gerrit@7ca3b02]: no-op to gerrit: deploying scap config change
  • 15:49 akosiaris: upgrade etherpad.wikimedia.org to 1.6.3-1 T186866
  • 15:47 akosiaris: upgrade etherpad.wikimedia.org to 1.6.3-1
  • 15:47 akosiaris: upload etherpad-lite 1.6.3-1 to apt.wikimedia.org/jessie-wikimedia/main T186866
  • 15:00 herron: upgraded mailman on fermium for security updates
  • 14:24 demon@tin: Synchronized php-1.31.0-wmf.20/tests/phpunit/includes/db/LBFactoryTest.php: no-op to prior (duration: 01m 12s)
  • 13:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 12s)
  • 13:33 demon@tin: Finished deploy [gerrit/gerrit@9c0acf6]: updating gitiles plugin (duration: 00m 10s)
  • 13:33 demon@tin: Started deploy [gerrit/gerrit@9c0acf6]: updating gitiles plugin
  • 10:51 marostegui: Stop replication in sync on db1067 and db1089 - T162807
  • 10:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 for data checksumming - T162807 (duration: 01m 11s)
  • 10:36 moritzm: uploaded php-luasandbox 2.0.14~stretch2 for stretch-wikimedia to apt.wikimedia.org (this removes the php-luasandbox binary from our internal luasandbox build in favour of the php-luasandbox package maintained by legoktm from stretch-backports). As such the php-luasandbox source package we build internall now only provides the HHVM extension (and we can retire it entirely when migrating to PHP7)
  • 10:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1080 - T162807 (duration: 01m 11s)
  • 10:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1080 - T162807 (duration: 01m 12s)
  • 09:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1080 - T162807 (duration: 01m 11s)
  • 09:06 marostegui: Fix data drifts on db1067 - T162807
  • 08:45 demon@tin: Synchronized wmf-config/: rm cleanchanges (duration: 01m 14s)
  • 08:44 demon@tin: Synchronized multiversion/submodules.json: rm CleanChanges (duration: 01m 13s)
  • 07:57 marostegui: Stop replication on labsdb1004 to fix replication issues
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1080 - T162807 (duration: 01m 11s)
  • 07:39 elukey: forced remount of /mnt/hdfs on stat1005
  • 06:52 marostegui: Fix replication on labsdb1010 - T186579
  • 06:47 marostegui: Fix data drifts, upgrade kernel, mariadb and socket path on db1080 - T162807
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T162807 (duration: 01m 12s)
  • 02:41 andrew@tin: Finished deploy [horizon/deploy@60cac8e]: updating with designate dashboard (duration: 02m 42s)
  • 02:38 andrew@tin: Started deploy [horizon/deploy@60cac8e]: updating with designate dashboard
  • 00:18 demon@tin: rebuilt and synchronized wikiversions files: surprise, it broke. revert group1 back to wmf.20
  • 00:16 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20 *duck and cover*

2018-02-08

  • 23:49 ppchelko@tin: Finished deploy [restbase/deploy@c0f0dcd]: Fix a type that prevented the mobile partial content to have an etag (duration: 15m 44s)
  • 23:33 ppchelko@tin: Started deploy [restbase/deploy@c0f0dcd]: Fix a type that prevented the mobile partial content to have an etag
  • 22:37 bsitzmann@tin: Finished deploy [mobileapps/deploy@75a2ebb]: Update mobileapps to e93ab95 (duration: 05m 07s)
  • 22:32 bsitzmann@tin: Started deploy [mobileapps/deploy@75a2ebb]: Update mobileapps to e93ab95
  • 22:17 demon@tin: rebuilt and synchronized wikiversions files: mw.org back to wmf.20
  • 22:08 XioNoX: rebooting cr1-eqsin
  • 21:59 andrew@tin: Finished deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take... six, I guess? (duration: 00m 03s)
  • 21:58 andrew@tin: Started deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take... six, I guess?
  • 21:53 ottomata: finished upgrade of scb to librdkafka 0.11 and node-rdkafka 2
  • 21:49 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Finally we're there (duration: 00m 49s)
  • 21:49 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Finally we're there
  • 21:48 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Finally we're there (duration: 00m 35s)
  • 21:48 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Finally we're there
  • 21:48 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2 (duration: 00m 46s)
  • 21:47 otto@tin: Started deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2
  • 21:40 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 15s)
  • 21:40 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:40 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 04s)
  • 21:40 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:39 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 47s)
  • 21:38 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2 (duration: 00m 24s)
  • 21:38 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:38 otto@tin: Started deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2
  • 21:32 herron: restarted rsyslogd services on lithium and wezen to clear rsyslog tls listener on port 6514 icinga alerts
  • 21:23 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 54s)
  • 21:23 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 01m 03s)
  • 21:23 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:22 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 25s)
  • 21:22 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:22 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 21:13 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 13s)
  • 21:12 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:11 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 45s)
  • 21:10 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:09 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 00m 21s)
  • 21:09 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 21:01 andrew@tin: Finished deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take five (duration: 01m 25s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take five
  • 20:52 andrew@tin: Finished deploy [horizon/deploy@7d4a2d9]: updating with designate dashboard -- take four (duration: 01m 36s)
  • 20:50 andrew@tin: Started deploy [horizon/deploy@7d4a2d9]: updating with designate dashboard -- take four
  • 20:34 ppchelko@tin: Started restart [changeprop/deploy@5fdc03a]: Restart CP to force rule rebalance
  • 20:27 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 46s)
  • 20:26 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 20:26 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 13s)
  • 20:26 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 20:24 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 00m 22s)
  • 20:24 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 20:20 ottomata: starting deploy process to update scb cluster to librdkafka 0.11 and node-rdkafka 2. we will depool, stop puppet, deploy, test, start puppet on each node
  • 20:03 no_justification: gerrit: killed about 12 parallel clones of mediawiki/extensions/Math that had been running between 2-3 days (wtf?)
  • 19:24 catrope@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Model/AbstractRevision.php: T186077 (duration: 01m 11s)
  • 19:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TemplateStyles on svwiki (T176082) (duration: 01m 11s)
  • 19:17 catrope@tin: Synchronized php-1.31.0-wmf.20/extensions/Campaigns/CampaignsSecondaryAuthenticationProvider.php: T185870 (duration: 01m 13s)
  • 19:02 bsitzmann@tin: Finished deploy [mobileapps/deploy@541a7f7]: Update mobileapps to e6fbc94 (duration: 08m 21s)
  • 19:00 bblack: lvs@ulsfo - all back to normal
  • 18:55 bblack: lvs@ulsfo - puppet disabled, trying tagged vlan deploy
  • 18:54 bsitzmann@tin: Started deploy [mobileapps/deploy@541a7f7]: Update mobileapps to e6fbc94
  • 18:38 arlolra: Updated Parsoid to 961a5cf (T186630)
  • 18:27 arlolra@tin: Finished deploy [parsoid/deploy@1367057]: Updating Parsoid to 961a5cf (duration: 08m 11s)
  • 18:26 andrew@tin: Finished deploy [horizon/deploy@9e9d458]: updating with designate dashboard -- take three (duration: 01m 16s)
  • 18:25 andrew@tin: Started deploy [horizon/deploy@9e9d458]: updating with designate dashboard -- take three
  • 18:19 arlolra@tin: Started deploy [parsoid/deploy@1367057]: Updating Parsoid to 961a5cf
  • 18:10 ema: upgrade cp2026 to varnish 5
  • 17:55 bblack@neodymium: conftool action : set/pooled=yes; selector: name=dns400[12].wikimedia.org
  • 17:21 akosiaris: repool sca1004 (zotero) for T181121
  • 17:21 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 17:16 ema: upgrade cp2024 to varnish 5
  • 16:58 ema: upgrade cp2022 to varnish 5
  • 16:39 moritzm: installing PHP7 security updates
  • 16:32 moritzm: installing mysql security updates on auth*
  • 16:31 ema: upgrade cp2020 to varnish 5
  • 16:30 bblack: puppet disabled on all ntp servers for initial ulsfo recdns/ntp config process
  • 16:25 bblack: puppet disabled on lvs400[67] for initial ulsfo recdns config process
  • 16:23 elukey: stop archiva on meitnerium to swap /var/lib/archiva from the root partition to a new separate one - T186020
  • 16:20 akosiaris: reboot ganeti1005 T181121
  • 16:18 akosiaris: depool sca1004 (zotero) for T181121
  • 16:17 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 16:13 bblack: rebooting dns400[12] (downtimed, currently spare::system)
  • 16:13 ema: upgrade cp2017 to varnish 5
  • 16:11 andrew@tin: Finished deploy [horizon/deploy@9af532a]: updating with designate dashboard -- take two (duration: 01m 24s)
  • 16:10 andrew@tin: Started deploy [horizon/deploy@9af532a]: updating with designate dashboard -- take two
  • 16:05 bblack: ntp servers back to normal
  • 16:04 andrew@tin: Finished deploy [horizon/deploy@2f176e2]: updating with designate dashboard (duration: 01m 11s)
  • 16:03 andrew@tin: Started deploy [horizon/deploy@2f176e2]: updating with designate dashboard
  • 15:57 ema: upgrade cp2014 to varnish 5
  • 15:48 moritzm: installing libio-socket-ssl-perl update from jessie point release
  • 15:47 bblack: disabling puppet on all global dns recursors for controlled config deploy
  • 15:35 ema: upgrade cp2011 to varnish 5
  • 15:18 ema: upgrade cp2008 to varnish 5
  • 15:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1073 - T162807 (duration: 01m 12s)
  • 14:59 moritzm: installing icu security updates from jessie/stretch point releases
  • 14:56 ema: upgrade cp2005 to varnish 5
  • 14:49 akosiaris: migrate all running VMs off ganeti1005 T181121
  • 14:47 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on meta and mediawiki.org (duration: 01m 12s)
  • 14:43 zeljkof: EU SWAT finished
  • 14:31 moritzm: upgrading deployment-mediawiki04 to HHVM linked against ICU 57
  • 14:23 ema: upgrade cp2002 to varnish 5
  • 13:54 marostegui: Rename dewiki tables on s8 slaves - T184599
  • 13:53 ariel@tin: Finished deploy [dumps/dumps@9b7841f]: make sure all hashes appear in dumpstatus file , T185454 (duration: 00m 02s)
  • 13:53 ariel@tin: Started deploy [dumps/dumps@9b7841f]: make sure all hashes appear in dumpstatus file , T185454
  • 13:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 with low weight - T162807 (duration: 01m 11s)
  • 13:41 marostegui: Drop dewiki already renamed tables and database on s8 master (db1071) - T184599
  • 13:22 marostegui: Fixing data drifts on db1073, also upgrade kernel, socket location and mysql - T162807
  • 13:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T162807 (duration: 01m 12s)
  • 13:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T184599 (duration: 01m 12s)
  • 13:09 moritzm: upgrade deployment servers and script runners to HHVM 3.18.7
  • 13:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T184599 (duration: 01m 11s)
  • 13:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 - T184599 (duration: 01m 11s)
  • 13:02 moritzm: upgrade mwdebug servers to HHVM 3.18.7
  • 12:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 - T184599 (duration: 01m 11s)
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T184599 (duration: 01m 11s)
  • 12:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T184599 (duration: 01m 11s)
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 - T184599 (duration: 01m 11s)
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T184599 (duration: 01m 11s)
  • 11:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 - T184599 (duration: 01m 11s)
  • 11:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 - T184599 (duration: 01m 11s)
  • 11:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 - T184599 (duration: 01m 11s)
  • 11:37 marostegui: Fix replication on labsdb1010 - T186579
  • 11:33 akosiaris: reboot ganeti1005 T181121
  • 11:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 - T184599 (duration: 01m 11s)
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 (duration: 01m 11s)
  • 11:12 akosiaris: migrate all running VMs off ganeti1005 T181121
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 12s)
  • 11:00 marostegui: Drop wikidata renamed tables and database from s5 eqiad hosts - T184599
  • 10:07 marostegui: Drop deleted databases from sanitarium and labsdb hosts - T186685
  • 10:07 moritzm: upgrading remaining nginx-full packages on mw* in eqiad to 1.13.6-2+wmf1~jessie1
  • 08:07 moritzm: upgrade remaining app servers to HHVM 3.18.7
  • 07:27 _joe_: depooled mw1256 from traffic, scap (faulty disk, T186535); now powering it off
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 58s)
  • 02:20 eileen: Update CiviCRM civicrm revision changed from 71b1e35b99 to 61acc9175e (deploy citibank, benevity import updates)
  • 01:30 andrew@tin: Started deploy [horizon/deploy@9223ba7]: Now with static content, I hope
  • 01:30 andrew@tin: Finished deploy [horizon/deploy@9223ba7]: Now with static content, I hope (duration: 01m 15s)
  • 01:29 andrew@tin: Started deploy [horizon/deploy@9223ba7]: Now with static content, I hope
  • 00:35 ebernhardson@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/: Revert "Use wgEditSubmitButtonLabelPublish from upstream", Assume wpTextbox1 has an API registered already (duration: 01m 12s)
  • 00:33 ebernhardson@tin: Synchronized php-1.31.0-wmf.20/extensions/CirrusSearch/: T186765: Add special handling for profiles into config dump (duration: 01m 27s)

2018-02-07

  • 23:59 mutante: restarted icinga-wm, too quiet
  • 21:53 ebernhardson: mwdebug1001 back to standard deployed versions
  • 21:51 bsitzmann@tin: Finished deploy [mobileapps/deploy@fe3cd60]: Update mobileapps to 7a3b19c (T186745 T186643) (duration: 06m 41s)
  • 21:44 bsitzmann@tin: Started deploy [mobileapps/deploy@fe3cd60]: Update mobileapps to 7a3b19c (T186745 T186643)
  • 21:40 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: (no justification provided) (duration: 00m 02s)
  • 21:40 otto@tin: Started deploy [eventstreams/deploy@ee854df]: (no justification provided)
  • 21:39 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: (no justification provided) (duration: 00m 02s)
  • 21:39 otto@tin: Started deploy [eventstreams/deploy@ee854df]: (no justification provided)
  • 21:33 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png (duration: 03m 55s)
  • 21:29 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png
  • 21:27 ebernhardson: deploying wmf.20 to en* (except enwiki) on mwdebug1001 to debug new cirrus errors in wmf.20/wmf.19 mixed sister search
  • 21:13 andrew@tin: Finished deploy [horizon/deploy@9773454]: This isn't working and I'm going to do this 1000 times (duration: 01m 24s)
  • 21:12 andrew@tin: Started deploy [horizon/deploy@9773454]: This isn't working and I'm going to do this 1000 times
  • 21:07 demon@tin: rebuilt and synchronized wikiversions files: mw.org also back to wmf.17
  • 21:06 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 03s)
  • 21:06 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:04 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 02m 38s)
  • 21:01 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:01 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:01 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 44s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:00 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 05s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 20:39 demon@tin: rebuilt and synchronized wikiversions files: revert, huge spike in db lag
  • 20:36 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20
  • 19:47 ejegg: updated SmashPig from 1f56978c0c to 1ebee97a45
  • 19:43 ejegg: updated payments-wiki from 39a7ef32e5 to fe311c2d26
  • 19:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add NS_MAIN to $wgNamespacesWithSubpages for cawikimedia T185436 (duration: 01m 12s)
  • 19:11 chasemp: after conversation with andrew we moved labweb to public for T186729
  • 19:09 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Rename Project NS on Wikimedia Canada Chapter wiki T185661 (duration: 01m 11s)
  • 18:55 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove old "accountcreator" rules now handled by default T185417 T186462 (duration: 01m 12s)
  • 18:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Tidy: Re-do this as a sorted negative list that gets shorter over time (duration: 01m 13s)
  • 18:07 jynus: fixing ferm breakage by restarting the service on db1051
  • 17:38 awight: ORES celery workers restarted on scb100[1-4]
  • 16:53 legoktm@tin: Synchronized php-1.31.0-wmf.20/includes/http/MWHttpRequest.php: MWHttpRequest: Restore ability to pass null for $options - https://gerrit.wikimedia.org/r/408718 (Unbreak ExtensionDistributor) (duration: 01m 12s)
  • 16:47 gehel: upgrade of tilerator / kartotherian on maps eqiad completed, sorry for the noise...
  • 16:46 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 21s)
  • 16:46 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:44 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 18s)
  • 16:44 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:43 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 24s)
  • 16:42 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:39 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 18s)
  • 16:39 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:38 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 24s)
  • 16:38 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:37 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 17s)
  • 16:37 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:34 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:31 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 20s)
  • 16:31 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:30 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 01m 17s)
  • 16:28 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:28 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1001.eqiad.wmnet
  • 16:27 gehel: upgrading tilerator / kartotherian on maps eqiad
  • 16:00 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1271.eqiad.wmnet
  • 14:37 moritzm: installing poppler security updates
  • 14:33 zeljkof: EU SWAT finished
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Updates to enable transliteration for crhwiki (T23582) (duration: 01m 11s)
  • 14:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add "Portal" namespace on it.wikiquote (T185232) (duration: 01m 13s)
  • 14:05 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 47s)
  • 14:03 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:58 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: (no justification provided) (duration: 03m 02s)
  • 13:55 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: (no justification provided)
  • 13:38 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 02m 45s)
  • 13:36 moritzm: installing p7zip security updates
  • 13:35 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:35 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 21s)
  • 13:34 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:33 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 00m 06s)
  • 13:32 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:20 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 00m 55s)
  • 13:19 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:18 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 22s)
  • 13:17 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:16 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: (no justification provided)
  • 13:16 marostegui: Drop wikidata tables and database from s5 codfw hosts - T184599
  • 13:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 01m 11s)
  • 12:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1089 weight (duration: 01m 11s)
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low weight (duration: 01m 40s)
  • 11:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 - T186321 (duration: 01m 11s)
  • 11:09 elukey: install libc6-dbg on phab1001 to get a more precise gdb stack trace - T182832
  • 11:04 marostegui: Stop MySQL on db1069 for MySQL upgrade, kernel upgrade and change binlog format to statement - T186321
  • 10:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 - T186321 (duration: 01m 09s)
  • 09:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1051 after the BBU change - T186049 (duration: 01m 14s)
  • 09:41 kartik@tin: Finished deploy [cxserver/deploy@eabb6d7]: Update cxserver to e164ead and Matxin MT deployment (T184901) (duration: 03m 44s)
  • 09:38 marostegui: Failover back labsdb1010 to labsdb1009 - T174569
  • 09:37 kartik@tin: Started deploy [cxserver/deploy@eabb6d7]: Update cxserver to e164ead and Matxin MT deployment (T184901)
  • 09:18 marostegui: Failover labsdb1009 to labsdb1010 - T174569
  • 09:16 marostegui: Failover back labsdb1010 to labsdb1011 - T174569
  • 09:05 marostegui: Failover labsdb1011 to labsdb1010 - T174569
  • 08:43 marostegui: Change triggers for s3 on db1095 - T174569
  • 08:21 marostegui: Change triggers for s1 on db1095 - T174569
  • 08:11 marostegui: Change triggers for s5 on db1095 - T174569
  • 07:53 marostegui: Change triggers for s8 on db1095 - T174569
  • 07:17 marostegui: Change triggers for s7 on db1102 - T174569
  • 07:05 marostegui: Change triggers for s6 on db1102 - T174569
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Start repooling db1051 after the BBU change - T186049 (duration: 01m 15s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 34s)
  • 01:14 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable fine grained usage tracking, another batch. (T186645) (duration: 01m 11s)
  • 01:05 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable AICaptcha data collection everywhere (T186244) (duration: 01m 11s)
  • 00:45 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Support fallback values for referrer policy (T180921) (duration: 01m 12s)
  • 00:31 ladsgroup@tin: Synchronized php-1.31.0-wmf.20/includes/http/MWHttpRequest.php: MWHttpRequest: Restore ability to pass null for $options (duration: 01m 11s)
  • 00:28 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on wikis with < 10 errors in all high-priority categories (T184656) (duration: 01m 09s)

2018-02-06

  • 23:02 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 00m 04s)
  • 23:02 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 23:00 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 00m 03s)
  • 23:00 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 22:56 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 02m 45s)
  • 22:53 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 22:42 ejegg: updated SmashPig standalone from 778e8f87b4 to 1f56978c0c
  • 22:23 hashar: Zuul/CI seems to work all fine now
  • 21:49 hashar: Flushing Zuul queue and upgrading to zuul_2.5.1-wmf2 | T186381
  • 21:49 hashar: Flushing Zuul queue and upgrading
  • 21:41 hashar: Going to shutdown Zuul in a few for an emergency hotfix | T186381
  • 21:35 andrew@tin: Finished deploy [horizon/deploy@a316e45]: (no justification provided) (duration: 01m 00s)
  • 21:34 andrew@tin: Started deploy [horizon/deploy@a316e45]: (no justification provided)
  • 21:14 legoktm: restarted zuul due to patch being stuck (T186381)
  • 20:25 andrew@tin: Finished deploy [horizon/deploy@fbf761e]: (no justification provided) (duration: 01m 21s)
  • 20:23 andrew@tin: Started deploy [horizon/deploy@fbf761e]: (no justification provided)
  • 20:13 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.20
  • 20:11 demon@tin: Synchronized php: symlink swap (duration: 01m 17s)
  • 19:25 hashar: Restarted Zuul due to T186381
  • 18:55 demon@tin: Finished scap: bootstrap wmf.20 @ testwiki (duration: 26m 09s)
  • 18:55 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo (duration: 00m 15s)
  • 18:55 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo
  • 18:47 arlolra: Updated Parsoid to 8a0ff6c (T183515, T129372, T181408)
  • 18:46 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo (duration: 06m 23s)
  • 18:40 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo
  • 18:39 arlolra@tin: Finished deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c (duration: 03m 47s)
  • 18:35 arlolra@tin: Started deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c
  • 18:29 demon@tin: Started scap: bootstrap wmf.20 @ testwiki
  • 18:22 demon@tin: Pruned MediaWiki: 1.31.0-wmf.16 (duration: 07m 29s)
  • 18:15 arlolra@tin: Started deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c
  • 16:56 elukey: restart httpd on phab1001
  • 16:50 gehel: upgrading kartotherian / tilerator on maps codfw completed
  • afk: restarting jenkins for updates
  • 16:41 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2004.codfw.wmnet
  • 16:41 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 31s)
  • 16:40 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:40 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 36s)
  • 16:39 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:38 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2004.codfw.wmnet
  • 16:37 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2003.codfw.wmnet
  • 16:36 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 30s)
  • 16:36 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:35 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 01m 01s)
  • 16:34 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:32 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2003.codfw.wmnet
  • 16:30 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 31s)
  • 16:30 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:29 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 02m 34s)
  • 16:29 mutante: mw1262 started hhvm, it had Unhandled server exception: Class undefined: Psr\Log\LogLevel
  • 16:27 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:26 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 16:24 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 34s)
  • 16:24 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 16:15 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 16:14 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 22s)
  • 16:14 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:11 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 16:10 gehel: upgrading kartotherian / tilerator on maps codfw
  • 15:36 elukey: drain + shutdown of analytics1038 to replace faulty BBU - T185409
  • 15:02 zeljkof: EU SWAT finished
  • 15:01 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow eliminators to undelete at urwiki (T185829) (duration: 00m 55s)
  • 14:53 marostegui: Poweroff db1051 for BBU replacement - T186049
  • 14:50 akosiaris: upgrade service-checker to 0.1.4 on scb1001
  • 14:45 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Typo, its 2018 not 2017 (T185794) (duration: 00m 55s)
  • 14:39 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T186530) (duration: 00m 55s)
  • 14:35 chasemp: disable puppet on labs things for a cautious change rollout
  • 14:33 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on test wikis (duration: 00m 56s)
  • 14:28 marostegui: Changing triggers on s2 - T174569
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on fiwiki, hewiki, ruwiki, svwiki (T185945) (duration: 00m 55s)
  • 14:14 mlitn@tin: Synchronized php-1.31.0-wmf.17/extensions/UploadWizard/resources/details/uw.DescriptionsDetailsWidget.js: T184380 (duration: 00m 55s)
  • 14:10 ladsgroup@tin: Synchronized wmf-config/Wikibase.php: Add entityUsageModifierLimits config for Wikibase (T185693) (duration: 00m 55s)
  • 14:07 urandom: re-enable smartpath on restbase1010 (revert experiment) - T178177
  • 13:35 gehel: upgrading prometheus-elasticsearch-exporter across all elasticsearch nodes
  • 12:32 marostegui: Power cycled dbstore1001 after it crashed - T186596
  • 11:54 marostegui: Sanitize s4 - T174569
  • 11:11 _joe_: forcing a resync of /dev/md1 on conf2001 to verify if the higher timeouts avoid consensus loss in etcd
  • 11:02 ema: restart pybal on codfw primary LVSs to make them reconnect to etcd
  • 11:01 ema: restart pybal on codfw secondary LVSs to make them reconnect to etcd
  • 10:57 ema: restart pybal on eqiad primary LVSs to make them reconnect to etcd
  • 10:55 ema: restart eqiad secondary LVSs to make them reconnect to etcd
  • 10:47 _joe_: rolling restart of the eqiad etcd cluster
  • 10:39 _joe_: rolling restart of the codfw cluster to pick up the config changes
  • 09:38 marostegui: Sanitizing s2 - T174569
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1077 (duration: 00m 55s)
  • 08:21 elukey: rollback apache/httpd changes on phab1001 (restart required)
  • 08:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1077 weight (duration: 00m 55s)
  • 07:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1077 weight (duration: 00m 55s)
  • 07:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 with low weight (duration: 00m 53s)
  • 07:06 marostegui: Stop MySQL on db1077 for a full upgrade
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for MariaDB and kernel upgrade (duration: 00m 56s)
  • 06:49 marostegui: Fix replication on labsdb1010 - T186579
  • 03:32 demon@tin: Finished deploy [gerrit/gerrit@f25f017]: adding gitiles plugin (duration: 00m 10s)
  • 03:32 demon@tin: Started deploy [gerrit/gerrit@f25f017]: adding gitiles plugin
  • 03:17 foks: reset email for User:Andrewman327
  • 02:32 demon@tin: Synchronized tests/Defines.php: no op (duration: 00m 55s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 15s)
  • 01:47 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AICaptcha data collection on group0/group1 T186244 (duration: 00m 56s)
  • 00:25 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ps.svg: SWAT: Update the ps mobile wordmark T184442 (duration: 00m 55s)
  • 00:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure settings feedback link T182217 (duration: 00m 56s)

2018-02-05

  • 23:21 mutante: nihal - restarted puppetdb service
  • 23:07 mobrovac@tin: Finished deploy [citoid/deploy@7bbc583]: Fix TypeError bug - T186395 (duration: 03m 29s)
  • 23:04 mobrovac@tin: Started deploy [citoid/deploy@7bbc583]: Fix TypeError bug - T186395
  • 22:46 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: EventBus: Enable htmlCacheUpdate jobs for all projects - T182023 (duration: 00m 55s)
  • 22:45 mobrovac@tin: Synchronized wmf-config/jobqueue.php: EventBus: Enable htmlCacheUpdate jobs for all projects - T182023 (duration: 00m 56s)
  • 22:43 ppchelko@tin: Finished deploy [cpjobqueue/deploy@4543102]: Revert the switch to librdkafka 0.11 and enable htmlCacheUpdate (duration: 00m 54s)
  • 22:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@4543102]: Revert the switch to librdkafka 0.11 and enable htmlCacheUpdate
  • 21:47 mholloway-shell@tin: Finished deploy [mobileapps/deploy@6cae404]: Update mobileapps to 3140b1a (duration: 06m 38s)
  • 21:47 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aebfded]: Enble htmlCacheUpdate job for all wikis T182023 (duration: 02m 27s)
  • 21:45 chasemp: asw-b-codfw# rollback 0 pending questions on T183167
  • 21:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@aebfded]: Enble htmlCacheUpdate job for all wikis T182023
  • 21:41 mholloway-shell@tin: Started deploy [mobileapps/deploy@6cae404]: Update mobileapps to 3140b1a
  • 21:07 tgr@tin: Finished scap: T186244 backporting patches and enabling AICaptcha data collection on testwiki (duration: 18m 24s)
  • 20:48 tgr@tin: Started scap: T186244 backporting patches and enabling AICaptcha data collection on testwiki
  • 19:44 demon@tin: Synchronized wmf-config/InitialiseSettings.php: collation for abwiki (duration: 00m 55s)
  • 19:32 demon@tin: Finished scap: adding collation for Abkhaz (duration: 05m 12s)
  • 19:27 demon@tin: Started scap: adding collation for Abkhaz
  • 19:26 demon@tin: Synchronized multiversion/MWWikiversions.php: drop php5.3 support (duration: 00m 56s)
  • 19:22 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ArticlePlaceholder extension for urwiki (duration: 00m 56s)
  • 19:05 elukey: executed 'echo '/srv/apache2_dump/core.%h.%e.%p.%t' > /proc/sys/kernel/core_pattern' on phab1001 - T182832
  • 18:57 ppchelko@tin: Finished deploy [restbase/deploy@44f2d2b]: Pass cache-control headers to /sys/mobileapps (duration: 14m 42s)
  • 18:42 ppchelko@tin: Started deploy [restbase/deploy@44f2d2b]: Pass cache-control headers to /sys/mobileapps
  • 18:37 mutante: added bstorm to acl*operations-team (project 29) on Phabricator (T185493)
  • 18:35 elukey: add 'ulimit -c unlimited' to /etc/default/apache2 to see if httpd's CoreDumpDirectory works properly on phab1001
  • 18:35 mutante: welcome new root shell user bstorm
  • 18:31 mutante: added bstorm to the 'wmf' and 'ops' LDAP groups (modify-ldap-groups on terbium) (T185493)
  • 18:30 ppchelko@tin: Finished deploy [restbase/deploy@55e9d87]: Enable ensure_content_type filter for mobile content (duration: 12m 04s)
  • 18:18 ppchelko@tin: Started deploy [restbase/deploy@55e9d87]: Enable ensure_content_type filter for mobile content
  • 18:07 gehel@tin: Finished deploy [wdqs/wdqs@d7eb899]: wdqs blazegraph + gui + updater upgrade (duration: 02m 36s)
  • 18:04 gehel@tin: Started deploy [wdqs/wdqs@d7eb899]: wdqs blazegraph + gui + updater upgrade
  • 17:52 ejegg: updated payments-wiki from 341cb573a1 to 39a7ef32e5
  • 17:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@d970b61]: Update mobileapps to 7a9fab3 (duration: 05m 45s)
  • 17:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@d970b61]: Update mobileapps to 7a9fab3
  • 16:10 marostegui: Renaming wikidata tables on s5 on eqiad - T184599
  • 16:03 marostegui: Renaming wikidata tables on s5 on codfw - T184599
  • 15:54 mholloway-shell@tin: Finished deploy [mobileapps/deploy@c9c774e]: Update mobileapps to 1411ccb (duration: 06m 06s)
  • 15:48 mholloway-shell@tin: Started deploy [mobileapps/deploy@c9c774e]: Update mobileapps to 1411ccb
  • 15:26 elukey: temporary setting CoreDumpDirectory /srv/apache2_dump to httpd on phab1001 (+ httpd reload) to investigate core dumps for T182832
  • 14:46 hashar: European SWAT completed. I have not deployed matmarex patches to change Abkhaz collation ( https://gerrit.wikimedia.org/r/#/c/406185/ https://gerrit.wikimedia.org/r/#/c/406187/ )
  • 14:41 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ArticlePlaceholder for Estonian Wikipedia (etwiki) - T186107 (duration: 00m 55s)
  • 14:35 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgNamespaceRobotPolicies on ptwiki's NS_USER to noindex - T185660 (duration: 00m 55s)
  • 14:31 hashar@tin: Synchronized wmf-config/throttle.php: Add throttle rules - T185794 T185811 (duration: 00m 55s)
  • 14:24 hashar@tin: Synchronized php-1.31.0-wmf.17/extensions/Flow/includes/Import/OptInController.php: OptInController catch both errors and exception - T184670 (duration: 00m 55s)
  • 14:22 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Fix typo in arwikibooks rollbacker group - T185720 (duration: 00m 56s)
  • 14:14 hashar@tin: Synchronized wmf-config/throttle.php: Add throttle rule for an event - T185930 (duration: 00m 55s)
  • 14:11 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add 'rollbacker' group at arwikibooks - T185720 (duration: 00m 56s)
  • 13:20 marostegui: Rename dewiki tables on s8 master (db1071 - with no replication) before dropping them - T184599
  • 12:20 marostegui: Drop empty wikidata database from s5 master (db1070) - T184599
  • 12:17 marostegui: Drop old and renamed wikidata tables from s5 master (db1070) - T184599
  • 11:30 godog: expand smart metrics checking rollout with https://gerrit.wikimedia.org/r/#/c/403621/
  • 11:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify db1078 comment as it is the new candidate master for s3 (duration: 00m 55s)
  • 11:04 hashar: Upgraded jenkins-debian-glue to 0.18.4-wmf1 | T186494
  • 11:03 elukey: restart eventlogging/forwarder legacy-zmq on eventlog1001 due to slow memory leak over time (cached memory down to zero)
  • 10:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1078 (duration: 00m 55s)
  • 09:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1078 traffic (duration: 00m 55s)
  • 09:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1078 traffic - T186321 (duration: 00m 55s)
  • 09:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 with low traffic - T186321 (duration: 00m 53s)
  • 08:44 marostegui: Stop MySQL on db1078, upgrade mariadb, kernel and socket location - T186321
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T186321 (duration: 00m 55s)
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 56s)
  • 07:45 marostegui: Deploy schema change on s8 primary master (db1071) - T174569
  • 07:43 elukey: install libjson-c2-dbg on phab1001 to allow better debugging of httpd/mod-php stuck process - T182832
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 03s)

2018-02-04

  • 22:40 elukey: restart aphlict.service on phab1001 to force it to pick up the new logfile (/var/log/aphlict/aphlict.log rather than the .log.1)
  • 06:18 _joe_: reduced raid resync speed on conf2* to 5000 KB/s
  • 04:33 _joe_: restarted etcdmirror on conf2002, failure caused by raid resyncs in codfw

2018-02-03

  • 03:55 legoktm: restarting zuul to drop 407165,3 from the queue

2018-02-02

  • 23:40 no_justification: gerrit: one last restart to try and force gerrit/phab session restart
  • 22:42 jynus: reloading m2 dbproxy
  • 22:08 no_justification: cobalt/gerrit2001: purged libbcprov-java libbcpkix-java, cleaned up old symlinks
  • 21:45 demon@tin: Finished deploy [gerrit/gerrit@98f5d9a]: Gerrit 2.14.6 (duration: 00m 14s)
  • 21:45 demon@tin: Started deploy [gerrit/gerrit@98f5d9a]: Gerrit 2.14.6
  • 21:42 no_justification: cobalt: disabling puppet so it doesn't restart gerrit
  • 21:41 no_justification: bringing down gerrit for upgrade
  • 20:54 demon@tin: Synchronized docroot/wikipedia.org/spec.yaml: expose swagger spec (duration: 00m 56s)
  • 20:47 elukey: truncated /var/log/aphlict/aphlict.log to 1G (was 26G) to avoid overhead for the upcoming first logrotate on phab1001
  • 16:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original traffic for db1100 (duration: 00m 54s)
  • 16:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1100 (duration: 00m 55s)
  • 16:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1100 (duration: 00m 54s)
  • 16:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1100 - T186321 (duration: 00m 54s)
  • 15:50 marostegui: Restart MySQL on db1100 - T186321
  • 15:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T186321 (duration: 00m 55s)
  • 15:34 moritzm: uploaded HHVM 3.18.5+dfsg+wmf5+icu57 to jessie-wikimedia/component/icu57 (HHVM 3.18.8 linked against an ICU 57 backport from stretch)
  • 15:25 mutante: ganeti1004 - stopped and started VM ununpentium
  • 14:53 akosiaris: reboot ganeti1005 after emptying it. T181121
  • 13:59 elukey: reboot meitnerium via gnt-instance reboot on ganeti1005 to pick up new disk config - T186020
  • 13:16 moritzm: installing w3m security updates on trusty
  • 12:57 moritzm: installing updated kernels on remaining jessie DB servers
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 55s)
  • 11:57 godog: roll-restart nginx on thumbor and swift-proxy on ms-fe to apply https://gerrit.wikimedia.org/r/407411
  • 11:39 moritzm: uploaded php-wikidiff2 1.5.1+deb9u2 to apt.wikimedia.org (despite the source package name, this package only builds hhvm-wikidiff2 now as php-wikidiff2 is instead updated via stretch-backports, the old internal package will eventually be phased out when we move to PHP7)
  • 11:12 ema: cache_upload: repool cp4026 (varnish 5)
  • 11:07 ema: cache_upload: upgrade cp4026 to varnish 5
  • 10:43 ema: cache_upload: repool cp4025 (varnish 5)
  • 10:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 00m 55s)
  • 10:39 ema: cache_upload: upgrade cp4025 to varnish 5
  • 10:24 ema: cache_upload: repool cp4024 (varnish 5)
  • 10:20 ema: cache_upload: upgrade cp4024 to varnish 5
  • 10:18 moritzm: installing ruby security updates on trusty
  • 09:57 godog: roll-upgrade thumbor to 1.11 - T178072 T185478 T185483 T185485 T183907 T179954
  • 09:46 gilles: Add thumborUrl to Swift config in PrivateSettings.php
  • 09:13 ema: cache_upload: repool cp4023 (varnish 5)
  • 09:08 ema: cache_upload: upgrade cp4023 to varnish 5
  • 09:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 54s)
  • 08:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 00m 55s)
  • 08:37 elukey: apt-get install php5-dbg on phab1001 as attempt to have a better gdb output for T182832
  • 08:35 ema: cache_upload: repool cp4022 (varnish 5)
  • 08:29 ema: cache_upload: upgrade cp4022 to varnish 5
  • 08:23 marostegui: Stop replication in sync db1089 - db1065 - T162807
  • 08:23 moritzm: installing curl security updates on trusty (Debian already updated)
  • 08:21 ema: cache_upload: repool cp4021 (varnish 5)
  • 08:14 ema: cache_upload: upgrade cp4021 to varnish 5
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 55s)
  • 07:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 55s)
  • 07:10 marostegui: Fixing data drifts on db1065 - T162807
  • 05:37 elukey: truncate /var/log/aphlict/aphlict.log to 25G as temp measure to avoid phab1001's root partition to fill up

2018-02-01

  • 23:37 mutante: creating new 100GB virtual disk for ganeti VM meitnerium (T186020)
  • 23:12 eileen: update civicrm revision changed from 849bba4186 to 71b1e35b99 (deploy civitoken)
  • 22:37 ejegg: updated payments-wiki from 40145892e7 to 341cb573a1
  • 21:56 raita: Removed 2FA from User:Jehochman
  • 21:52 raita: Removed 2FA from User:Superzerocool (on Mon, Jan 29): https://phabricator.wikimedia.org/T185731
  • 20:50 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool of db1083 (duration: 00m 55s)
  • 20:41 jynus: deployed modified query killer to enwiki replicas
  • 20:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: emergency depool of db1083 (duration: 00m 55s)
  • 19:19 chasemp: labservices1001:~# logrotate --force /etc/logrotate.conf
  • 19:17 chasemp: labservices1002:~# logrotate --force /etc/logrotate.conf
  • 19:04 demon@tin: Pruned MediaWiki: 1.31.0-wmf.16 [keeping static files] (duration: 01m 16s)
  • 19:02 demon@tin: Pruned MediaWiki: 1.31.0-wmf.15 (duration: 14m 55s)
  • 16:26 andrewbogott: apt-get install 'designate' on labservices1001 and 1002 — routine upgrade
  • 15:39 moritzm: upgrading nginx on mw1266-mw1299 (for T164456)
  • 15:27 moritzm: restarting apache/HHVM on deployment servers to pick up libxml2/curl security updates
  • 15:14 moritzm: installing curl security updates
  • 14:48 moritzm: installing tiff security updates
  • 14:44 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2019 (duration: 00m 57s)
  • 14:40 moritzm: restarting nginx on sodium to pick up libxml2 security update
  • 14:34 moritzm: restarting apache on rutherfordium to pick up libxml2 security update
  • 14:01 moritzm: restarting nginx on puppetdb hosts to pick up libxml2 security update
  • 13:56 moritzm: restarting nginx on meitnerium/archiva to pick up libxml2 security update
  • 13:42 gehel: restarting nginx on wdqs* for upgrade
  • 13:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Update db1051 reason for depooling (duration: 00m 56s)
  • 13:23 akosiaris: force puppet run on all postgres servers for https://gerrit.wikimedia.org/r/407424
  • 13:20 jynus: stop and reimage es2019
  • 13:13 moritzm: restarting apache on krypton to pick up libxml2 security update
  • 13:13 gehel: restarting postgresql and nodejs services on maps*
  • 13:09 gehel: upgrade nging on elastic*
  • 12:54 moritzm: restarting nginx on debug proxies to pick up libxml2 security update
  • 12:53 moritzm: restarting apache on hafnium to pick up libxml2 security update
  • 12:06 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2018, depool es2019 (duration: 00m 57s)
  • 12:03 moritzm: restarting squid on URL downloaders to pick up libxml2 security update
  • 11:53 moritzm: installing libxml2 security updates
  • 10:33 godog: roll restart thumbor to lower subprocess timeout - T185479
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 53s)

2018-01-31

  • 23:56 mutante: restarting apache on phabricator server, same pattern as described in T182832
  • 23:06 bblack: re-pooling ulsfo in DNS - T185228
  • 23:04 bblack: re-pooling ulsfo in DNS
  • 23:00 bblack: restarting ulsfo varnish-fe processes
  • 22:55 bblack: un-downtiming various ulsfo things
  • 22:28 mepps: updated civicrm from c70f01cd83 to 849bba4186
  • 22:04 mepps: updated civicrm from c70f01cd83 to 63c918837c
  • 21:57 mholloway-shell@tin: Finished deploy [mobileapps/deploy@18d263a]: Update mobileapps to 3d717fa (duration: 06m 11s)
  • 21:51 mholloway-shell@tin: Started deploy [mobileapps/deploy@18d263a]: Update mobileapps to 3d717fa
  • 21:35 mutante: fixed icinga config for cp4024 parents
  • 20:29 demon@tin: Synchronized .gitmodules: consistency (duration: 00m 54s)
  • 20:25 demon@tin: Synchronized docroot/wikimedia.org/: bye bye firefox os. you will (not) be missed (duration: 00m 58s)
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4032.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4031.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4030.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4029.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4027.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4028.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4026.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4025.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4024.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4023.ulsfo.wmnet
  • 18:52 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4022.ulsfo.wmnet
  • 18:52 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4021.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4032.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4030.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4029.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4028.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4027.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4026.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4025.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4024.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4023.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4022.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4021.ulsfo.wmnet
  • 18:40 robh: putting all ulsfo servers into maint mode
  • 18:10 XioNoX: deactivating bgp session from ulsfo to office
  • 16:17 marostegui: Optimize wbc_entity_usage on s6 on db1102
  • 16:15 robh: depooling ulsfo for https://phabricator.wikimedia.org/T185228
  • 15:44 akosiaris: reimage ores100{1..9} T171851
  • 14:37 godog: bump prometheus global instance retention to 15 months - T160677
  • 12:25 marostegui: Fix replication on labsdb1004
  • 09:32 moritzm: rolling restart of thumbor/nginx to pick up libxml security update
  • 08:21 moritzm: installing clamav security update on fermium
  • 07:55 moritzm: installing libxml security updates
  • 07:48 marostegui: Stop MySQL on db1030 - T184397
  • 07:47 marostegui: Remove db1030 from tendril - T184397
  • 07:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 56s)
  • 07:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 57s)
  • 07:08 marostegui: Force BBU relearn on db1051 - T186049
  • 06:19 elukey: restart varnish backend on cp4024 - failed fetches / 503s
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 46s)
  • 01:51 mutante: catchpoint: recycled gwicke's user and turned it into a user for volans, upgraded him to admin (T162857)
  • 00:55 krinkle@tin: Synchronized wmf-config: no-op, adding files for beta cluster (duration: 00m 59s)
  • 00:51 krinkle@tin: Synchronized wmf-config/profiler.php: no-op (comment-only) (duration: 00m 58s)

2018-01-30

  • 21:39 demon@tin: Synchronized docroot/noc/conf/open.dblist: (no justification provided) (duration: 00m 57s)
  • 21:38 demon@tin: Synchronized dblists/open.dblist: Adding open.dblist (duration: 00m 57s)
  • 19:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 (duration: 00m 57s)
  • 18:32 mutante: powercyling amslvs4, to be reinstalled as bast3003
  • 18:08 moritzm: installing PHP security updates
  • 15:52 moritzm: installing libxml2 security updates
  • 15:35 moritzm: installing libxcursor security updates
  • 15:30 jynus: stop and reimage es2018
  • 14:42 moritzm: installing curl security updates on app server canaries along with HHVM restart
  • 13:15 moritzm: installing rsync security updates on trusty
  • 12:15 moritzm: installing libxtst updates
  • 10:57 moritzm: installing ffmpeg security updates
  • 09:34 moritzm: installing wireshark security updates
  • 08:35 moritzm: installing libxml2 security updates
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 58s)
  • 00:31 demon@tin: rebuilt and synchronized wikiversions files: not changing versions, testing something

2018-01-29

2018-01-28

  • 18:39 bblack: testme

2018-01-26

  • 16:32 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: BETA: Enable FileImporter on testwiki with open config PT 2/2 (duration: 00m 56s)
  • 16:30 addshore@tin: Synchronized wmf-config/CommonSettings-labs.php: BETA: Enable FileImporter on testwiki with open config PT 1/2 (duration: 00m 58s)
  • 06:40 niharika29@tin: Finished deploy [scholarships/scholarships@5d2fca4]: Update deadline for scholarships application (duration: 00m 03s)
  • 06:39 niharika29@tin: Started deploy [scholarships/scholarships@5d2fca4]: Update deadline for scholarships application
  • 04:31 urandom: bootstrapping restbase2009-c - T184100
  • 02:32 urandom: bootstrapping restbase2009-b - T184100

2018-01-25

  • 22:24 mutante: restarting gerrit service to apply a few small config changes https://gerrit.wikimedia.org/r/#/q/topic:gerrit-trivial-tweaks+(status:open+OR+status:merged)
  • 22:07 mutante: restarting apache on phabricator server
  • 22:06 urandom: bootstrapping restbase2009-a - T184100
  • 18:13 _joe_: restart hhvm on a few api appservers, high cpu load
  • 14:52 urandom: bootstrapping restbase2008-c - T184100
  • 07:44 urandom: bootstrapping restbase2008-b - T184100
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 32s)
  • 01:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es2011 (duration: 00m 56s)
  • 01:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2011 (duration: 00m 57s)
  • 00:24 urandom: bootstrapping restbase2008-a - T184100

2018-01-24

  • 23:15 ema: cp4025: restart varnish backend due to mbox lag
  • 19:57 jynus: starting es2011 reimage
  • 19:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2011 (duration: 00m 57s)
  • 18:35 no_justification: gerrit: restarting services, will be back momentarily
  • 18:32 urandom: bootstrapping restbase2007-c - T184100
  • 08:16 ema: cp4024: restart varnish-be due to 503s
  • 06:26 urandom: bootstrapping restbase2007-b - T184100
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 33s)
  • 01:14 matt_flaschen: SWAT complete
  • 01:10 matt_flaschen: Deployed 'T185304: NWE: Don't attempt to set selection on unattached textarea' in extensions/VisualEditor
  • 01:02 mattflaschen@tin: Synchronized php-1.31.0-wmf.17/extensions/VisualEditor/: (no justification provided) (duration: 00m 58s)
  • 00:40 mobrovac@tin: Finished deploy [zotero/translators@8f53531]: Update translators to 528296d (duration: 00m 08s)
  • 00:39 mobrovac@tin: Started deploy [zotero/translators@8f53531]: Update translators to 528296d
  • 00:08 urandom: bootstrapping restbase2007-a - T184100

2018-01-23

  • 17:37 robh: mc2036 offline until mainboard fix
  • 14:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Unify comments about sanitarium masters (duration: 00m 56s)
  • 14:36 zeljkof: EU SWAT finished
  • 14:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update the project namespace in Nepali Wikipedia (T184865) (duration: 00m 56s)
  • 14:17 zeljkof: continuing EU SWAT
  • 14:14 zeljkof: EU SWAT finished
  • 14:13 zfilipin@tin: Synchronized php-1.31.0-wmf.17/extensions/WikibaseQualityConstraints/: SWAT: Add missing DISTINCT to SPARQL query (T184705) (duration: 01m 02s)
  • 13:03 moritzm: installing libxtst, libxfixes, libxrandr, libxi security updates
  • 10:56 moritzm: installing libx11 security updates
  • 10:43 moritzm: installing sudo security updates
  • 09:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 00m 56s)
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1089 (duration: 00m 56s)
  • 08:24 moritzm: installing gdk-pixbuf security updates
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 (duration: 00m 56s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 56s)
  • 06:50 elukey: restart varnish backend on cp4021, 503s and mailbox lag
  • 06:47 marostegui: Stop replication in sync on db2048 and db1089 - T162807
  • 06:23 marostegui: Stop replicaiton in sync db1089 and db1105:3311 - T162807
  • 06:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 00m 57s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 31s)

2018-01-22

  • 22:38 mutante: rebooting the-server-formerly-known-as-amslvs4 to PXE to reinstall it as bast3003. doesnt work
  • 21:02 ottomata: restarting archiva
  • 19:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add Draft namespace to newiki (T184157) (duration: 00m 56s)
  • 19:34 catrope@tin: Synchronized php-1.31.0-wmf.17/extensions/UploadWizard/resources/details/uw.DescriptionDetailsWidget.js: T184380 (duration: 00m 56s)
  • 19:31 catrope@tin: Synchronized php-1.31.0-wmf.17/extensions/InputBox/InputBox.hooks.php: T185367 (duration: 00m 58s)
  • 18:16 gehel@tin: Finished deploy [wdqs/wdqs@f59ed29]: (no justification provided) (duration: 02m 12s)
  • 18:15 gehel: updating wdqs GUI
  • 18:14 gehel@tin: Started deploy [wdqs/wdqs@f59ed29]: (no justification provided)
  • 17:11 joal@tin: Finished deploy [analytics/refinery@5b8edb8]: Regular weekly deploy (before freeze for all-hands) (duration: 10m 14s)
  • 17:01 joal@tin: Started deploy [analytics/refinery@5b8edb8]: Regular weekly deploy (before freeze for all-hands)
  • 16:51 volans: upgraded debdeploy and cumin to latest released on neodymium/sarin - T182575
  • 15:49 moritzm: upgrade image scalers in eqiad to HHVM 3.18.7
  • afk: restarting jenkins
  • 14:59 moritzm: upgrade mw1221-mw1235 to HHVM 3,18.7
  • 14:43 zeljkof: EU SWAT finished
  • 14:41 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update officewiki logo, add HD logo for officewiki (T184575) (duration: 00m 56s)
  • 14:39 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update officewiki logo, add HD logo for officewiki (T184575) (duration: 00m 56s)
  • 14:36 elukey: truncate (again) /var/log/upstart/neutron-server.log on labtestnet2001
  • 14:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow bureaucrats@mr.wiki to grant&revoke accountcreator (T184553) (duration: 00m 56s)
  • 14:26 moritzm: uploaded debdeploy 0.0.99.2 for jessie-wikimedia, stretch-wikimedia, trusty-wikimedia to apt.wikimedia.org
  • 14:23 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains (T184853) (duration: 00m 56s)
  • 14:12 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Remove $wgWBQualityConstraintsIncludeDetailInApi setting (T180614) (duration: 00m 56s)
  • 14:11 gehel: cleanup leftover logrotate configuration on wdqs*
  • 14:05 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable fine grained lua tracking for arwiki, fawiki, viwiki (T185032) (duration: 00m 57s)
  • 13:46 marostegui: Force BBU relearn on db1016 - T166344
  • 12:38 volans: upgraded cumin on labpuppetmasters hosts to 2.0.0
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1063 as vslow - T184397 (duration: 00m 56s)
  • 12:22 moritzm: upgrade mw1238-mw1258 to HHVM 3,18.7
  • 12:01 marostegui: Change x1 codfw topology: db2034 to replicate from eqiad T184888
  • 11:45 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2036 (duration: 00m 57s)
  • 11:38 volans: uploaded cumin_2.0.0-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 09:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 56s)
  • 09:50 jynus: running heavy reads on db2043, db2036 to try to reproduce s3 codfw crash
  • 09:25 marostegui: Stop replication in sync db1099:3311 and db1089 - T162807
  • 09:21 marostegui: Stop MySQL on db1030 to clone db1063 - T184397
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1030 - T184397 (duration: 00m 56s)
  • 09:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066, depool db1099:3311 - T162807 (duration: 00m 56s)
  • 08:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1067 weight (duration: 00m 56s)
  • 08:31 moritzm: upgrading video scalers to HHVM 3.18.7
  • 07:51 marostegui: Stop replication in sync db1089 and db1066 - T162807
  • 07:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067, depool db1066 - T162807 (duration: 00m 56s)
  • 07:11 marostegui: Stop replication in sync db1089 and db1067 - T162807
  • 07:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 - T162807 (duration: 00m 56s)
  • 07:04 elukey: truncated /var/log/upstart/neutron-server.log on labtestnet2001 - / disk space exhausted
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Move db1063 from s8 to s6 - T184397 (duration: 00m 58s)
  • 06:18 marostegui: Compress ruwiki on db1102 - T182450
  • 02:37 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 07m 24s)

2018-01-21

  • 17:21 marostegui: Compress frwiki and jawiki on db1102 - T182450
  • 12:03 marostegui: Defragment s2 on db1102 - T182450
  • 02:35 urandom: bootstrapping restbase2012-b - T184100

2018-01-20

  • 23:20 urandom: bootstrapping restbase2012-a - T184100
  • 17:36 elukey: forced bbu learn cycle on analytics1038 (cache policy flapping from WriteBack to WriteThrough)
  • 16:57 urandom: bootstrapping restbase2011-c - T184100
  • 12:53 urandom: bootstrapping restbase2011-b - T184100
  • 03:32 urandom: bootstrapping restbase2011-a - T184100

2018-01-19

  • 22:53 matt_flaschen: Ran (time foreachwikiindblist flow.dblist extensions/Flow/maintenance/FlowFixInconsistentBoards.php --force) 2>&1|tee --append ~/FlowFixInconsistentBoards_all_2018-01-19_actual_force.txt
  • 21:28 urandom: bootstrapping restbase2010-c - T184100
  • 19:43 mutante: ms-be3003 - power up via mgmt to check if still connected and usable as temp bastion (T184936)
  • 18:58 urandom: bootstrapping restbase2010-b - T184100
  • 17:58 chasemp: labcontrol1002:~# ip addr del 208.80.154.12/32 dev eth0
  • 17:58 chasemp: labcontrol1002:~# ip addr del 208.80.154.102/32 dev eth0
  • 17:55 chasemp: labcontrol1001:~# ip addr del 208.80.154.94/32 dev eth0
  • 17:50 reedy@tin: Synchronized dblists/s3.dblist: alphasort and remove dupes (duration: 01m 01s)
  • 17:11 jynus: stopping mariadb on db2016,17,18,19,23,28&29 T184090
  • 16:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool ddb1089 and depool db1067 (duration: 00m 56s)
  • 16:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 - T162807 (duration: 00m 56s)
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 - T174569 (duration: 00m 56s)
  • 15:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 - T162807 (duration: 00m 56s)
  • 15:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Tune s1 and s3 database weights (duration: 00m 57s)
  • 15:38 anomie: Running migrateArchiveText.php on all wikis that need it (T184629)
  • 15:24 anomie: Running migrateArchiveText.php on metawiki (T184629)
  • 15:23 godog: bootstrap cassandra-a on restbase2010 - T184100
  • 14:48 anomie: Running migrateArchiveText.php on testwiki (T184629)
  • 14:31 moritzm: installing krb5 updates from jessie point release
  • 12:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 56s)
  • 12:03 moritzm: installing imagemagick security updates
  • 11:46 moritzm: installing sensible-utils security update
  • 11:29 jynus@tin: Synchronized wmf-config/db-eqiad.php: Decommission old codfw masters (duration: 00m 55s)
  • 11:20 moritzm: upgrading tor on radium to 0.3.2.9
  • 11:18 jynus@tin: Synchronized wmf-config/db-codfw.php: Decommission old codfw masters (duration: 00m 56s)
  • 11:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 56s)
  • 10:55 jynus: restarting es2002
  • 10:53 moritzm: updated tor packages on apt.wikimedia.org to 0.3.2.9-1~d80
  • 10:19 jynus: stop mariadb at db2018 to clone it away
  • 10:02 jynus: restarting es2001
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 54s)
  • 09:56 ema: cp4026 restart varnish-be because of mbox lag
  • 09:10 marostegui: Stop replication in sync db1089 and db1105:3311 - T162807
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 00m 57s)
  • 09:08 godog: start cassandra-a on restbase1015 - T184100
  • 07:11 marostegui: Stop x1 on dbstore2002 to copy its content to db2034 - T184888
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 55s)
  • 06:31 marostegui: Stop replication in sync db1089 and db1099:3311 - T162807
  • 06:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 - T162807 (duration: 00m 56s)
  • 06:22 marostegui: Deploy schema change on db1109 - T174569
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T174569 (duration: 00m 57s)
  • 03:21 TimStarling: on bast1001: restarting bacula-fd with master key decryption enabled, restarting restore job
  • 01:20 TimStarling: attempting to restore home_pmtpa from bacula to bast1001
  • 00:19 ebernhardson: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: T185246: Removing unused citizendium from $wgRelatedSitesPrefixes (duration: 00m 56s)
  • 00:11 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: T185250 Switch wiktionary sister search on enwiki to title only (step 2) (duration: 00m 56s)
  • 00:09 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T185250 Switch wiktionary sister search on enwiki to title only (step 1) (duration: 00m 57s)

2018-01-18

  • 23:11 urandom: bootstrapping restbase1015-b -- T184100
  • 22:36 herron: added ruby-rgen-0.7.0-1 (backported package from jessie) to trusty-wikimedia apt repo (T182894)
  • 21:03 arlolra@tin: Finished deploy [parsoid/deploy@a95fede]: Update Parsoid config, again (duration: 09m 39s)
  • 20:53 arlolra@tin: Started deploy [parsoid/deploy@a95fede]: Update Parsoid config, again
  • 20:31 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.17
  • 20:09 mutante: releases1001 - /srv/patches got created, initial manual rsync using /usr/local/sbin/sync-srv-patches created by rsync::quickdatacopy, mw patches exists on nightlies server now
  • 20:09 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/Score/includes/Score.php: SWAT: Always pass FileBackend instance to `new FileRepo()` T185204 (duration: 01m 12s)
  • 20:01 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: (no justification provided) (duration: 01m 09s)
  • 20:00 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: (no justification provided)
  • 20:00 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: (no justification provided) (duration: 00m 44s)
  • 19:59 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: (no justification provided)
  • 19:56 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: Updating Parsoid config (duration: 02m 01s)
  • 19:54 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: Updating Parsoid config
  • 19:53 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/VisualEditor/modules/ve-mw/ui/pages/ve.ui.MWTemplatePlaceholderPage.js: SWAT: Update TitleInput getTitle to getMWTitle (duration: 01m 09s)
  • 19:24 arlolra: Updated Parsoid to af06386 (T45094)
  • 19:20 ema: cache_upload: upgrade cp3049 to varnish 5
  • 19:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update linter stats for commonswiki less frequently T184280 (duration: 01m 13s)
  • 19:17 arlolra@tin: Finished deploy [parsoid/deploy@fcc2b63]: Updating Parsoid to af06386 (duration: 09m 32s)
  • 19:08 arlolra@tin: Started deploy [parsoid/deploy@fcc2b63]: Updating Parsoid to af06386
  • 19:00 ema: cache_upload: repool cp3046 (varnish 5)
  • 18:58 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: T184670: Hide Flow beta feature everywhere but testwiki (duration: 01m 10s)
  • 18:54 ema: cache_upload: upgrade cp3046 to varnish 5
  • 18:47 bsitzmann@tin: Finished deploy [mobileapps/deploy@669fb5b]: Update mobileapps to 2690899 (T184328 T184557 T177007 T184669 T177430 T185050) (duration: 07m 03s)
  • 18:40 bsitzmann@tin: Started deploy [mobileapps/deploy@669fb5b]: Update mobileapps to 2690899 (T184328 T184557 T177007 T184669 T177430 T185050)
  • 18:39 ema: cache_upload: repool cp3045 (varnish 5)
  • 18:33 ema: cache_upload: upgrade cp3045 to varnish 5
  • 18:23 mlitn@tin: Finished deploy [3d2png/deploy@74b1ed7]: Updating 3d2png repo (duration: 00m 50s)
  • 18:22 mlitn@tin: Started deploy [3d2png/deploy@74b1ed7]: Updating 3d2png repo
  • 18:00 ema: cache_upload: repool cp3044 (varnish 5)
  • 17:55 ema: cache_upload: upgrade cp3044 to varnish 5
  • 17:39 moritzm: rebooting sodium (and temporarily disable icinga-wm due to some expected spam due to clients failing to run apt-get update)
  • 17:33 jynus: starting compare.py on s3 codfw (it triggered db2036 crash before)
  • 17:31 ema: cache_upload: repool cp3039 (varnish 5)
  • 17:26 ema: cache_upload: upgrade cp3039 to varnish 5
  • 17:02 ema: cache_upload: repool cp3036 (varnish 5)
  • 16:55 ema: cache_upload: upgrade cp3036 to varnish 5
  • 15:54 ema: cache_upload: repool cp3048 (varnish 5)
  • 15:49 ema: cache_upload: upgrade cp3048 to varnish 5
  • 15:40 moritzm: rebooting labsdb1004 for kernel security update
  • 15:40 ema: cache_upload: repool cp3047 (varnish 5)
  • 15:34 ema: cache_upload: upgrade cp3047 to varnish 5
  • 15:33 moritzm: reboot labsdb1006 (OSM slave) for kernel security update
  • 15:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T162807 (duration: 01m 12s)
  • 15:15 mforns@tin: Finished deploy [analytics/refinery@78f98d9]: deploying refinery to add ISO codes to pageviews by country (duration: 04m 12s)
  • 15:11 mforns@tin: Started deploy [analytics/refinery@78f98d9]: deploying refinery to add ISO codes to pageviews by country
  • 15:01 marostegui: Stop replication in sync db1089 and db1066 - T162807
  • 15:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T162807 (duration: 01m 11s)
  • 14:58 moritzm: installing bind security updates (we only use the client-side tools)
  • 14:58 volans: reprepro includedeb jessie-wikimedia python-requests-mock_1.3.0-3_all.deb
  • 14:45 ema: cache_upload: repool cp3038 (varnish 5)
  • 14:44 herron: disabling puppet agents during deploy of 404587, 404689
  • 14:39 ema: cache_upload: upgrade cp3038 to varnish 5
  • 14:39 godog: restart hhvm on mw1233
  • 14:31 _joe_: restarting hhvm on a few API appservers
  • 14:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T174569 (duration: 01m 12s)
  • 14:28 ema: cache_upload: repool cp3035 (varnish 5)
  • 14:25 marostegui@tin: Synchronized wmf-config/db-codfw.php: Promote db2043 to s3 master after db2036 crash (duration: 01m 12s)
  • 14:25 godog: restart hhvm on mw1227
  • 14:23 ema: cache_upload: upgrade cp3035 to varnish 5
  • 14:19 jynus: starting mysql on db2043
  • 14:17 jynus: stopping mysql on db2043
  • 14:10 zeljkof: EU SWAT finished
  • 14:10 ema: cache_upload: repool cp3037 (varnish 5)
  • 14:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikibooks (T185182) (duration: 01m 13s)
  • 13:54 ema: cache_upload: upgrade cp3037 to varnish 5
  • 13:49 moritzm: upgrade mw* servers in eqiad running 3.18.5+dfsg-1+wmf3 (recent installations) to 3.18.5+dfsg-1+wmf4
  • 13:19 jynus: changing topology of codfw s3 databases
  • 13:05 akosiaris: reboot poolcounter2001 for PCID/INVPCID CPU feature enabling
  • 13:03 akosiaris: reboot webperf1001 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:57 akosiaris: enable puppet across the fleet after nitrogen (puppetdb) reboot
  • 12:56 akosiaris: reboot nitrogen for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:52 jgleeson: turned on donations queue consumer process-control job (actual time of change 17/01/18 ~16:20)
  • 12:45 akosiaris: reboot seaborgium for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:43 elukey: bohrium rebooted for kernel upgrades
  • 12:43 akosiaris: disable puppet across the fleet for nitrogen (puppetdb) reboot
  • 12:40 elukey: set piwik in readonly mode and stopped mysql on bohrium (prep step for reboot)
  • 12:36 akosiaris: reboot chlorine.eqiad.wmnet etcd1003.eqiad.wmnet etcd1005.eqiad.wmnet fermium.wikimedia.org install1002.wikimedia.org krypton.eqiad.wmnet kubestagetcd1003.eqiad.wmnet logstash1009.eqiad.wmnet mwdebug1001.eqiad.wmnet sca1004.eqiad.wmnet for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 11:34 akosiaris: reboot logstash1008 etcd1002 kubestagetcd1002.eqiad.wmnet for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 11:12 ema: cp3046: restart varnish-be due to mbox lag
  • 11:06 volans: disabled puppet on tegmen to test impact on puppetdb - T170740
  • 10:57 akosiaris: reboot actinium.wikimedia.org aluminium.wikimedia.org argon.eqiad.wmnet boron.eqiad.wmnet bromine.eqiad.wmnet darmstadtium.eqiad.wmnet dbmonitor1001.wikimedia.org dubnium.wikimedia.org dysprosium.wikimedia.org etcd1001.eqiad.wmnet etcd1004.eqiad.wmnet fermium.wikimedia.org hassium.eqiad.wmnet kubestagetcd1001.eqiad.wmnet logstash1007.eqiad.wmnet meitnerium.wikimedia.org mendelevium.eqiad.wmnet mwdebug1002.eqiad.wmnet m
  • 10:45 ema: cp3034: restart varnishxcps and varnishmedia, they were both using 100% of a cpu core
  • 10:35 Amir1: ladsgroup@terbium:/srv/mediawiki/php-1.31.0-wmf.17$ mwscript extensions/WikibaseQualityConstraints/maintenance/ImportConstraintStatements.php --wiki wikidatawiki (T184720)
  • 10:30 akosiaris: reboot etherpad1001 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 10:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2034 from s1 as it will be in x1 - T184888 (duration: 01m 12s)
  • 10:25 mobrovac@tin: Finished deploy [restbase/deploy@5c353f7]: Use stable packge names, normalise cache-control headers, update top definition, take #2 - T184199 T184833 T184541 (duration: 12m 18s)
  • 10:12 mobrovac@tin: Started deploy [restbase/deploy@5c353f7]: Use stable packge names, normalise cache-control headers, update top definition, take #2 - T184199 T184833 T184541
  • 10:10 mobrovac@tin: Finished deploy [restbase/deploy@04e7cdb]: Use stable packge names, normalise cache-control headers, update top definition - T184199 T184833 T184541 (duration: 02m 29s)
  • 10:07 mobrovac@tin: Started deploy [restbase/deploy@04e7cdb]: Use stable packge names, normalise cache-control headers, update top definition - T184199 T184833 T184541
  • 10:07 moritzm: rebooting rdb1002/rdb1004/rdb1006/rdb1008 for kernel security update
  • 09:58 akosiaris: reboot etcd1006 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 09:43 ema: cache_upload: repooled cp3034 running varnish 5
  • 09:38 elukey: reboot thorium (analytics webserver) for security upgrade - This maintenance will cause temporary unavailability of the Analytics websites
  • 09:27 marostegui: !log Stop replication in sync db1089 and db2048 (codfw master) - T162807
  • 09:26 jynus: reimage es2003 to stretch
  • 09:21 elukey: reboot druid1001 for kernel upgrades
  • 09:20 akosiaris: reboot oresrdb2001 for PCID/INVPCID CPU feature enabling
  • 09:10 akosiaris: reboot alcyone pollux sca2004 poolcounter2002 serpens for PCID/INVPCID CPU feature enabling
  • 09:07 marostegui: Stop replication in sync db1089 db1067 - T162807
  • 08:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 01m 13s)
  • 08:37 godog: bootstrap cassandra-c on restbase1013
  • 08:30 moritzm: reboot iron for kernel security update
  • 06:27 marostegui: Deploy schema change on s8 db1087 (sanitarium master) with replication (this will generate lag on labs servers) - T174569
  • 06:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T174569 (duration: 01m 12s)
  • 06:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 - T174569 (duration: 01m 13s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.16) (duration: 07m 18s)
  • 01:08 twentyafterfour: phabricator deployment finished without incident.
  • 01:01 twentyafterfour: Evening SWAT completed. Starting phabricator deployment of #phabricator-2018-07-17 [release/2017-01-17/1]
  • 01:00 twentyafterfour@tin: Finished scap: Evening SWAT (duration: 24m 29s)
  • 00:35 twentyafterfour@tin: Started scap: Evening SWAT

2018-01-17

  • 23:38 mutante: [terbium:~] $ echo 'https://annual.wikimedia.org' | mwscript purgeList.php
  • 22:54 urandom: bootstrapping restbase1013-b - T184100
  • 22:00 andrewbogott: rebooting californium, silver, labcontrol1001, labservices1001
  • 21:03 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.17 (duration: 01m 11s)
  • 20:57 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.17
  • 20:45 thcipriani@tin: Synchronized php-1.31.0-wmf.17/vendor/wikibase/data-model-services: Add missing files from wikibase/data-model-services 3.9.0 (duration: 01m 15s)
  • 20:41 thcipriani@tin: Synchronized php-1.31.0-wmf.17/includes/ServiceWiring.php: [MCR] RevisionStore::getTitle final logged fallback to master PART II (duration: 01m 12s)
  • 20:40 thcipriani@tin: Synchronized php-1.31.0-wmf.17/includes/Storage/RevisionStore.php: [MCR] RevisionStore::getTitle final logged fallback to master PART I (duration: 01m 04s)
  • 20:35 pnorman@tin: Finished deploy [kartotherian/deploy@ecdda41]: (no justification provided) (duration: 05m 44s)
  • 20:30 pnorman@tin: Started deploy [kartotherian/deploy@ecdda41]: (no justification provided)
  • 20:05 andrewbogott: rebooted labservices1002, labcontrol1002, labnet1002
  • 19:56 andrewbogott: rebooting labpuppetmaster1001
  • 19:46 andrewbogott: rebooting labpuppetmaster1002
  • 19:45 papaul: Powering down mw2140 for main board replacement
  • 18:20 niharika29@tin: Synchronized php-1.31.0-wmf.17/includes/EditPage.php: Update Save/Publish button flag from 'constructive' to 'progressive' https://gerrit.wikimedia.org/r/#/c/404733/ (duration: 01m 14s)
  • 18:09 moritzm: uploading HHVM 3.18.5+wmf4 for stretch-wikimedia to apt.wikimedia.org (3.18.7 with the patch https://github.com/facebook/hhvm/commit/bd7b2bcfe70b053a3a001480653012f68599250f backed out)
  • 18:08 ejegg: turned off main silverpop recipient data fetch job
  • 17:55 mutante: gerrit login page design changed (https://gerrit.wikimedia.org/r/402665) in case you were worried it was a fake page trying to steal your login, heh
  • 17:44 moritzm: resetting RAC on labsdb1004 (serial console inaccessible)
  • 17:17 chasemp: reboot labstore2003
  • 17:12 madhuvishy: Rebooting labstore2004
  • 17:08 godog: bootstrap cassandra-a on restbase1013
  • 17:06 ema: upgrade pybal on primary LVSs to 1.14.3 T184715, T184721
  • 16:52 ema: upgrade secondary LVSs to pybal 1.13.4 T184715, T184721
  • 16:33 XioNoX: routing ns2 to radon
  • 16:26 ema: reboot baham (codfw authdns) for kernel upgrade
  • 16:24 XioNoX: routing ns1 to eqiad
  • 16:17 chasemp: labmon1001:~# service grafana-server
  • 16:17 ema: reboot radon (eqiad authdns) for kernel upgrade
  • 16:13 jgleeson: updated civicrm from 354f32fe8a to c70f01cd83
  • 16:12 chasemp: labmon1001:~# /sbin/reboot
  • 16:09 XioNoX: routing ns0 to codfw (baham)
  • 16:07 moritzm: upgrading HHVM in codfw to 3.18.7 (wmf4)
  • 16:06 moritzm: upgrading nginx on mwdebug servers to 1.13.6-2+wmf1~jessie1
  • 16:05 jgleeson: turned off donations queue consumer process-control job
  • 16:00 ema: pybal 1.14.3 uploaded to apt.w.o
  • 15:51 chasemp: labstore1002:~# /sbin/reboot
  • 15:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 after fixing data drifts - T162807 (duration: 01m 12s)
  • 15:41 _joe_: dropping ruwiki htmlCacheUpdate records stuck int he old jobqueue
  • 15:36 moritzm: upgrading nginx on mw servers in codfw to 1.13.6-2+wmf1~jessie1
  • 15:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1104 (duration: 01m 12s)
  • 14:57 moritzm: resetting RAC on labsdb1007 (serial console inaccessible)
  • 14:53 moritzm: resetting RAC on labsdb1006 (serial console inaccessible)
  • 14:38 chasemp: labstore1001:~# /sbin/reboot
  • 14:27 zeljkof: EU SWAT finished
  • 14:23 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create "eliminator" user group on ur.wikipedia (T184607) (duration: 01m 12s)
  • 14:14 moritzm: repooling chromium
  • 14:14 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Draft Namespace in enwikiversity (T184957) (duration: 01m 12s)
  • 14:07 moritzm: rebooting chromium for kernel security update
  • 14:04 gehel: restart of elasticsearch / cirrus eqiad completed (cluster still recovering)
  • 14:03 moritzm: depooling chromium
  • 13:51 chasemp: reboot labstore2003
  • 13:46 akosiaris: reboot sca2003 webperf2001 planet2001 poolcounter2002 mx2001 kubetcd200{1,2,3} install2002 dbmonitor2001 alsafi acrux hassaleh diadem nihal pybal-test200{1,2,3} releases2001 tureis for PCID, INVPCID
  • 13:45 chasemp: labstore2002:~# sudo update-grub && /sbin/reboot
  • 13:40 chasemp: labstore2001:~# /sbin/reboot
  • 13:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 (duration: 01m 13s)
  • 13:31 akosiaris: reboot acrab for PCID,INVPCID enabling
  • 13:22 marostegui: Deploy schema change on db1099:3318 - https://phabricator.wikimedia.org/T174569
  • 13:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 - T174569 (duration: 01m 12s)
  • 13:17 moritzm: upgrading app server canaries to 3.18.5+wmf4
  • 13:12 marostegui: Fixing drifts on db1065 - T162807
  • 12:28 moritzm: uploading HHVM 3.18.5+wmf4 for jessie-wikimedia to apt.wikimedia.org (3.18.7 with the patch https://github.com/facebook/hhvm/commit/bd7b2bcfe70b053a3a001480653012f68599250f backed out)
  • 12:10 moritzm: updating HHVM in deployment-prep to 3.18.5+wmf4
  • 11:40 godog: bootstrap cassandra-b on restbase1016
  • 11:28 moritzm: rearmed keyholder on neodymium
  • 11:24 moritzm: rebooting neodymium for kernel security update
  • 11:19 _joe_: restarted nginx on mw1346, was in a bad state
  • 10:51 moritzm: reset RAC on chromium, serial console is inaccessible
  • 10:42 moritzm: repooling hydrogen
  • 10:39 moritzm: rebooting hydrogen for kernel security update
  • 10:34 moritzm: depooling hydrogen again
  • 10:22 moritzm: repooling hydrogen (and pdns-recursor restarted), experiment concluded
  • 10:14 moritzm: depooling hydrogen (and keeping pdns-recursor stopped for a few minutes to check whether problems with load-balanced recdns traffic are still an issue)
  • 10:11 moritzm: reset RAC on hydrogen, serial console was inaccessible
  • 10:01 godog: start cassandra-a on restbase1016
  • 09:52 elukey: reboot druid1002 for kernel upgrades
  • 09:46 elukey: removed upstart config for brrd on eventlog1001 (failing and spamming syslog, old leftover?)
  • 09:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Full repool db1101:3318 (duration: 01m 11s)
  • 09:30 moritzm: rebooting flerovium and furud for kernel security update
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1101:3318 (duration: 01m 12s)
  • 09:14 godog: reimage restbase1016 - T184100
  • 09:06 elukey: reboot analytics1003 for kernel upgrades
  • 09:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 01m 11s)
  • 08:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1101:3318 (duration: 15m 42s)
  • 08:44 elukey: reboot stat100[456] for kernel upgrades
  • 07:48 elukey: restart varnish backend on cp4024 (ton of 503s, icinga alerting for mailbox lag)
  • 07:46 oblivian@neodymium: conftool action : set/pooled=inactive; selector: cluster=appserver,name=mw12([0-1][0-9]|20)\.eqiad\.wmnet
  • 07:45 _joe_: depooling mw1209-1220 from the appserver cluster for decommissioning, T185004
  • 06:47 marostegui: Remove labsdb1001 and labsdb1003 from tendril - T184832
  • 06:40 marostegui: Stop MySQL on labsdb1001 (already dead) and labsdb1003 - T184832
  • 06:29 marostegui: Stop replication in sync on db1089 and s1 codfw master (db2048) - T162807
  • 06:28 marostegui: Deploy schema change on db1104 - T174569
  • 06:21 marostegui: Upgrade mariadb and kernel on db1104
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T174569 (duration: 01m 14s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.16) (duration: 07m 11s)
  • 00:28 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T182616 Remove cirrus AB test config for hewiki (duration: 01m 09s)
  • 00:26 ebernhardson@tin: Synchronized fc-list: SWAT: T184664 Updating fonts list and sorting it (duration: 01m 12s)
  • 00:21 ebernhardson@tin: Synchronized fc-list: SWAT: T184664 Updating fonts list and sorting it (duration: 01m 12s)
  • 00:10 ebernhardson@tin: Synchronized php-1.31.0-wmf.16/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T182616 Turn off cirrus AB test on hewiki (duration: 01m 12s)
  • 00:08 ebernhardson@tin: Synchronized php-1.31.0-wmf.17/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T182616 Turn off cirrus AB test on hewiki (duration: 01m 14s)

2018-01-16

  • 22:57 niharika29@tin: Finished deploy [scholarships/scholarships@728d203]: Update privacy statement and delete invalidated translation files. T184659 (duration: 00m 02s)
  • 22:57 niharika29@tin: Started deploy [scholarships/scholarships@728d203]: Update privacy statement and delete invalidated translation files. T184659
  • 22:53 thcipriani@tin: rebuilt and synchronized wikiversions files: group0 to 1.31.0-wmf.17
  • 22:40 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for htmlCacheUpdate jobs for all wikis but en, commons and wikidata - T182023 (duration: 01m 12s)
  • 22:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@19b9bdd]: Switch htmlCacheUpdates for all but en, commons, wikidata T182023 (duration: 00m 35s)
  • 22:39 ppchelko@tin: Started deploy [cpjobqueue/deploy@19b9bdd]: Switch htmlCacheUpdates for all but en, commons, wikidata T182023
  • 22:19 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/WikimediaMessages/WikimediaMessages.hooks.php: Update access to ORES isModelEnabled() (duration: 01m 13s)
  • 22:19 ottomata: apt-get install librdkafka1=0.9.4-1~jessie1 librdkafka++1=0.9.4-1~jessie1 on scb* to put librdkafka back at node-rdkafka compat version (somehow this was upgraded yesterday...very dangerous!!)
  • 22:16 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.17 and rebuild l10n cache (duration: 25m 45s)
  • 21:50 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.17 and rebuild l10n cache
  • 21:15 andrewbogott: rebooting labvirt1014 and 1015
  • 21:04 thcipriani@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.16
  • 20:59 andrewbogott: rebooting labvirt1013
  • 20:42 demon@tin: Finished scap: wmf.17 files, no bootstrap of i18n tho (x2) (duration: 06m 33s)
  • 20:35 demon@tin: Started scap: wmf.17 files, no bootstrap of i18n tho (x2)
  • 20:34 demon@tin: scap aborted: wmf.17 files, no bootstrap of i18n tho (duration: 08m 59s)
  • 20:34 andrewbogott: rebooting labvirt1011
  • 20:32 herron: re-enabling puppet agents
  • 20:25 demon@tin: Started scap: wmf.17 files, no bootstrap of i18n tho
  • 20:24 herron: temporarily disabling puppet agents while troubleshooting puppet crl
  • 20:21 andrewbogott: rebooting labvirt1010
  • 20:07 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.16
  • 20:03 andrewbogott: rebooting labvirt1009
  • 19:46 andrewbogott: rebooting labvirt1008
  • 19:46 thcipriani@tin: Synchronized php-1.31.0-wmf.16/includes/Storage/RevisionStore.php: RevisionStore, fix loadSlotContent with no $blobFlags T184749 (duration: 01m 13s)
  • 19:30 twentyafterfour: restarted wikibugs (several attempts, eventually it worked)
  • 18:50 chasemp: reboot labvirt1020
  • 18:44 chasemp: reboot labvirt1019
  • 18:35 andrewbogott: rebooting labvirt1007
  • 18:30 arlolra@tin: Finished deploy [parsoid/deploy@1026fd2]: Updating Parsoid to 231bfff (duration: 13m 13s)
  • 18:25 herron: removing ganeti VM puppetcompiler1001
  • 18:19 moritzm: rebooting labmon1002 for kernel security update
  • 18:17 arlolra@tin: Started deploy [parsoid/deploy@1026fd2]: Updating Parsoid to 231bfff
  • 17:59 moritzm: rebooting labnet100[34] and labcontrol100[34] for kernel security update
  • 17:52 herron: re-enabled puppet agents
  • 17:50 andrewbogott: rebooting labvirt1005
  • 17:45 herron: disabled puppet agents troubleshooting T184444
  • 17:31 andrewbogott: rebooting labvirt1004
  • 17:11 andrewbogott: upgrading and rebooting labvirt1002
  • 17:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1092 original weight (duration: 01m 12s)
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1092 (duration: 01m 12s)
  • 16:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1092 (duration: 01m 12s)
  • 16:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1092 - T174569 (duration: 01m 08s)
  • 16:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1101:3317 (duration: 01m 12s)
  • 16:11 oblivian@neodymium: conftool action : set/pooled=no; selector: cluster=api_appserver,name=mw120[1-8]\.eqiad\.wmnet
  • 16:10 _joe_: depooling mw1201-1208 from the API cluster, T185004
  • 16:09 moritzm: rebooting praseodymium for kernel security update
  • 16:08 godog: bootstrap cassandra-c on restbase1018
  • 16:04 chasemp: add arturo to acl*operations-team
  • 16:03 moritzm: rebooting labweb* hosts for kernel security update
  • 15:57 andrewbogott: rebooting labvirt1001
  • 15:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1101:3317 after kernel upgrade (duration: 01m 12s)
  • 15:44 elukey: reboot druid1003 for kernel upgrades
  • 15:41 moritzm: rebooting achernar for kernel security update
  • 15:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1101:3317 after kernel upgrade (duration: 01m 12s)
  • 15:31 moritzm: rebooting acamar for kernel security update
  • 15:30 marostegui: Deploy schema change on db1101:3318 - T174569
  • 15:11 marostegui: Upgrade mariadb and kernel on db1101
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 db1101:3318 for schema change, mariadb upgrade and kernel upgrade - T162807 (duration: 01m 12s)
  • 14:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 01m 09s)
  • 14:52 moritzm: rebooting graphite1001 for kernel security update
  • 14:35 moritzm: rebooting graphite1003 for kernel security update
  • 14:25 moritzm: powercycling labtestservices2003, stuck in reboot
  • 14:18 moritzm: powercycling labtestservices2001, stuck in reboot
  • 14:13 zeljkof: EU SWAT finished
  • 14:12 elukey: reboot druid100[56] for kernel upgrades
  • 14:11 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Restrict sending mails to new users" config change (T184470) (duration: 01m 13s)
  • 14:09 godog: bootstrap cassandra-b on restbase1018
  • 14:01 moritzm: rebooting labtest* hosts for kernel security update
  • 13:56 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=druid1004.*.wmnet
  • 13:54 moritzm: rebooting graphite1002 for kernel security update
  • 13:52 elukey: reboot druid1004 for kernel upgrades
  • 13:46 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=druid1004.*.wmnet
  • 13:46 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=druid1004*.wmnet
  • 13:44 moritzm: rebooting graphite2002 for kernel security update
  • 13:31 oblivian@neodymium: conftool action : set/weight=25; selector: cluster=api_appserver,name=mw134[3-8[B]\.eqiad\.wmnet
  • 13:28 moritzm: rebooting graphite2001 for kernel security update
  • 13:20 oblivian@neodymium: conftool action : set/pooled=yes; selector: cluster=api_appserver,name=mw134[3-7]\.eqiad\.wmnet
  • 12:56 elukey: reboot kafka100[23] for kernel upgrades
  • 11:59 ariel@tin: Finished deploy [dumps/dumps@c165ca0]: enable 7z prefetch files for page content dumps (duration: 00m 04s)
  • 11:59 ariel@tin: Started deploy [dumps/dumps@c165ca0]: enable 7z prefetch files for page content dumps
  • 11:51 moritzm: rebooting mc2* hosts for kernel security update
  • 11:27 elukey: reboot kafka1001 for kernel upgrades
  • 11:12 moritzm: reboot maerlant for kernel security update
  • 11:08 moritzm: uploaded HHVM 3.18.7 for stretch-wikimedia to apt.wikimedia.org
  • 10:59 godog: roll-restart swift object server - T167400
  • 10:57 moritzm: reboot nescio for kernel security update
  • 10:13 godog: start cassandra-a on restbase1018 - T184100
  • 09:56 moritzm: upgrading canary app servers to HHVM 3.18.7
  • 09:50 marostegui: Stop replication in sync db1089 - db1105:3311 - T162807
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 01m 12s)
  • 09:38 _joe_: started refreshLinks additional jobs for commonswiki,ruwiki
  • 09:30 oblivian@neodymium: conftool action : set/weight=10; selector: cluster=api_appserver,name=mw13(39|4[12]).eqiad.wmnet
  • 09:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 01m 12s)
  • 09:24 oblivian@neodymium: conftool action : set/pooled=yes:weight=1; selector: cluster=api_appserver,name=mw13(39|4[12]).eqiad.wmnet
  • 09:16 moritzm: installing libxml2 security updates on mw* servers (so that it gets picked up along the HHVM 3.18.7 rollout)
  • 09:12 moritzm: installing krb5 security updates (we're just using rev deps)
  • 09:08 jynus: upgrade and reboot db1031 after switchover
  • 08:49 moritzm: rearmed key holder on sarin
  • 08:45 moritzm: rebooting sarin for kernel security update
  • 08:38 marostegui: Stop replication in sync db1089 - db1099:3311 - T162807
  • 08:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066, depool db1099:3311 - T162807 (duration: 01m 12s)
  • 08:28 jynus: master x1 eqiad failover has finished
  • 08:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Promote db1055 as the new x1 master (duration: 00m 49s)
  • 08:17 jynus: setting db1031 (x1 master) as read only
  • 08:11 jynus: start x1 eqiad master failover
  • 08:02 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 and db1056 after maintenance (duration: 00m 49s)
  • 07:42 jynus: moving replication topology of x1 replicas
  • 07:34 marostegui: Deploy schema change on dbstore1001 (s8) - T174569
  • 07:30 marostegui: Stop replication in sync db1066 and db1089 - T162807
  • 07:30 jynus: upgrade and reboot db1056
  • 07:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 and db1089 - T162807 (duration: 01m 13s)
  • 07:29 oblivian@neodymium: conftool action : set/weight=25; selector: name=mw1340.eqiad.wmnet
  • 07:17 jynus: upgrade and reboot db1055
  • 07:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 and db1056 for maintenance (duration: 01m 12s)
  • 07:03 marostegui: Deploy schema change on dbstore1002 (s8) - T174569
  • 06:32 marostegui: Deploy schema change on db1092 - T174569
  • 06:24 marostegui: Upgrade kernel on db1092
  • 06:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T174569 (duration: 01m 32s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 10s)

2018-01-15

  • 23:40 demon@tin: Synchronized wmf-config/InitialiseSettings.php: turn educationprogram back on for cs.wikipedia -- turns out there was no consensus and a patch should never have been written 😡 (duration: 01m 13s)
  • 18:50 _joe_: pooled mw1340 as an api appserver
  • 18:43 oblivian@puppetmaster1001: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:42 oblivian@neodymium: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:34 oblivian@neodymium: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:01 moritzm: uploading HHVM 3.18.7 (3.18.5+dfsg-1+wmf3) for jessie-wikimedia to apt.wikimedia.org
  • 17:44 moritzm: updating HHVM in deployment-prep to HHVM 3.18.7
  • 17:08 godog: bootstrap cassandra-c on restbase1017
  • 16:53 jynus: upgrade and restart db2018
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 and db1089 - T162807 (duration: 01m 12s)
  • 16:49 jynus: finished codfw s3 master switchover
  • 16:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover s3 codfw master from db2018 to db2036 (duration: 01m 12s)
  • 16:41 _joe_: restarting hhvm on mw1227, threads stuck in HPHP::jit::enterTCImpl
  • 16:31 marostegui: Force WB on db2033 - T184888
  • 16:24 jynus: restarting db2036 to set as master
  • 16:20 jynus: starting codfw s3 master switchover
  • 15:55 marostegui: Stop replication in sync db1067 and db1089 - T162807
  • 15:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 01m 12s)
  • 15:44 jynus: upgrade and restart db2074
  • 15:33 jynus: upgrade and restart db2057
  • 15:08 jynus: upgrade and restart db2050
  • 14:58 jynus: upgrade and restart db2043
  • 14:46 jynus: upgrade and restart db2036
  • 14:41 zeljkof: EU SWAT finished
  • 14:40 zfilipin@tin: Synchronized php-1.31.0-wmf.16/extensions/ContentTranslation: SWAT: CX1: Fix translation view UI overlaps (T184662 T184130) (duration: 01m 16s)
  • 14:08 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable lua fine grained usage tracking in some wikis (T184322) (duration: 01m 14s)
  • 14:05 moritzm: reboot rdb* hosts in codfw for kernel security update
  • 13:41 gehel: starting rolling reboot of elasticsearch / cirrus eqiad for kernel upgrade
  • 13:38 elukey: reboot eventlog1001 for kernel updates
  • 13:20 elukey: reboot kafka2003 for kernel upgrades
  • 12:04 jynus: upgrade and restart db2017
  • 11:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 01m 12s)
  • 11:54 moritzm: rebooting ores1* for kernel security update
  • 11:36 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw13(3[8-9]|4[0-9]).*
  • 11:21 godog: upload scap 3.7.6-1 - T127762
  • 11:10 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 14s)
  • 11:09 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 14s)
  • 11:08 godog: bootstrap cassandra-a on restbase1017
  • 10:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 12s)
  • 10:52 gehel: lowering disk watermark on elasticsearch eqiad to shuffle shards around
  • 10:51 jynus: s2 codfw master swithover finished
  • 10:51 hashar: Upgrading zuul to 2.5.1 on contint1001 / contint2001 | T158243
  • 10:51 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover codfw s2 master from db2017 to db2035 (duration: 01m 12s)
  • 10:50 elukey: reboot kafka2002 for kernel updates
  • 10:48 hashar: Upgrading zuul to 2.5.1 on contint1001 / contint2001
  • 10:27 jynus: upgrade and restart db2035
  • 10:22 jynus: starting codfw s2 master switchover
  • 10:16 jynus: start proxysql on terbium
  • 10:15 moritzm: reboot wasat for kernel security update
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 01m 09s)
  • 09:58 elukey: rolling reboots of aqs hosts (1005->1009) for kernel updates
  • 09:45 marostegui: Deploy schema change on s8 codfw master (db2045) with replication (this will generate lag on s8 codfw) - T174569
  • 09:32 elukey: reboot kafka2001 for kernel updates
  • 09:11 hashar: upgrading Zuul on contint2001 (zuul-merger) | https://gerrit.wikimedia.org/r/#/c/356181/
  • 09:09 hashar: upgrading Zuul on contint1001 | https://gerrit.wikimedia.org/r/#/c/356181/
  • 09:07 elukey: reboot aqs1004 for kernel updates
  • 08:44 jynus: disconnecting codfw -> eqiad replication for x1
  • 08:42 moritzm: reboot wezen for kernel security update
  • 08:22 moritzm: rebooting bast1001 for kernel security update
  • 08:15 moritzm: rebooting terbium for kernel security update
  • 08:11 ema: lvs400[56]: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 07:58 _joe_: reenabling puppet on all systems where it was previously enabled, after various testing
  • 07:50 _joe_: forcing puppet run on the puppetmasters to force pluginsync for function change
  • 07:41 _joe_: disabling puppet in all of production before merging https://gerrit.wikimedia.org/r/402345
  • 07:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Replace db1063 with db1087 as vslow in s8 (duration: 01m 12s)
  • 07:11 marostegui: Deploy schema change on silver (labswiki) and labtestweb2001 (labtestwiki) - T174569
  • 06:52 marostegui: Upgrade MariaDB on db1065
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 to fix data drifts on the archive table - T162807 (duration: 01m 13s)
  • 06:13 marostegui: Deploy schema change on db1070 (s5 master) - T174569
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 07m 50s)

2018-01-12

  • 20:07 mutante: mw1227 hhvm-restart
  • 20:07 mutante: mw1227 - high load: hhvm-dump-debug > /root/hhvm-dump-debug-2017012.log | Backtrace saved as /tmp/hhvm.2203.bt.
  • 19:19 ejegg: disabled Omnimail recipient load backfill job
  • 19:09 bblack: leftover cruft from expired digicert-2016 certs all cleaned up now :)
  • 19:08 jynus: upgrade and restart db2091
  • 18:32 jynus: upgrade and restart db2088
  • 18:28 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1014.eqiad.wmnet
  • 17:59 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1014.eqiad.wmnet
  • 17:58 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1012.eqiad.wmnet
  • 17:41 jynus: upgrade and restart db2064
  • 17:34 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1012.eqiad.wmnet
  • 17:33 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1010.eqiad.wmnet
  • 17:31 demon@tin: Synchronized docroot/mediawiki/: prettier keys page (duration: 01m 13s)
  • 17:28 cwd: re-enabled payments,civi,listener,p-c
  • 17:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 01m 09s)
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1010.eqiad.wmnet
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1009.eqiad.wmnet
  • 17:06 cwd: disabled payments/civi/listener
  • 17:06 cwd: disabled process-control jobs
  • 17:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1089 (duration: 01m 12s)
  • 16:52 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1009.eqiad.wmnet
  • 16:52 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1008.eqiad.wmnet
  • 16:46 jynus: upgrade and restart db2063
  • 16:34 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1008.eqiad.wmnet
  • 16:34 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1007.eqiad.wmnet
  • 16:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 (duration: 01m 12s)
  • 16:19 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1007.eqiad.wmnet
  • 16:18 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 16:15 jynus: upgrade and restart db2056
  • 15:44 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2006.codfw.wmnet
  • 15:44 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2005.codfw.wmnet
  • 15:39 jynus: upgrade and restart db2049
  • 14:27 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2005.codfw.wmnet
  • 13:24 jynus: upgrade and restart db2041
  • 12:55 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 12:37 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2004.codfw.wmnet
  • 12:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2003.codfw.wmnet
  • 12:27 jynus: stop db2035 replication for maintenance
  • 12:23 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2035 for maintenance (duration: 01m 13s)
  • 12:15 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2003.codfw.wmnet
  • 12:14 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 11:42 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2002.codfw.wmnet
  • 11:41 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1066 (duration: 01m 12s)
  • 11:07 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 10:50 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2007.codfw.wmnet
  • 10:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1105:3311 and slowly repool db1066 (duration: 01m 13s)
  • 10:33 twentyafterfour@tin: Finished deploy [phabricator/deployment@61f1099]: (no justification provided) (duration: 03m 49s)
  • 10:33 elukey: reboot analytics1066->69 for kernel updates
  • 10:30 twentyafterfour@tin: Started deploy [phabricator/deployment@61f1099]: (no justification provided)
  • 10:29 twentyafterfour@tin: Finished deploy [phabricator/deployment@61f1099]: (no justification provided) (duration: 00m 07s)
  • 10:29 twentyafterfour@tin: Started deploy [phabricator/deployment@61f1099]: (no justification provided)
  • 10:24 moritzm: reboot job runners in codfw for kernel security update (along with update to HHVM 3.18.6)
  • 10:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight db1105:3311 (duration: 01m 13s)
  • 10:11 godog: upload scap 3.7.5-1 - T184774
  • 10:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1082 and db1100 (duration: 01m 22s)
  • 10:02 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw2140.codfw.wmnet
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1100, db1105:3311, db1105:3312 (duration: 01m 23s)
  • 09:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1100 (duration: 01m 22s)
  • 09:11 godog: reboot ms-be2023 - sdn failed and raid controller isn't happy
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1105:3311 and db1105:3312 (duration: 01m 23s)
  • 09:07 elukey: reboot analytics1063->65 for kernel updates
  • 09:04 marostegui: Upgrade kernel on db1100
  • 09:00 elukey: forced remount of /mnt/hdfs on stat1005 after OOM
  • 08:46 moritzm: reboot remaining API servers in codfw for kernel security update (along with update to HHVM 3.18.6)
  • 08:14 moritzm: reboot video scalers in codfw for kernel security update
  • 07:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3312 with low weight (duration: 01m 22s)
  • 07:01 marostegui: Stop replication in sync db1089 db1105:3311 - T162807
  • 06:46 marostegui: Update mariadb and kernel on db1105 - T184256
  • 06:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311, db1105:3312 - T162807 T184256 (duration: 01m 22s)
  • 06:24 marostegui: Deploy schema change on db1100 - T174569
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T174569 (duration: 01m 22s)
  • 00:51 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/WikibaseQualityConstraints/extension.json: SWAT: Declare dependency on jquery.makeCollapsible (duration: 01m 21s)
  • 00:43 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/WikibaseQualityConstraints/modules/ui/ConstraintReportGroup.less: SWAT: Do not hide default [Expand] link (duration: 01m 22s)
  • 00:40 thcipriani@tin: Synchronized php-1.31.0-wmf.16/extensions/WikibaseQualityConstraints/modules/ui/ConstraintReportGroup.less: SWAT: Do not hide default [Expand] link (duration: 01m 24s)

2018-01-11

  • 22:35 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/403774
  • 22:04 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403762/
  • 20:57 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403753/
  • 20:52 andrewbogott: rebooting labvirt1003
  • 20:12 twentyafterfour@tin: rebuilt and synchronized wikiversions files: Rollback group1 to wmf.15 due to T184749 refs T180749
  • 20:12 andrewbogott: rebooting labvirt1017 for kernel upgrade
  • 20:04 catrope@tin: Finished scap: SWAT (duration: 30m 12s)
  • 19:58 gehel: elasticsearch / cirrus / codfw rolling reboot completed. Cluster still recovering
  • 19:34 catrope@tin: Started scap: SWAT
  • 19:20 catrope@tin: Synchronized php-1.31.0-wmf.16/includes/: Deprecate old interwiki search result widget (duration: 02m 17s)
  • 19:09 catrope@tin: Synchronized php-1.31.0-wmf.16/extensions/Flow/modules/styles/flow/widgets/editor/mw.flow.ui.EditorWidget.less: T184631 (duration: 01m 22s)
  • 18:06 ema: lvs4007: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 18:00 jynus: upgrade and restart db1102- it may add some minutes of lag to some wikis on wikireplicas
  • 17:32 jynus: shutting down db1059 for maintenance
  • 16:57 akosiaris: upgrade apertium on scb100* nodes done
  • 16:55 godog: start rolling restart of restbase-test / restbase-dev cluster
  • 16:54 jynus: upgrade and restart db1095- it may add some minutes of lag to some wikis on wikireplicas
  • 16:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1099:3311 (duration: 01m 22s)
  • 16:28 moritzm: rebooting mwlog2001 for kernel security update
  • 16:19 moritzm: rebooting mwlog1001 for kernel security update
  • 16:05 moritzm: rebooting notebook1001 for kernel security update
  • 16:05 akosiaris: upgrade apertium on scb200* nodes
  • 15:59 moritzm: reboot lithium for kernel security update
  • 15:51 moritzm: reboot oxygen for kernel security update
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight (duration: 01m 21s)
  • 15:28 moritzm: reboot ruthenium for kernel security update
  • 15:26 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1008.eqiad.wmnet
  • 15:18 akosiaris: clear trusty-wikimedia from apertium packages. The apertium services is a long time now on jessie and all users should have migrated by now. If not, they should
  • 15:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight (duration: 01m 23s)
  • 15:08 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1008.eqiad.wmnet
  • 15:05 moritzm: rolling reboot of prometheus in eqiad for kernel security update
  • 15:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1007.eqiad.wmnet
  • 14:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067,db1099:3318,db1099:3311, depool db1066 (duration: 01m 19s)
  • 14:58 marostegui: Upgrade mariadb and kernel on db1066
  • 14:47 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1007.eqiad.wmnet
  • 14:47 godog: continue swift frontend eqiad roll-restart, ms-fe1007 / ms-fe1008
  • 14:45 jynus@tin: Synchronized wmf-config/db-codfw.php: Promote db2040 as the new codfw-s7 master (duration: 01m 22s)
  • 14:40 moritzm: rolling reboot of prometheus in codfw for kernel security update
  • 14:37 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1271.eqiad.wmnet
  • 14:36 joal@tin: Finished deploy [analytics/refinery@ed8ecbc]: Patching interlanguage link and manually add a jar to our collection (duration: 04m 10s)
  • 14:36 jynus: running scap pull on mw1271
  • 14:32 joal@tin: Started deploy [analytics/refinery@ed8ecbc]: Patching interlanguage link and manually add a jar to our collection
  • 14:26 moritzm: powercycling mw1271
  • 14:25 zeljkof: EU SWAT finished
  • 14:17 jynus: upgrade and restart db2029
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create extendedconfirmed for kowiki (T184675) (duration: 01m 23s)
  • 14:14 akosiaris: set migration_downtime to 2000ms for seaborgium
  • 14:01 moritzm: reboot hafnium for kernel security update
  • 14:00 moritzm: reboot tungsten for kernel security update
  • 13:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3318 weight (duration: 01m 15s)
  • 13:56 jynus: perform master switchover of s7 codfw
  • 13:42 moritzm: rebooting ores2* for kernel security update
  • 13:34 jynus: upgrade and restart db2077
  • 13:34 moritzm: rebooting bast2001 for kernel security update
  • 13:31 moritzm: migrating instances off ganeti1001 for subsequent reboot for kernel security update
  • 13:27 moritzm: failover the ganeti master in eqiad to ganeti1004
  • 12:39 volans: Icinga failover back to einsteinium completed - T170353
  • 12:38 moritzm: rearmed keyholder on naos
  • 12:36 moritzm: migrating instances off ganeti1007 for subsequent reboot for kernel security update
  • 12:34 moritzm: rebooting naos for kernel security update
  • 12:28 volans: Start Icinga failover back to einsteinium - T170353
  • 12:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 with low weight (duration: 01m 44s)
  • 12:07 marostegui: Stop replication in sync db1089 db1099:3311 - T162807
  • 12:03 moritzm: migrating instances off ganeti1006 for subsequent reboot for kernel security update
  • 11:33 moritzm: migrating instances off ganeti1005 for subsequent reboot for kernel security update
  • 11:14 moritzm: migrating instances off ganeti1004 for subsequent reboot for kernel security update
  • 11:07 moritzm: reboot remaining job runners in eqiad for kernel security update (along with update to HHVM 3.18.6)
  • 11:02 akosiaris: upload cg3_1.0.0~r12254-1+wmf1_amd64 to apt.wikimedia.org/jessie-wikimedia/main
  • 11:02 moritzm: migrating instances off ganeti1003 for subsequent reboot for kernel security update
  • 10:56 akosiaris: upload apertium_3.4.2~r68466-3+wmf1_amd64to apt.wikimedia.org/jessie-wikimedia/main T181464
  • 10:54 akosiaris: set kvm:migration_downtime to 30ms for both eqiad/codfw ganeti clusters. Then set migration_downtime 30000 for nitrogen/nihal
  • 10:52 moritzm: rearmed keyholder on tin
  • 10:47 moritzm: rebooting tin for kernel security update
  • 10:43 marostegui: Upgrade and restart db1099:3311 and db1099:3318
  • 10:41 jynus: upgrade and restart db2068
  • 10:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1110 original weight (duration: 01m 04s)
  • 10:27 moritzm: rolling reboot of sca/zotero clusters for kernel security update
  • 10:23 jynus: upgrade and restart db2061
  • 10:20 akosiaris: upload hfst_3.13.0~r3461-1+wmf1_amd64 to apt.wikimedia.org/jessie-wikimedia/main T181463
  • 10:14 moritzm: migrating instances off ganeti1002 for subsequent reboot for kernel security update
  • 10:07 jynus: upgrade and restart db2054
  • 10:06 moritzm: rebooting rhenium for kernel security update
  • 10:00 elukey: reboot analytics1059-61 for kernel updates
  • 10:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1110 weight (duration: 01m 06s)
  • 09:41 moritzm: reboot bast4002 for kernel security update
  • 09:34 elukey: reboot analytics1055->1058 for kernel updates
  • 09:32 godog: cleanup ores metrics older than 30d - T169969
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with low weight - T174569 (duration: 01m 08s)
  • 09:24 gehel: relforge reboot completed
  • 09:08 gehel: reboot of relforge* for kernel upgrade
  • 09:04 elukey: reboot analytics1051->1054 for kernel updates
  • 09:00 gehel: logstash rolling restart completed
  • 08:57 moritzm: reboot remaining mediawiki app servers in eqiad for kernel security update (along with update to HHVM 3.18.6)
  • 08:55 marostegui: Upgrade db1110 kernel - T184256
  • 08:36 moritzm: powercycling wtp2013 (apparently didn't come back up after reboot)
  • 08:27 marostegui: Fix data drifts on enwiki.archive on codfw - T162807
  • 08:21 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=logstash1007.eqiad.wmnet
  • 08:17 gehel: rolling restart of logstash for kernel upgrade
  • 07:50 marostegui: Deploy schema change on db1110 - T174569
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 - T174569 (duration: 01m 03s)
  • 07:47 moritzm: reboot remaining mediawiki API servers for kernel security update (along with update to HHVM 3.18.6)
  • 07:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T174569 (duration: 01m 03s)
  • 07:24 marostegui: Drop external_user table from s3 - T184247
  • 07:17 foks: Removed 2FA from Amjaabc
  • 07:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 db1099:3318 - T162807 T184256 (duration: 01m 02s)
  • 06:32 marostegui: Deploy schema change on db1082.s5 with replication (this will generate lag on labs) - T174569
  • 06:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T174569 (duration: 01m 02s)
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 - T174569 (duration: 01m 03s)
  • 06:21 marostegui: Upgrade mariadb+kernel on db1089
  • 06:17 marostegui: Force BBU relearn on db1059 - T184160
  • 02:37 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 11m 10s)
  • 00:57 urandom: bootstrapping restbase1011-c -- T184100

2018-01-10

  • 23:50 eileen: civicrm revision changed from 429a5c5385 to 354f32fe8a, deploy contact change, contact search fixes, install cleanup
  • 21:57 twentyafterfour: group1 looks stable. This concludes the MediaWiki train for today.
  • 21:54 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.16 (duration: 01m 02s)
  • 21:53 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.16
  • 21:47 twentyafterfour@tin: Finished scap: group0 to 1.31.0-wmf.16 refs T180749 (duration: 38m 29s)
  • 21:09 twentyafterfour@tin: Started scap: group0 to 1.31.0-wmf.16 refs T180749
  • 20:49 twentyafterfour@tin: Synchronized php-1.31.0-wmf.16: Sync wmf.16 to deploy multiple patches from addshore refs T180749 (duration: 10m 23s)
  • 20:14 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: Update eventstreams with newer service-template-node: T171011 (duration: 04m 11s)
  • 20:09 otto@tin: Started deploy [eventstreams/deploy@ee854df]: Update eventstreams with newer service-template-node: T171011
  • 20:09 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: Update eventstreams deploy test to scb2002: T171011 (duration: 00m 24s)
  • 20:09 otto@tin: Started deploy [eventstreams/deploy@ee854df]: Update eventstreams deploy test to scb2002: T171011
  • 20:05 jynus: upgrade and restart dbstore2002
  • 20:00 jynus: upgrade and restart dbstore2001
  • 19:45 jynus: upgrade and restart db2047
  • 19:32 urandom: bootstrapping restbase1011-b -- T184100
  • 19:22 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Add throttle rule for Paris University and sort other by date T184618 (duration: 01m 03s)
  • 19:00 jynus: upgrade and restart db1059
  • 18:45 chasemp: reboot labtestvirt2002.codfw.wmnet w/ new kernel
  • 18:40 andrewbogott: upgrading labvirt1018 kernel and rebooting
  • 18:23 jynus: upgrade and restart db2040
  • 17:59 jynus: upgrade and restart db2087
  • 17:48 andrewbogott: installing linux-image-generic-lts-xenial on labtestvirt2003
  • 17:44 jynus: upgrade and restart db2086
  • 16:55 elukey: reboot analytics1047->50 for kernel updates
  • 16:43 akosiaris: wtp* rolling restarts for meltdown finished
  • 16:39 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1006.eqiad.wmnet
  • 16:38 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1008.eqiad.wmnet
  • 16:35 godog: bounce thumbor-instances on thumbor1001
  • 16:26 anomie: Running cleanupUsersWithNoId.php on dewiki and wikidatawiki
  • 16:22 ottomata: restarting kafka jumbo brokers to apply java.security certpath restrictions
  • 16:08 godog: roll-restart swift frontend in eqiad for kernel upgrade
  • 16:06 moritzm: migrating instances off ganeti2001 for subsequent reboot for kernel security update
  • 16:05 moritzm: switched ganeti master node in codfw to ganeti2004
  • 16:03 marostegui: Deploy schema change on db1096.s5 - https://phabricator.wikimedia.org/T174569
  • 16:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 - T174569 (duration: 01m 02s)
  • 15:59 godog: start cassandra-a on restbase1011
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 - T174569 (duration: 01m 03s)
  • 15:32 moritzm: rebooting yubico auth servers for kernel security update
  • 15:14 moritzm: reboot netmon1002 / netmon2001 for kernel security update
  • 14:54 ema: codfw LVSs: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:51 godog: start cassandra-a on restbase1011 - T184100
  • 14:50 zeljkof: EU SWAT finished
  • 14:50 jynus: dropping dewiki from dbstore2001:3318 T184599
  • 14:47 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: translationadmin: remove configuration equal to CommonSettings.php (T184314) (duration: 01m 02s)
  • 14:46 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: translationadmin: typo fix (duration: 01m 03s)
  • 14:42 chasemp: new meltdown images are live in cloud land
  • 14:34 jynus: dropping wikidatawiki from dbstore2001:3315 T184599
  • 14:09 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Lift the cap on IP address to create accounts on mrwiki (T184579) (duration: 01m 04s)
  • 14:05 moritzm: migrating instances off ganeti2002 for subsequent reboot for kernel security update
  • 13:37 moritzm: migrating instances off ganeti2003 for subsequent reboot for kernel security update
  • 13:26 _joe_: restarting pybal on lvs2003
  • 13:03 mobrovac@tin: Finished deploy [restbase/deploy@a2aabfb]: API: add top-by-country, change recommendation route, fix duplicates in onthisday - T181520 T170877 T175974 (duration: 08m 00s)
  • 12:55 mobrovac@tin: Started deploy [restbase/deploy@a2aabfb]: API: add top-by-country, change recommendation route, fix duplicates in onthisday - T181520 T170877 T175974
  • 12:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 - T174569 (duration: 01m 03s)
  • 12:54 marostegui: Deploy schema change on db1097:3315 - https://phabricator.wikimedia.org/T174569
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T174569 (duration: 01m 03s)
  • 12:38 moritzm: migrating instances off ganeti2004 for subsequent reboot for kernel security update
  • 12:19 moritzm: migrating instances off ganeti2005 for subsequent reboot for kernel security update
  • 12:11 moritzm: rebooting einsteinium for kernel security update
  • 11:51 moritzm: migrating instances off ganeti2006 for subsequent reboot for kernel security update
  • 11:45 godog: downtime decomissioned restbase cassandra 2 hosts
  • 11:39 moritzm: rebooting mw1201-mw1208 for kernel security update (along with update to HHVM 3.18.6)
  • 11:33 marostegui: Deploy schema change on db1106 - T174569
  • 11:26 elukey: reboot analytics1044->47 for kernel updates
  • 11:23 moritzm: migrating instances off ganeti2007 for subsequent reboot for kernel security update
  • 11:19 volans: Icinga failover to tegmen completed - T170353
  • 11:12 moritzm: migrating instances off ganeti2008 for subsequent reboot for kernel security update
  • 11:07 volans: start failovering of Icinga to tegmen - T170353
  • 10:55 elukey: reboot analytics1040->43 for kernel updates
  • 10:29 godog: reimage restbase1011 to test HBA mode - T184100
  • 10:16 moritzm: rebooting bast4001 for kernel security update
  • 10:06 elukey: rebooting analytics1035 (hadoop worker node and hdfs journal node) for kernel updates
  • 10:02 moritzm: rebooting tegmen for kernel security update
  • 09:50 godog: shut cassandra 2 on restbase legacy nodes - T184100
  • 09:40 moritzm: rebooting kubernetes workers (plus staging hosts) for kernel security update
  • 09:39 ema: eqiad LVSs: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:32 marostegui: Upgrade kernel on db1067
  • 09:27 godog: stop restbase on cassandra 2 nodes - T184100
  • 09:15 marostegui: Deploy schema change on db1051 - T174569
  • 09:12 moritzm: rebooting radium (tor relay) for kernel security update
  • 08:42 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 01m 05s)
  • 08:38 marostegui: Deploy schema change on s5 dbstore1001 - T174569
  • 08:33 moritzm: rebooting mw1299-mw1306 (job runners) for kernel security update (along with update to HHVM 3.18.6)
  • 08:28 hashar: contint1001: upgraded Zuul 2.5.0-8-gcbc7f62-wmf4jessie1 .. 2.5.0-8-gcbc7f62-wmf6 | T158243
  • 08:13 marostegui: Deploy schema change on s5 dbstore1002 - T174569
  • 07:44 moritzm: rebooting mw1262-mw1275 for kernel security update (along with update to HHVM 3.18.6)
  • 07:37 marostegui: Drop external_user from wikidatawiki - T184247
  • 06:17 marostegui: Deploy schema change on s5 codfw master (db2052) with replication (this will generate lag on codfw) - T174569
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 02s)
  • 01:39 mutante: mw1226 - high load - hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1739PST.log ; restart-hhvm
  • 00:43 mutante: rebooting gerrit server for kernel upgrade
  • 00:18 mutante: rebooting phabricator server for kernel upgrade

2018-01-09

  • 22:52 godog: ms-be1033 truncate unrotated and big server.log
  • 22:22 aaron@tin: Synchronized php-1.31.0-wmf.16/includes/Setup.php: 68b4bbf (duration: 01m 15s)
  • 22:20 mutante: netmon2001 - arming keyholder for rancid
  • 21:10 mepps: updated SmashPig from 45aa62650c to 778e8f87b4
  • 20:57 twentyafterfour@tin: Finished scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2) (duration: 36m 34s)
  • 20:21 twentyafterfour@tin: Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2)
  • 20:14 twentyafterfour@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="test2wiki" --outdir="/tmp/scap_l10n_3984299293" --threads=10 --lang en --quiet' returned non-zero exit status 1 (duration: 02m 44s)
  • 20:13 mutante: netmon2001 - rebooting
  • 20:12 twentyafterfour@tin: Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749
  • 20:04 mutante: gerrit2001 - rebooting
  • 20:00 mutante: phab2001 - reboot for upgrade
  • 19:20 mepps: rolledback SmashPig from 0c45b1a684 to 45aa62650c
  • 19:07 mepps: updated SmashPig from 45aa62650c to 0c45b1a684
  • 18:42 mutante: ms-fe3002,ms-fe3001 - powering down, removing from puppet and icinga, ms-be* removing from puppet/icinga (T169518)
  • 18:38 mutante: ms-fe3001 - shutting down for decom, removed from puppet
  • 18:38 mutante: mw1227 still not showing recovery, using restart-hhvm
  • 18:29 mutante: mw1227 killed it one more time and also restarted apache.. now load going down
  • 18:26 mutante: mw1227 hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1024PST.log ; then killed hhvm and restarted it with systemctl
  • 17:56 twentyafterfour: MediaWiki Train: Branching 1.31.0-wmf.16
  • 17:41 moritzm: rebooting image scalers in codfw for kernel security update (along with HHVM update)
  • 17:30 volans: re-enabled Icinga event handlers on RAID checks for lvs3001
  • 17:17 ema: failover traffic back to lvs3001, raid rebuilt
  • 17:15 godog: depool restbase cassandra 2 nodes - T184100
  • 16:35 cmjohnson1: disabling pupppet for decom on mw1180-1200
  • 16:28 volans: disabled Icinga event handlers on RAID checks for lvs3001, WIP on the host
  • 16:18 gehel: starting cluster reboot for elasticsearch / cirrus codfw
  • 16:09 bd808: data-services: added s8.{analytics,web}.db.svc.eqiad.wmflabs and aliases (T181643, T184179)
  • 16:09 elukey: re-started mysql on dbstore1002 (and slave replication) after hw maintenance
  • 15:44 godog: roll-restart swift frontends in codfw and eqiad
  • 15:40 akosiaris@tin: Finished deploy [servermon/servermon@10e165e]: Testing scap check (duration: 00m 02s)
  • 15:40 akosiaris@tin: Started deploy [servermon/servermon@10e165e]: Testing scap check
  • 15:31 gehel: reboot maps-test* for kernel upgrade
  • 15:30 elukey: stop mysql on dbstore1002 as prep step for shutdown (stop all slaves, mysql stop)
  • 15:23 herron: puppet master reboots complete. re-enabling puppet agents
  • 15:18 ema: lvs3001 disk swap: failover traffic to lvs3003 T166965
  • 15:10 elukey: reboot analytics1028 (hadoop worker and hdfs journal node) for kernel updates
  • 15:07 anomie: Creating MCR tables on all wikis (T183486)
  • 15:01 herron: temporarily disabling puppet agents and rebooting puppet masters for security updates
  • 15:00 elukey: reboot kafka-jumbo1006 for kernel updates
  • 14:59 ema: lvs3001: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267, replace sdb T166965
  • 14:48 moritzm: rolling reboot of scb in eqiad for kernel security update
  • 14:41 elukey: reboot kafka-jumbo1005 for kernel updates
  • 14:36 godog: upgrade and roll-restart thumbor in codfw/eqiad - T182656 T183907 T169144
  • 14:32 elukey: reboot kafka1023 for kernel updates
  • 14:21 elukey: reboot kafka-jumbo1004 for kernel updates
  • 14:14 moritzm: rolling reboot of scb in codfw for kernel security update
  • 14:14 ema: lvs3003: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:07 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Save -> Publish on remaining Wikinewses which haven't updated - https://gerrit.wikimedia.org/r/#/c/403077/ (duration: 00m 53s)
  • 14:06 ema: lvs3002: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:04 elukey: reboot kafka1022 for kernel updates
  • 14:01 godog: copy poolcounter from jessie-wikimedia into stretch-wikimedia - T183385
  • 13:51 elukey: reboot kafka-jumbo1003 for kernel updates
  • 13:34 moritzm: rebooting remaining video scalers in eqiad for kernel security update (along with HHVM update)
  • 13:10 elukey: reboot kafka1020 for kernel updates
  • 13:07 mobrovac@tin: Finished deploy [restbase/deploy@837f5a9]: Force deploy on all targets - T184110 (duration: 07m 23s)
  • 13:00 mobrovac@tin: Started deploy [restbase/deploy@837f5a9]: Force deploy on all targets - T184110
  • 12:58 moritzm: rebooting labnodepool* for kernel security update
  • 12:55 akosiaris@tin: Finished deploy [servermon/servermon@10e165e]: Update servermon (duration: 00m 02s)
  • 12:54 akosiaris@tin: Started deploy [servermon/servermon@10e165e]: Update servermon
  • 12:23 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1014.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1012.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1011.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1010.eqiad.wmnet
  • 12:19 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1009.eqiad.wmnet
  • 12:17 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1008.eqiad.wmnet
  • 12:17 moritzm: rebooting scb2001 for kernel security update
  • 12:09 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1007.eqiad.wmnet
  • 12:07 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 12:05 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2005.codfw.wmnet
  • 12:04 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 12:03 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2003.codfw.wmnet
  • 11:58 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 11:56 godog: roll-restart restbase c3 nodes in codfw/eqiad
  • 11:50 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:43 moritzm: rebooting app servers mw1238-mw1258 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 11:25 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 11:17 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:03 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2004.codfw.wmnet
  • 11:03 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2002.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2006.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 11:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 10:59 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 10:07 ema: cp3041 soft lockup, rebooting
  • 10:03 elukey: reboot kafka-jumbo1002 for kernel updates
  • 09:59 ema: failover traffic lvs3002 -> lvs3004 (new kernel)
  • 09:51 ema: lvs3004: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:35 elukey: reboot kafka1014 for kernel updates
  • 09:32 godog: deploy restbase to cassandra 3 nodes
  • 09:11 godog: roll restart swift in eqiad for kernel upgrade
  • 08:39 moritzm: rebooting app servers in codfw for kernel security update
  • 08:15 jynus: stopping dbstore2001:s5 for cloning to s8
  • 06:32 _joe_: restarting pdfrender on scb1003
  • 06:29 marostegui@tin: Synchronized docroot/noc/conf/s8.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 48s)
  • 06:27 marostegui@tin: Synchronized dblists/s8.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 50s)
  • 06:26 marostegui@tin: Synchronized dblists/s5.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 53s)
  • 06:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove read_only from s5 and s8 T177208 T181645 (duration: 00m 27s)
  • 06:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Splitting s5 and s8 T177208 T181645 (duration: 00m 50s)
  • 06:07 jynus: stopping slave and reseting on db1071
  • 06:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Set s5 on read-only to start failover T177208 T181645 (duration: 00m 50s)
  • 05:12 marostegui: Start pre-failover tasks T177208 T181645
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 05m 31s)
  • 00:43 mutante: phabricator servers: upgraded php5-*, openssh
  • 00:17 mutante: netmon1002/2001 - upgraded php7.0 related packages | krypton (webserver_misc_apps) - upgraded php5 packages
  • 00:08 mutante: contint1001/2001 - upgraded php5-related packages
  • 00:06 mutante: releases1001/2001 - upgraded kernel image, planet - upgraded openssl et al

2018-01-08

  • 23:56 mutante: rutherfordium (people.wm.org) - upgrading PHP5
  • 21:52 bsitzmann@tin: Finished deploy [mobileapps/deploy@1bfd4b0]: Update mobileapps to d20915c (T184430 T184429) (duration: 05m 33s)
  • 21:47 bsitzmann@tin: Started deploy [mobileapps/deploy@1bfd4b0]: Update mobileapps to d20915c (T184430 T184429)
  • 21:30 arlolra: Updated Parsoid to e133312 (T182349, T183893, T159985)
  • 21:22 arlolra@tin: Finished deploy [parsoid/deploy@1dac474]: Updating Parsoid to e133312 (duration: 10m 31s)
  • 21:12 arlolra@tin: Started deploy [parsoid/deploy@1dac474]: Updating Parsoid to e133312
  • 21:05 mutante: new Wikipedia lanuage: "inh" - recreating/reloading DNS zones to add "inh" (Ingush) from langs.tmpl (T184374) https://wikitech.wikimedia.org/wiki/Add_a_wiki#DNS
  • 20:09 ejegg: rolled back smashpig payments listener from 0e703f502d to 45aa62650c
  • 19:34 ottomata: rebooting analytics1002 and then analytics1001 to apply proxyuser changes and kernel update
  • 19:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove language button from Wikidata and MediaWiki T183665 (duration: 00m 51s)
  • 19:04 ejegg: updated SmashPig payments listener from 45aa62650c to 0e703f502d
  • 18:15 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2040 (duration: 00m 50s)
  • 18:10 gehel@tin: Finished deploy [wdqs/wdqs@c680f55]: (no justification provided) (duration: 02m 03s)
  • 18:08 gehel@tin: Started deploy [wdqs/wdqs@c680f55]: (no justification provided)
  • 16:57 milimetric@tin: Finished deploy [analytics/refinery@f99e7dd]: Update and re-run interlanguage job (duration: 11m 28s)
  • 16:45 milimetric@tin: Started deploy [analytics/refinery@f99e7dd]: Update and re-run interlanguage job
  • 16:36 jynus: stopping replication on db2040
  • 16:28 cormacparle: About to run refreshFileHeaders.php on all wikis to fix https://phabricator.wikimedia.org/T178849
  • 15:23 elukey: reboot kafka1013 for kernel updates
  • 15:17 jynus@tin: Synchronized wmf-config/db-codfw.php: Fix db2039 comments (duration: 00m 50s)
  • 15:12 ema: cache_upload: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 15:03 hashar@tin: Synchronized dblists/group1-wikipedia.dblist: Add test2wiki as a group1 wiki - T182326 (duration: 00m 50s)
  • 14:57 gehel: rolling reboot of maps servers for kernel upgrade
  • 14:56 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable fine grained usage tracking in hewiki - T172914 (duration: 00m 50s)
  • 14:51 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add Translation: namespace on Punjabi Wikisource - T179807 (duration: 00m 50s)
  • 14:48 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Turn on mapframe for Arabic Wikipedia - T183764 (duration: 00m 51s)
  • 14:33 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add new namespace aliases on zhwiki - T183711 (duration: 00m 50s)
  • 14:28 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable commons import in tawikisource - T181774 (duration: 00m 48s)
  • 14:27 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Update logo for chrwiki, add the HD version T180553 (duration: 00m 50s)
  • 14:26 ema: cache_text: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:25 hashar@tin: Synchronized static/images/project-logos: Update logo for chrwiki, add the HD version T180553 (duration: 00m 51s)
  • 14:23 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Move wiktionary HD logo to wiktionaries - T183922 (duration: 00m 50s)
  • 14:21 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wgKartographerStaticMapframe for lvwiki - T183981 (duration: 00m 51s)
  • 14:16 hashar@tin: Synchronized wmf-config/Wikibase-production.php: Don’t check constraints on example properties - T183267 (duration: 00m 51s)
  • 13:50 moritzm: rebooting mw image scalers in eqiad for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 13:42 gehel: rolling restart of wdqs servers for kernel upgrades
  • 13:41 elukey: reboot analytics10[36-39] for kernel updates
  • 13:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1109 original status (duration: 00m 50s)
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up db1109 (duration: 00m 52s)
  • 13:07 joal@tin: Finished deploy [analytics/aqs/deploy@ab85797]: Add pageview top-by-country endpoint (duration: 17m 57s)
  • 13:05 moritzm: rebooting mw1259/mw1260 (video scalers) for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 12:59 elukey: reboot kafka1012 for kernel updates
  • 12:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 27s)
  • 12:49 joal@tin: Started deploy [analytics/aqs/deploy@ab85797]: Add pageview top-by-country endpoint
  • 12:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 59s)
  • 12:37 fdans@tin: Finished deploy [analytics/aqs/deploy@ab85797]: (no justification provided) (duration: 00m 16s)
  • 12:37 fdans@tin: Started deploy [analytics/aqs/deploy@ab85797]: (no justification provided)
  • 12:35 moritzm: rebooting mw1209-mw1220 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 12:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: revert warm up s8 future hosts - T177208 (duration: 02m 58s)
  • 12:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 52s)
  • 12:18 akosiaris@tin: Finished deploy [servermon/servermon@b9832c5]: Update servermon (duration: 00m 02s)
  • 12:18 akosiaris@tin: Started deploy [servermon/servermon@b9832c5]: Update servermon
  • 12:01 moritzm: rebooting mw1221-mw1235 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 11:38 moritzm: rebooting mwdebug* for kernel security update
  • 11:28 godog: puppet node deactivate wtp10[568] - T177374
  • 11:06 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 51s)
  • 11:05 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 51s)
  • 10:50 godog: roll restart swift in codfw for kernel upgrades
  • 10:40 akosiaris@tin: Finished deploy [servermon/servermon@53b81d8]: Update servermon (duration: 00m 02s)
  • 10:40 akosiaris@tin: Started deploy [servermon/servermon@53b81d8]: Update servermon
  • 10:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 and db1089 - T162807 (duration: 00m 50s)
  • 10:26 hashar: Started docker on contint1001 / contint2001 . They were missing the overlay/overlayfs kernel modules | T184410
  • 10:04 elukey: drain + reboot analytics1029,1031->1034 for kernel updates
  • 10:03 jynus: fixing wrong events on db2039, db1071,db2023, db2045, db2052, db1100
  • 09:53 godog: Flashing Smart Array P840 in Slot 3 [ 4.52 -> 6.06 ] on ms-be2037 - T184390 T141756
  • 09:46 hashar: rebooting CI
  • 09:46 godog: reboot ms-be2037 - T184390
  • 09:39 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 to mw1261,mw2251,mw1276 and all videoscalers (Recently rebooted/reimaged)
  • 09:38 hashar: upgrading contint1001 / contint1002 | T184267
  • 09:24 ema: cache_misc: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:17 _joe_: starting 3 manual loops for consuming refreshLinks jobs for ruwiki
  • 09:14 marostegui: Force BBU relearn on db1059 - T184160
  • 08:30 moritzm: installing remaining openssl updates
  • 07:24 marostegui: Stop MySQL on db1039 for decommission - T184262
  • 07:17 marostegui: Remove db1039 from tendril - T184262
  • 07:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1039 as it will be decommissioned - T184262 (duration: 00m 50s)
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1039 as it will be decommissioned - T184262 (duration: 00m 50s)
  • 06:51 marostegui: Stop replication in sync on db1067 and db1089 - T162807
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 00m 51s)
  • 06:32 marostegui: Disable BBU auto-learn on db1011
  • 06:17 marostegui: Deploy schema change on s7 primary master (db1062) - T174569
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 17s)

2018-01-07

  • 20:25 demon@tin: Synchronized wmf-config/interwiki.php: auto-sync with my plugin was busted 🙃 (duration: 00m 50s)
  • 19:56 demon@tin: Synchronized php-1.31.0-wmf.15/maintenance/Maintenance.php: fix stuff (duration: 00m 51s)
  • 19:32 demon@tin: Finished scap: Delete alswik(ibooks|iquote|tionary), mowik(ipedia|tionary) (duration: 21m 32s)
  • 19:10 demon@tin: Started scap: Delete alswik(ibooks|iquote|tionary), mowik(ipedia|tionary)
  • 08:52 elukey: re-enabled puppet on db110[78] - eventlogging_sync restarted on db1108 (analytics-slave) - T168414

2018-01-06

  • 08:09 elukey: re-enable eventlogging mysql consumers after database maintenance - T168414
  • 06:59 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw[1329-1333] (new appservers, was 120)
  • 06:49 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw1335 (new jobrunner, was 120)

2018-01-05

  • 22:27 tgr: T184263 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=eswiki --logwiki=metawiki "Mega849" "Mega809"
  • 20:47 demon@tin: Pruned MediaWiki: 1.31.0-wmf.12 [keeping static files] (duration: 02m 11s)
  • 18:15 otto@tin: Finished deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed) (duration: 00m 19s)
  • 18:15 otto@tin: Started deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed)
  • 18:14 otto@tin: Finished deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed) (duration: 02m 11s)
  • 18:11 otto@tin: Started deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed)
  • 16:40 jynus: upgrade and restart labsdb1010
  • 16:29 akosiaris@tin: Finished deploy [servermon/servermon@cf88f3f]: Update servermon to 3c8538a (duration: 00m 02s)
  • 16:29 akosiaris@tin: Started deploy [servermon/servermon@cf88f3f]: Update servermon to 3c8538a
  • 16:07 akosiaris@tin: Finished deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a (duration: 00m 02s)
  • 16:07 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 16:06 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 16:05 akosiaris@tin: Finished deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a (duration: 00m 23s)
  • 16:04 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 15:50 marostegui: Upgrade db2071 kernel - T184256
  • 15:48 moritzm: rebooting multatuli for kernel update
  • 15:41 marostegui: Upgrade db2072 (mariadb and kernel) - T184256
  • 14:25 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1002.eqiad.wmnet
  • 14:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1002.eqiad.wmnet
  • 14:17 gehel: reboot maps1002 for kernel upgrade
  • 14:03 fdans@tin: (no justification provided)
  • 13:57 elukey@tin: Finished deploy [analytics/aqs/deploy@792c95d]: Add pageviews by country endpoint (duration: 01m 12s)
  • 13:56 elukey@tin: Started deploy [analytics/aqs/deploy@792c95d]: Add pageviews by country endpoint
  • 13:54 ema: upgrade cp3046 to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 13:53 fdans@tin: Finished deploy [analytics/aqs/deploy@792c95d]: (no justification provided) (duration: 00m 18s)
  • 13:52 fdans@tin: Started deploy [analytics/aqs/deploy@792c95d]: (no justification provided)
  • 13:37 gehel: rebooting wdqs1003 for kernel upgrade
  • 13:24 fdans@tin: Finished deploy [analytics/aqs/deploy@792c95d]: (no justification provided) (duration: 01m 32s)
  • 13:22 fdans@tin: Started deploy [analytics/aqs/deploy@792c95d]: (no justification provided)
  • 13:22 gehel: rebooting elastic1017 for kernel upgrade
  • 13:19 fdans: deploying Analytics Query Service
  • 12:44 elukey: reboot kafka-jumbo1001 for kernel updates
  • 12:43 ema: upgrade cp3007 to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 12:04 jynus: upgrade and restart labsdb1011
  • 12:03 ema: reboot cp1008 into linux 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 10:15 godog: reboot restbase2004 to test kernel upgrade
  • 10:14 jynus: reboot labsdb1009
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 - T163190 (duration: 00m 27s)
  • 09:20 elukey: drain and reboot analytics1030 for kernel updates
  • 09:11 godog: reboot ms-be1014 to test update stretch kernel
  • 08:54 elukey: ran git checkout modules/role/manifests/puppetmaster/standalone.pp on labs-puppetmaster.wikimedia.org to unblock sync from prod
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T163190 (duration: 00m 28s)
  • 07:37 _joe_: rebooting mw1276 toio, kernel upgrade
  • 07:25 _joe_: rebooting mw1261
  • 06:49 marostegui: Stop replication in sync on db1039 and db1098:3317 - T163190
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 - T163190 (duration: 00m 27s)
  • 06:24 marostegui: Deploy schema change on db1094 - T174569
  • 06:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T163190 (duration: 00m 51s)
  • 03:54 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Undeploy EducationProgram from test2wiki (duration: 00m 48s)

2018-01-04

  • 23:30 apergos: rebooted releases1001 and 2001 (new kernel)
  • 22:09 moritzm: uploaded linux-meta 1.16 for jessie-wikimedia to apt.wikimedia.org (which installs the new KPTI-enabled kernel with the new ABI)
  • 22:03 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.15
  • 22:00 twentyafterfour: No blockers remain for T180748, proceeding to deploy wmf.15 to all wikis
  • 21:53 twentyafterfour@tin: Synchronized php-1.31.0-wmf.15/extensions/TitleBlacklist/TitleBlacklistPreAuthenticationProvider.php: Deploy 332fab0 to stop logspam and unblock the train (duration: 01m 02s)
  • 21:37 moritzm: uploaded linux-4.9.65-3+deb9u1~bpo8+2 for jessie-wikimedia to apt.wikimedia.org (provides KPTI backport)
  • 21:35 twentyafterfour@tin: Synchronized php-1.31.0-wmf.15/includes/parser/Parser.php: Deploy 601cf9d (duration: 01m 03s)
  • 21:33 twentyafterfour: deploying patches to unblock the train
  • 21:25 moritzm: reboot multatuli for kernel update
  • 20:06 twentyafterfour: There are still open blockers for wmf.15 - see T180748 .. attempting to resolve them to unblock the train.
  • 20:03 twentyafterfour: preparing to deploy the train (filling in for no_justification)
  • 19:51 joal@tin: Finished deploy [analytics/refinery@a69a2cd]: Regular analytics deploy (duration: 04m 38s)
  • 19:46 joal@tin: Started deploy [analytics/refinery@a69a2cd]: Regular analytics deploy
  • 18:58 bsitzmann@tin: Finished deploy [mobileapps/deploy@8bcffa9]: Update mobileapps to a4ba9fd (T182330 T177430 T170690 T182652 T184198) (duration: 06m 01s)
  • 18:52 bsitzmann@tin: Started deploy [mobileapps/deploy@8bcffa9]: Update mobileapps to a4ba9fd (T182330 T177430 T170690 T182652 T184198)
  • 18:27 jynus: upgrade and restart labsdb1009
  • 17:42 moritzm: upgrading HHVM on eqiad video scalers to 3.18.6
  • 17:40 demon@tin: Finished deploy [gerrit/gerrit@1e1a79d]: deploying hooks plugin (duration: 00m 10s)
  • 17:40 demon@tin: Started deploy [gerrit/gerrit@1e1a79d]: deploying hooks plugin
  • 16:38 jynus: upgrade and restart db2089 (s5/s6)
  • 16:14 jynus: upgrade and restart db2087 (s6/s7)
  • 15:44 jynus: upgrade and restart db2076
  • 15:36 jynus: upgrade and restart db2067
  • 15:31 demon@tin: Synchronized php-1.31.0-wmf.15/extensions/ActiveAbstract/: unbreak, T184177 (duration: 01m 02s)
  • 15:17 jynus: upgrade and restart db2060
  • 15:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 - T163190 (duration: 01m 02s)
  • 15:03 moritzm: upgrading HHVM on eqiad image scalers to 3.18.6
  • 14:54 jynus: restart db2046 database to move socket location
  • 14:24 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Adding Movepage-summary to wgForceUIMsgAsContentMsg T183848 (duration: 01m 02s)
  • 14:11 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Restrict sending mails to new users T182541 (duration: 01m 02s)
  • 13:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T163190 (duration: 01m 01s)
  • 13:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T163190 (duration: 01m 02s)
  • 13:35 marostegui: Stop replication in sync db1079 db1101:3317 T163190
  • 13:17 moritzm: upgrading HHVM on mw1180-mw1220 to 3.18.6
  • 12:53 moritzm: upgrading HHVM on mwdebug* to 3.18.6
  • 12:45 mobrovac@tin: Finished deploy [restbase/deploy@66b7efe]: Switch Mathoid to Cassandra 3 and drop Cassandra 2 references - T179419 (duration: 04m 05s)
  • 12:41 mobrovac@tin: Started deploy [restbase/deploy@66b7efe]: Switch Mathoid to Cassandra 3 and drop Cassandra 2 references - T179419
  • 12:07 mobrovac@tin: Finished deploy [mathoid/deploy@c9957ce]: Mathoid v0.7.1 - T172767 (duration: 05m 05s)
  • 12:02 mobrovac@tin: Started deploy [mathoid/deploy@c9957ce]: Mathoid v0.7.1 - T172767
  • 12:00 moritzm: upgrading HHVM on API canaries (mw1276-mw1279) to HHVM 3.18.6
  • 10:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T163190 (duration: 01m 01s)
  • 10:39 marostegui: Stop replication in sync on db1079 and db1101:3317 - T163190
  • 10:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T163190 (duration: 01m 02s)
  • 10:16 mobrovac@tin: Finished deploy [mathoid/deploy@7f664ff]: Update Mathoid in codfw to v0.7.0, take #2 - T183557 (duration: 02m 38s)
  • 10:14 mobrovac@tin: Started deploy [mathoid/deploy@7f664ff]: Update Mathoid in codfw to v0.7.0, take #2 - T183557
  • 09:58 jynus: restart and upgrade db2053
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 - T163190 (duration: 03m 09s)
  • 09:38 moritzm: rebooting mw1307 and wtp1025 for kernel update
  • 09:13 moritzm: rebooting kubernetes1001 for kernel update
  • 08:57 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw133[67] (new jobrunners)
  • 08:53 marostegui: Fixing inconsistencies on s7 - T163190
  • 08:48 marostegui: Deploy schema change on db1069 (s7) - T174569
  • 08:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T174569 (duration: 01m 02s)
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T174569 (duration: 01m 02s)
  • 06:48 marostegui: Deploy schema change on db1079 (s7) with replication enabled - this will generate lag on labs replicas - T174569
  • 06:27 marostegui: Deploy schema change on db1068 (s4) master - T174569
  • 06:23 marostegui: Issue a BBU re-learn cycle on db1059 - T184160
  • 02:49 legoktm@tin: Synchronized php-1.31.0-wmf.15/extensions/Flow/Hooks.php: Fix CheckUser type check thingy - T182834 (duration: 01m 01s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 07m 50s)
  • 01:50 ladsgroup@tin: Synchronized dblists/group0.dblist: SWAT: Move testwiki2 from group0 to group1 (T182326) (duration: 01m 02s)

2018-01-03

  • 23:02 twentyafterfour: restarted apache on phab1001 to clear hung workers (refs T182832)
  • 22:31 bd808@tin: Finished deploy [striker/deploy@69f1b15]: Enhance membership request workflow and fix Diffusion repo creation (T168027, T182142) (duration: 00m 31s)
  • 22:31 bd808@tin: Started deploy [striker/deploy@69f1b15]: Enhance membership request workflow and fix Diffusion repo creation (T168027, T182142)
  • 21:41 ejegg: re-enabled ingenico audit
  • 21:27 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.15 (duration: 01m 01s)
  • 21:26 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.15
  • 21:26 twentyafterfour: deploying 1.31.0-wmf.15 to "Group 1" wikis
  • 21:01 ottomata: deleting stale topics from main kafka clusters: T149594
  • 20:56 mutante: uranium - revoked puppet cert, node deactivate, removing from DNS (T183209)
  • 20:50 mutante: uranium (ex-ganglia-web) is going into eternal downtime on Icinga.. shutdown -h RIP (T183209)
  • 20:23 thcipriani: updateCollation for eswiki running in screen as thcipriani on terbium
  • 20:19 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Do not enable lua fine grained tracking for any wiki T172914 (duration: 01m 02s)
  • 20:16 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/VisualEditor/lib/ve: SWAT: Update VE core submodule to master T182907 T183590 (duration: 01m 06s)
  • 20:06 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Close wikimania2017.wikimedia.org PART II T182493 (duration: 01m 04s)
  • 20:04 thcipriani@tin: Synchronized dblists/closed.dblist: SWAT: Close wikimania2017.wikimedia.org PART I T182493 (duration: 01m 02s)
  • 19:53 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Extension:Translate default permissions for Wikimedia wikis T178793 (duration: 01m 02s)
  • 19:42 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set category collation to uca-es-u-kn for eswiki T183802 (duration: 01m 02s)
  • 19:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Setup some namespace aliases for eswiki T183612 (duration: 01m 02s)
  • 19:14 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add configuration deboosting scientific articles on Wikidata T183510 (duration: 01m 02s)
  • 18:53 volans: restarted ircecho on einsteinium
  • 18:37 moritzm: uploaded hhvm 3.18.5+dfsg-1+wmf2+deb9u1 for stretch-wikimedia to apt.wikimedia.org
  • 18:25 ottomata: deploying change to produce statsv metrics to main kafka clusters from varnishkafka. statsv on hafnium will be restarted to consume from main. might cause a short blip in statsv metrics.
  • 18:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Decom db2028 (duration: 01m 01s)
  • 18:17 jynus@tin: Synchronized wmf-config/db-codfw.php: Decom db2028, repool pc2005 (duration: 01m 01s)
  • 17:47 otto@tin: Finished deploy [statsv/statsv@362d1a9]: statsv (duration: 00m 02s)
  • 17:47 otto@tin: Started deploy [statsv/statsv@362d1a9]: statsv
  • 17:35 godog: upload prometheus-jmx-exporter 0.10-3 to jessie/stretch
  • 17:35 demon@tin: Synchronized php-1.31.0-wmf.15/extensions/Wikibase: I9da46c36 (duration: 02m 00s)
  • 17:35 jynus: restart and upgrade db2046
  • 17:07 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2251.*.wmnet
  • 17:02 jynus: performing schema change on db2039 (s6)
  • 16:51 papaul: powering down pc2005 for maintenance
  • 16:18 otto@tin: Finished deploy [statsv/statsv@0a86be8]: revert (duration: 00m 02s)
  • 16:18 otto@tin: Started deploy [statsv/statsv@0a86be8]: revert
  • 16:18 otto@tin: Finished deploy [statsv/statsv@c390cdf]: no-op deployment of statsv with support for multiple topics (duration: 00m 03s)
  • 16:18 otto@tin: Started deploy [statsv/statsv@c390cdf]: no-op deployment of statsv with support for multiple topics
  • 16:14 papaul: powering down mw2251 for memory replacement and firmware uprade
  • 16:02 urandom: drop unused keyspaces in legacy restbase cluster - T183745
  • 15:51 niharika29@tin: Finished deploy [scholarships/scholarships@ec05ae7]: Remove outdated i18n files (duration: 00m 02s)
  • 15:51 niharika29@tin: Started deploy [scholarships/scholarships@ec05ae7]: Remove outdated i18n files
  • 15:48 jynus: stop pc2005's database for maintenance T183750
  • 15:46 jynus@tin: Synchronized wmf-config/db-codfw.php: "Depool" pc2005 (duration: 01m 02s)
  • 15:38 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2251.*.wmnet
  • 15:28 niharika29@tin: Finished deploy [scholarships/scholarships@ec05ae7]: Update i18n files (duration: 00m 02s)
  • 15:28 niharika29@tin: Started deploy [scholarships/scholarships@ec05ae7]: Update i18n files
  • 15:14 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover s6-master db2028 to db2039 (duration: 01m 01s)
  • 15:08 jynus: stopping db2028's mysql to apply new config
  • 15:01 godog: roll-restart thumbor in eqiad after upgrade - T183907
  • 15:00 ottomata: restarting kafka-jumbo brokers to enable tls version and cipher suite restrictions
  • 14:55 jynus: switchover db2028 to db2039 as codfw-s6-master
  • 14:39 godog: rollout python-thumbor-wikimedia 1.8 - T183907
  • 14:30 zeljkof: EU SWAT finished
  • 14:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Translation NS for kowikisource (T183836) (duration: 01m 00s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add patrol to Image-reviewer on Commons (T183835) (duration: 01m 02s)
  • 13:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T174569 (duration: 01m 02s)
  • 13:17 moritzm: upgrading mw1261-mw1265 to HHVM 3.18.5+dfsg-1+wmf2
  • 13:07 moritzm: uploaded hhvm 3.18.5+dfsg-1+wmf2 (including the fixes from 3.18.6) for jessie-wikimedia to apt.wikimedia.org
  • 12:53 moritzm: importing linux 4.9.65-3+deb9u1~bpo8+1 for jessie-wikimedia to apt.wikimedia.org
  • 12:14 mobrovac@tin: Finished deploy [mathoid/deploy@63b2ddc]: Bring back codfw in sync with eqiad - T183557 (duration: 02m 10s)
  • 12:12 mobrovac@tin: Started deploy [mathoid/deploy@63b2ddc]: Bring back codfw in sync with eqiad - T183557
  • 11:57 moritzm: upgrading app servers in deployment-prep to hhvm 3.18.5+dfsg-1+wmf2 (which contains the patches from 3.18.6)
  • 11:52 jynus: upgrade and restart db2039
  • 11:49 jynus: disabling puppet on db2039 and db2028 in preparation for gerrit:401706 deployment
  • 11:47 akosiaris: boot ganeti1006. It exhibited page allocation stalls on Jan 1. T181121
  • 11:39 marostegui: Deploy schema change on db1086 - T174569
  • 11:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T174569 (duration: 01m 01s)
  • 11:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 - T174569 (duration: 01m 02s)
  • 11:32 mobrovac@tin: Finished deploy [mathoid/deploy@91648aa]: Update to Mathoid v0.7.0 in codfw only for T183557 (duration: 02m 15s)
  • 11:29 mobrovac@tin: Started deploy [mathoid/deploy@91648aa]: Update to Mathoid v0.7.0 in codfw only for T183557
  • 11:28 mobrovac@tin: Finished deploy [mathoid/deploy@91648aa]: (no justification provided) (duration: 00m 40s)
  • 11:28 mobrovac@tin: Started deploy [mathoid/deploy@91648aa]: (no justification provided)
  • 11:03 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1336.*.eqiad.wmnet
  • 11:00 mobrovac@tin: Started restart [changeprop/deploy@3c4f51d]: Pick up the new RESTBase DNS
  • 10:48 mobrovac@tin: Started restart [mobileapps/deploy@bf85a55]: Pick up the new RESTBase DNS
  • 10:45 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=restbase,name=codfw
  • 09:57 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1337.*.eqiad.wmnet
  • 08:59 elukey: stop eventlogging mysql insertion on eventlog1001 to allow db1107 maintenance - T168414
  • 06:57 marostegui: Deploy schema change on db1101:3317 - T174569
  • 06:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 - T174569 (duration: 01m 01s)
  • 06:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 - T174569 (duration: 01m 10s)
  • 06:47 kartik@tin: Finished deploy [cxserver/deploy@66e384e]: Update cxserver to cc01477 (duration: 04m 49s)
  • 06:43 kartik@tin: Started deploy [cxserver/deploy@66e384e]: Update cxserver to cc01477
  • 06:37 marostegui: Deploy schema change on s1 master db1052 - T174569
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 06m 56s)
  • 01:26 eileen: civicrm updated civicrm revision changed from ffa9d7fc7a to 429a5c5385, config revision is a7b9b58595
  • 01:18 eileen: update process-control to use different reference to civicrm_root (symlinks) process-control config revision is a7b9b58595
  • 01:01 reedy@tin: Synchronized php-1.31.0-wmf.15/resources/src/mediawiki/mediawiki.editfont.css: T182320 (duration: 01m 01s)
  • 00:59 reedy@tin: Synchronized php-1.31.0-wmf.15/extensions/Flow: T182320 (duration: 01m 18s)
  • 00:58 reedy@tin: Synchronized php-1.31.0-wmf.15/extensions/CodeMirror: T182320 (duration: 00m 59s)
  • 00:51 eileen: rollback smashPig SmashPig revision changed from ab7802d5b3 to 45aa62650c (locked), config revision is 4a4c61ae1b
  • 00:38 reedy@tin: Synchronized php-1.31.0-wmf.12/resources/src/mediawiki.rcfilters/dm/: RCFilters (duration: 01m 02s)
  • 00:36 reedy@tin: Synchronized php-1.31.0-wmf.15/resources/src/mediawiki.rcfilters/dm/: RCFilters (duration: 01m 02s)
  • 00:09 reedy@tin: Synchronized wmf-config/CirrusSearch-common.php: Lower ElasticSearch index refresh interval for Wikidata to 5s (duration: 01m 02s)
  • 00:06 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Add wmgCirrusSearchRefreshInterval (duration: 01m 02s)

2018-01-02

  • 22:04 herron: upgrading trusty puppet agents to puppet 4
  • 21:00 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.15
  • 20:59 demon@tin: Synchronized php-1.31.0-wmf.15/includes/Setup.php: Aaron made me do it (duration: 01m 04s)
  • 20:48 ottomata: restarting kafka-jumbo brokers for version 1.0 upgrade
  • 19:15 demon@tin: Finished scap: wmf.15 bootstrap (duration: 34m 55s)
  • 18:46 subbu: started linter-reparse script on terbium to reprocess itwiki pages (safe to kill -9 the script at any point)
  • 18:40 demon@tin: Started scap: wmf.15 bootstrap
  • 18:37 ebernhardson: T183053 update index.refresh_interval for wikidatawiki_{content,general} on eqiad to 5s
  • 18:30 jgleeson: Updating Smashpig from 45aa62650c to ab7802d5b3
  • 18:21 arlolra@tin: Finished deploy [parsoid/deploy@4d55952]: Updating Parsoid to 28d7734 (duration: 11m 57s)
  • 18:20 awight@tin: Finished deploy [ores/deploy@eb0f776]: Update ORES service to eb0f776: T182614 (duration: 19m 55s)
  • 18:10 moritzm: rebooting multatuli for kernel test
  • 18:09 arlolra@tin: Started deploy [parsoid/deploy@4d55952]: Updating Parsoid to 28d7734
  • 18:00 awight@tin: Started deploy [ores/deploy@eb0f776]: Update ORES service to eb0f776: T182614
  • 17:28 demon@tin: Pruned MediaWiki: 1.31.0-wmf.11 (duration: 01m 24s)
  • 17:23 demon@tin: Pruned MediaWiki: 1.31.0-wmf.10 (duration: 01m 29s)
  • 17:05 ejegg: updated payments-wiki from e91db27108 to 40145892e7
  • 17:01 jynus: add missing mysql grants to db1103:s4
  • 16:53 jynus: add missing mysql grants to db1097:s4
  • 16:51 herron: restarted exim and spamd services on fermium, mx1001 and mx2001 for openssl update
  • 16:48 elukey@puppetmaster1001: conftool action : set/weight=30; selector: name=mw13(29|3[0-3]).*.eqiad.wmnet
  • 16:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1029 (duration: 00m 51s)
  • 15:55 moritzm: installing openssl updates on restbase* hosts
  • 15:53 elukey@puppetmaster1001: conftool action : set/weight=20; selector: name=mw13(29|3[0-3]).*.eqiad.wmnet
  • 15:33 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1335.*.eqiad.wmnet
  • 15:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1055 & db1056 x1 weight (duration: 00m 50s)
  • 15:16 akosiaris: boot ganeti1008 with older 4.4 kernel and migrate multiple VMs to it. T181121
  • 15:05 zeljkof: EU SWAT finished
  • 15:04 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Set watchcreations preference to true by default on Commons (T178750) (duration: 00m 51s)
  • 15:03 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set watchcreations preference to true by default on Commons (T178750) (duration: 00m 51s)
  • 14:54 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable mapframe on lvwiki (T183661) (duration: 00m 51s)
  • 14:42 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Revert "Switch Wikipedias from $wgLogoHD to direct using of a SVG" (T178942) (duration: 00m 51s)
  • 14:41 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Switch Wikipedias from $wgLogoHD to direct using of a SVG" (T178942) (duration: 00m 51s)
  • 14:33 zfilipin@tin: scap failed: average error rate on 6/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 14:32 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Switch Wikipedias from $wgLogoHD to direct using of a SVG (T178942) (duration: 01m 59s)
  • 14:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect to autoreview/editor at ruwikt (T183719) (duration: 00m 51s)
  • 14:12 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create rollbacker user group for ruwiktionary (T183655) (duration: 00m 52s)
  • 14:07 foks: removed 2FA for Martin_Urbanec
  • 14:00 moritzm: installing further openssl updates
  • 13:49 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1333.*.eqiad.wmnet
  • 13:48 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1332.*.eqiad.wmnet
  • 13:47 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1331.*.eqiad.wmnet
  • 13:45 marostegui: Deploy alter table db1098:3317 - T174569
  • 13:45 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1330.*.eqiad.wmnet
  • 13:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 - T174569 (duration: 00m 51s)
  • 13:42 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1329.*.eqiad.wmnet
  • 13:41 elukey: enable live traffic for new appservers mw1329->mw1333 (T165519)
  • 13:00 moritzm: installing openssl updates on remaining mw* hosts in eqiad
  • 12:25 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 51s)
  • 12:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 & db1056 as x1 replicas (2nd try) (duration: 00m 51s)
  • 12:21 akosiaris: empty ganeti1008 for kernel downgrade. T181121
  • 12:11 jynus: add missing mysql grants to db1055 and db1056
  • 11:42 moritzm: installing ncurses security updates
  • 11:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Revert: Repool db1055 & db1056 as x1 replicas (duration: 00m 51s)
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 50s)
  • 11:31 mobrovac@tin: Finished deploy [citoid/deploy@ee0bdf4]: Update to service template v0.5.4 - T151394 (duration: 04m 19s)
  • 11:30 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 50s)
  • 11:27 mobrovac@tin: Started deploy [citoid/deploy@ee0bdf4]: Update to service template v0.5.4 - T151394
  • 09:45 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=api_appserver,name=mw1(1.*|200).eqiad.wmnet
  • 09:44 _joe_: setting api_appservers in the mw1180-1200 range to pooled=inactive, T183895
  • 09:43 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=appserver,name=mw1(1.*|200).eqiad.wmnet
  • 09:37 _joe_: setting appservers in the mw1180-1200 range to pooled=inactive, T183895
  • 09:28 godog: reboot ms-be1033 - T183724
  • 08:52 _joe_: restarting also mw1226-8, mw1223, mw1201,mw1203, mw1205-7
  • 08:36 _joe_: likewise for mw1285,mw1235,mw1232
  • 08:29 _joe_: restarting hhvm on mw1280,1282 for the same reasons
  • 08:26 _joe_: restarting hhvm on mw1317, multiple threads stuck in HPHP::jit::enterTCImpl
  • 08:23 elukey: restart druid coordinators on druid* to pick up new jvm settings
  • 08:19 _joe_: restarting hhvm on mw1313, concurrency HPHP::VariableUnserializer::unserializeVariant
  • 08:06 marostegui: Deploy alter table on db1039 (already depooled) - T174569
  • 07:56 marostegui: Deploy schema change on dbstore1001.s7 - T174569
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 (duration: 00m 52s)
  • 06:42 marostegui: Stop db1110 and dbstore1002.s5 replication in sync
  • 06:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 to reimport dewiki.langlinks on dbstore1002 (duration: 00m 50s)

2018-01-01

2000s

2010s

2020s