Jump to content

User:BryanDavis/Scap3 in a Cloud VPS project

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Setting up a deploy server to use scap3 in a Cloud VPS project

  • Create a m1.small instance
  • Add role and some dummy hiera config values that are needed by ::profile::mediawiki::deployment::server to Hiera:$PROJECT/host/$HOST
---
classes:
    - profile::mediawiki::deployment::server
    - profile::keyholder::server
profile::keyholder::server::require_encrypted_keys: nope
scap::dsh::groups:
  mediawiki-installation:
    hosts:
    - 127.0.0.1
  • Add some project wide hiera config values to set the scap and scap3 master server in Hiera:$PROJECT:
    ---
    scap::deployment_server: <FQDN of project deploy server>
    profile::mediawiki::deployment::server::rsync_host: <FQDN of project deploy server>
    scap::wmflabs_master: <FQDN of project deploy server>
    deployment_server: <FQDN of project deploy server>
    network::allow_deployment_from_ips:
    - <FQDN of project deploy server>
    # A bunch of hiera settings that we have to stub out because the profiles
    # are not well factored for the scap3 only use case
    profile::rsyslog::kafka_shipper::kafka_brokers: []
    profile::mediawiki::php::enable_fpm: false
    profile::mediawiki::php::version: "7.2"
    profile::mediawiki::apc_shm_size: 128M
    has_lvs: false
    lvs::configuration::lvs_service_ips: {}
    lvs::configuration::lvs_services: {}
    
  • Force another puppet run!

Adding a scap3 project

profile::keyholder::server::agents:
    deploy-service:
        trusted_groups:
            - wikidev

scap::sources:
    striker/deploy:
        repository: labs/striker/deploy

Syncing with the cluster

  • Accept the ssh host keys of all of the target nodes as the user you will be deploying as (e.g. bd808 if you happen to be BryanDavis)
  • Arm Keyholder on your deploy server.
    • When running sudo keyholder status you should see all the keys needed listed, if not something went wrong.
    • To verify that everything runs as expected, pick one of the target hosts of the deployment and execute the following (you should be able to ssh correctly): SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -l <KEYHOLDER-IDENTITY> -oBatchMode=yes <TARGET-HOSTNAME>
    • Interesting corner case: the analytics team owns the keyholder identity analytics-deploy and needs to deploy to hosts using the username analytics. This is what I had to do to resolve ssh problems when using scap: SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -l analytics-deploy -oBatchMode=yes analytics@hadoop-coordinator-2.analytics.eqiad.wmflabs
  • Deploy stuff!
    $ cd $MY_DEPLOY_DIR  # (e.g /srv/deployment/striker/deploy)
    $ scap deploy
    21:14:10 Started Deploy: striker/deploy
    Entering 'public_html/staticfiles'
    Entering 'striker'
    Entering 'wheels'
    21:14:10
    == DEFAULT ==
    :* striker-uwsgi01.striker.eqiad.wmflabs
    striker/deploy: fetch stage(s): 100% (ok: 1; fail: 0; left: 0)
    striker/deploy: config_deploy stage(s): 100% (ok: 1; fail: 0; left: 0)
    striker/deploy: promote and restart_service stage(s):   0% (ok: 0; fail: 0; left: 1)
    striker/deploy: promote and restart_service stage(s): 100% (ok: 1; fail: 0; left: 0)
    striker/deploy: promote and restart_service stage(s): 100% (ok: 1; fail: 0; left: 0)
    21:14:13 Finished Deploy: striker/deploy (duration: 00m 02s)
    $ scap deploy-log
    -- Opening log file: '/srv/deployment/striker/deploy/scap/log/scap-sync-2016-07-28-0007.log'
    21:14:10 [striker-deploy03] Started Deploy: striker/deploy
    21:14:10 [striker-deploy03]
    == DEFAULT ==
    :* striker-uwsgi01.striker.eqiad.wmflabs
    21:14:11 [striker-uwsgi01.striker.eqiad.wmflabs] Revision directory already exists (use --force to override)
    21:14:12 [striker-uwsgi01.striker.eqiad.wmflabs] Starting new HTTP connection (1): striker-deploy03.striker.eqiad.wmflabs
    21:14:13 [striker-uwsgi01.striker.eqiad.wmflabs] /srv/deployment/striker/deploy-cache/revs/47ea97dc38677aab79bd0c89f1e3ffe7fdc2cbfb is already live (use --force to override)
    21:14:13 [striker-deploy03] Finished Deploy: striker/deploy (duration: 00m 02s)
    

Errors I saw on initial provision

Yes Done https://gerrit.wikimedia.org/r/#/c/301403/ --

Error: Could not set 'file' on ensure: No such file or directory - /etc/firejail/mediawiki-imagemagick.profile20160726-18098-h7ugbr.lock at 45:/etc/puppet/modules/mediawiki/manifests/init.pp
Error: Could not set 'file' on ensure: No such file or directory - /etc/firejail/mediawiki-imagemagick.profile20160726-18098-h7ugbr.lock at 45:/etc/puppet/modules/mediawiki/manifests/init.pp
Wrapped exception:
No such file or directory - /etc/firejail/mediawiki-imagemagick.profile20160726-18098-h7ugbr.lock
Error: /Stage[main]/Mediawiki/File[/etc/firejail/mediawiki-imagemagick.profile]/ensure: change from absent to file failed: Could not set 'file' on ensure: No such file or directory - /etc/firejail/mediawiki-imagemagick.profile20160726-18098-h7ugbr.lock at 45:/etc/puppet/modules/mediawiki/manifests/init.pp

Yes Done https://gerrit.wikimedia.org/r/#/c/301404 --

Error: Could not set 'file' on ensure: No such file or directory - /etc/php5/apache2/php.ini20160726-18098-10h241o.lock at 21:/etc/puppet/modules/mediawiki/manifests/php.pp
Error: Could not set 'file' on ensure: No such file or directory - /etc/php5/apache2/php.ini20160726-18098-10h241o.lock at 21:/etc/puppet/modules/mediawiki/manifests/php.pp
Wrapped exception:
No such file or directory - /etc/php5/apache2/php.ini20160726-18098-10h241o.lock
Error: /Stage[main]/Mediawiki::Php/File[/etc/php5/apache2/php.ini]/ensure: change from absent to file failed: Could not set 'file' on ensure: No such file or directory - /etc/php5/apache2/php.ini20160726-18098-10h241o.lock at 21:/etc/puppet/modules/mediawiki/manifests/php.pp

Yes Done https://gerrit.wikimedia.org/r/#/c/301405 --

Error: Could not set 'file' on ensure: No such file or directory - /home/l10nupdate/.gitconfig20160726-18098-axu2js.lock at 81:/etc/puppet/modules/scap/manifests/l10nupdate.pp
Error: Could not set 'file' on ensure: No such file or directory - /home/l10nupdate/.gitconfig20160726-18098-axu2js.lock at 81:/etc/puppet/modules/scap/manifests/l10nupdate.pp
Wrapped exception:
No such file or directory - /home/l10nupdate/.gitconfig20160726-18098-axu2js.lock
Error: /Stage[main]/Scap::L10nupdate/File[/home/l10nupdate/.gitconfig]/ensure: change from absent to file failed: Could not set 'file' on ensure: No such file or directory - /home/l10nupdate/.gitconfig20160726-18098-axu2js.lock at 81:/etc/puppet/modules/scap/manifests/l10nupdate.pp

Yes Done https://gerrit.wikimedia.org/r/#/c/301408 --

Notice: /Stage[main]/Mediawiki::Scap/Exec[fetch_mediawiki]/returns: 22:25:55 pull failed: <CalledProcessError> Command '['sudo', '-u', 'mwdeploy', '-n', '--', '/usr/bin/rsync', '--archive', '--delete-delay', '--delay-updates', '--compress', '--delete', '--exclude=**/cache/l10n/*.cdb', '--exclude=*.swp', '--no-perms', '--exclude=**/.git', 'deployment-tin.eqiad.wmflabs::common', '/srv/mediawiki']' returned non-zero exit status 10
Notice: /Stage[main]/Mediawiki::Scap/Exec[fetch_mediawiki]/returns:
Error: /usr/bin/scap pull returned 70 instead of one of [0]
Error: /Stage[main]/Mediawiki::Scap/Exec[fetch_mediawiki]/returns: change from notrun to 0 failed: /usr/bin/scap pull returned 70 instead of one of [0]n

What's wrong here

  1. trebuchet and all of its baggage including a bunch of cloned repos
  2. Full MediaWiki scap setup
  3. l10nupdate (via MW scap)
  4. Full MW runtime setup (via MW scap)
  5. Why oh why is hiera('mediawiki::redis_servers::eqiad') embedded in role::memcached?
  6. Several resource ordering issues on initial provision