Jump to content

Portal:Cloud VPS/Admin/notes/Neutron Migration

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
This page contains historical information. This page was only relevant while we were migrating from nova-network to neutron

Clearly a lot of this can be automated. Once we've done a few projects without errors we can mash all this into one big super-script

Steps

root@labcontrol1001:~# cd ~
root@labcontrol1001:~# pwd
/root

root@labcontrol1001:~# cd /root
root@labcontrol1001:~# source ~/novaenv.sh 
root@labcontrol1001:~# wmcs-region-migrate-quotas <project-name>
Updated quotas using <QuotaSet cores=12, fixed_ips=200, floating_ips=0, injected_file_content_bytes=10240, injected_file_path_bytes=255, injected_files=5, instances=8, key_pairs=100, metadata_items=128, ram=24576, security_group_rules=20, security_groups=10, server_group_members=10, server_groups=10>
root@labcontrol1001:~# wmcs-region-migrate-security-groups <project-name>
deleting rule {u'remote_group_id': u'2c908284-84ef-4a4a-8f1a-11e84b6256db', u'direction': u'ingress', u'protocol': None, u'description': u'', u'ethertype': u'IPv4', u'remote_ip_prefix': None, u'port_range_max': None, u'security_group_id': u'2c908284-84ef-4a4a-8f1a-11e84b6256db', u'port_range_min': None, u'tenant_id': u'hhvm', u'id': u'6ea1ac47-3876-4676-bb01-bf89cb6f4363'}
deleting rule {u'remote_group_id': u'2c908284-84ef-4a4a-8f1a-11e84b6256db', u'direction': u'ingress', u'protocol': None, u'description': u'', u'ethertype': u'IPv6', u'remote_ip_prefix': None, u'port_range_max': None, u'security_group_id': u'2c908284-84ef-4a4a-8f1a-11e84b6256db', u'port_range_min': None, u'tenant_id': u'hhvm', u'id': u'85eabca5-63f4-43f2-aeb2-64d2e70779f1'}
Updating group default in dest
copying rule: {u'from_port': None, u'group': {u'tenant_id': u'hhvm', u'name': u'default'}, u'ip_protocol': None, u'to_port': None, u'parent_group_id': 357, u'ip_range': {}, u'id': 1526}
copying rule: {u'from_port': -1, u'group': {}, u'ip_protocol': u'icmp', u'to_port': -1, u'parent_group_id': 357, u'ip_range': {u'cidr': u'0.0.0.0/0'}, u'id': 1527}
copying rule: {u'from_port': 22, u'group': {}, u'ip_protocol': u'tcp', u'to_port': 22, u'parent_group_id': 357, u'ip_range': {u'cidr': u'10.0.0.0/8'}, u'id': 1528}
copying rule: {u'from_port': 5666, u'group': {}, u'ip_protocol': u'tcp', u'to_port': 5666, u'parent_group_id': 357, u'ip_range': {u'cidr': u'10.0.0.0/8'}, u'id': 1529}
  • Start 'screen' because the next bit is going to take a while
root@labcontrol1001:~# screen
  • Get a list of all VMs in the project
root@labcontrol1001:~# OS_TENANT_NAME=<project-name> openstack server list
+--------------------------------------+------------------+--------+--------------------+
| ID                                   | Name             | Status | Networks           |
+--------------------------------------+------------------+--------+--------------------+
| d4730c86-a6cc-4cb1-9ebe-a84f26926f24 | hhvm-jmm-vp9     | ACTIVE | public=10.68.19.57 |
| 34522cd3-9628-4035-9faa-6d12e55b0f9f | hhvm-stretch-jmm | ACTIVE | public=10.68.20.46 |
| db3a0098-8707-49bd-846f-9b9629c63658 | hhvm-jmm         | ACTIVE | public=10.68.16.91 |
+--------------------------------------+------------------+--------+--------------------+
  • Migrate VMs one by one
root@labcontrol1001:~# wmcs-region-migrate d4730c86-a6cc-4cb1-9ebe-a84f26926f24
  • See what broke

Special Concerns for Kubernetes Nodes

When moving a Kubernetes worker (or anything that connects to the flannel network for that matter), you must reload ferm on every flannel etcd node (currently that means tools-flannel-etcd-0[1-3].tools.eqiad.wmflabs). After that, run puppet on the worker node to put everything to rights.

That said, don't forget that some worker nodes still have a broken image that has a bad resolve.conf. Do check that.

Therefore the process for moving a worker node is:

  1. drain and cordon
  2. move with wmcs-region-migrate
  3. fix resolve.conf if needed
  4. sudo systemctl reload ferm on tools-flannel-etcd-0[1-3]
  5. run puppet
  6. uncordon after validating the node is "Ready" in kubectl get nodes

Common issues

  • Puppet not working because certificate issues. Run sudo rm -rf /var/lib/puppet/ssl in the instance and then run sudo puppet agent -t -v again.