Jump to content

Incidents/2018-11-06 maps

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

document status: final

Summary

Tilerator failed on maps100[1-3]. Tilerator is a non-public service to prepare vector tiles (data blobs) from OSM database into Cassandra storage. This happened on the 6th November 2018. Icinga first reported this failure around 00:18 UTC.

Timeline

This is a step by step outline of what happened to cause the incident and how it was remedied.

00:18 UTC: Icinga reported failure of Tilerator ports :

PROBLEM - tilerator on maps1003 is CRITICAL: connect to address 10.64.32.117 and port 6534: Connection refused
1:19 AM PROBLEM - tilerator on maps1002 is CRITICAL: connect to address 10.64.16.42 and port 6534: Connection refused
1:19 AM PROBLEM - tilerator on maps1001 is CRITICAL: connect to address 10.64.0.79 and port 6534: Connection refused

07:25 UTC: Tilerator Service was restarted on maps100[1-3]

07:26 UTC: Tilerator Service came back up.

Conclusions

Maps Runbook: Maps/RunBook

Actionables

NOTE: Please add the #wikimedia-incident Phabricator project to these follow-up tasks and move them to the "follow-up/actionable" column.