Jump to content

User:Brion/todo/dumps

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

mailing list notes

dump process... again... *sigh*

  • [live slave db]:
    1) internally consistent sql db dumps for our own backup purposes, use gzip compression
    2) gzipped xml stub archives
  • [secondary db]:
    3) gzipped xml full archives
  • [xml-based]: -> can run simultaneously
    4) gzip -> bzip2
    5) gzip - 7zip
    6) search index update
    7) yahoo xml [fixme!]
  • [cleanup]:
    archive or remove older dumps


  1. one master process
  2. multiple slave processes
  3. be sensitive to status changes -- added, removed, private wikis during runs

questions

  • Do we have enough space on benet for two complete, clean versions?
    • If not, we need to pull a bigger box to run on.
  • What about historical archives? How much should we preserve?
    • Recommend at *least* keeping the current-articles and full-history .7z versions of each dump around. Online if possible, offline if not.
    • Where do we have space for historical archives?