Jump to content

Category:Data pipelines

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Documentation of data ingestion and processing pipelines.

  • Includes documentation describing how specific datasets are derived or computed, for example: MediaWiki history computation (ingestion from DB, history rebuilding, computation of metrics, extraction onto other systems, ad-hoc querying).
  • Does not include documentation for the data platform infrastructure or system components that implement a given data pipeline, for example: Airflow, Gobblin.