Jump to content

Search Platform/Weekly Updates/2023-10-06

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Summary

We're starting a new quarter. Our goals for this quarter are almost ready and will be published on wiki shortly (keep an eye on https://wikitech.wikimedia.org/wiki/Search_Platform/Goals).

Overall, this was a short week, due to Wikimedia Connect. We're making good progress towards deploying the Search Update Pipeline, with the testing of standard operations completed. We've identified a number of performance improvements to our improvements to multilingual zero-results rate. And we're getting started on experimenting with WDQS graph split.

What we've accomplished

Search Update Pipeline

  • We have tested all relevant operations, we are ready for a production deployment of the Search Update Pipeline on Flink, with k8s operators - https://phabricator.wikimedia.org/T342149
  • Migration of the WDQS updater to use newer Flink connectors
  • Started to work on better isolation of wdqs updater error streams, quick patch to disable them to unblock testing the flink-k8s-op, better solution still WIP - https://phabricator.wikimedia.org/T347515

Improve multilingual zero-results rate

  • Performance optimization in progress. In particular, consolidating character mapping brings a 9.3% improvement to indexing times, implementing custom mapping code instead of the heavy weight elasticsearch machinery is ~50% faster.
  • A new Elasticsearch plugin will be created to isolate this and allow for easier rollout

WDQS Graph Split

Misc