Jump to content

Search Platform/Weekly Updates/2023-05-18

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Summary

Working on the post-mortem of the WDQS outage, Search Update pipeline, and optimizing Wikibase index settings.

What we've accomplished

Search - Analysis

  • Continuing data analysis for apostrohpe-like characters (T315118). There are 22 candidate characters, and they get treated differently by different tokenizers (the Hebrew tokenizer straight up converts 5 of them to apostrophes—including Hebrew geresh—which I never noticed before!) and by ICU normalization and ICU folding.

Operations / SRE