Page MenuHomePhabricator

Follow-up AB test of dym language model variants
Open, Needs TriagePublic3 Estimated Story Points

Description

We ran an AB test in T404647 that compared the default language model (title + redirect.title) vs a varient(opening_text). We were surprised to find the variant field did not improve performance vs the default language model. One possibility is that there are patterns in the titles not found in the opening_text, and we need both. Run a test that compares title+redirect.title vs title+redirect.title+opening_text.

Event Timeline

Change #1196747 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@master] Revert "Limit suggest_variant to only opening_text"

https://gerrit.wikimedia.org/r/1196747

Change #1196747 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Revert "Limit suggest_variant to only opening_text"

https://gerrit.wikimedia.org/r/1196747

pfischer set the point value for this task to 3.Oct 20 2025, 3:33 PM

Change #1277701 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/mediawiki-config@master] cirrus: AB test query suggester variants

https://gerrit.wikimedia.org/r/1277701

This got backburnered, but on review it looks like everything necessary is inplace.

I wrote a small bash script to check the clusters and verify everything has been reindexed,script and output at P91683.

  • It is expected that wikidata does not have the configuration, did-you-mean is disabled there
  • It turns out labtestwiki has been deleted, and thus did not get updated with the reindex. These indices have now been deleted from the production clusters.

With everything ready I've put up the patch to start the test. This uses the same test configuration as before, only changing the test name for uniqueness.

Change #1277701 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: AB test query suggester variants

https://gerrit.wikimedia.org/r/1277701

Mentioned in SAL (#wikimedia-operations) [2026-04-28T20:22:33Z] <ebernhardson@deploy1003> Started scap sync-world: Backport for [[gerrit:1277701|cirrus: AB test query suggester variants (T407432)]], [[gerrit:1278498|ExtensionDistributor: mark 1.46 as development (T423262)]]

Mentioned in SAL (#wikimedia-operations) [2026-04-28T20:24:25Z] <ebernhardson@deploy1003> ebernhardson, macfan4000: Backport for [[gerrit:1277701|cirrus: AB test query suggester variants (T407432)]], [[gerrit:1278498|ExtensionDistributor: mark 1.46 as development (T423262)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-04-28T20:29:08Z] <ebernhardson@deploy1003> Finished scap sync-world: Backport for [[gerrit:1277701|cirrus: AB test query suggester variants (T407432)]], [[gerrit:1278498|ExtensionDistributor: mark 1.46 as development (T423262)]] (duration: 06m 35s)

Change #1286997 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/mediawiki-config@master] Revert "cirrus: AB test query suggester variants"

https://gerrit.wikimedia.org/r/1286997

Change #1286997 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert "cirrus: AB test query suggester variants"

https://gerrit.wikimedia.org/r/1286997

Mentioned in SAL (#wikimedia-operations) [2026-05-13T20:16:29Z] <ebernhardson@deploy1003> Started scap sync-world: Backport for [[gerrit:1286997|Revert "cirrus: AB test query suggester variants" (T407432)]]

Mentioned in SAL (#wikimedia-operations) [2026-05-13T20:18:27Z] <ebernhardson@deploy1003> ebernhardson: Backport for [[gerrit:1286997|Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-05-13T20:23:35Z] <ebernhardson@deploy1003> Finished scap sync-world: Backport for [[gerrit:1286997|Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s)

Change #1287899 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@master] Include xff in search logs

https://gerrit.wikimedia.org/r/1287899

Change #1287899 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Include xff in search logs

https://gerrit.wikimedia.org/r/1287899

Change #1288924 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@wmf/1.47.0-wmf.2] Include xff in search logs

https://gerrit.wikimedia.org/r/1288924

The AB test shows a significant fraction (~40% of enwiki sessions) receiving multiple AB test buckets. Verified by joining search id's from AB testing against our backend logging, indeed we used different buckets on different requests. We have all the input data to bucketing in the logs except the x-forwarded-for header. Patch above is scheduled for deployment today and should get us enough data to understand where things went wrong.

Rough plan:

  • Ship the logging for x-forwarded-for
  • Turn the AB test back on
  • Dig through the first days data to understand where the bucketing problem is and fix it
  • Let the AB test run with fixed bucketing for a week and re-run analysis.

Change #1288924 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@wmf/1.47.0-wmf.2] Include xff in search logs

https://gerrit.wikimedia.org/r/1288924

Change #1293800 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@master] identity: Prune private ips from x-forwarded-for

https://gerrit.wikimedia.org/r/1293800

Change #1293800 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] identity: Prune private ips from x-forwarded-for

https://gerrit.wikimedia.org/r/1293800

Change #1294373 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@wmf/1.47.0-wmf.4] identity: Prune private ips from x-forwarded-for

https://gerrit.wikimedia.org/r/1294373

Change #1294374 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/mediawiki-config@master] Revert^2 "cirrus: AB test query suggester variants"

https://gerrit.wikimedia.org/r/1294374

Change #1294374 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert^2 "cirrus: AB test query suggester variants"

https://gerrit.wikimedia.org/r/1294374

Change #1294373 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@wmf/1.47.0-wmf.4] identity: Prune private ips from x-forwarded-for

https://gerrit.wikimedia.org/r/1294373

Mentioned in SAL (#wikimedia-operations) [2026-05-27T20:44:14Z] <ebernhardson@deploy1003> Started scap sync-world: Backport for [[gerrit:1294373|identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374|Revert^2 "cirrus: AB test query suggester variants" (T407432)]]

Mentioned in SAL (#wikimedia-operations) [2026-05-27T20:46:07Z] <ebernhardson@deploy1003> ebernhardson: Backport for [[gerrit:1294373|identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374|Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-05-27T20:51:45Z] <ebernhardson@deploy1003> Finished scap sync-world: Backport for [[gerrit:1294373|identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374|Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)

The test is rolled out, but the fix only applies to wmf.4. When analyzing the results we must ignore all events prior to May 29th.