Page MenuHomePhabricator

Create mediawiki.user_change event stream
Open, HighPublic

Description

MediaWiki user change events are needed for Incremental MediaWiki History to support weekly delivery of core contributor metrics.

This Incremental MWH sheet tracks the data field requirements.

MediaWiki hooks available for use (as of 2026-04):

Some prior work for producing this kind of event data is done in the ServerSideAccountCreation legacy eventlogging stream, emitted by Extension:Campaign. It looks like instead of using a convenient Hook, Extension:Campaign implements a SecondaryAuthenticationProvider that will be called by MediaWiki core when a user logs in. If the log in happens to also be an account creation, they construct an event and produce it.

There doesn't seem to be a relevant CentralAuth hook, so we'll have to figure out how global accounts relate.

Done is

  • new mediawiki/user/change(?) schema modeled on entity/user schema fragment
  • new mediawiki.user_change stream emitted from EventBus
  • new mediawiki.user_change has more signal than ServerSideAccountCreation does for Incremental MWH purposes

Details

Related Objects

Event Timeline

Met today with MW folks to learn more. Notes here.

Change #1277724 had a related patch set uploaded (by Ottomata; author: Ottomata):

[mediawiki/extensions/EventBus@master] WIP - create new user_change event stream

https://gerrit.wikimedia.org/r/1277724

We might need to do {T348252} for this too.

I just encountered an annoying issue with how we have been modeling user groups in event schemas. I filed T425360: EventBus - user entity schema should differentiate between explicit and implicit user groups. Fixing it may be a semi backwards incompatible change.

Since user groups are not needed for T418032, I'm going to not implement user group changes now. I have code that does it, so I'll separate that out.

For posterity, the user group change code I had can been seen by diffing patchset 8 to 7:
https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/1277724/8..7

I worked out ^ as described in this comment: T425360#11887651

First draft of user_change ready! After review, we could merge and declare a .dev0 stream and try it!

Now to find a reviewer!

I commented on the schema, but I'm not comfortable reviewing the mediawiki code :)

I commented on the schema, but I'm not comfortable reviewing the mediawiki code :)

Same.

Ottomata triaged this task as High priority.May 9 2026, 2:14 AM

Change #1285525 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] EventStreamConfig - add mediawiki.user_change.dev0

https://gerrit.wikimedia.org/r/1285525

Test wiki created on Patch demo by Ottomata using patch(es) linked to this task:
https://a199b51d8e.catalyst.wmcloud.org/w/

Change #1285525 merged by jenkins-bot:

[operations/mediawiki-config@master] EventStreamConfig - add mediawiki.user_change.dev0

https://gerrit.wikimedia.org/r/1285525

Mentioned in SAL (#wikimedia-operations) [2026-05-11T12:59:23Z] <otto@deploy1003> Started scap sync-world: Backport for [[gerrit:1285525|EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]]

Mentioned in SAL (#wikimedia-operations) [2026-05-11T13:01:08Z] <otto@deploy1003> otto: Backport for [[gerrit:1285525|EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-05-11T13:07:28Z] <otto@deploy1003> Finished scap sync-world: Backport for [[gerrit:1285525|EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s)

Change #1277724 merged by jenkins-bot:

[mediawiki/extensions/EventBus@master] Create new user_change event stream

https://gerrit.wikimedia.org/r/1277724

A new consideration: T425986: Add log_id to wmf.mediawiki_history. For this new user_change stream we would also need a log_id field.

Change #1286434 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] EventStreamConfig - ingest mediawiki.user_change into the Data Lake

https://gerrit.wikimedia.org/r/1286434

Change #1286434 merged by jenkins-bot:

[operations/mediawiki-config@master] EventStreamConfig - ingest mediawiki.user_change into the Data Lake

https://gerrit.wikimedia.org/r/1286434

Mentioned in SAL (#wikimedia-operations) [2026-05-12T17:40:36Z] <otto@deploy1003> Started scap sync-world: Backport for [[gerrit:1286434|EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]]

Mentioned in SAL (#wikimedia-operations) [2026-05-12T17:42:31Z] <otto@deploy1003> otto: Backport for [[gerrit:1286434|EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-05-12T17:56:44Z] <otto@deploy1003> Finished scap sync-world: Backport for [[gerrit:1286434|EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s)

We have Hive event.mediawiki_user_change_dev0 records!

We have Hive event.mediawiki_user_change_dev0 records!

Awesome! I'll hopefully start looking into these next week.

We have done multiple test runs ingesting data from event.mediawiki_user_change_dev0, and so far have found no data quality issues.

Can we move this dev stream to production?

CC @JMonton-WMF

Change #1299422 had a related patch set uploaded (by JavierMonton; author: JavierMonton):

[operations/mediawiki-config@master] stream: mediawiki.user_change

https://gerrit.wikimedia.org/r/1299422

Thank you! Just discussed with Javier and Xabriel. We can do this, but I suggested to wait a while, perhaps until MWHInc v2.

We aren't yet sure how T424685: Emit comprehensive mediawiki user block change information in an event stream relates, and if it may change anything about the user/change data model if we decide to include it as part of the user_change stream.