Page MenuHomePhabricator

Attribution API: Include wprov parameters in response URLs
Closed, ResolvedPublic2 Estimated Story Points

Description

Description

We'd like to include wprov parameters (See Provenance documentation) in (1) the URL of the essential link attribute that navigates to articles/media files source pages and (2) the default URLs served as part of the calls_to_action signal category.

Why do this:

Acceptance criteria
  • The link signal's URL includes a default wprov=afsw1 parameter. Example: "https://en.wikipedia.org/wiki/Earth?wprov=afsw1"
  • The participation_ctas' URLs include a default wprov=afcw1
  • We transparently communicate the presence of these parameters to users of the Attribution API

Event Timeline

I'm looking at t[[ https://wikitech.wikimedia.org/wiki/Provenance#Description_of_wprov_parameter | Description of wprov parameter ]] and it should be

<3_char_feature><platform_one_char><major_version_of_feature_uint>.

Therefore, we need to go with something like

afw + [s|d|p|l|c] + 1

where
afw -> Attribution FrameWork
s -> source
d -> donation
p -> participation
l -> learn more
c -> create account
1 -> current version

I need to check, but I don't think that it's good to use the major_version to define which link someone clicked. I'll follow up on this.

Thanks @pmiazga. @RHo and I agree to follow the predetermined format more closely. Following your lead, this would leave us with:

link --> afws1
donation_ctas --> afwd1
participation_ctas --> afwp1
learn_more --> afwl1
create_account --> afwc1

In case that (1) it's not acceptable to use the platform slot to specify the type of CTA (platform is irrelevant in our case) and (2) platform can be omitted, we could do:

link --> afs1
donation_ctas --> afd1
participation_ctas --> afp1
learn_more --> afl1
create_account --> afc1

We'll update the description with whichever variant you find more compliant. Thank you!

After a deeper thought, I don't think we need different wprov values for each link -> as the we will know which link was clicked by the link ( URL ) itself. Therefore we may be totally good with just two af[s|c]w1 which would be translated to

afsw1 -> Attribution Framework, Source, Web, ver 1
afcw1 -> Attribution Framewor, Calls to Action, Web, ver 1

HCoplin-WMF set the point value for this task to 2.May 7 2026, 3:02 PM
HCoplin-WMF subscribed.

Notes from estimation:

  • We can use linker tool/library to append the right values to the title link
  • Values are assumed to be directly concatenated to the hardcoded CTAs
  • Add a test that confirms presence of wprov

After a deeper thought, I don't think we need different wprov values for each link -> as the we will know which link was clicked by the link ( URL ) itself. Therefore we may be totally good with just two af[s|c]w1 which would be translated to

afsw1 -> Attribution Framework, Source, Web, ver 1
afcw1 -> Attribution Framewor, Calls to Action, Web, ver 1

Alright. Thank you @pmiazga! I think, though, that you might have meant afws1 and afwc1, right? Just making sure. I'll update the task description now 🙏🏻

Notes from estimation:

  • We can use linker tool/library to append the right values to the title link
  • Values are assumed to be directly concatenated to the hardcoded CTAs
  • Add a test that confirms presence of wprov

@HCoplin-WMF I'm not sure whether this will impact the estimation, but I had forgotten a crucial AC: We need to communicate to users that the URLs include the wprov parameters in a clear and accessible way (either via the spec or the Attribution API docs — wherever it sounds most convenient). A note on the attribution framework site sounds necessary too, so I'll draft a short heads-up there.

After a deeper thought, I don't think we need different wprov values for each link -> as the we will know which link was clicked by the link ( URL ) itself. Therefore we may be totally good with just two af[s|c]w1 which would be translated to

afsw1 -> Attribution Framework, Source, Web, ver 1
afcw1 -> Attribution Framework, Calls to Action, Web, ver 1

Alright. Thank you @pmiazga! I think, though, that you might have meant afws1 and afwc1, right? Just making sure. I'll update the task description now 🙏🏻

I think afsw1 and afcw1

afs | afc -> first 3 chars is a feature. The af stands for attribution framework, and s for source and c for ctas
w -> platform, one char -> w for web
1 -> version

@dr0ptp4kt can you confrm?

@pmiazga confirmed, I think.

If I understand correctly, this is for https://wikimedia-attribution.toolforge.org/attribution-signals/link.html , and so here is what an ordinary request and response might look like

$ curl -A \
'GardenQLNiceAgenticBrowserBot (https://gitlab.wikimedia.org/dr0ptp4kt/gardenql; abaso@wikimedia.org)'  \
  -X GET \
  'https://en.wikipedia.org/w/rest.php/attribution/v0-beta/pages/Neuro-symbolic_AI/signals?redirect=true' \
  -H 'accept: application/json'

{"essential":{"title":"Neuro-symbolic AI","license":{"url":"https://creativecommons.org/licenses/by-sa/4.0/deed.en","title":"Creative Commons Attribution-Share Alike 4.0"},"link":"https://en.wikipedia.org/wiki/Neuro-symbolic_AI","default_brand_marks":[{"name":"Default logo","url":"https://en.wikipedia.org/static/images/project-logos/enwiki-25.png","type":"logo"},{"name":"Site icon","url":"https://en.wikipedia.org/static/images/icons/enwiki-25.svg","type":"icon"},{"name":"Sound logo","url":"https://upload.wikimedia.org/wikipedia/commons/9/91/Wikimedia_Sonic_Logo_-_4-seconds.wav","type":"audio"}],"source_wiki":{"site_id":"enwiki","site_language":"en","page_language":"en"}},"source_wiki":{"site_name":"English Wikipedia","project_family":"wikipedia"}

Even if the thing calling the API is not a web browser (and indeed, I'd anticipate a wide variety of source user agents), I think the idea here is that the original sourcing of the Attribution Framework is fundamentally the website for the Attribution Framework for this class of URL. So, even if the Attribution Framework website were found by means of some agentic browser, let's say, you'd still say that the fundamental sourcing was web. So I think the w here is appropriate enough.

I was about to say use m for multiple, but upon working through this example out loud to myself I'm convinced by use of w as well (I could be wrong, so feel free to refute as you all have final judgment on this; I just had to convince myself one way or the other in the interim 😊).

Now, as for the af, makes sense. And the third character still has another 60 options (if we restrict ourselves to [a-zA-Z0-9]), so there's plenty of additional room to grow for more potential sourcing by means of the Attribution Framework when it's the Attribution Framework and its surrounding API/discovery architecture that provides the hyperlinks, in this case in the link element provided by the API linked to from link.html as mentioned above.

Notes from estimation:

  • We can use linker tool/library to append the right values to the title link
  • Values are assumed to be directly concatenated to the hardcoded CTAs
  • Add a test that confirms presence of wprov

@HCoplin-WMF I'm not sure whether this will impact the estimation, but I had forgotten a crucial AC: We need to communicate to users that the URLs include the wprov parameters in a clear and accessible way (either via the spec or the Attribution API docs — wherever it sounds most convenient). A note on the attribution framework site sounds necessary too, so I'll draft a short heads-up there.

Just updating the docs is a minor addition; it doesn't bump the estimate.

Change #1287028 had a related patch set uploaded (by Pmiazga; author: Pmiazga):

[mediawiki/extensions/WikimediaCustomizations@master] Attribution: Add provenance parameter to Attribution URLS

https://gerrit.wikimedia.org/r/1287028

@Sarai-WMF @HCoplin-WMF I added also wprov param info to schema documentation:

image.png (1,521×1,233 px, 152 KB)

Does this meet your expectations regarding

We transparently communicate the presence of these parameters to users of the Attribution API

@Sarai-WMF @HCoplin-WMF I added also wprov param info to schema documentation:

image.png (1,521×1,233 px, 152 KB)

Does this meet your expectations regarding

We transparently communicate the presence of these parameters to users of the Attribution API

Hey @pmiazga! Confirmed on our end that including descriptions in the schema docs is sufficient to satisfy that criterion.
I'd like to suggest new copy for the description, if that sounds good. What about:

The URL value includes a Wikimedia-specific provenance parameter (wprov). This helps Wikimedia evaluate whether participation calls to action are effective, without changing the destination page or attribution data. We recommend keeping the parameter when possible.

Thank you!

The schema description is my preferred approach here, too. I don't think it's key information for folks who are casually perusing the docs, but agree it should be mentioned somewhere.

pmiazga changed the task status from Open to In Progress.Mon, May 18, 2:19 PM
pmiazga claimed this task.

Change #1287028 merged by jenkins-bot:

[mediawiki/extensions/WikimediaCustomizations@master] Attribution: Add provenance parameter to Attribution URLS

https://gerrit.wikimedia.org/r/1287028

pmiazga updated the task description. (Show Details)

Marking as resolved as part of sprint close out.