Page MenuHomePhabricator

RFC: Shadow namespaces
Open, Stalled, HighPublic

Description

Implement shadow namespaces, which refers to the concept where if a local page doesn't exist, it will be transparently fetched from a remote wiki.

For example, if Template:Hi does not exist on wiki A but it exists on the linked wiki B, then if {{Hi}} is added to a page on wiki A, then it will show Template:Hi from wiki B.

This is just like how InstantCommons and foreign file repos currently work (If [[File:Example.png]] does not exist on this wiki, but exists on Wikimedia Commons, the Commons wiki image is retrieved and used).

For more details see https://www.mediawiki.org/wiki/Requests_for_comment/Shadow_namespaces

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

In any case, local users need to know if the available object is local or central. Perhaps from the style of the title and from a dedicated sign.

If the central version has priority, the local user cannot delete or rename this local object, and cannot choose which object use. Also the local user cannot rename and re-use it with another version name. Also that need an option to manage all cases.

In any case, local users need to know if the available object is local or central. Perhaps from the style of the title and from a dedicated sign.

Sure. We already have user-visible notices for files that are coming from a foreign file repository such as Wikimedia Commons. We also already have user-visible notices for global user pages coming from Meta-Wiki.

If the central version has priority, the local user cannot delete or rename this local object, and cannot choose which object use. Also the local user cannot rename and re-use it with another version name. Also that need an option to manage all cases.

While I agree with what you're saying, I think this is essentially the cost of having a centralized repository. You give up some local control over an item in exchange for having fewer versions/editions of a similar global item. (I'm reminded of Brexit!) We make this trade-off in a couple of places already, such as in the File and User namespaces. We want to generalize the functionality to make it easier to extend to other namespaces such as Help, as I understand it.

OK for central modules which have the priority and replace local ones.
I see 2 ways to retreave local versions:

  • The replacement is registered in the history of the object page. Then any user can find and reuse the object before the central replace.
  • Admins could have a right for this case.

@ahroni has an un conference session scheduled for today. Please sign up.

I added a session to the 2017 Dev Summit, which is closely related to this task. Hope to see some of this task's subscribers there. Sorry about the super-short notice, I'll do my best to make it useful nevertheless.

I started an Etherpad for the session: https://etherpad.wikimedia.org/p/devsummit17-xwiki-templates

People who understand the technology inside Shadow namespaces are welcome to add it.

On the enwiki, templates are created, put in use, merged and deleted all the time. It may be more complex to manage them on a shared repository for other projects, but "no real way" is simply wrong.

On the enwiki, templates are created, put in use, merged and deleted all the time.

Of course, and the same is true for each single wiki.

It may be more complex to manage them on a shared repository for other projects, but "no real way" is simply wrong.

"No real way" refers to sharing templates between wikis. And there is no real way for that. There is currently no shared repository, and there's no way to write a template on one wiki and use it on another, except importing it and adapting it manually, which creates a fork and doesn't get automatic updates from the source template.

Note-taker(s) of this session: Follow the instructions here: https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2017/Session_Guidelines#NOTE-TAKER.28S.29 After the session, DO NOT FORGET to copy the relevant notes and summary into a new wiki page following the template here: https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2017/Your_Session and also link this from the All Session Notes page: https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2017/All_Session_Notes. The EtherPad links are also now linked from the Schedule page (https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2017/Schedule) for you!

Thanks a lot for this, @Legoktm. I plan to follow up further.

Does it have to be "templates"? What if I want to transclude a poem from Wikisource into a Wikipedia article about the poem? Or a quotation into a Wiktionary entry, to show how a word gets used?

More information: https://www.mediawiki.org/w/index.php?title=Topic:T6ukmc98uuun7ydc (especially @Billinghurst's comment)

What's the current status of the Shadow namespace work? Has it gotten stalled?

This RFC seems to be stalled. If there is currently no interest in driving this further, it should for now be removed from the RFC work board.
If there is interest in continuing the RFC process, please let us (TechCom) known who will be working on this RFC, and who commits to implementing it if approved, and in what time frame.

@Fjalapeno "shadow namespaces" seems like a bit of infrastructure that audiences may be asking for. Is this a topic for ATWG?

@daniel potentially… there is definitely a desire to be able to reliably parse information from main pages, news portals, features article pages, etc…

Global templates may help this. But so may storing data via MCR.

I'd say that the topic of "making content machine readable and easier to parse" is a high level goal worthy of discussion.

@Fjalapeno Shadow namespaces don't really do anything for easier parsing, they are unrelated to MCR. This is about cross-wiki access to page content. Global templates would be one application of shadow namespaces. Global Lua modules and global Gadgets would be others.

Reading would like more mobile-friendly and possibly more machine-readable pages, and since most complex markup and most data that machines would care about is in templates, that would mainly be achievable by template standardization. (Putting machine-readable data into an MCR slot instead of the current mark-it-up-in-HTML-then-parse-out approach would be cool, but seems quite far away, and in any case would still happen through templates in some form.) Cross-wiki template standardization and the creation of a specialist class users focusing on that (much like the Commons community for images or Wikidatans for data) would increase the chance of that happening (in the case of most small wikis, increase it from zero) and that depends on shadow namespaces. So there is definitely a product need IMO.

Also global gadgets might or might not be implemented via shadow namepsaces and since frontend developers are negatively affected by broken / misbehaving gadgets that interfere with the code they are working on, anything that would improve average gadget quality is also a win.

That said, I thought the main driver for this was CommTech (with global gadgets & templates the #3 wish in 2015 and global gadgets the #1 in 2016)?

@Fjalapeno Shadow namespaces don't really do anything for easier parsing,

Sorry… probably need some context: Global templates, made possible by shadow namespaces, make it easier for parsing as they can be used to enforce markup consistency across wikis. So if the same templates are used across all news portals on all wikis, then we can use them to apply the same markup to the content. This is of course brittle, but it is a technique used by several services maintained by Readers - they parse content from main pages and news portals by looking at HTML tags. This makes it hard to roll out to every project, and we add new languages on a case by case basis. So this is the case I was talking about

they are unrelated to MCR.

Yes it is unrelated to MCR, but MCR is a better way to serve the use case I am talking about. Because at the end of the day, parsing this content is brittle, even with global templates. What would be better, is if pages could store the data that is needed in a structured way so that the services do not have to parse the page HTML to get the most recent news story. Instead they can just look in an MCR slot to get data that represents the news story.

Hopefully this makes more sense.

@Fjalapeno Ah, now I see the connection. So Reading's main interest in this is in standardizing templates, to make it easier to parse information out of wikitext and/or HTML? It seems to me T114251: [RFC] Magic Infobox implementation would be more relevant for that use case.

Most of the information is not infobox related (e.g. news entries or DYK entries on the main page). IMO the nice solution would be T156876: Structured data side channel for wikitext. But in some ways it's always going to rely on templates.

There is also the whole responsive templates thing (use TemplateStyles to add responsive CSS to templates which reacts meaningfully to screen width, instead of inline desktop-only CSS) which is an independent issue.

@brion: Should this task still be assigned to you? I think not, given that the concept of "shepherd" has gone away now?

Here's how I remember this task: I believe @Legoktm started working toward making a general shadow namespaces implementation so that it could be used with file description pages from a central wiki like Commons and global user pages from a central wiki like Meta-Wiki. When we discussed this during an IRC meeting, Tim suggested that templates and Scribunto modules would be a different beast than file and user pages (and maybe help pages?). Where we left off was that we would do the "easier" part first (File, User, and Help pages) and then figure out how to deal with the more complex part (Template and Module pages).

I think any work toward resolving this task has stalled now.

Removing my claim per note above, we've dropped the 'shepherding' notion and this is not under active work on my end.

Unknown Object (User) subscribed.Jul 30 2018, 9:02 PM
daniel changed the task status from Open to Stalled.Mar 28 2019, 10:57 AM
daniel moved this task from Under discussion to Old on the TechCom-RFC board.

@MZMcBride said:

When we discussed this during an IRC meeting, Tim suggested that templates and Scribunto modules would be a different beast than file and user pages (and maybe help pages?). Where we left off was that we would do the "easier" part first (File, User, and Help pages) and then figure out how to deal with the more complex part (Template and Module pages).

I agree with this analysis: sharing any "active" content (like templates, modules, user scripts, gadgets, etc) across wikis is going to be more tricky. For one thing, we presently have no ability to track such cross-wiki usage.

Making "plain" pages work across wikis is much easier, especially if we go with the "remote parsing" option as described on the wiki page, where the content is parsed on its home wiki, and the HTML is then shown on other wikis.

On the other hand - what use cases do we have for that beyond user pages and file description pages, which both already work?

I think any work toward resolving this task has stalled now.

Yes, this seems to be stalled. Moving it to the backlog for now. Anyone should feel free to drop it into TechCom's inbox again if they feel they want to drive this to completion.

I agree with this analysis: sharing any "active" content (like templates, modules, user scripts, gadgets, etc) across wikis is going to be more tricky.

Not only more tricky but potentially requiring a different approach. For code you want code reviews, CI infrastructure and so on. Reinventing all that in MediaWiki might not be the best approach.

(Templates are more borderline - not quite code, but not quite content either.)

For one thing, we presently have no ability to track such cross-wiki usage.

We do for images, via the GlobalUsage extension. That approach could probably be extended to other things.

In T91162#5065385, @Tgr wrote:

For one thing, we presently have no ability to track such cross-wiki usage.

We do for images, via the GlobalUsage extension. That approach could probably be extended to other things.

Indeed. That feature is also (ab)used for user scripts (see T35355):
https://commons.wikimedia.org/wiki/Special:WhatLinksHere/Template:GlobalJsUsage

Krinkle moved this task from Old to P1: Define on the TechCom-RFC board.

I'm reviewing the current situation for T425752: Produnto repository viewer: Package namespace.

NS_MEDIAWIKI is a use case which is missing from previous discussions. GlobalUserPage and Commons NS_FILE operate by overriding the view UI, whereas NS_MEDIAWIKI also provides access to the source wikitext via transclusion and edit page preload.

GlobalUserPage and ImagePage use remote rendering, whereas NS_MEDIAWIKI uses local rendering.

I think for T425752 I want local rendering, and there might be a use case for transclusion, but probably not edit preload for now. So T425752 is more like NS_MEDIAWIKI than ImagePage, which is unfortunate since NS_MEDIAWIKI is a completely unrefactored hack, distributed throughout core.

What does a shadow content provider provide? What is the data model?

  • For GlobalUserPage, data similar to a ParserOutput. Commons NS_FILE provides HTML which could be wrapped in a ParserOutput.
  • For Commons NS_FILE, a header.
  • For GlobalUserPage, a footer.
  • For GlobalUserPage and Commons NS_FILE, a rel=canonical URL.
  • GlobalUserPage and Commons NS_FILE provide isLocal() and getWikiDisplayName() to the skin.
  • NS_MEDIAWIKI provides source text to PreloadedContentBuilder, EditPage::showDiff and Parser::statelessFetchTemplate().
  • NS_MEDIAWIKI provides a fake RevisionRecord to REST PageContentHelper.
  • GlobalUserPage and NS_MEDIAWIKI provide existence to various consumers. Commons NS_FILE also overrides existence, although it's file existence, not shadow content existence.

I imagine there would be a service which, given a PageReference, would return an object encapsulating all these things.

ImagePage is awkward because the File is already loaded, and it has a special way of loading it, so a shadow content provider for Commons really wants the File, not the PageReference. The implementation is mostly in File. Arguably, shadow content is not an easily separable module of ImagePage or File.

What would be nice to have?

  • Batch existence is desired (T88644), and is not implemented by any of the three.
  • Maybe non-main slots (MCR) could be shadowed. Although, to be clear, I'm not suggesting that this should be a layer behind RevisionStore. I'm suggesting that the shadow content service acts as a second source for content alongside RevisionStore. The UI layer generally needs to be aware of it.

Caching

  • GlobalUserPage performs cache invalidation in response to various hooks by pushing a job for every wiki the user is attached on. It doesn't override the purge action.
  • ForeignDBFile::getDescriptionText() fetches the remote page_touched and uses it as part of the memcached key. I don't think GlobalUsage purges file description pages. The purge action purges the description cache.
  • NS_MEDIAWIKI has no caching besides MessageCache and the CDN. It overrides action=purge, purging MessageCache.

Change #1289735 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] Introduce ShadowPage concept

https://gerrit.wikimedia.org/r/1289735

I'm reviewing the current situation for T425752: Produnto repository viewer: Package namespace

NS_MEDIAWIKI is a use case which is missing from previous discussions. GlobalUserPage and Commons NS_FILE operate by overriding the view UI, whereas NS_MEDIAWIKI also provides access to the source wikitext via transclusion and edit page preload.

Some thing to note is cross-wiki NS_FILE allows showing contents from pages in different database cluster (e.g. non-WMF wikis using files from Wikimedia Commosn), and MediaWiki natively supports serving files (including file description pages) from multiple repos together (e.g. wikis in Miraheze can use file from Wikimedia Commons, Miraheze Commons, own local wiki plus any other wikis site bureaucrat defined); but GlobalUserPage does not yet support that (T237770: GlobalUserPage should not require the global wiki to be in the same database cluster). If we have a central Lua package repo, it is clearly a valid use case for 3rd party wiki to use modules from Wikimedia (potentially in addition to their own Produnto installation).

Also note the current implementation of GlobalUserPage is buggy, e.g. see T89916 and T90978.

Change #1289735 merged by jenkins-bot:

[mediawiki/core@master] Introduce ShadowPage concept

https://gerrit.wikimedia.org/r/1289735

Change #1293827 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] ShadowPage: Allow ShadowPage providers to be registered by extensions

https://gerrit.wikimedia.org/r/1293827

Change #1293827 merged by jenkins-bot:

[mediawiki/core@master] ShadowPage: Allow ShadowPage providers to be registered by extensions

https://gerrit.wikimedia.org/r/1293827