[Share Highlights] Support CJK text better for fragment for zh pilot
Closed, ResolvedPublic3 Estimated Story Points
Actions

Assigned To

Authored By

	bvibber
	Feb 10 2026, 8:08 PM

Project Tags

Referenced Files

	F73861345: Screenshot 2026-03-27 at 2.36.17 PM.png
	Mar 27 2026, 6:55 PM

Subscribers

Description

The initial implementation of long text fragmentation in textFragment.js tries to take the first 5 words and the last 5 words and remove the middle; this falls down in a couple places:

if there are in fact less than 10 words, you get duplicates
languages like Chinese and Japanese that don't use spacing between words will treat long strings as individual words, making that situation more likely

For instance sharing a link to an entire paragraph of Chinese text from the zh "Paris" article gave me a URL over 5000 characters long (tweaked to point at zhwiki instead of my localhost):

Consider treating each Han character as an entire word (which may help with Chinese and Japanese specifically), or cropping the total length of each side of the pair.

Acceptance criteria:

Never repeat words on both sides of the "long string" start/end pair in a Chinese string
If cropping within long strings, do not break Unicode surrogate pairs

Details

Related Changes in Gerrit:

	Subject	Repo	Branch	Lines +/-
	[ShareHighlight] Text fragment generation tests and fixes	mediawiki/extensions/ReaderExperiments	master	+152 -8

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T416432 [Epic] Share Card experiment
		Resolved		KSarabia-WMF	T417067 [Share Highlights] Support CJK text better for fragment for zh pilot

Event Timeline

bvibber created this task.Feb 10 2026, 8:08 PM

Restricted Application added subscribers: Dragoniez, Stang, Aklapper. · View Herald TranscriptFeb 10 2026, 8:08 PM

bvibber renamed this task from [Share Highlights] Support CJK text better for ellipsis for zh pilot to [Share Highlights] Support CJK text better for fragment for zh pilot.Feb 10 2026, 8:08 PM

bvibber added a parent task: T416432: [Epic] Share Card experiment.

egardner mentioned this in T416434: [Share Highlights] Set up codebase for Share Highlights.Feb 10 2026, 9:18 PM

Stang unsubscribed.Feb 10 2026, 11:43 PM

Shizhao added a project: Chinese-Sites.Feb 11 2026, 2:59 AM

egardner mentioned this in T417105: [Share Highlights] Text fragment URL generation.Feb 11 2026, 4:42 AM

egardner moved this task from Incoming/Inbox to Needs Refinement on the Reader Growth Team board.Feb 11 2026, 4:49 AM

SherryYang-WMF triaged this task as Low priority.Feb 11 2026, 5:30 PM

SherryYang-WMF raised the priority of this task from Low to Medium.

Shizhao moved this task from Backlog to Research on the Chinese-Sites board.Feb 24 2026, 4:01 AM

egardner mentioned this in T416432: [Epic] Share Card experiment.Mar 3 2026, 7:53 PM

egardner moved this task from Needs Refinement to Backlog on the Reader Growth Team board.Mar 3 2026, 10:01 PM

SherryYang-WMF moved this task from Backlog to Needs Refinement on the Reader Growth Team board.Mar 10 2026, 10:42 PM

HSwan-WMF set the point value for this task to 3.Mar 12 2026, 4:36 PM

SherryYang-WMF moved this task from Needs Refinement to Ready on the Reader Growth Team board.Mar 17 2026, 12:00 AM

Change #1259248 had a related patch set uploaded (by Bvibber; author: Bvibber):

[mediawiki/extensions/ReaderExperiments@master] [ShareHighlight] Text fragment generation tests and fixes

https://gerrit.wikimedia.org/r/1259248

gerritbot added a project: Patch-For-Review.Mar 24 2026, 7:44 PM

Change #1259248 merged by jenkins-bot:

[mediawiki/extensions/ReaderExperiments@master] [ShareHighlight] Text fragment generation tests and fixes

https://gerrit.wikimedia.org/r/1259248

mfossati edited projects, added: Reader Growth Team (Sprint 5 (Mar 18 - Mar 31) Q3 25/26); removed: Reader Growth Team.Mar 25 2026, 7:46 PM

mfossati moved this task from Committed to QA on the Reader Growth Team (Sprint 5 (Mar 18 - Mar 31) Q3 25/26) board.

ReleaseTaggerBot added a project: MW-1.46-notes (1.46.0-wmf.22; 2026-03-31).Mar 25 2026, 8:00 PM

Maintenance_bot removed a project: Patch-For-Review.Mar 25 2026, 8:31 PM

Checked on enwiki beta on 第85屆奧斯卡金像獎 (the article was imported from zhwiki).

The resulting url from the shared highlighted text (see the screenshot below) - https://en.wikipedia.beta.wmcloud.org/wiki/%E7%AC%AC85%E5%B1%86%E5%A5%A7%E6%96%AF%E5%8D%A1%E9%87%91%E5%83%8F%E7%8D%8E#:~:text=2012%E5%B9%B412%E6%9C%881%E6%97%A5%EF%BC%8C%E5%AD%A6%E9%99%A2%E5%9C%A8%E5%A5%BD%E8%8E%B1%E5%9D%9E,%E9%A2%81%E5%8F%91%E4%BA%86%E5%A5%A5%E6%96%AF%E5%8D%A1%E7%A7%91%E6%8A%80%E6%88%90%E6%9E%9C%E5%A5%96%5B7%5D%E3%80%82

Screenshot 2026-03-27 at 2.36.17 PM.png (2,378×1,274 px, 563 KB)

Compare with the link for the same highlighted text on zhwiki wmf.21

https://zh.wikipedia.org/wiki/%E7%AC%AC85%E5%B1%86%E5%A5%A7%E6%96%AF%E5%8D%A1%E9%87%91%E5%83%8F%E7%8D%8E#:~:text=2012%E5%B9%B412%E6%9C%881%E6%97%A5%EF%BC%8C%E5%AD%A6%E9%99%A2%E5%9C%A8%E5%A5%BD%E8%8E%B1%E5%9D%9E%E9%AB%98%E5%9C%B0%E4%B8%AD%E5%BF%83%E7%9A%84%E5%AE%B4%E4%BC%9A%E5%A4%A7%E7%A4%BC%E5%A0%82%E4%B8%BE%E8%A1%8C%E4%BA%86%E7%AC%AC%E5%9B%9B%E5%B1%8A%E5%B9%B4%E5%BA%A6%E5%AD%A6%E9%99%A2%E4%B8%BB%E5%B8%AD%E5%A5%96%E9%A2%81%E5%A5%96%E6%99%9A%E4%BC%9A%5B8%5D%E3%80%822013%E5%B9%B42%E6%9C%889%E6%97%A5%EF%BC%8C%E5%85%8B%E9%87%8C%E6%96%AF%C2%B7%E6%BD%98%E6%81%A9%E5%92%8C%E4%BD%90%E4%BC%8A%C2%B7%E7%B4%A2%E5%B0%94%E8%BE%BE%E5%A8%9C%E4%B8%80%E8%B5%B7%E5%9C%A8%E6%AF%94%E4%BD%9B%E5%88%A9%E5%B1%B1%E7%9A%84%E6%AF%94%E4%BD%9B%E5%88%A9%E5%B1%B1%E9%85%92%E5%BA%97%E4%B8%BB%E6%8C%81%E9%A2%81%E5%8F%91%E4%BA%86%E5%A5%A5%E6%96%AF%E5%8D%A1%E7%A7%91%E6%8A%80%E6%88%90%E6%9E%9C%E5%A5%96%5B9%5D%E3%80%82,2012%E5%B9%B412%E6%9C%881%E6%97%A5%EF%BC%8C%E5%AD%A6%E9%99%A2%E5%9C%A8%E5%A5%BD%E8%8E%B1%E5%9D%9E%E9%AB%98%E5%9C%B0%E4%B8%AD%E5%BF%83%E7%9A%84%E5%AE%B4%E4%BC%9A%E5%A4%A7%E7%A4%BC%E5%A0%82%E4%B8%BE%E8%A1%8C%E4%BA%86%E7%AC%AC%E5%9B%9B%E5%B1%8A%E5%B9%B4%E5%BA%A6%E5%AD%A6%E9%99%A2%E4%B8%BB%E5%B8%AD%E5%A5%96%E9%A2%81%E5%A5%96%E6%99%9A%E4%BC%9A%5B8%5D%E3%80%822013%E5%B9%B42%E6%9C%889%E6%97%A5%EF%BC%8C%E5%85%8B%E9%87%8C%E6%96%AF%C2%B7%E6%BD%98%E6%81%A9%E5%92%8C%E4%BD%90%E4%BC%8A%C2%B7%E7%B4%A2%E5%B0%94%E8%BE%BE%E5%A8%9C%E4%B8%80%E8%B5%B7%E5%9C%A8%E6%AF%94%E4%BD%9B%E5%88%A9%E5%B1%B1%E7%9A%84%E6%AF%94%E4%BD%9B%E5%88%A9%E5%B1%B1%E9%85%92%E5%BA%97%E4%B8%BB%E6%8C%81%E9%A2%81%E5%8F%91%E4%BA%86%E5%A5%A5%E6%96%AF%E5%8D%A1%E7%A7%91%E6%8A%80%E6%88%90%E6%9E%9C%E5%A5%96%5B9%5D%E3%80%82

lwatson added a project: ReaderExperiments-ShareHighlight.Mar 31 2026, 6:19 PM

LGTM

Stang moved this task from Research to Closed on the Chinese-Sites board.Mar 31 2026, 8:31 PM

[Share Highlights] Support CJK text better for fragment for zh pilotClosed, ResolvedPublic3 Estimated Story PointsActions

Description

Details

Related ObjectsSearch...

Event Timeline

[Share Highlights] Support CJK text better for fragment for zh pilot
Closed, ResolvedPublic3 Estimated Story Points
Actions

Related Objects
Search...