CDN/Headers
The Wikimedia CDN uses headers for security, analytical, and functional purposes. Some headers are sent to clients and some are only seen through internal systems.
X-Analytics
An HTTP header used for measurement purposes, including in cache log format and the webrequest data stream. A MediaWiki extension implements the capability of extracting this information into the data lake.
Generally, values are added to the header on the server side; the only keys accepted from the client side are preview and pageview.
Format
The X-Analytics header is formatted as a list of key=value pairs separated by semicolons, like mf-m=b or zero=123-45;mf-m=b. If a key occurs more than once, it is undefined which one takes precedence.
The special value - must be interpreted as the empty string.
Keys
| Key | Value | Origin | Since | Until | Team | Contact | Use case | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mf-m | b, amc, b,amc, or not set
|
appserver | ? | Current | Readers Web | Phuedx | If set, then the value b indicates that the user is opted into the beta mode (of the mobile site) (mf-m=b), the value amc indicates Advanced Mobile Contributions (mf-m=amc), and b,amc indicates both (mf-m=b,amc). See MobileContext.php.
| ||||||||||
| proxy | Proxy name, e.g. Opera
|
varnish | ? | Current | Wikipedia Zero | Yurik | If set, indicates that this request has been received via one of the trusted proxies such as Opera Mini servers.
Currently, the following proxies can be expected:
| ||||||||||
| https | 1
|
varnish | ? | Current | SRE Traffic | BBlack | If set, will be equal to "1", indicating HTTPS protocol. Currently set for the vast majority of requests, including all that are served with content from canonical WMF domains. If it is missing and the HTTP status is 301, the request was sent using HTTP and met with a HTTP redirect response, most likely to the corresponding HTTPS URL. For other response codes <400 (non-errors), it is assumed that the absence of this field also indicates a HTTP request. For some rare cases involving response codes >= 400, it may be possible that this field is not set even though the request was over HTTPS. (More details) | ||||||||||
| ismobile | 1
|
varnish | June 2025 | Current | ? | ? | If set, will equal "1", indicating that the request came from a mobile client (i.e. mobile user agent or mobile opt-in cookie), and is thus routed to MediaWiki with an X-Subdomain header to enable MobileFrontend. Launch task:T390924. | ||||||||||
| wmfuuid | UUID v4 value | varnish | ? | Current | Mobile apps | dr0ptp4kt | If set, will be equal to a hyphen separated value, and indicates a unique app installation. The ID may span multiple requests, as it is generated once, at app install time, using an appropriate library (Java, Objective C), and conforms to RFC 4122 version 4.
Older versions of the app may contain an appInstallID parameter in the request URL instead, or may contain both the appInstallID parameter in the URL as well as the wmfuuid X-Analytics value. Later versions of the software should only contain the wmfuuid X-Analytics value and not the appInstallID parameter in the URL. Requests from the app will not contain this header if the user has turned off "Send usage reports" in the settings menu of the app. | ||||||||||
| WMF-Last-Access | dd-MMM-yyyy, e.g. 06-May-2015
|
varnish | ? | Current | Analytics (Infrastructure) | Milimetric | Date of site last access. If set will be equal to the latest date when a device issued a request to the specific host in dd-MMM-yyyy format(Eg: 06-May-2015) and an expiration date set to ~31 days in the future. More explanation at Analytics/Unique_clients/Last_access_solution. | ||||||||||
| preview | 1 | client | ? | Analytics (Infrastructure) | Milimetric | Whether this is a preview request (not present if not). At the time of this writing, preview requests by mobile apps are not consider pageviews.
Expected value is
| |||||||||||
| pageview | 1 | client | ? | Current | Analytics (Infrastructure) | Milimetric | If set it will count the request in question as a pageview regardless of other attributes of request. | ||||||||||
| nocookies | 1 | varnish | ? | Current | Analytics (Infrastructure) | Madhuvishy or Nuria | If set it will tag the request in question as a nocookie request. This means that either this is a fresh browser session, a user browsing with cookies disabled or possibly a bot request.
We expect that the majority of requests tagged with nocookies will belong to bots. Please see: change 244626. | ||||||||||
| loggedIn | 1 | appserver (WikimediaEvents) | ? | WMDE-Analytics | Addshore | If set, will be equal to "1", and indicates that the request came from a logged in user (see also code). | |||||||||||
| page_id | Page ID | appserver (WikimediaEvents) | ? | WMDE-Analytics | Addshore, Ori.livneh | If set, will be a string of a positive integer. | |||||||||||
| ns | Namespace ID | appserver (WikimediaEvents) | ? | WMDE-Analytics | Addshore, Ori.livneh | If set, will be a string integer (can be negative for negative namespace IDs) | |||||||||||
| special | Special page name | appserver (WikimediaEvents) | ? | WMDE-Analytics | Addshore | Set for special pages only. This will be the base name of the special page, so if the user is browsing a page via an alias the actual page name will be here. | |||||||||||
| translationengine | Identifier, e.g. GT
|
varnish | Nov 2018 | Current | Product | ABaso | If set, indicates request served through a known intermediary service for machine translations. "GT" stands for Google Translate.
Added in T208795. | ||||||||||
| wprov | <3_char_feature>
<1_char_platform><major_version>
|
client or varnish | ? | Current | ? | ? | see Provenance | ||||||||||
| debug | 1 | varnish | Jan 2021 | Current | Analytics (Infrastructure) | Milimetric | Added in T263683. | ||||||||||
| client_port | medium-size int | varnish | Jan 2021 | Current | Analytics (Infrastructure) | JAllemandou | Added in T271953. | ||||||||||
| public_cloud | 1 | varnish | May 2021 | Current | SRE | CDanis | Added in T279380. | ||||||||||
| sessioncookie | 1 | varnish | November 2022 | Current | SRE | Vgutierrez | Added in T319324. | ||||||||||
| prefetch_sec_purpose | chrome_private_prefetch, chrome_prerender, chrome_preview, 1, or nonstandard
|
varnish | January 2024 | Current | Analytics (Infrastructure) | ABaso / AOtto | Added in T346463. | ||||||||||
| chrome_private_prefetch_version | 1, or later on an incremented version
|
varnish | January 2024 | Current | Analytics (Infrastructure) | ABaso / AOtto | Added in T346463. | ||||||||||
| prefetch_purpose | 1
|
varnish | January 2024 | Current | Analytics (Infrastructure) | ABaso / AOtto | Added in T346463. Note that this may be present in conjunction with other prefetch tagged values. | ||||||||||
| prefetch_x_moz | 1
|
varnish | January 2024 | Current | Analytics (Infrastructure) | ABaso / AOtto | Added in T346463. | ||||||||||
| rev_id | Revision ID | appserver (WikimediaEvents) | February 2024 | Current | ? | ? | Added in T346350. | ||||||||||
| authorization | OAuth, or Bearer, or unknown. Possibly more in the future
|
varnish | February 2025 | Current | SRE | CDanis | A summary of the HTTP Authorization header sent by the client, if any. | ||||||||||
| wmfuniq_days | integers 0 .. 8
|
varnish | October 2025 | Current | SRE | CDanis | 0 => no valid Edge Unique cookie in the request
1..8 => number of days the same valid cookie has been returned to us, rounded up, capped at 8. | ||||||||||
| wmfuniq_weeks | integers 0 .. 52
|
varnish | October 2025 | Current | SRE | CDanis | 0 => no valid Edge Unique cookie in the request
1..52 => number of weeks the same valid cookie has been returned to us, rounded up, capped at 52 | ||||||||||
| wmfuniq_freq | integers 0 .. 10
|
varnish | October 2025 | Current | SRE | CDanis | 0 => no valid Edge Unique cookie in the request, or, the user has visited the site on fewer than 10% of distinct weeks since cookie issuance
1..10 => the user has visited the site on (freq/10) * 100% of the distinct weeks since cookie issuance | ||||||||||
| ja3n | MD5 hash | HAProxy | September 2025 | Current | SRE | Vgutierrez | JA3N fingerprint of the client performing the request | ||||||||||
| auth_type | string unknown-$session->getProvider()
|
appserver (WikimediaEvents) | March 2026 | Current | MediaWiki Interfaces | HCoplin | Added in T418606 |
| Key | Value | Origin | Since | Until | Team | Contact | Use case |
|---|---|---|---|---|---|---|---|
| php | zend, or hhvm
|
appserver | ? | Jan 2015 | SRE | _joe_ | If set, marks the used PHP implementation.
This tag was only set between September 2014 and January 2015 during the migration from Zend to HHVM. (See I46ff99, and I75b30b) |
| zero | MCC-MNC of a zero carrier, e.g. 404‑01.
|
varnish | ? | July 2019 | Wikipedia Zero | Yurik | If set, indicates that this request has been associated with the given carrier. It does not mean that the request qualifies as page view.
Removal in T213769. |
| zeronet | Subdivision of a carrier, e.g. b
|
varnish | ? | July 2019 | Wikipedia Zero | Yurik | Used of disambiguate between different parts/configurations of a single carrier. Like broadband vs. special access points.
Removal in T213769. |
| max-snippet | 1, 0, or not set
|
appserver (WikimediaEvents) | Mar 2022 | Oct 2022 | Readers Web | cjming | If set, the value 1 indicates the page's robots meta tag contains the max-snippet directive. The value 0 indicates the page's robots meta tag does not contain the max-snippet directive. If set, both 1 and 0 indicate that the page is part of an A/B test in the treatment and control groups respectively. Added in T301584.
Removed in I65ce99b04acc as part of T310267. |
X-Cache
| Origin | Returned to client? |
|---|---|
| HAProxy, Varnish | Yes |
A comma-separated list of cache hostnames with information such as hit/miss status for each entry. This header is read right-to-left: The rightmost is the outermost cache and further entries to the left progress deeper towards the application layer. The rightmost cache is the in-memory cache while all others are disk caches. In case of cache hit, the number of times the object has been returned is also specified. Once "hit" is encountered while reading right to left, everything to the left of "hit" is part of the cached object that got hit. It's whether the entries to the left missed, passed, or hit when that object was first pulled into the hitting cache.
Possible values are:
hit: a cache hit in cache storage. There was no need to query a deeper cache server (or the applayer, if already at the last cache server). Hits could need reaching an inner layer if content is stale andmust-revalidateis set. In this scenario the cache server sends a conditional request to an inner layer and if a 304 Not Modified is obtained the response is sent from the cache.int: locally-generated response from the cache. For example, a 301 redirect. The cache did not use a cache object and it didn't need to contact another server. Backend errors will trigger an int response as well. let's consider a backend responding with a 429 without a response body, the cache will internally generate an error response after contacting the applayer.miss: the object might be cacheable, but we don't have it.pass: the object was uncacheable, talk to a deeper level.
Some subtleties on "pass": different caches (eg: in-memory vs. on-disk) might disagree on whether the object is cacheable or not. A pass on the in-memory cache (for example, because the object is too big) could be a hit for an on-disk cache. Also, it's sometimes not clear that an object is uncacheable till the moment we fetch it. In that case, we cache for a short while the fact that the object is uncachable. In Varnish terminology, this is a "hit-for-pass".
If we don't know an object is uncacheable until after we fetch it, it's initially identical to a normal miss. Which means coalescing, other requests for the same object will wait for the first response. But after that first fetch we get an uncacheable object, which can't answer the other requests which might have queued. Because of that they all get serialized and we've destroy the performance of hot (high-parallelism) objects that are uncacheable. "hit-for-pass" is the answer to that problem. When we make that first request (no knowledge), and get an uncacheable response, we create a special cache entry that says something like "this object cannot be cached, remember it for 10 minutes" and then all remaining queries for the next 10 minutes proceed in parallel without coalescing, because it's already known the object isn't cacheable.
The content of the X-Cache header is recorded for every request in the webrequest log table.
- Example
-
X-Cache: cp1066 hit/6, cp3043 hit/1, cp3040 hit/26603
X-Cache-Status
| Origin | Returned to client? |
|---|---|
| HAProxy, Varnish | Yes |
This header condenses the various X-Cache values into a single value to describe the overall cache status.
Possible values are:
hit-front: Ahitcame from the outer-most cache level (Varnish).hit-local: Ahitcame not from the outer-most cache level (Varnish) but instead an inner level (ATS).int-front: Anintcame from the outer-most cache level (Varnish).int-local: Anintcame not from the outer-most cache level (Varnish) but instead an inner level (ATS).int-tls: The request only hit the TLS termination layer (HAProxy) and not the caches. This indicates HTTP→HTTPS redirection.miss: the object might be cacheable, but no portion of the stack had it.pass: the object was uncacheable by any portion of the stack.unknown: Catch-all value when internal parsing mechanisms fail to categorize as any of the above values. You should never see this.
- Examples
-
X-Cache: cp4038 miss, cp4038 hit/45761→X-Cache-Status: hit-frontX-Cache: cp4051 hit, cp4051 miss→X-Cache-Status: hit-localX-Cache: cp5021 int→X-Cache-Status: int-tlsX-Cache: cp5021 int→X-Cache-Status: int-frontX-Cache: cp5018 int, cp5018 pass→X-Cache-Status: int-local
X-Client-IP
| Origin | Returned to client? |
|---|---|
| Varnish | Yes |
Reports the User-Agent IP as reported by the layer 3 (no HTTP headers are parsed to populate the header).
- Examples
-
X-Client-IP: 185.15.58.224X-Client-IP: 2a02:ec80:600:ed1a::1
X-Client-Port
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
Reports the source port of the connection on the client side, which is the port the client connected from.
- Example
-
X-Client-Port: 25312
X-Connection-Properties
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
A multi-value header that lists various properties of the request. These properties always include the following key=value properties delimited by semi-colons (;):
H2: Represents whether HTTP/2 is used. Possible values are0or1.SSR: Returns true if the TLS session has been resumed through the use of SSL session cache or TLS tickets on an incoming connection over an SSL/TLS transport layer. Possible values are0or1.SSL: Returns the name of the used protocol when the incoming connection was made over an TLS transport layer.C: Returns the name of the used cipher when the incoming connection was made over an TLS transport layer.EC: The elliptic curve used.
- Example
-
X-Connection-Properties: H2=1;SSR=0;SSL=TLSv1.3;C=TLS_CHACHA20_POLY1305_SHA256;EC=X25519
X-Image-Generator
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
An internal header used within the Wikimedia CDN and request classification systems to signal the source a link to a given image or thumbnail. It provides early, lightweight identification of known traffic type, helping optimize filtering and rate-limiting decisions for media content.
This header is meant to:
- Tag media traffic, based on how it's being access and directed, as specified by MediaWiki.
- Apply rate-limiting based on the indicated use-case.
- Allow Requestctl, HAProxy and Varnish logic to apply differentiated rules based on known usage.
The header follows the form X-Image-Generator: value where value identifies the source/generator of the URL. Possible values are:
apiimageinfoindexparserrest
Any other value will be considered invalid.
- Examples
-
X-Image-Generator: apiX-Image-Generator: parser
X-Is-Browser
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
This header contains, for requests in class E and F of #X-Trusted-Request, a score indicating how likely it is that the request is coming from a browser and not a script. Values above 80 indicate a high likelihood that the request is coming from a browser, and not from a script. Conversely, a value below 20 indicates a high likelihood of the request not coming from a browser.
X-JA3N
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
JA3N fingerprint for help with identifying abuse.
- Example
-
X-JA3N: e7d705a3286e19ea42f587b344ee6865(Tor client)
X-JA4H
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
JA4H fingerprint for help with identifying abuse.
- Example
-
X-JA4H: t13d1516h2_8daaf6152771_02713d6af862(Chrome)
X-Provenance
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
An internal header used within the Wikimedia CDN and request classification systems to signal the origin or trust level of a request. It provides early, lightweight identification of known traffic sources, helping optimize filtering and rate-limiting decisions such as bypassing generic rate limits/Requestctl rules for trusted sources or use as an input to moat-mode rules or future trust scoring systems
This header is meant to:
- Tag traffic based on its origin before deeper inspection (e.g. session token validation or UA classification)
- Enable fast-path handling (e.g. skip filtering, assign different rate limits)
- Allow Requestctl, HAProxy and Varnish logic to apply differentiated rules based on known provenance
In the future it will also:
- Integrate with session/token-based identification
- Help shape rate-limiting tiers dynamically
- Expand label taxonomy to support more trusted classes
The header follows the form X-Provenance: label1=value1;labelN=valueN where label identifies the provenance of the request.
- Examples
-
X-Provenance: net: used to flag internal or requests coming from trusted network rangesX-Provenance: abuser: request coming from a known abuserX-Provenance: client: request coming from a known client ipblockX-Provenance: cloud: request coming from a known cloudX-Provenance: isp: ISP data provided by MaxMind ISP databaseX-Provenance: net=unknown: default fallback valueX-Provenance: datacenter=true: indicates the request is coming from a datacenter, not from a eyeballs provider. Data is provided at the moment by the Spur datacenter feedX-Provenance: id: request coming from a verified client, for which we have both a matching user agent and a matching provenance expression. For instance, a request with user-agent "Googlebot" coming from the ip ranges of googlebot.
X-Forwarded-Proto
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
Identifies the protocol (HTTP or HTTPS) used by connecting client. The value of this header is hard-coded to https.
- Example
-
X-Forwarded-Proto: https
X-Trusted-Request
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
This header expresses the level of trust of a request from the point of view of identification: do we know who is making this request, and in that case, do we trust them? The values go from A to F, see the table below for an explanation of the meaning.
| Value | Meaning |
|---|---|
| A | The request comes from a trusted network, like WMCS or another wikimedia network, and is exempted by most rate-limiting and requestctl filters. |
| B | The request comes from verified crawlers and bots which we identify by their User-Agent and IP range. These requests have allocated rate-limits in the CDN, and are excluded from any other filtering rule. |
| C | The request has a valid logged-in MediaWiki session (correctly signed JWT session token). The request is exempted from most requestctl filters, and rate-limiting is based on the MediaWiki account rather than the IP (via the encrypted JWT subject ID). |
| D | The request is from a bot that identifies itself with a user-agent compliant with our robot policy but are not otherwise authenticated. Requests from these bots are automatically rate-limited based on their contact information, according to our robot policy. |
| E | Generic, unidentified traffic. This includes most of the logged-out human traffic and bots that do not honor our UA policy. This traffic is subject to all requestctl filtering rules, and it also gets a score indicating the probability of being a browser (see X-Is-Browser below). Depending on the score, rate-limiting (which is based on the wmfuniq cookie, or IP as a fallback) will be more lax or steeper. |
| F | Traffic from abusive networks. It should mostly be blocked or heavily rate-limited. |
On the backend, this information can help you make decisions about performing expensive operations, or setting different limits on resource consumption.
- Examples
-
X-Trusted-Request: BX-Trusted-Request: -
X-UA-Contact
| Origin | Returned to client? |
|---|---|
| HAProxy | No |
This header is present in requests of classes C through F of #X-Trusted-Request and contains the contact information from the automated clients (bot) that respects our policy. This could either an URL or email address indicated in the User-Agent header by the client. If the client indicates both contact information in the User-Agent header the email is preferred and saved in the X-UA-Contact header sent downstream.
Historical headers
Headers that are no longer used and only retained here for historical information.
X-Varnish-Cluster
This header was used to signal the back-end caching layer which varnish cluster handled a request. The value of this header was hard-coded to misc.
- Example
-
X-Varnish-Cluster: misc
See also
- The
wprovURL parameter documented under Provenance is likewise logged in thex_analytics_mapfield of the webrequest table.