-
Notifications
You must be signed in to change notification settings - Fork 352
Return a content-encoding
header for resource timing and more
#1796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for taking the time to pick this up. However, it doesn't seem like this addresses all the issues with #1742? I recommend studying the feedback on that PR. |
Hi Anne! I think I should have put up some background information here.
Therefore, in this Does this sound right to you? I am new to |
This CL introduce a contentEncoding field to Performance resource timing object. This field is behind a feature flag. PR to resource timing specification: w3c/resource-timing#411 PR to fetch specification: whatwg/fetch#1796 Bug: 327941462 Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0
This CL introduce a contentEncoding field to Performance resource timing object. This field is behind a feature flag. PR to resource timing specification: w3c/resource-timing#411 PR to fetch specification: whatwg/fetch#1796 Bug: 327941462 Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321 Commit-Queue: Guohui Deng <[email protected]> Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Matthew Denton <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Cr-Commit-Position: refs/heads/main@{#1407331}
This CL introduce a contentEncoding field to Performance resource timing object. This field is behind a feature flag. PR to resource timing specification: w3c/resource-timing#411 PR to fetch specification: whatwg/fetch#1796 Bug: 327941462 Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321 Commit-Queue: Guohui Deng <[email protected]> Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Matthew Denton <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Cr-Commit-Position: refs/heads/main@{#1407331}
This CL introduce a contentEncoding field to Performance resource timing object. This field is behind a feature flag. PR to resource timing specification: w3c/resource-timing#411 PR to fetch specification: whatwg/fetch#1796 Bug: 327941462 Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321 Commit-Queue: Guohui Deng <[email protected]> Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Matthew Denton <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Cr-Commit-Position: refs/heads/main@{#1407331}
…ourceTiming, a=testonly Automatic update from web-platform-tests Expose contentEncoding in PerformanceResourceTiming This CL introduce a contentEncoding field to Performance resource timing object. This field is behind a feature flag. PR to resource timing specification: w3c/resource-timing#411 PR to fetch specification: whatwg/fetch#1796 Bug: 327941462 Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321 Commit-Queue: Guohui Deng <[email protected]> Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Matthew Denton <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Cr-Commit-Position: refs/heads/main@{#1407331} -- wpt-commits: 1df2c3e47bcb6379ecf3a07735bd967101d02a5b wpt-pr: 50115
1) formatting; 2) "gzip, GZIP" is ok for they case-insensitive match. 3) there is a mistake saying that the "contentEncoding" consists of digits; 4) no longer returns "contentEncoding" for data url.
That's on the client side getting the reponse header.
Updated the patch, I just added the content encoding to the body info struct, and add the clause that updates it. |
restore a new line.
very sorry for so many mistakes folks. Thanks for you guys' patence. |
No worries, we've all been there! (Or at least I have...) |
…ourceTiming, a=testonly Automatic update from web-platform-tests Expose contentEncoding in PerformanceResourceTiming This CL introduce a contentEncoding field to Performance resource timing object. This field is behind a feature flag. PR to resource timing specification: w3c/resource-timing#411 PR to fetch specification: whatwg/fetch#1796 Bug: 327941462 Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321 Commit-Queue: Guohui Deng <[email protected]> Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Matthew Denton <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Cr-Commit-Position: refs/heads/main@{#1407331} -- wpt-commits: 1df2c3e47bcb6379ecf3a07735bd967101d02a5b wpt-pr: 50115
I think it should be specified here as that matches how we do MIME types and that reduces the chances of someone inadvertently exposing the information. In other words: the guarantee should come from Fetch, not from the caller. |
The "raw" contentEncoding value can be arbitrary proprietary compression the app uses, and it's leaked as a response header. Meanwhile I think moving the filtering here guarantees that the only place where the raw If there is any concern pls let me know. Thanks. |
To be clear, the header is not exposed to the website passively embedding the resource, but this getter is. I don't think I understand your suggestion, could you rephrase? |
Specifically, it needs to be explicitly filtered when assigned to the response body into struct. |
It should be before copying "content-encoding" value to response bodyinfo.
Got it, Thanks! I updated the PR accordingly. |
I think the website can get the arbitrary value like this: (I am new to this area so please correct me if I am wrong)
And the reason for that is some use cases involving service workers. See but the |
You would only get access to
Yea, so filtering them when assigning to the struct wouldn't change anything observable, but any future user of that struct would get the filtered value. |
Thank you Noam! @annevk : Would you take one more look? (I also left a response at WebKit/standards-positions#467 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed you're participating on behalf of Microsoft. That means you cannot sign the contributor's agreement as an individual. Microsoft has already signed up for the Fetch Workstream so you have to join the relevant GitHub organization (MicrosoftWHATWGContributors) and make your membership thereof public.
@@ -6319,6 +6321,24 @@ optional boolean <var>forceNewConnection</var> (default false), run these steps: | |||
<li><p>Let <var>codings</var> be the result of <a>extracting header list values</a> given | |||
`<code>Content-Encoding</code>` and <var>response</var>'s <a for=response>header list</a>. | |||
|
|||
<li><p>Let <var>filteredCoding</var> be "<code>unknown</code>". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a distinct value from the empty string? It also seems to squat on the value space of the registry, which doesn't seem ideal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unknown
means there is a compression that's not recognized by the browser
empty string
means there is no compression.
We would like to distinguish the two. In this discussion thread: w3c/resource-timing#381, nhelfman points out that the two situations need to be distinguishable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, what about the value space concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, at the moment "unknown
" is also used when the header could not be parsed. Maybe that's okay. Is that tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Could you help me understand the "value space concern" problem?
- Yes, I made two test cases where the
contentEncoding
value is filtered tounknown
value.
https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/external/wpt/resource-timing/content-encoding.https.html;l=36
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, in principle someone could register unknown
as a content coding and user agents could implement it, but this API would not be able to distinguish it from the unknown
case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see! I think a solution is to make unknown
a reserved word and it cannot be used as contentEncoding
in the response header. There is already one reserved word identity
.
Do you think I should try to make that happen? I couldn't find how to propose changes to that iana "http parameters". There is a "contact" section on that page but the emails are obviously out of date. (they are @sun.com :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you email Mark Nottingham [email protected] and copy [email protected] (that's me) with the information stated at https://httpwg.org/specs/rfc9110.html#content.coding.extensibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think I should try to make that happen? I couldn't find how to propose changes to that iana "http parameters". There is a "contact" section on that page but the emails are obviously out of date. (they are @sun.com :) )
These are contacts for existing registrations; not the contact point for new registrations. Blame IANA for the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sent email to Mark (and copied Anne).
fetch.bs
Outdated
<li><p>Otherwise, if <var>codings</var> contains two strings or more, set <var>filteredCoding</var> to | ||
"<code>multiple</code>". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then set*
And this should probably compare with Infra's size concept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added "then". But I am not sure how to "compare with infra's size concept". I looked at the "infra" section but I didn't see a title related to "size". Would you please help me understand what else needs to be done with this paragraph? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! I made changes accordingly.
fetch.bs
Outdated
|
||
<li><p>Otherwise, if <var>codings[0]</var> is the empty string, or it is supported by the user agent, | ||
and is listed in the <a href="https://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding"> | ||
content encoding registry on IANA</a>, set <var>filteredCoding</var> to <var>codings[0]</var>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want <cite>HTTP Content Coding Registry</cite>
inline and then use a reference for the actual URL.
(And also apply the earlier comments.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite done. The casing is incorrect and you didn't move the URL into the references section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see another example of reference, so I did the same: with an inline link reference in place, I also made an entry in the "reference section".
The reference is not in Specref
yet, so I submitted this PR:
tobie/specref#860
I cannot verify this yet because the specref
PR is not merged.
Does this look correct to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to be in specref, you can also modify <pre class=biblio>
in this document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Do you think I should withdraw the PR to the specref? Thanks.
Now with the filtering in place it's probably slightly more reasonable to leave the existing parsing issue unsolved for now, assuming there's adequate test coverage. |
Thanks @annevk! I am working on MicrosoftWHATWGContributors membership right now. |
@@ -6319,6 +6321,24 @@ optional boolean <var>forceNewConnection</var> (default false), run these steps: | |||
<li><p>Let <var>codings</var> be the result of <a>extracting header list values</a> given | |||
`<code>Content-Encoding</code>` and <var>response</var>'s <a for=response>header list</a>. | |||
|
|||
<li><p>Let <var>filteredCoding</var> be "<code>unknown</code>". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, what about the value space concern?
fetch.bs
Outdated
|
||
<li><p>Otherwise, if <var>codings</var>[0] is the empty string, or it is supported by the user agent, | ||
and is listed in the <a href="https://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding"> | ||
<cite>content encoding registry on IANA</cite></a>, then set <var>filteredCoding</var> to <var>codings</var>[0]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the server specifies "GZIP
"? Does it get lowercased?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes all output should be lowercased and I made changes in text accordingly. Thanks for pointing out.
- In the current text, a value is "allowed" as long as it "case insensitive matches" a registered value. And it will be lowercased before exposed. Is it O.K.?
Thanks.
fetch.bs
Outdated
|
||
<li><p>If <var>codings</var> is null, then set <var>filteredCoding</var> to the empty string. | ||
|
||
<li><p>Otherwise, if <var>codings</var>'s <a for=list>size</a> is 2 or more, then set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<li><p>Otherwise, if <var>codings</var>'s <a for=list>size</a> is 2 or more, then set | |
<li><p>Otherwise, if <var>codings</var>'s <a for=list>size</a> is greater than 1, then set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
fetch.bs
Outdated
|
||
<li><p>Set <var>response</var>'s <a for=response>body info</a>'s | ||
<a for="response body info">content encoding</a> to the result of | ||
<a lt=byte-lowercased>byte-lowercasing</a> <var>filteredCoding</var>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<a lt=byte-lowercased>byte-lowercasing</a> <var>filteredCoding</var>. | |
<a lt=byte-lowercased>byte-lowercasing</a> <var>filteredCoding</var>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. This sentence is modified, and the superfluous space is gone
fetch.bs
Outdated
<li><p>Otherwise, if <var>codings</var>[0] is the empty string, or it is supported by the user agent, | ||
and is a <a>byte-case-insensitive</a> match for an entry listed in the | ||
<a href="https://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding"> | ||
<cite>HTTP Content Coding Registry</cite></a> of [[!IANA-HTTP-PARAMS]], then set | ||
<var>filteredCoding</var> to <var>codings</var>[0]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be clearer if you did the lowercasing here. There's no reason to lowercase the other branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done.
The major change is to add
content-encoding
to response header list. This PR also adds description on howcontent-encoding
is determined. (content negotiation)The purpose is to pass such value to resource timing. Further details are available at
w3c/resource-timing#381.
Note: Per discussion at 12/05/2024 webPerWG call (https://docs.google.com/document/d/1mpFDrAWuV6IgvJ1KiL9sgIlcboC5uArtF8r_oqS1Sco/edit?tab=t.0#heading=h.af6v74wysf4m), we decided to allow arbitrary "content-encoding" value at "fetch". We only filter such value at client side, before passing the value to resource timing.
Related PR to modify resource timing specification:
w3c/resource-timing#411
At least two implementers are interested (and none opposed):
Likely, Discussed and agreed at Feb 29, 2024 W3C WebPerf call, Chromium already receiving content-encoding through fetch. But not sure about other browsers.
Tests are written and can be reviewed and commented upon at:
[to be updated with a new link] https://chromium-review.googlesource.com/c/chromium/src/+/5958411
Implementation bugs are filed:
Chromium: https://issues.chromium.org/issues/327941462
Gecko: https://bugzilla.mozilla.org/show_bug.cgi?id=1886107
WebKit: https://bugs.webkit.org/show_bug.cgi?id=271632
Deno (not for CORS changes): …
MDN issue is filed:
New PerformanceResourceTiming.contentEncoding field mdn/content#32823
The top of this comment includes a clear commit message to use.
(See WHATWG Working Mode: Changes for more details.)
Bug: w3c/resource-timing#381
Preview | Diff