Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polyfill http.validateHeaderName and http.validateHeaderValue #843

Merged
merged 2 commits into from
Jun 9, 2020
Merged

Polyfill http.validateHeaderName and http.validateHeaderValue #843

merged 2 commits into from
Jun 9, 2020

Conversation

tinovyatkin
Copy link
Member

@tinovyatkin tinovyatkin commented May 28, 2020

This PR makes use of native header/value validation functions added in Node 14.3 and adjusts error message/code for our functions polyfilling the same behavior.

@tinovyatkin tinovyatkin changed the title prefer-native-functions Use Node 14.3 native functions for headers/value validation May 28, 2020
@Richienb
Copy link
Member

Richienb commented May 28, 2020

I'm putting this on hold until at least April 2022 at which point Node.js 14 would be the last LTS version.

@tinovyatkin
Copy link
Member Author

Did you check the code? It uses feature detection and native functions when available and rolls-back to our implementation if they don’t. No experimental warnings, etc.

This approach was used several times in this library in the past (remember gated Stream.destroy for example) and I can’t understand why you decided to put it on hold in this case?

@Richienb
Copy link
Member

Is there a difference between http.validateHeaderName and our implementation?

@tinovyatkin
Copy link
Member Author

RegExes are different: https://github.com/nodejs/node/blob/26f150022f9b4d4709c1d4ad450c6f9e9fce8d0e/lib/_http_common.js#L204
Ours are negating.
It's a big chance, like with may other features in the past, that this will be backported to Node 12.

@Richienb
Copy link
Member

Since adding this support doesn't really change anything, either we copy over the regex or wait for the v12 backport and which we'll most likely support in node-fetch v4.

@LinusU
Copy link
Member

LinusU commented May 28, 2020

Personally, I feel that it's better to either wait until we can drop support for Node.js versions that doesn't have this, or copy their regex here and only use our copy. That way there won't be subtle differences between different versions, since potentially we aren't behaving in the exact same way...

@tinovyatkin tinovyatkin changed the title Use Node 14.3 native functions for headers/value validation Polyfill http.validateHeaderName and http.validateHeaderValue May 28, 2020
@tinovyatkin
Copy link
Member Author

tinovyatkin commented May 28, 2020

Colleagues, I'm really lost here...
This PR is basically polyfilling functions added into Node 14 as they directly related to our functionality (adjusting the shape of our error messages and adding wrong header name into value error)
It's the practice that moved forward the whole JavaScript development community. It's something this project did before with Readable.destroy, Brotli and a bunch of other features.
The fetch itself is still a polyfill in many cases browser-side.

But now @Richienb put it on hold until Node 14 becomes LTS, and @LinusU says that just copying source code is better than attempt to use native feature first.

Is there something particularly wrong with these functions that we should use different approach from earlier cases here?

@tinovyatkin tinovyatkin requested a review from bitinn May 28, 2020 13:43
Copy link
Member

@xxczaki xxczaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since in the previous versions of node-fetch we did use features, which were not available in some Node.js versions, which we supported and therefore had to feature-check and/or polyfill, I think we should do it here as well.

@LinusU
Copy link
Member

LinusU commented May 28, 2020

So, I'm quite new to this project so I'm not entirely up to speed on what has been the policy currently. Maybe I could have made that clearer, I was just speaking of my personal opinion which is coming from maintaining other packages...

With that said, I will try to explain myself more throughly, but I will stress that this just my personal opinion on what technically best solution is:

Since our function and the built-in Node.js function doesn't use the same regex, it's likely that there will be subtle differences in the behaviour of those two functions. This means that the behaviour of our library will change, when running on Node.js 14.3 or newer.

Since we have to maintain a code path that works before Node.js 14.3 anyways, I think it's better that we only have one code path, since I don't see any downside with this.

The idea of not using the built-in version even when available is sometimes called a ponyfill, and that link has some more information on why it's preferred.

As I said, this is just my personal opinion...

@tinovyatkin
Copy link
Member Author

tinovyatkin commented May 28, 2020

@LinusU we all speak personal opinions here and nobody is the ultimate source of true :-)
If there is no written policy then I assume previously accepted contributions are the policy and we did the use of gated features here in the past (see above).

We certainly can copy the source from Node code, but:

  1. Our code works and somehow battle-tested, but the Node functions are just released as public API and subject to internal changes
  2. However, where they are available our headers will be verified by them anyway as soon as they get to nativehttp.request. So, it will be a shame to have a header that passed our check but will be thrown by native Node check
  3. If we decide to just copy without attempting to use native function first then we can easily forget to switch to the native functions on a future refactoring when these functions will be available on our minimum required platform
  4. If we just copy, we also should track changes to these functions in Node to stay update with bug fixes. We should not discard the case when Node may decide to move these functions into C++.

So, I see the current approach as compliance with JavaScript industry best practice, this library's past approaches, and overall future-proof.

@Richienb
Copy link
Member

3. we can easily forget to switch to the native functions on a future refactoring

Just add a TODO comment for reminding.

@jimmywarting
Copy link
Collaborator

jimmywarting commented May 29, 2020

hmm, on the fence, 14.3 just got out so many won't benefit from it just yet. I like native things too. I would probably say use native first if possible, if not copy over nodes regex to our code also instead of using two different. but i also would want to see at least some differences between the two, which is faster/slower, is one or the other doing things wrong, dose something pass throught the cracks that should or shouldn't be a valid header, who wrote the current regex that we use today, where dose it come from? (always wish someone leve comment/references when it comes to regex)

@tinovyatkin
Copy link
Member Author

tinovyatkin commented May 29, 2020

@jimmywarting
Our header validation was initially borrowed from Node code (it was non-Regex version): 2cafdcb

Then copying RegExp version from Node core again: a8f6d79

Both changes were made by @TimothyGu

But at some moment we stop tracking, so, without putting native code call we probably will forget again 😄

Our current code at master:

const invalidTokenRegex = /[^`\-\w!#$%&'*+.|~]/;
const invalidHeaderCharRegex = /[^\t\u0020-\u007E\u0080-\u00FF]/;

function validateName(name) {
	name = String(name);
	if (invalidTokenRegex.test(name) || name === '') {
		throw new TypeError(`'${name}' is not a legal HTTP header name`);
	}
}

function validateValue(value) {
	value = String(value);
	if (invalidHeaderCharRegex.test(value)) {
		throw new TypeError(`'${value}' is not a legal HTTP header value`);
	}
}

Node current code is bigger as it public API now, but RegExp part is as below:

const tokenRegExp = /^[\^_`a-zA-Z\-0-9!#$%&'*+.|~]+$/;
/**
 * Verifies that the given val is a valid HTTP token
 * per the rules defined in RFC 7230
 * See https://tools.ietf.org/html/rfc7230#section-3.2.6
 */
function checkIsHttpToken(val) {
  return tokenRegExp.test(val);
}

const headerCharRegex = /[^\t\x20-\x7e\x80-\xff]/;
/**
 * True if val contains an invalid field-vchar
 *  field-value    = *( field-content / obs-fold )
 *  field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
 *  field-vchar    = VCHAR / obs-text
 */
function checkInvalidHeaderChar(val) {
  return headerCharRegex.test(val);
}

@tinovyatkin
Copy link
Member Author

@jimmywarting Node.JS announces High severity security release upcoming this Sat, so, probably everybody on Node 14.x will upgrade and have this feature.

@jimmywarting
Copy link
Collaborator

can you switch to using node's regex as a fallback if http.validateHeaderName and http.validateHeaderValue isn't available?

@tinovyatkin
Copy link
Member Author

@jimmywarting done. We need #846 to be merged to actually test on Node 14.3, as for now all tests are running on LTS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants