Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename HttpData.of to HttpData.wrap, add copyOf, and deprecate offset. #1797

Merged
merged 11 commits into from
May 31, 2019

Conversation

anuraaga
Copy link
Collaborator

@anuraaga anuraaga commented May 28, 2019

Currently, we allow creating a HttpData as a slice of a byte[]. However, in practice we almost never use it, only when constructing an HttpData after decoding a CharSequence (not String) which is very rare usage, so I've changed it to copy out the slice. We previously also did so when retrieving the result of Thrift serialization, but replaced that with using ByteBufHttpData. So instead, we can change the slicing APIs to use ByteBuf and remove the need for users to worry about offset / length when accessing the data array.

Without the ability to slice HttpData, a user can access the array of an HttpData for processing by simply calling .array(), where it has been easy to forget to call .offset() and .length() as is currently required. It's especially problematic because it still usually works, but couldn't be guaranteed to work in all situations.

Fixes #1771

@codecov
Copy link

codecov bot commented May 28, 2019

Codecov Report

Merging #1797 into master will decrease coverage by 0.02%.
The diff coverage is 70%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1797      +/-   ##
============================================
- Coverage     72.85%   72.83%   -0.03%     
- Complexity     8636     8687      +51     
============================================
  Files           770      772       +2     
  Lines         34016    34145     +129     
  Branches       4185     4196      +11     
============================================
+ Hits          24781    24868      +87     
- Misses         7105     7131      +26     
- Partials       2130     2146      +16
Impacted Files Coverage Δ Complexity Δ
.../com/linecorp/armeria/common/AbstractHttpData.java 37.5% <ø> (ø) 4 <0> (ø) ⬇️
...n/java/com/linecorp/armeria/client/HttpClient.java 50% <ø> (+6.66%) 17 <0> (+4) ⬆️
.../java/com/linecorp/armeria/common/HttpRequest.java 75.51% <0%> (+1.51%) 24 <0> (ø) ⬇️
...orp/armeria/client/thrift/THttpClientDelegate.java 75.96% <0%> (ø) 22 <0> (ø) ⬇️
...inecorp/armeria/common/AggregatedHttpResponse.java 64.15% <0%> (+4.5%) 8 <0> (ø) ⬇️
...om/linecorp/armeria/common/HttpResponseWriter.java 28.78% <0%> (+2.02%) 3 <0> (ø) ⬇️
...m/linecorp/armeria/server/thrift/THttpService.java 71.98% <0%> (ø) 49 <0> (ø) ⬇️
...ecorp/armeria/server/grpc/UnframedGrpcService.java 82.72% <0%> (ø) 19 <0> (ø) ⬇️
...a/com/linecorp/armeria/unsafe/ByteBufHttpData.java 78.94% <0%> (-3.11%) 17 <2> (-2)
...orp/armeria/server/tomcat/Tomcat90InputBuffer.java 62.5% <0%> (ø) 6 <0> (ø) ⬇️
... and 50 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 939ddd2...9f26c17. Read the comment docs.

Copy link
Contributor

@hyangtack hyangtack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@trustin
Copy link
Member

trustin commented May 28, 2019

I wonder if it's a good idea to remove the overloaded methods that accept offset and length or not. It introduces ambiguity that forces a user to guess if the specified array will be copied or not, because no copy may be made if offset is 0 and length is same with the length of the array. However, it can be convenient in cases where you want to avoid unnecessary array copy:

if (offset == 0 && length == data.length) {
    return HttpData.of(data);
} else {
    return HttpData.of(Arrays.copyOfRange(data, offset, offset + length));
}

@trustin
Copy link
Member

trustin commented May 28, 2019

Perhaps we could rename all factory methods of HttpData for clarity:

  • of(byte[]) and of(ByteBuf) to wrap() or wrapperOf()
  • of(byte[], int offset, int length) to wrapOrCopy(byte[], int, int) or wrapperOrCopyOf(...)

Thoughts?

@anuraaga
Copy link
Collaborator Author

Yeah I was definitely thinking of renaming of(byte[]) to wrap, seems like a good idea.

wrapOrCopy seems confusing since having a method that may reflect mutations in the input or may not seems unpredictable. In that case I would consider having copyOf(byte[]), copyOf(byte[], int, int) and wrap(byte[], int, int) which is implemented as return new ByteBufHttpData(Unpooled.wrappedBuffer(bytes, offset, size), false). If we expose these on Aggregated* too though it'd be a whole lot of APIs.

@trustin
Copy link
Member

trustin commented May 28, 2019

wrapOrCopy seems confusing since having a method that may reflect mutations in the input or may not seems unpredictable. In that case I would consider having copyOf(byte[]), copyOf(byte[], int, int) and wrap(byte[], int, int) which is implemented as return new ByteBufHttpData(Unpooled.wrappedBuffer(bytes, offset, size), false).

+1

If we expose these on Aggregated* too though it'd be a whole lot of APIs.

Agreed. Let's leave them removed in Aggregated* and let users use HttpData when they want to slice the array.

@anuraaga anuraaga force-pushed the no-httpdata-slicing branch from 24578bf to 3ba7d6d Compare May 28, 2019 10:23
@anuraaga
Copy link
Collaborator Author

Tried changing to the proposed API let me know if it looks ok

@anuraaga
Copy link
Collaborator Author

Also I'm wondering if the ByteBuf versions should be prefixed with unsafe or be in a different package

Copy link
Member

@trustin trustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API change looks great to me. Thanks!

@@ -192,7 +192,7 @@ public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception
fail(ctx, ContentTooLargeException.get());
return;
} else {
res.write(HttpData.of(data));
res.write(HttpData.copyOf(data));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if we can use wrap() here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah noticed this - didn't want to think too much about it but changed to wrap(retain) since seems ok.

@@ -62,18 +62,13 @@ public boolean isEndOfStream() {

@Override
public byte[] array() {
if (buf.hasArray()) {
if (buf.hasArray() && buf.arrayOffset() == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& buf.array().length == length ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof - added tests

@trustin
Copy link
Member

trustin commented May 29, 2019

Pressed a wrong button. I actually requested changes. 😉

@anuraaga anuraaga changed the title Remove ability to slice an array into an HttpData. Rename HttpData.of to HttpData.wrap, add copyOf, and deprecate offset. May 29, 2019
@@ -192,7 +192,7 @@ public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception
fail(ctx, ContentTooLargeException.get());
return;
} else {
res.write(HttpData.of(data));
res.write(HttpData.copyOf(data));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah noticed this - didn't want to think too much about it but changed to wrap(retain) since seems ok.

@@ -62,18 +62,13 @@ public boolean isEndOfStream() {

@Override
public byte[] array() {
if (buf.hasArray()) {
if (buf.hasArray() && buf.arrayOffset() == 0) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof - added tests

Copy link
Contributor

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Copy link
Contributor

@hyangtack hyangtack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM :-)

@anuraaga anuraaga force-pushed the no-httpdata-slicing branch from 0d27285 to 07bdb98 Compare May 29, 2019 10:08
@anuraaga
Copy link
Collaborator Author

I found a gotcha with wrap(byte[], int, int) when implemented using an unpooled bytebuffer - previously it was possible to publish the same HttpData created that way multiple times, but now it isn't since the first subscriber would read all the bytes. Do we want to support this?

@trustin
Copy link
Member

trustin commented May 30, 2019

I found a gotcha with wrap(byte[], int, int) when implemented using an unpooled bytebuffer - previously it was possible to publish the same HttpData created that way multiple times, but now it isn't since the first subscriber would read all the bytes. Do we want to support this?

That's a good point. Perhaps we should introduce another HttpData implementation dedicated to wrap(byte[], int, int) which does not involve a ByteBuf?

@anuraaga
Copy link
Collaborator Author

Tried adding a separate implementation, how does it look?

@trustin
Copy link
Member

trustin commented May 31, 2019

@anuraaga Looks good. Left just some comments.

@trustin trustin merged commit e22dc1d into line:master May 31, 2019
@trustin
Copy link
Member

trustin commented May 31, 2019

Thanks, @anuraaga!

fmguerreiro pushed a commit to fmguerreiro/armeria that referenced this pull request Sep 19, 2020
line#1797)

Motivation:

Currently, we allow creating a `HttpData` as a slice of a `byte[]`. However, in practice we almost never use it, only when constructing an `HttpData` after decoding a `CharSequence` (not `String`) which is very rare usage, so I've changed it to copy out the slice. We previously also did so when retrieving the result of Thrift serialization, but replaced that with using `ByteBufHttpData`. So instead, we can change the slicing APIs to use `ByteBuf` and remove the need for users to worry about `offset` / `length` when accessing the data array.

Without the ability to slice `HttpData`, a user can access the array of an `HttpData` for processing by simply calling `.array()`, where it has been easy to forget to call `.offset()` and `.length()` as is currently required. It's especially problematic because it still usually works, but couldn't be guaranteed to work in all situations.

Modifications:

* Deprecate `offset()` and add `HttpData.wrap()` and `HttpData.copyOf()`
* Remove usage of offset
* Clarify Javadocs for `byte[]` factories.
* Add wrapping implementation

Result:

- Fixes line#1771
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove HttpData.offset / length?
4 participants