Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8335231: [macos] Test java/awt/print/PrinterJob/Cancel/PrinterJobCancel.java failed on macOS because the case didn't get the expected PrintAbortException #20027

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

prsadhuk
Copy link
Contributor

@prsadhuk prsadhuk commented Jul 4, 2024

When a printjob is cancelled midway, PrinterAbortException was not thrown in macos. because
firstly, cancelCheck invokes LWCToolkit.invokeLater with null as parameter causing it to fail with NPE and
secondly PrinterAbortException was consumed silently when printLoop throws any exception
which is rectified to throw the PrinterAbortException when encountered..


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8335231: [macos] Test java/awt/print/PrinterJob/Cancel/PrinterJobCancel.java failed on macOS because the case didn't get the expected PrintAbortException (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20027/head:pull/20027
$ git checkout pull/20027

Update a local copy of the PR:
$ git checkout pull/20027
$ git pull https://git.openjdk.org/jdk.git pull/20027/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20027

View PR using the GUI difftool:
$ git pr show -t 20027

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20027.diff

Webrev

Link to Webrev Comment

…el.java failed on macOS because the case didn't get the expected PrintAbortException
@bridgekeeper
Copy link

bridgekeeper bot commented Jul 4, 2024

👋 Welcome back psadhukhan! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 4, 2024

@prsadhuk This change is no longer ready for integration - check the PR body for details.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 4, 2024
@openjdk
Copy link

openjdk bot commented Jul 4, 2024

@prsadhuk The following label will be automatically applied to this pull request:

  • client

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the client client-libs-dev@openjdk.org label Jul 4, 2024
@mlbridge
Copy link

mlbridge bot commented Jul 4, 2024

Copy link
Contributor

@TejeshR13 TejeshR13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did verified in MacOS and the fix is working fine. The issue is in windows too, getting the same error. Is there a separate bug for windows?

@prsadhuk
Copy link
Contributor Author

prsadhuk commented Jul 5, 2024

I am not able to reproduce in windows, and I guess neither the submitter who raised this issue as it is mentioned in JBS
Is it a platform-specific issue: Yes. This issue only happens on macOS.
Maybe there is some issue in your physical printer configuration in which case you can try with "Microsoft Print-to-pdf" option

@TejeshR13
Copy link
Contributor

I am not able to reproduce in windows, and I guess neither the submitter who raised this issue as it is mentioned in JBS Is it a platform-specific issue: Yes. This issue only happens on macOS. Maybe there is some issue in your physical printer configuration in which case you can try with "Microsoft Print-to-pdf" option

Yeah, Print-to-pdf works fine. Not sure about printer config issue.

@prsadhuk
Copy link
Contributor Author

prsadhuk commented Jul 8, 2024

I am not able to reproduce in windows, and I guess neither the submitter who raised this issue as it is mentioned in JBS Is it a platform-specific issue: Yes. This issue only happens on macOS. Maybe there is some issue in your physical printer configuration in which case you can try with "Microsoft Print-to-pdf" option

Yeah, Print-to-pdf works fine. Not sure about printer config issue.

It seems it's timing related...For some network printers, it might take more time to setup..for example, the printer I tested Xerox Altalink it was taking time in Win32PrintService.getMediaPrintableArea called from RasterPrinterJob.setAttributes and it has around 66 media sizes namely "A3, A4, A5, Postcard, envelope, Letter" etc so looping through all media sizes were taking time, so
increasing time between actual printing start and cancelling (which is 5s presently in the testcase) helps for those printers..

Copy link
Contributor

@TejeshR13 TejeshR13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I've tested the fix in macos and it works as expected.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 9, 2024
Copy link
Contributor

@kumarabhi006 kumarabhi006 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required to change the copyright year for test as well ?

otherwise tested with the current changes and it looks good to me.

@@ -371,6 +371,8 @@ public SecondaryLoop run() {

try {
printLoop(true, firstPage, lastPage);
} catch (PrinterAbortException pex) {
throw new PrinterAbortException(pex.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not re-throw the original exception ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you miss this question ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh NM, this comments view doesn't show that it was fixed.

}
}}, null);
} catch (java.lang.reflect.InvocationTargetException ite) {}
cancelDoc();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments about deadlock refer to isCancelled() but is that not also why there was an invokeLater() ?
It is not at all clear to me. Perhaps you can explain how the deadlock would occur and why it is not a problem for calling cancelDoc() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was there from initial macosx port. Since RasterPrinterJob.isCancelled relies on synchronized block and since this cancelCheck method was called from native, it might have been suspected that it might block the native AppKit thread, so invokeLater was added.
But the problem was LWCToolkit.invokeLater is called with null argument,
which was not a problem in initial macosx port but JDK-8042087 got it modified in jdk9 to throw an exception if component argument is null, which is the case in CPrinterJob.m#cancelCheck

LWCToolkit.inokeAndWait is called on the Appkit thread, but if the component argument is null it uses EventQueue.invokeLater which would fail with an NPE in plugin mode. The method should be updated to throw an exception in case the provided component is null.

Coming to your question, I guess if there is a deadlock, calling cancelDoc directly will also face the same issue as it also relies on same synchronized(this) block but I could not find any deadlock issue with the printing for which this invokeLater safeguard was added...

So, Is the above comment " fail with an NPE in plugin mode" still applicable i.e do we still support plugin mode?
If not, we can restore invokeLater call in this code and use Toolkit.getToolkit if the component is null as it was previously to JDK-8042087

Copy link
Contributor Author

@prsadhuk prsadhuk Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see any deadlock in my testing so I am using isCancelled .
How do you normally "cancel" ongoing printing in macosx? I tried clicking on "x" in "Print Centre" when the specific print job is shown in the printer spooler and I dont see any deadlock for long printing tests like PrintAllFonts etc..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plugin mode is no longer a concern.

So it seems to me that the worry is that some other thread holds the lock on the PrinterJob, and
is waiting for a print loop to return from native code running on the AppKit thread, which then up-calls to here and then needs to synchronize on the PrinterJob.
Other cases that I see aren't from native so aren't a problem.

"Coming to your question, I guess if there is a deadlock, calling cancelDoc directly will also face the same"

Yes whereas In the existing code it is run on the AppKit thread to avoid it ...
It seems safest to restore the invokeLater.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok..Restore invokeLater and handled/thrown PrinterAbortException in native..

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jul 30, 2024
@@ -759,6 +761,7 @@ private boolean cancelCheck() {
// no-op, let the native side handle it
}
}}, null);
} catch (NullPointerException ex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. I thought you were going to do something about the null ?
Unless you don't this code won't work.
Restoring using invokeLater isn't the same thing as restoring the bug.

Look at the code that you are calling - it won't run the Runnable if there's an NPE.
"new Component()" instead of null for example.
FWIW I do not understand why this code uses LWCToolkit.invokeLater()
Seems unnecessary.
Why don't it use EventQueue.invokeLater() directly ?
That'll solve a couple of problems - simpler code, no null check because no Component arg needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, actually I wanted to avoid changing the common LWCToolkit.invokeLater (as it needs to use Toolkit.getToolkit if component is null as I was not sure if it is ok to pass any adhoc component via new Component in CPrinterJob) which is also used in accessibility and InputMethod which would have affected and require testing in those areas
but yes, I didn't see having NPE will render invokeLater a no-op

Changed to use EventQueue.invokeLater..

@prrace
Copy link
Contributor

prrace commented Aug 4, 2024

I think we still have some things to look at here

(1) The job still gets queued for printing - should it ? Ideally not if there's some thing we can do.
(2) The "Error during printing" dialog I get after running the test is from macOS not us. Should it happen ?
(3) There's also an NSException printed which is because a call to CHECK_EXCEPTION thinks
the pending Java Exception is an error from an upcall made after terminating the print loop.

Here is the NSException printed for (3) :

2024-08-02 13:34:54.360 java[50622:37988054] Java Exception
2024-08-02 13:34:54.360 java[50622:37988054] (
0 CoreFoundation 0x00007ff8045ed11a __exceptionPreprocess + 242
1 libobjc.A.dylib 0x00007ff8041120b7 objc_exception_throw + 48
2 CoreFoundation 0x00007ff8045ecf80 +[NSException raise:format:] + 214
3 libawt_lwawt.dylib 0x0000000127355ed8 -[PrinterView complete:] + 552
4 libawt_lwawt.dylib 0x0000000127354050 -[PrintModel safePrintLoop:withEnv:] + 176
5 libawt_lwawt.dylib 0x00000001272f68ef Java_sun_lwawt_macosx_CPrinterJob_printLoop + 3871
6 ??? 0x000000011455ca87 0x0 + 4636134023
)

Note it is not a Java Exception, it is an NSException and we log this from JNI_COCOA_EXIT after we catch it.
Perhaps we should only log this if a debugging flag is set .. but at least in this case it helped me see the problem (3).

So what's going on with (3)? - I can partly help explain that.
When we get an unexpected Java Exception from a JNI call and need to bail back up to Java we use NSException to do this.
Unexpected exceptions mean bad method lookups, eg up calls that create an unexpected exception etc.
If an NSException is raised Objective-C transfers control to the next "catch" - which is in this case JNI_COCOA_EXIT

This particular NSException happens because after the printLoop we need to make an up-call from "complete"

(*env)->CallVoidMethod(env, fPrinterJob, jf_completePrintLoop);
CHECK_EXCEPTION();

But there's this PrinterAbortException already pending, which means we really should not be making the up-call without clearing it first, which is another issue (#4, or still part of #3 ?)

After the upcall CHECK_EXCEPTION sees a pending Java Exception which it supposes is from a problem during the upcall and raises the NSException to "bail" up to the JNI_COCOA_EXIT where we catch it and log it as seen above.
So the NSExceptions do not escape to Java and the Java Exception is seen by Java - which is OK so far as it goes.
BUT we've jumped over a lot of printer job clean up which might contribute to (1) and (2) above but probably isn't the whole story.

So how can we get that PrinterAbortException back up to Java without causing problems for subsequent upcalls?
If we just remove the CHECK_EXCEPTION then that would probably do it, but really we should have that check.
If we don't make the call at all with a pending exception what does that do ?
Some special handling of this whole case seems likely to be needed but I don't have time to look into this deeply enough to test a solution.
Perhaps we could
(1) grab the exception
(2) clear it from the current thread, but we still have a ref - it is just a Java object !
(3) store it in some field of the CPrinterJob
(4) then check for it and re-throw when we reach Java code. \

Don't forget about (1) - the job being queued.
I'm also unclear if the NSException is detected by the macOS printing code and plays a role in the popup.
Hmm, perhaps not, since I still get it if I comment that out. Perhaps this something else we are skipping past that has nothing to do with the NSException

@prsadhuk
Copy link
Contributor Author

prsadhuk commented Aug 5, 2024

(1) The job still gets queued for printing - should it ? Ideally not if there's some thing we can do.

I guess you are testing via your standalone PJC.java or jtreg's PrinterJobCancel.java which starts the print job and after 5sec does call PrinterJob.cancel
so I guess it should start/queue the printjob..The printjob is supposed to print 6 pages if it doesn't get cancelled
I see it prints the 1st page and then it aborts in normal run with the above testcase...
Are you not seeing that or am I misunderstanding your statement?

(2) The "Error during printing" dialog I get after running the test is from macOS not us. Should it happen ?

I suspected it might be because of abortDoc native call but even after commenting cancelDoc, this dialog is coming.

(3) There's also an NSException printed

I thought it is because of this line in CHECK_EXCEPTION macro
[NSException raise:NSGenericException format:@"Java Exception"]; \
which was added by you in JDK-8259343:[macOS] Update JNI error handling in Cocoa code.

I found one issue where I see that cancelDoc exits the printingLoop and so does completePrintLoop called from PrinterView.m#complete so I think we should not be calling completePrintLoop from complete when cancelDoc is called to abort the printjob. Also, I think we should not also check the CHECK_EXCEPTION in complete in that case because now only PrinterAbortException will be there which is expected

I modified the PR to do that which stops the NSException being thrown so that should solve (3) but (2) is still an enigma..

@prsadhuk
Copy link
Contributor Author

prsadhuk commented Aug 7, 2024

I modified the PR to implement this
(1) grab the exception
(2) clear it from the current thread, but we still have a ref - it is just a Java object !
(3) store it in some field of the CPrinterJob
(4) then check for it and re-throw when we reach Java code.

which solves the (3) NSException log
but (2) macos error popup is still there.. I guess it is because it prints 1st page but there is a blank 2nd page which seems to suggest macos printer driver spooled the page but failed to print as it was aborted so it threw the error popup..

@victordyakov
Copy link

@prrace please take a look

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 18, 2024

@prsadhuk This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@prsadhuk
Copy link
Contributor Author

prsadhuk commented Oct 7, 2024

@prrace Waiting for your relook on this...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client client-libs-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

6 participants