Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #5487: Fix load issue with multiple same images #6170

Merged
merged 5 commits into from
Feb 10, 2019

Conversation

ankita240796
Copy link
Contributor

Explanation

Fixes #5487: Adds loadtime to image object to avoid overriding of multiple callbacks with same objectUrl.

@ankita240796
Copy link
Contributor Author

Should we also add an e2e test for this issue? @seanlip @DubeySandeep

@codecov-io
Copy link

codecov-io commented Jan 26, 2019

Codecov Report

Merging #6170 into develop will increase coverage by 0.04%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #6170      +/-   ##
===========================================
+ Coverage    45.33%   45.37%   +0.04%     
===========================================
  Files          525      525              
  Lines        30869    30923      +54     
  Branches      4615     4631      +16     
===========================================
+ Hits         13993    14032      +39     
- Misses       16876    16891      +15
Impacted Files Coverage Δ
...lates/dev/head/services/AssetsBackendApiService.js 70.83% <ø> (-0.48%) ⬇️
..._editor/translation_tab/TranslationTabDirective.js 1.28% <0%> (-8.72%) ⬇️
.../dev/head/domain/question/QuestionUpdateService.js 93.22% <0%> (-6.78%) ⬇️
...loration_editor/editor_tab/ExplorationEditorTab.js 33.33% <0%> (-4.45%) ⬇️
...ndDropSortInput/directives/DragAndDropSortInput.js 48.31% <0%> (-2.88%) ⬇️
...dev/head/pages/exploration_editor/RouterService.js 16.8% <0%> (-0.7%) ⬇️
...es/exploration_editor/EditorNavigationDirective.js 1.81% <0%> (-0.27%) ⬇️
...d/pages/question_editor/QuestionEditorDirective.js 2.53% <0%> (-0.25%) ⬇️
...ory_editor/main_editor/StoryNodeEditorDirective.js 0.87% <0%> (-0.19%) ⬇️
...d/pages/topic_editor/TopicEditorNavbarDirective.js 1.53% <0%> (-0.13%) ⬇️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4773604...ca1592f. Read the comment docs.

@seanlip
Copy link
Member

seanlip commented Jan 28, 2019

Hi @ankita240796, would you mind explaining exactly what was causing the problem, and how this solves it? I'm not sure I understand the reasoning behind the fix.

Thanks!

@ankita240796
Copy link
Contributor Author

Hi @seanlip, I tried adding breakpoints to check if the last request was overriding the callback for the other requests due to same objectUrl for all requests. (https://github.com/oppia/oppia/pull/6170/files#diff-91b5db2bf68bf29fd29d98b9cec1002fR63) and that was the reason for the issue.

So, to make the callbacks different, I added loadtime property and objectUrl property in a new object which is now used on promise resolution.

@seanlip
Copy link
Member

seanlip commented Jan 28, 2019

Hm, what do you mean by "overriding the callback", though? E.g. if I were to call successCallback() twice with the same args, shouldn't there still be two executions of the callback function? That's what I'd have expected, so I'm still a little confused about what is going on here.

(Not saying your diagnosis is wrong, just that it would be helpful to have a bit more detail about how exactly this overriding is taking place. Part of why I'm drilling down here is also to double-check that we aren't breaking our caching functionality by doing this, since adding timestamps makes every request unique and may result in cache hits turning into cache misses.)

Thanks!

@ankita240796
Copy link
Contributor Author

ankita240796 commented Jan 28, 2019

Hi @seanlip, here is what I have tried:

I added breakpoints at these two line of codes:
image

This was the exploration for the same: https://oppiatestserver.appspot.com/explore/kZ4H2LdrtenU

The execution was paused thrice for the first line for each image but for the next line only one pause was observed i.e. there are three callbacks but strangely only one is getting executed.

I feel it is better if you cross check the diagnosis because the behaviour observed is strange in this case.

Thanks!

@seanlip
Copy link
Member

seanlip commented Jan 28, 2019

Hi @ankita240796, I understand (and believe) what you're saying, but the next step is to debug it to understand why it is happening. Until we understand why, it's dangerous to make a change (similar to "Chesterton's fence").

So what I would suggest is that you try to reproduce this locally on your dev instances. If you can do so, then start adding logging statements to getImageUrl() and the surrounding calls until you can come up with a hypothesis for why some of the calls are getting swallowed. I also suggest that you take a look at line 30 and maybe add a breakpoint there -- is that function getting called the other two times?

@ankita240796
Copy link
Contributor Author

Sure @seanlip, I agree this is a correct way to find a fix. I will try doing the steps suggested by you and share the updates here. Thanks!

@ankita240796
Copy link
Contributor Author

ankita240796 commented Jan 29, 2019

Hi @seanlip, here are the things I tried:

  • I tried to produce this behaviour locally. It was reproducible in a very asynchronous way. Sometimes I was able to reproduce and sometimes it did not happen at all. I had to clear the cache a few times after which it was reproduced.
  • I added console.log(statements) as follows:

image
If there were 10 same images, hi1 was displayed 10 times on console. Also the first 9 images did not load.

image
If there were 10 images, the first 9 did not load and no message was displayed on console.

image
image
In both of these cases, strangely all the 10 images loaded and both hi1 and hi2 were displayed 10 times.

In ImagePreloaderService.js
image.
This was executed 3 times as expected.

In ImageFileObjectFactory.js
image
This was executed only once.

In AssestsBackendApiService.js
image
The execution order for the 3 images was as follows: 158, 159, 162, 163, 158, 159, 162, 158, 159, 162

This shows that no image file is being returned for two of the images since neither of the if else is true.

This is due to images being stored according to their names when they are requested here. Due to this in a set of multiple images, only the last is being loaded.

Would it be fine to store image names and their request time to solve this issue?

@seanlip @DubeySandeep what do you suggest?

@seanlip
Copy link
Member

seanlip commented Jan 30, 2019

@ankita240796, this is a great start. Thank you! I think we are close, but let's do one more pass to ensure that we fully understand the issue.

Given your observations, can you draw a sequence diagram to explain what happens when multiple same images are requested? Show how key variables change as the image requests come in, and how these changes lead to a cache miss (the identification of this line is a good start, but where is it being used and how is it being modified as the image requests come in?). You should be able to use this model to explain the observations you've made above.

Once you've done that, we can then figure out whether your solution is what we need to do, or whether we can fix the asset tracking logic instead in a way that's less invasive. For example, other solutions might include allowing retries for a cache miss, or maintaining a persistent image cache. Caching issues are typically very easy to get wrong (and "fixes" to them can end up causing more issues in the process), so I think we should proceed carefully here.

Thanks!

@ankita240796
Copy link
Contributor Author

Hi @seanlip, here is the sequence diagram for calls made for loading two same images in AssetsBackendApiService.js.

The issue is due to image filename being added to cache before the first request and being removed after the first request is complete. In the meanwhile the second request is not executed due to the filename already being in cache.

PTAL, Thanks!

@seanlip
Copy link
Member

seanlip commented Jan 31, 2019

Ah, I see! So, just to recap: the issue arises because, when the call to retrieve the second image comes in, nothing happens because the image is still in the "being fetched" phase, and the frontend cache hasn't got a copy of the image yet. So the execution just drops out of the bottom of the if/else clause because there's no "else" block to catch the third case. Is that correct? If so, then thanks for the very clear explanation -- this is an excellent analysis, and I understand clearly what's going on now.

It seems to me that the correct fix is to add an "else" clause. There are three ways I can think of for doing this:

  1. Wait until the existing http request comes in, then somehow resolve the new incoming call with the response of that http request. I'm not sure if there's even a way to do that -- it seems quite messy to me.
  2. Make a new request to the backend anyway, following a principle of "if it's not in the cache, I'll just need to ask manually for it". I think this is fine since it guarantees a response -- we might be making a few "wasteful" queries but that's probably OK. The main thing is to make sure that we do the correct thing when updating the cache, but I think we can probably just update the cache with the latest response.
  3. Resolve the second request with something that indicates "ask me later". It is then the responsibility of the client to ask for the image to be loaded again after a fixed time interval. This would result in fewer wasteful queries to the server, but might degrade the user experience a bit if the time interval is too long (or risk jamming things up with too many frontend pings if the time interval is too short).

I think options 2 and 3 are the plausible ones. Would you agree, and if so, which one would you prefer to go with?

Also, just for clarity, now that I understand what's going on, I'm not sure how the existing fix in this PR solves the issue -- it seems to me like execution can still drop out of the bottom of the if/else-if clause. Is there a reason why this doesn't happen?

/cc @DubeySandeep because I think this would theoretically affect audio assets too.

Thanks!

@ankita240796
Copy link
Contributor Author

ankita240796 commented Jan 31, 2019

Hi @seanlip,

Is that correct?

Yes, that is correct and is exactly what is happening with the requests.

If so, then thanks for the very clear explanation -- this is an excellent analysis, and I understand clearly what's going on now.

You're welcome! It was great to learn a proper way to debug an issue.

I think options 2 and 3 are the plausible ones. Would you agree, and if so, which one would you prefer to go with?

I feel that option 2 is the best one to go with since such cases where user uploads multiple same images will be rare. So, it is ok to make a few wasteful queries and also there will be no effect on user experience. It is better to go with an option which effects user experience as less as possible.

Also, just for clarity, now that I understand what's going on, I'm not sure how the existing fix in this PR solves the issue -- it seems to me like execution can still drop out of the bottom of the if/else-if clause. Is there a reason why this doesn't happen?

Yes, the existing PR will not fix the issue. When I checked the PR locally, I had a console.log(obj.loadTime) in the same way as part 3 in this comment and with that case all the images loaded correctly. I am not sure why that happens though.

Thanks!

@seanlip
Copy link
Member

seanlip commented Jan 31, 2019

OK, that sounds good on all counts. Go for it! If you could fix the audio part too, that would be great.

Also, I'm not sure what's going on with console.log -- maybe it blocks the execution flow in some weird way. Thanks for checking!

@ankita240796
Copy link
Contributor Author

ankita240796 commented Jan 31, 2019

OK, on it! I will try fixing the audio part as well.

Yes, I too feel it may be affecting the execution time of requests such that they may not be interfering with each other but the behaviour locally was not that consistent so it was difficult to figure out the reason behind that.

@ankita240796
Copy link
Contributor Author

Hi @seanlip, sorry for the delay here. I have made the changes, PTAL. Thanks!

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ankita240796! Left a few notes.

@DubeySandeep could you PTAL at this as well, since it relates to audio too? Thanks!

assetsCache[filename] = assetBlob;

if (!assetsCache.hasOwnProperty(filename)) {
assetsCache[filename] = assetBlob;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, would it be better to just assign the value in all cases? That way, the cache always has the most up-to-date version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, seems good.

@@ -216,6 +219,9 @@ oppia.factory('AssetsBackendApiService', [
ASSET_TYPE_AUDIO)) {
_fetchFile(explorationId, filename, ASSET_TYPE_AUDIO,
resolve, reject);
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to combine this clause with the previous one? Ditto below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be better.

@@ -216,6 +219,9 @@ oppia.factory('AssetsBackendApiService', [
ASSET_TYPE_AUDIO)) {
_fetchFile(explorationId, filename, ASSET_TYPE_AUDIO,
resolve, reject);
} else {
_fetchFile(explorationId, filename, ASSET_TYPE_AUDIO,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any difference in else if and else block? maybe we can just have else block? (ditto below)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlip, has already covered this in his review!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

@ankita240796
Copy link
Contributor Author

ankita240796 commented Feb 4, 2019

Hi @seanlip @DubeySandeep, I have made the required changes, PTAL. Thanks!

I just have a doubt -- We are issuing a fetch request whenever we do not find an image in the cache but in this process we have removed the check for whether the same request is not issued again. Will that result in issues?

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ankita240796!

@DubeySandeep -- any further comments prior to merge?

@DubeySandeep
Copy link
Member

DubeySandeep commented Feb 4, 2019

@ankita240796, Can we remove the _isAssetCurrentlyBeingRequested function entirely if it's of no use?

EDIT: I can see it's used in other places. (So scratch this comment!)

@DubeySandeep
Copy link
Member

DubeySandeep commented Feb 4, 2019

@ankita240796, There are two things we use to do earlier here once some assets are requested:

  1. We use to add this to beingRequested list before making https request.
  2. Once loaded we use to add them in assetsCache and remove from beingRequested,

I think this PR changes create a scenario where we don't need beingRequested part? If so then I think we can clean up the code a little more!

We are issuing a fetch request whenever we do not find an image in the cache but in this process, we have removed the check for whether the same request is not issued again. Will, that result in issues?

I'm not sure but think about a case where a user navigating back and forth a card quickly in slower network connectivity! (and assuming loading image/audio for the first time is still in process.)

@ankita240796
Copy link
Contributor Author

ankita240796 commented Feb 5, 2019

Hi @DubeySandeep, the scenario you mentioned can definitely imapct the user. So, should we try to find a way which handles multiple same requests as well ensures that a single request is not executed repeatedly?

@DubeySandeep
Copy link
Member

@ankita240796, yeah I think we should take care of that case as well, so yeah I'll suggest you look into this and see If you can find a way which handles multiple same requests as well ensures that a single request is not executed repeatedly?

@seanlip any suggestion on this? (Also, if you want this functionality then can you please un-approved the PR (so that it doesn't get merged by mistake) until the new functionality is implemented?)

@seanlip
Copy link
Member

seanlip commented Feb 6, 2019

I think we discussed something like this in an earlier comment, but I don't know of a clean way to join a new async call to a existing pending response, so we decided to just let new responses go through (as the least bad option).

I think it's fine to keep that (i.e. merge this PR) and file a feature request for the additional improvement.

@DubeySandeep
Copy link
Member

@seanlip, thanks SGTM!

@ankita240796 have you gone through this? do you think it will be possibile to remove this?

@ankita240796, There are two things we use to do earlier once some assets are requested:

  1. We use to add this to beingRequested list before making https request.
  2. Once loaded we use to add them in assetsCache and remove from beingRequested,

I think this PR changes create a scenario where we don't need beingRequested part? If so then I think we can clean up the code a little more!

@ankita240796
Copy link
Contributor Author

Hi @DubeySandeep, I don't think we can remove the code for adding and removal of assets in fetch function as that will be required in a situation where we want to abort all downloads(L105)

@seanlip should I file an issue for the additional improvement?

@seanlip
Copy link
Member

seanlip commented Feb 7, 2019

Yes, go ahead!

@ankita240796
Copy link
Contributor Author

Filed #6224.

@ankita240796
Copy link
Contributor Author

Hi @DubeySandeep may you please take a look here, Thanks!

Copy link
Member

@DubeySandeep DubeySandeep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankita240796, LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants