Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

File Recovery Service #11828

Merged
merged 34 commits into from
May 27, 2016
Merged

File Recovery Service #11828

merged 34 commits into from
May 27, 2016

Conversation

as-cii
Copy link
Contributor

@as-cii as-cii commented May 24, 2016

Depends on #11826 (to run spec/browser/file-recovery-service.spec.js).

Fixes #11708.

Recently, we have been observing several issues related to losing data when Atom hard-crashes. We think that this problem manifests whenever the following happens:

  1. User/Package triggers a file save.
  2. fs.writeFileSync(path) gets called.
    1. fd = fs.openSync(path, "w") gets called. This causes the file to be truncated.
    2. Before fs.writeSync(fd, contents) has the chance to run, a hard crash occurs.

The scenario described above leaves the file in a "truncated state", thus erasing all of its contents. In the past we have experimented with backup files to solve this issue, but that creates other problems such as paths that are too long on Windows, and unnecessary events triggered on fs watchers that are monitoring the saved file's directory. It turns out that atomic saves are not a good trade-off either, as described on #3158 (comment).

With this pull-request we want to provide a file recovery service running on the main process that keeps track of pending changes, and restores files to their original state when a renderer process crashes while a save is in flight. If the recovery is unsuccessful for any reason (e.g. problems with permissions, unmounted network drives, etc.), we log a message like the following in the system console and in a message box too, so that users can perform the recovery themselves:

There was a crash while saving "/private/etc/profile", so this file might be blank or corrupted.
Atom couldn't recover it automatically, but a recovery file has been saved at: "/Users/as-cii/.atom/recovery/profile-950635".

By default we will use ~/.atom/recovery as our recovery directory, where we temporarily store a backup of the saved file and delete it upon successful save/recovery. The nice thing about using the file system as our recovery storage (as opposed to IndexedDb), is that users can inspect it pretty easily and copy/paste content via Terminal.app or Finder.app (or their equivalent on other platforms).

As a convention, backup files stored under the recovery directory, will have the following format:

{originalFilename}-{randomHexNumber}.{originalExtension}

Where:

  • originalFilename is the name of the file without its extension. This string will be truncated if it's longer than 34 characters.
  • randomHexNumber is a randomly generated hexadecimal number. This string will be 6 characters long by default.
  • originalExtension is the extension of the file.

This should be long enough to make filenames readable (and more inspectable), but short enough to avoid path length issues on platforms like Windows.

@@ -360,7 +360,6 @@ class Project extends Model

addBuffer: (buffer, options={}) ->
@addBufferAtIndex(buffer, @buffers.length, options)
@subscribeToBuffer(buffer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, we were subscribing to buffers twice, once here and once in addBufferAtIndex.

@as-cii
Copy link
Contributor Author

as-cii commented May 24, 2016

/cc: @atom/core for 👀

@BinaryMuse
Copy link
Contributor

This is lookin' great, @as-cii, thanks very much for taking this on.

I have a couple questions / comments:

  1. Is there any chance that the timing won't work out in some cases? I'm imagining the following scenario:

    1. onWillSave calls emitWillSavePath
    2. emitWillSavePath emits will-save-path. This is asynchronous.
    3. The renderer process beings saving, truncating the file
    4. The main process receives the will-save-path event. It's too late to back up the file

    If so, perhaps this is one of the few occasions where sendSync would be appropriate?

  2. This has all got me wondering if this service is actually necessary at all... would it be reasonable to just ask the main process to do all file saving? The renderer process could just ship it the contents to save, either via IPC or some other mechanism (temp file?) and then let the main process handle the saving. I'm not actually sure if this is better or not, or even reasonable, but it's a thought. :)

this.recoveryPathsByWindowAndFilePath.get(window).set(path, recoveryPath)

if (!this.crashListeners.has(window)) {
window.on('crashed', () => this.recoverFilesForWindow(window))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should remove the listener when the window closes? Would the callback's reference to window retain it and keep it from being removed from the WeakSet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point, thanks! 💯

@BinaryMuse
Copy link
Contributor

This has all got me wondering if this service is actually necessary at all... would it be reasonable to just ask the main process to do all file saving?

Now that I think of it, @as-cii didn't you experiment with saving via a different process and it was too slow? The suggested process where the renderer process creates a temp file somewhere containing the data to be saved and the main process takes that and copies it into the real save location isn't that different from what we have here... except maybe we don't have to worry as much about crashes? What do you think?

@as-cii
Copy link
Contributor Author

as-cii commented May 24, 2016

This has all got me wondering if this service is actually necessary at all... would it be reasonable to just ask the main process to do all file saving?

What makes me nervous about this solution is handing possibly huge chunks of memory to another process through ipc. Maybe this isn't a huge deal in practice, but we ran into some troubles when sending lots of contents down an ipc channel in the past (for spell-check), so it felt more natural to keep the communication between the two processes as lightweight as possible.

Now that I think of it, @as-cii didn't you experiment with saving via a different process and it was too slow?

Yeah, in that case we also had the huge overhead of spawning a new process, whereas here we have the main process already running, which makes it an (almost) zero cost facility that we can control through ipc.

Is there any chance that the timing won't work out in some cases? I'm imagining the following scenario:

Good call, @BinaryMuse! 👍

}

recoverSync () {
fs.writeFileSync(this.originalPath, fs.readFileSync(this.recoveryPath))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the files are large, will storeSync and recoverSync greatly increase memory usage? I wonder if reading some number of bytes at a time, or using some synchronous version of pipe, would be better?

@BinaryMuse
Copy link
Contributor

@as-cii Do you have a testing strategy I can use to test this on Windows? I'm not 100% sure how to reproduce artificially.

/cc @lee-dohm

@as-cii as-cii force-pushed the as-file-recovery-service branch from 72f1ae9 to 6c34844 Compare May 26, 2016 09:28
@as-cii as-cii force-pushed the as-file-recovery-service branch from bd47130 to 8355e7f Compare May 26, 2016 17:06
@as-cii as-cii merged commit 1c843e7 into master May 27, 2016
@as-cii as-cii deleted the as-file-recovery-service branch May 27, 2016 09:08
@as-cii
Copy link
Contributor Author

as-cii commented May 27, 2016

🚀

Thanks everyone for the great feedback! ❤️💚💙💜

as-cii pushed a commit that referenced this pull request Aug 11, 2016
After shipping #11828 Atom always creates a backup file in the
`~/.atom/recovery` directory, trying to restore it automatically via the
main process in case there is a hard crash on the renderer process. This
approach should be resilient enough to allow us to delete this config
option altogether.

We can still keep it for a while in text-buffer, in case we need to use
it again at some point, but it's probably fine to remove it from there
too at some point in the future.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants