File Recovery Service #11828

as-cii · 2016-05-24T07:26:44Z

~~Depends on #11826 (to run spec/browser/file-recovery-service.spec.js).~~

Recently, we have been observing several issues related to losing data when Atom hard-crashes. We think that this problem manifests whenever the following happens:

User/Package triggers a file save.
fs.writeFileSync(path) gets called.
1. fd = fs.openSync(path, "w") gets called. This causes the file to be truncated.
2. Before fs.writeSync(fd, contents) has the chance to run, a hard crash occurs.

The scenario described above leaves the file in a "truncated state", thus erasing all of its contents. In the past we have experimented with backup files to solve this issue, but that creates other problems such as paths that are too long on Windows, and unnecessary events triggered on fs watchers that are monitoring the saved file's directory. It turns out that atomic saves are not a good trade-off either, as described on #3158 (comment).

With this pull-request we want to provide a file recovery service running on the main process that keeps track of pending changes, and restores files to their original state when a renderer process crashes while a save is in flight. If the recovery is unsuccessful for any reason (e.g. problems with permissions, unmounted network drives, etc.), we log a message like the following in the system console and in a message box too, so that users can perform the recovery themselves:

There was a crash while saving "/private/etc/profile", so this file might be blank or corrupted.
Atom couldn't recover it automatically, but a recovery file has been saved at: "/Users/as-cii/.atom/recovery/profile-950635".

By default we will use ~/.atom/recovery as our recovery directory, where we temporarily store a backup of the saved file and delete it upon successful save/recovery. The nice thing about using the file system as our recovery storage (as opposed to IndexedDb), is that users can inspect it pretty easily and copy/paste content via Terminal.app or Finder.app (or their equivalent on other platforms).

As a convention, backup files stored under the recovery directory, will have the following format:

{originalFilename}-{randomHexNumber}.{originalExtension}

Where:

originalFilename is the name of the file without its extension. This string will be truncated if it's longer than 34 characters.
randomHexNumber is a randomly generated hexadecimal number. This string will be 6 characters long by default.
originalExtension is the extension of the file.

This should be long enough to make filenames readable (and more inspectable), but short enough to avoid path length issues on platforms like Windows.

…so that other event handlers have the chance to execute even if the user doesn't choose an option in the message box. This will allow us to recover files when a window crashes.

as-cii · 2016-05-24T07:29:54Z

src/project.coffee

@@ -360,7 +360,6 @@ class Project extends Model

  addBuffer: (buffer, options={}) ->
    @addBufferAtIndex(buffer, @buffers.length, options)
-    @subscribeToBuffer(buffer)


Apparently, we were subscribing to buffers twice, once here and once in addBufferAtIndex.

as-cii · 2016-05-24T07:32:38Z

/cc: @atom/core for 👀

BinaryMuse · 2016-05-24T18:46:32Z

This is lookin' great, @as-cii, thanks very much for taking this on.

I have a couple questions / comments:

Is there any chance that the timing won't work out in some cases? I'm imagining the following scenario:
1. onWillSave calls emitWillSavePath
2. emitWillSavePath emits will-save-path. This is asynchronous.
3. The renderer process beings saving, truncating the file
4. The main process receives the will-save-path event. It's too late to back up the file
If so, perhaps this is one of the few occasions where sendSync would be appropriate?
This has all got me wondering if this service is actually necessary at all... would it be reasonable to just ask the main process to do all file saving? The renderer process could just ship it the contents to save, either via IPC or some other mechanism (temp file?) and then let the main process handle the saving. I'm not actually sure if this is better or not, or even reasonable, but it's a thought. :)

kuychaco · 2016-05-24T20:45:13Z

src/browser/file-recovery-service.js

+    this.recoveryPathsByWindowAndFilePath.get(window).set(path, recoveryPath)
+
+    if (!this.crashListeners.has(window)) {
+      window.on('crashed', () => this.recoverFilesForWindow(window))


Do you think we should remove the listener when the window closes? Would the callback's reference to window retain it and keep it from being removed from the WeakSet?

That's a great point, thanks! 💯

BinaryMuse · 2016-05-24T21:42:00Z

This has all got me wondering if this service is actually necessary at all... would it be reasonable to just ask the main process to do all file saving?

Now that I think of it, @as-cii didn't you experiment with saving via a different process and it was too slow? The suggested process where the renderer process creates a temp file somewhere containing the data to be saved and the main process takes that and copies it into the real save location isn't that different from what we have here... except maybe we don't have to worry as much about crashes? What do you think?

as-cii · 2016-05-24T22:10:53Z

This has all got me wondering if this service is actually necessary at all... would it be reasonable to just ask the main process to do all file saving?

What makes me nervous about this solution is handing possibly huge chunks of memory to another process through ipc. Maybe this isn't a huge deal in practice, but we ran into some troubles when sending lots of contents down an ipc channel in the past (for spell-check), so it felt more natural to keep the communication between the two processes as lightweight as possible.

Now that I think of it, @as-cii didn't you experiment with saving via a different process and it was too slow?

Yeah, in that case we also had the huge overhead of spawning a new process, whereas here we have the main process already running, which makes it an (almost) zero cost facility that we can control through ipc.

Is there any chance that the timing won't work out in some cases? I'm imagining the following scenario:

Good call, @BinaryMuse! 👍

BinaryMuse · 2016-05-25T17:54:29Z

src/main-process/file-recovery-service.js

+  }
+
+  recoverSync () {
+    fs.writeFileSync(this.originalPath, fs.readFileSync(this.recoveryPath))


If the files are large, will storeSync and recoverSync greatly increase memory usage? I wonder if reading some number of bytes at a time, or using some synchronous version of pipe, would be better?

BinaryMuse · 2016-05-25T18:04:47Z

@as-cii Do you have a testing strategy I can use to test this on Windows? I'm not 100% sure how to reproduce artificially.

/cc @lee-dohm

as-cii · 2016-05-27T09:08:48Z

🚀

Thanks everyone for the great feedback! ❤️💚💙💜

After shipping #11828 Atom always creates a backup file in the `~/.atom/recovery` directory, trying to restore it automatically via the main process in case there is a hard crash on the renderer process. This approach should be resilient enough to allow us to delete this config option altogether. We can still keep it for a while in text-buffer, in case we need to use it again at some point, but it's probably fine to remove it from there too at some point in the future.

Antonio Scandurra added 13 commits May 23, 2016 18:37

Emit {will,did}SavePath on ipcMain before/after a buffer is saved

ef3ab03

🐛 🔥 Remove double subscription to the same buffer

049321a

Make dialog asynchronous when a renderer process crashes

de599f9

…so that other event handlers have the chance to execute even if the user doesn't choose an option in the message box. This will allow us to recover files when a window crashes.

Create FileRecoveryService to restore corrupted files after a crash

b58ce49

🎨 event.sender -> window

770199c

🔥 Unnecessary variable assignments

ca32c13

✅ Write specs for FileRecoveryService

57195d7

💚

97dd25d

🎨 file-recovery-service-spec.js -> file-recovery-service.spec.js

25fece4

📝 Better description on spec

858740c

Emit informative warning when a file can't be recovered

c7f4b33

👕 Fix linter errors

4f1efe6

🔥 Unused requires on specs

7a12984

as-cii reviewed May 24, 2016
View reviewed changes

Merge branch 'master' into as-file-recovery-service

152c28a

as-cii mentioned this pull request May 24, 2016

Unable to modify files owned by root in mac like it used to be able to #11794

Closed

6 tasks

Merge branch 'master' into as-file-recovery-service

e57b35f

kuychaco reviewed May 24, 2016
View reviewed changes

Emit {will,did}SavePath events synchronously

b84feeb

as-cii mentioned this pull request May 25, 2016

Add --main-process flag to run specs in the main process #11826

Merged

Antonio Scandurra added 2 commits May 25, 2016 11:54

Forget window when it gets closed

3b4c101

Handle recovery when many windows save the same file simultaneously

c8fae11

Antonio Scandurra added 12 commits May 25, 2016 14:04

Merge branch 'master' into as-file-recovery-service

3ce7d0a

👕 Fix linting issues

3030723

Generate readable recovery filenames

a2a734a

Add sinon

c6a87b9

Log a more informative message when cannot recover a file

1a7858c

Make coupling looser between the recovery service and the windows

c2b01d5

🎨 Move RecoveryFile down

8ba275a

Show also a message box when recovery is not successful

49a603a

🔥 Remove extra comma

3f8f3c9

🔥 Extra imports

8733b52

Be a little more defensive when retaining/releasing recovery files

d8564ad

🔥 Remove unneeded WeakSet

aefcbcd

BinaryMuse reviewed May 25, 2016
View reviewed changes

🐛 Don't try to recover the same file twice

6c34844

as-cii force-pushed the as-file-recovery-service branch from 72f1ae9 to 6c34844 Compare May 26, 2016 09:28

Antonio Scandurra added 2 commits May 26, 2016 11:39

Return early when a recovery file can't be stored

df263a2

Merge branch 'master' into as-file-recovery-service

5e0e65b

as-cii mentioned this pull request May 26, 2016

Add copyFileSync(sourceFilePath, destinationFilePath) atom/fs-plus#30

Merged

Use fs.copyFileSync for buffered copy

8355e7f

as-cii force-pushed the as-file-recovery-service branch from bd47130 to 8355e7f Compare May 26, 2016 17:06

as-cii merged commit 1c843e7 into master May 27, 2016

as-cii deleted the as-file-recovery-service branch May 27, 2016 09:08

hultberg mentioned this pull request Jun 3, 2016

Data loss after power outage #11406

Open

bronson mentioned this pull request Jun 13, 2016

Uncaught Error: UnknownSystemError: Unknown system error, fdatasync #9196

Closed

as-cii mentioned this pull request Aug 11, 2016

Remove config option editor.backUpBeforeSaving #12393

Merged

as-cii mentioned this pull request Nov 10, 2017

Main process gets into a state where input is blocked for hundreds of milliseconds #16151

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File Recovery Service #11828

File Recovery Service #11828

as-cii commented May 24, 2016 •

edited

Loading

as-cii May 24, 2016

as-cii commented May 24, 2016

BinaryMuse commented May 24, 2016

kuychaco May 24, 2016

as-cii May 24, 2016

BinaryMuse commented May 24, 2016

as-cii commented May 24, 2016

BinaryMuse May 25, 2016

BinaryMuse commented May 25, 2016

as-cii commented May 27, 2016

File Recovery Service #11828

File Recovery Service #11828

Conversation

as-cii commented May 24, 2016 • edited Loading

as-cii May 24, 2016

Choose a reason for hiding this comment

as-cii commented May 24, 2016

BinaryMuse commented May 24, 2016

kuychaco May 24, 2016

Choose a reason for hiding this comment

as-cii May 24, 2016

Choose a reason for hiding this comment

BinaryMuse commented May 24, 2016

as-cii commented May 24, 2016

BinaryMuse May 25, 2016

Choose a reason for hiding this comment

BinaryMuse commented May 25, 2016

as-cii commented May 27, 2016

as-cii commented May 24, 2016 •

edited

Loading