Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to copy Media, FallbackPDFs #119

Closed
isbee opened this issue Dec 30, 2024 · 8 comments
Closed

Failed to copy Media, FallbackPDFs #119

isbee opened this issue Dec 30, 2024 · 8 comments
Labels
bug need-to-reproduce For things that can't yet be reliably reproduced in testing

Comments

@isbee
Copy link

isbee commented Dec 30, 2024

Describe the bug
apple_cloud_notes_parser can't copy Media, FallbackPDFs. It can only copy Previews.

To Reproduce
Just run ruby notes_cloud_ripper.rb -m "/Users/user/Library/Group Containers/group.com.apple.notes/"

Expected behavior
I have pdfs, images in notes so it should copy it to output directory.

Screenshots

D, [2024-12-30T13:57:06.212727 #31694] DEBUG -- : Note 45: Created a new Embedded Object of type com.apple.paper.doc.pdf
D, [2024-12-30T13:57:06.212786 #31694] DEBUG -- : Checking if Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media// exists as a real file on disk
D, [2024-12-30T13:57:06.212815 #31694] DEBUG -- : Found Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media//! Creating a new AppleStoredFileResult
D, [2024-12-30T13:57:06.212873 #31694] DEBUG -- : Copying /Users/user/Library/Group Containers/group.com.apple.notes/Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media// to ./output/2024_12_30-13_57_06/files/Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media
E, [2024-12-30T13:57:06.212989 #31694] ERROR -- : Failed to copy /Users/user/Library/Group Containers/group.com.apple.notes/Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media// to ./output/2024_12_30-13_57_06/files/Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media

...

D, [2024-12-30T13:57:06.221645 #31694] DEBUG -- : Note 73: Created a new Embedded Object of type com.adobe.pdf
D, [2024-12-30T13:57:06.221720 #31694] DEBUG -- : Checking if Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media/D40EC9F7-CC7B-4D2D-90E2-709C10A90C20/2402.07596v2.pdf exists as a real file on disk
D, [2024-12-30T13:57:06.221778 #31694] DEBUG -- : Found Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media/D40EC9F7-CC7B-4D2D-90E2-709C10A90C20/2402.07596v2.pdf! Creating a new AppleStoredFileResult
E, [2024-12-30T13:57:06.221919 #31694] ERROR -- : AppleNoteStore: NoteStore tried to rip Note 73 but had to rescue error: File exists @ dir_s_mkdir - ./output/2024_12_30-13_57_06/files/Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/Media
E, [2024-12-30T13:57:06.221933 #31694] ERROR -- : Backtrace: /opt/homebrew/lib/ruby/gems/3.4.0/gems/fileutils-1.7.3/lib/fileutils.rb:403:in 'Dir.mkdir'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/fileutils-1.7.3/lib/fileutils.rb:403:in 'FileUtils.fu_mkdir'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/fileutils-1.7.3/lib/fileutils.rb:381:in 'block (2 levels) in FileUtils#mkdir_p'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/fileutils-1.7.3/lib/fileutils.rb:379:in 'Array#reverse_each'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/fileutils-1.7.3/lib/fileutils.rb:379:in 'block in FileUtils#mkdir_p'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/fileutils-1.7.3/lib/fileutils.rb:371:in 'Array#each'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/fileutils-1.7.3/lib/fileutils.rb:371:in 'FileUtils#mkdir_p'
	/opt/homebrew/Cellar/ruby/3.4.1/lib/ruby/3.4.0/pathname.rb:590:in 'Pathname#mkpath'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleBackup.rb:238:in 'AppleBackup#back_up_file'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNotesEmbeddedPDF.rb:38:in 'AppleNotesEmbeddedPDF#initialize'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNotesEmbeddedObject.rb:544:in 'Class#new'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNotesEmbeddedObject.rb:544:in 'block (2 levels) in AppleNotesEmbeddedObject.generate_embedded_objects'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:254:in 'block (2 levels) in SQLite3::Database#execute'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/resultset.rb:50:in 'SQLite3::ResultSet#each'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:253:in 'block in SQLite3::Database#execute'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:220:in 'SQLite3::Database#prepare'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:248:in 'SQLite3::Database#execute'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNotesEmbeddedObject.rb:442:in 'block in AppleNotesEmbeddedObject.generate_embedded_objects'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNotesEmbeddedObject.rb:422:in 'Google::Protobuf::RepeatedField#each'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNotesEmbeddedObject.rb:422:in 'AppleNotesEmbeddedObject.generate_embedded_objects'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNote.rb:207:in 'AppleNote#replace_embedded_objects'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNote.rb:163:in 'AppleNote#process_note'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNoteStore.rb:915:in 'block in AppleNoteStore#rip_note'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:254:in 'block (2 levels) in SQLite3::Database#execute'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/resultset.rb:50:in 'SQLite3::ResultSet#each'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:253:in 'block in SQLite3::Database#execute'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:220:in 'SQLite3::Database#prepare'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:248:in 'SQLite3::Database#execute'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNoteStore.rb:805:in 'AppleNoteStore#rip_note'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNoteStore.rb:686:in 'block in AppleNoteStore#rip_notes'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:254:in 'block (2 levels) in SQLite3::Database#execute'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/resultset.rb:50:in 'SQLite3::ResultSet#each'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:253:in 'block in SQLite3::Database#execute'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:220:in 'SQLite3::Database#prepare'
	/opt/homebrew/lib/ruby/gems/3.4.0/gems/sqlite3-2.5.0-arm64-darwin/lib/sqlite3/database.rb:248:in 'SQLite3::Database#execute'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNoteStore.rb:684:in 'AppleNoteStore#rip_notes'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleNoteStore.rb:227:in 'AppleNoteStore#rip_all_objects'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleBackup.rb:315:in 'block in AppleBackup#rip_notes'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleBackup.rb:312:in 'Array#each'
	/Users/user/practice/apple_cloud_notes_parser/lib/AppleBackup.rb:312:in 'AppleBackup#rip_notes'
	notes_cloud_ripper.rb:209:in '<main>'

First pdf is uploaded on iPhone. We can see wrong directory like // on debug log. There might be wrong null check or recent iOS/macOS Notes bring some breaking compatibility on SQLite.

Second pdf is uploaded on Macbook. In this case apple_cloud_notes_parser located pdf directory correctly but it still produce error. It means apple_cloud_notes_parser can locate/parse pdf but there are some bugs.

I only provided pdfs related logs, but same error happens on images also.

Desktop (please complete the following information):

  • OS: Macbook
  • Version Ventura 13.3
  • Ruby Version 3.4.1

Smartphone Source (please complete the following information, if applicable):

  • Device: iPhone XS
  • OS: iOS 18.1.1
  • Type of backup: I didn't trigger backup on iOS, but uploaded notes/pdfs.

Command used
ruby notes_cloud_ripper.rb -m "/Users/user/Library/Group Containers/group.com.apple.notes/"

Please confirm the following

  • Error occurs on the latest version of this program on GitHub [Y/N] Y
  • You have run bundle install [Y/N] Y

Additional context

@threeplanetssoftware threeplanetssoftware added the need-to-reproduce For things that can't yet be reliably reproduced in testing label Dec 30, 2024
@threeplanetssoftware
Copy link
Owner

Wow, thank you for the detailed report, I sincerely appreciate it. I'll work on reproducing this to understand exactly what is causing it and try to get a fix out as soon as I can.

@isbee
Copy link
Author

isbee commented Dec 30, 2024

@threeplanetssoftware Thank you very much. If you want to run some quries on my SQLite for debugging, I can run them.

@threeplanetssoftware
Copy link
Owner

I was able to reproduce the com.apple.paper.doc.pdf issue. That is fixed in 65c4802.

Still need to figure out why you are getting that File exists error, but try this version first and see if they both go away.

@threeplanetssoftware
Copy link
Owner

Ultimately, I couldn't reproduce the mkpath error. I added a check to only call mkpath if the path doesn't exist, so it should be fixed. Please let me know if it isn't.

@isbee
Copy link
Author

isbee commented Dec 31, 2024

Now it successfully parsed Media. I can see com.adobe.pdf file in output/.../Media which source was located in group.com.apple.notes/Media/. Also images in Media works same. Thanks for fast feedback.

But it still fail to parse FallbackPDFs, so I can't see com.apple.paper.doc.pdf file which source was located in FallbackPDFs.

  • Previews are parsed so I can see preview images though

There is no error log about FallbackPDFs. I find some clues about this

  1. On debug_log.txt, it didn't logged about group.com.apple.notes/FallbackPDFs/D335816F-4333-4191-868F-D4DB48CC00FA.pdf(existing file on same machine where I ran the command) so it might not scanned this file.
  2. This can be verified on note_store_embedded_objects_1.csv and the log "42","45","","D335816F-4333-4191-868F-D4DB48CC00FA","com.apple.paper.doc.pdf","","","","","","". It missed Object Filepath on Computer or Object Filepath on Phone column. I uploaded this file on iPhone so latter column should be filled.
  3. Also note_store_notes_1.csv have this log {Embedded Object com.apple.paper.doc.pdf: D335816F-4333-4191-868F-D4DB48CC00FA with scan in }. It didn't fill out directory at the end of the log.

@threeplanetssoftware
Copy link
Owner

This is odd because I have com.apple.paper.doc.pdf files in my test corpus and they're working for me on both Linux and Mac test devices. Could you please give me some more info?

  1. At the start of your debug log file should be some lines that look like this, can you please share them for both your Mac and iPhone versions?
D, [2024-12-31] DEBUG -- : Ruby version: ruby 3.2.4 (2024-04-23 revision af471c0e01) [arm64-darwin24]
D, [2024-12-31] DEBUG -- : User asserted this is a MAC_BACKUP
D, [2024-12-31] DEBUG -- : Guessed Notes Version: 18 on Mac
D, [2024-12-31] DEBUG -- : Backup is valid, ripping notes
D, [2024-12-31] DEBUG -- : Apple Backup: Ripping notes from Note Store version 18 on Mac

  1. The filepath you give is not the structure I know of for com.apple.paper.doc.pdf objects so I might just need to add another place to search for them. I just pushed a potential fix, please try it out.
  2. You seem to be showing two different sets of paths. In your backtrace, you have Accounts/[uuid]/Media/[stuff] but above you pasted group.com.apple.notes/FallbackPDFs/[stuff] (lacking the Accounts folder). Are those both from the same backup? If so, that may complicate how I do the file lookups. I'm hoping those are two separate backups.

@isbee
Copy link
Author

isbee commented Dec 31, 2024

Ok first, good news is it works now with 8253041 :) FallbackPDFs is parsed.

  1. At the start of your debug log file should be some lines that look like this, can you please share them for both your Mac and iPhone versions?

I'll give you two debug_log.txt.

  1. debug_log_previous.txt. This is output before 8253041
  2. debug_log_current.txt

Also I'll give you two note_store_embedded_objects_1.csv for more context.

  1. note_store_embedded_objects_1_previous.csv
  2. note_store_embedded_objects_1_current.csv
  1. You seem to be showing two different sets of paths. In your backtrace, you have Accounts/[uuid]/Media/[stuff] but above you pasted group.com.apple.notes/FallbackPDFs/[stuff] (lacking the Accounts folder). Are those both from the same backup? If so, that may complicate how I do the file lookups. I'm hoping those are two separate backups.

Oh it's my mistake 😅

It's actually group.com.apple.notes/Accounts/488ED16B-1168-44DB-A8D0-464031965FAF/FallbackPDFs/D335816F-4333-4191-868F-D4DB48CC00FA.pdf. I omitted Acounts/[uuid].

@threeplanetssoftware
Copy link
Owner

Great to hear! Glad it is working. Happy New Year! 🎊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug need-to-reproduce For things that can't yet be reliably reproduced in testing
Projects
None yet
Development

No branches or pull requests

2 participants