Skip to content

jesmven/hfs2dfxml

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hfs2dfxml

Utility to parse hfsutils output and produce DFXML for HFS-formatted disk images

Caveat

This script is still in development. It is recommended that you check your results against another tool. If you encounter an error or unexpected results, please file an issue on GitHub, or use the contact information below.

System requirements and setup

BitCurator and macOS installation directions

The following directions were tested on BitCurator 1.7.98. The following installation/configuration steps require the Terminal.

  • Test dependencies
    • which xmllint - Output should be /usr/bin/xmllint
    • which git - Output should be /usr/bin/git
    • which hmount - Output should be /usr/local/bin/hmount (no output means hfsutils is not installed)
  • Install hfsutils (only if which hmount fails; see above)
    • BitCurator
      • sudo apt-get install hfsutils (to install hfsutils)
      • which hmount - Output should be /usr/bin/hmount (no output means hfsutils is not installed)
    • macOS
      • brew install hfsutils
  • Get python magic
    • BitCurator - python3-magic
      • sudo apt-get install python3-magic
    • macOS - libmagic
      • brew install libmagic
  • Download hfs2dfxml and set up DFXML libraries and schema
    • cd ~/Desktop
    • git clone https://github.com/cul-it/hfs2dfxml
    • cd ~/Desktop/hfs2dfxml
    • git submodule update --init --recursive
    • cd ../tests
    • git clone https://github.com/dfxml-working-group/dfxml_schema

How to use

To generate DFXML for an HFS-formatted volume:

  • cd ~/Desktop/hfs2dfxml/hfs2dfxml
  • python3 hfs2dfxml.py [HFS volume] [output file]

Note: [output file] must not already exist.

Optional parameters: -d, --delimiter [classic, macosx, osx, companion]

-s --strict: Use xmllint and generate easy to read XML

classic: Output paths with a colon delimiter; reports resource forks as filename:rsrc

macosx: Output paths with a slash delimiter; reports resource forks as filename/rsrc

osx: Output paths with a slash delimiter; reports resource forks as filename/..namedfork/rsrc

companion: Output paths with a slash delimiter; reports resource forks as ._filename

Optionally, place hfs2dfxml in your Python path and import it in your own code to call hfs_volobj. This function returns a standalone DFXML Volume object.

If you encounter the following error, "magic module object has no attribte 'open'" ensure that python3-magic is installed. If you encounter an error with ascii codecs, ensure you're running the code with python3.

For macOS use the hsf2dfxml magic branch:

  • cd ~/Desktop/hfs2dfxml
  • git pull
  • git checkout magic
  • Then follow the same use instructions above.

Known limitations (and implied to do list)

  • HFS namespace is projected and not yet officially part of the DFXML schema. See: dfxml-working-group/dfxml_schema#23
  • Byte runs not reported for fileobjects
  • Timestamps only include day/month/year and not specific time
  • Tested on CD-ROM disc image of HFS volume only; submissions of additional HFS volumes to do further testing will be happily accepted.

If you encounter a bug or issue not listed above, please feel free to file an issue in GitHub. There is a DEBUG flag at the top of the script which can be set to True and that will produce a logfile named DEBUG_hfs2dfxml.txt to assist in debugging, or can be used directly in the debug_raw_hfs.py file.

Contact

dd388 AT cornell DOT edu

Thank you

  • Alex Nelson for DFXML guidance and help prioritizing code features.
  • Kate Tasker for testing various HFS disk images and supplying data for debugging.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%