Skip to content

Commit

Permalink
feat: Add parquet file detection
Browse files Browse the repository at this point in the history
Adds parquet file detection. See [docs](https://parquet.apache.org/docs/file-format/)
for specification
  • Loading branch information
Keith Kelly committed Sep 23, 2024
1 parent 4cc383c commit a893098
Show file tree
Hide file tree
Showing 4 changed files with 6 additions and 2 deletions.
2 changes: 2 additions & 0 deletions internal/magic/binary.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ var (
SWF = prefix([]byte("CWS"), []byte("FWS"), []byte("ZWS"))
// Torrent has bencoded text in the beginning.
Torrent = prefix([]byte("d8:announce"))
// PAR1 matches a parquet file.
Par1 = prefix([]byte{0x50, 0x41, 0x52, 0x31})
)

// Java bytecode and Mach-O binaries share the same magic number.
Expand Down
3 changes: 2 additions & 1 deletion supported_mimes.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## 175 Supported MIME types
## 176 Supported MIME types
This file is automatically generated when running tests. Do not edit manually.

Extension | MIME type | Aliases
Expand Down Expand Up @@ -143,6 +143,7 @@ Extension | MIME type | Aliases
**.glb** | model/gltf-binary | -
**.cab** | application/x-installshield | -
**.jxr** | image/jxr | image/vnd.ms-photo
**.parquet** | application/vnd.apache.parquet | -
**.txt** | text/plain | -
**.html** | text/html | -
**.svg** | image/svg+xml | -
Expand Down
Binary file added testdata/parquet.parquet
Binary file not shown.
3 changes: 2 additions & 1 deletion tree.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ var root = newMIME("application/octet-stream", "",
avi, flv, mkv, asf, aac, voc, m3u, rmvb, gzip, class, swf, crx, ttf, woff,
woff2, otf, ttc, eot, wasm, shx, dbf, dcm, rar, djvu, mobi, lit, bpg,
sqlite3, dwg, nes, lnk, macho, qcp, icns, hdr, mrc, mdb, accdb, zstd, cab,
rpm, xz, lzip, torrent, cpio, tzif, xcf, pat, gbr, glb, cabIS, jxr,
rpm, xz, lzip, torrent, cpio, tzif, xcf, pat, gbr, glb, cabIS, jxr, parquet,
// Keep text last because it is the slowest check.
text,
)
Expand Down Expand Up @@ -258,4 +258,5 @@ var (
xfdf = newMIME("application/vnd.adobe.xfdf", ".xfdf", magic.Xfdf)
glb = newMIME("model/gltf-binary", ".glb", magic.Glb)
jxr = newMIME("image/jxr", ".jxr", magic.Jxr).alias("image/vnd.ms-photo")
parquet = newMIME("application/vnd.apache.parquet", ".parquet", magic.Par1)
)

0 comments on commit a893098

Please sign in to comment.