Skip to content

qpdf-rs does not model QPDF's lifetimes well #2

Closed
@teythoon

Description

@teythoon

I'm writing a tool to collate pages from several PDFs. I found it incredibly easy to crash my program because the bindings do not model the lifetimes of references in QPDF very well. For example, using QPdf::read_from_memory requires that the buffer outlives the QPdf object, but that is not enforced using Rust's type system:

use qpdf::*;

fn main() {
    let usage = "crash <PDF>";
    let buf = std::fs::read(std::env::args().nth(1).expect(usage)).unwrap();

    // Works fine:
    let source = QPdf::read_from_memory(&buf).unwrap();
    let sink = QPdf::empty();
    sink.add_page(&source.get_pages().unwrap()[0], false).unwrap();

    // Segfaults:
    let source = QPdf::read_from_memory(buf).unwrap();
    let sink = QPdf::empty();
    sink.add_page(&source.get_pages().unwrap()[0], false).unwrap();
}

Further, I have found this problem while writing out the resulting PDF. QPDF's documentation mentions this (https://qpdf.readthedocs.io/en/latest/design.html#copying-objects-from-other-pdf-files), but even if I explicitly use this, I still need to keep the source buffer around, which again the Rust type system does not enforce:

use qpdf::*;

fn main() {
    let usage = "crash <PDF>";
    let buf = std::fs::read(std::env::args().nth(1).expect(usage)).unwrap();

    let source = QPdf::read_from_memory(&buf).unwrap();
    let sink = QPdf::empty();

    // I understand this to make a deep copy.
    let page = sink.copy_from_foreign(&source.get_pages().unwrap()[0]);

    // So we should be able to drop source here, and we can.
    drop(source);

    // But if we drop buf, we will segfault...
    drop(buf);

    sink.add_page(page, false).unwrap();

    sink.writer()
        .static_id(true)
        .force_pdf_version("1.7")
        .normalize_content(true)
        .preserve_unreferenced_objects(false)
        .object_stream_mode(ObjectStreamMode::Preserve)
        .compress_streams(false)
        .stream_data_mode(StreamDataMode::Preserve)
        // ... while writing the PDF here:
        .write_to_memory().unwrap();
}

Activity

ancwrd1

ancwrd1 commented on Jan 21, 2023

@ancwrd1
Owner

Nice catch, thanks! It's actually not mentioned in QPDF C bindings that the qpdf_read_memory expects the buffer to not be freed.
I think the best option here is to use the same lifetime for QPdf and the buffer, otherwise we need to make a copy and waste a memory.

ancwrd1

ancwrd1 commented on Jan 21, 2023

@ancwrd1
Owner

On a second thought making the buffer owned is probably a much better option in terms of ergonomics and usage.

ancwrd1

ancwrd1 commented on Jan 21, 2023

@ancwrd1
Owner

Regarding foreign objects: I think it's probably a bug in QPDF, I'll check it further and open an issue there if needed.

added a commit that references this issue on Jan 21, 2023
ancwrd1

ancwrd1 commented on Jan 21, 2023

@ancwrd1
Owner

Should be fixed in bb9dff0
Will publish version 0.1.5

teythoon

teythoon commented on Jan 23, 2023

@teythoon
Author

Sweet, thanks for the quick fixes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      qpdf-rs does not model QPDF's lifetimes well · Issue #2 · ancwrd1/qpdf-rs