pdf2video.py
is a Python script that combines
- (selected pages of) a PDF presentation, and
- a text script
into a video narrated by the Amazon Polly text-to-speech engine. It can be used to generate, for instance, educational videos.
Please see this sample video, produced with the tool, for a short introduction.
Using pdf2video.py
requires the following external tools and services:
- Python version 3.6 or later.
- The
pdfinfo
andpdftoppm
command line tools provided in the poppler PDF rendering library. In Ubuntu Linux, you can install these withsudo apt get poppler-utils
. - The
ffmpeg
command line tool from theFFmpeg
framework. In Ubuntu Linux, you can install these withsudo apt get ffmpeg
. - Access to Amazon Web Services.
- The AWS Command Line Interface configured with a profile that can access the Polly service. To use the neural voices (recommended for the best quality), remember to select a region in which they are supported.
In the simplest case,
python3 pdf2video.py presentation.pdf script.txt video.mp4
converts the PDF file presentation.pdf
and the script script.txt
into
the video video.mp4
narrated by the default voice (Amazon Polly standard voice Joanna in the current version).
The video includes SRT subtitles that can be displayed by most video players.
In addition, for HTML use, WebVTT subtitles are produced in a separate file as well.
The selected PDF pages as well as the narration voice can be changed easily. For instance, the sample video was produced witth the command
python3 pdf2video.py sample.pdf sample.txt --pages "1,2,4-6" --voice Matthew --neural --conversational sample.mp4
All the options can be printed with python3 pdf2video.py --help
.
The script file is formatted as follows.
The script for each presentation page starts with a line #page
and
the following text then contains the script.
In the script text, one can use
*text*
to readtext
in an emphasized style,@xyz@
to spellxyz
as characters,#high/text/
to use higher pitch fortext
,#low/text/
to use lower pitch fortext
,#n
, wheren
is a positive integer, to have a pause of length ofn
*100ms,#ph/word/pronunciation/
spell theword
with the X-SAMPApronunciation
, and#sub/text/subtitle/
to usesubtitle
as the subtitle instead of the spokentext
,
Please see the file sample.txt file for examples.
The pdf2video.py
tool is relased under the MIT License.