Skip to content

A Python library to sanitize/validate a string such as filenames/file-paths/etc.

License

Notifications You must be signed in to change notification settings

thombashi/pathvalidate

 
 

Repository files navigation

pathvalidate is a Python library to sanitize/validate a string such as filenames/file-paths/etc.

PyPI package version conda package version Supported Python versions Supported Python implementations CI status of Linux/macOS/Windows Test coverage: coveralls CodeQL

  • Sanitize/Validate a string as a:
    • file name
    • file path
  • Sanitize will do:
    • Remove invalid characters for a target platform
    • Replace reserved names for a target platform
    • Normalize
    • Remove unprintable characters
  • Argument validator/sanitizer for argparse and click
  • Multi platform support:
    • Linux
    • Windows
    • macOS
    • POSIX: POSIX-compliant systems (Linux, macOS, etc.)
    • universal: platform independent
  • Multibyte character support

You can find this package's command line interface tool at the pathvalidate-cli repository.

Sample Code:
from pathvalidate import sanitize_filename

fname = "fi:l*e/p\"a?t>h|.t<xt"
print(f"{fname} -> {sanitize_filename(fname)}\n")

fname = "\0_a*b:c<d>e%f/(g)h+i_0.txt"
print(f"{fname} -> {sanitize_filename(fname)}\n")
Output:
fi:l*e/p"a?t>h|.t<xt -> filepath.txt

_a*b:c<d>e%f/(g)h+i_0.txt -> _abcde%f(g)h+i_0.txt

The default target platform is universal. i.e. the sanitized file name is valid for any platform.

Sample Code:
from pathvalidate import sanitize_filepath

fpath = "fi:l*e/p\"a?t>h|.t<xt"
print(f"{fpath} -> {sanitize_filepath(fpath)}\n")

fpath = "\0_a*b:c<d>e%f/(g)h+i_0.txt"
print(f"{fpath} -> {sanitize_filepath(fpath)}\n")
Output:
fi:l*e/p"a?t>h|.t<xt -> file/path.txt

_a*b:c<d>e%f/(g)h+i_0.txt -> _abcde%f/(g)h+i_0.txt
Sample Code:
import sys
from pathvalidate import ValidationError, validate_filename

try:
    validate_filename("fi:l*e/p\"a?t>h|.t<xt")
except ValidationError as e:
    print(f"{e}\n", file=sys.stderr)

try:
    validate_filename("COM1")
except ValidationError as e:
    print(f"{e}\n", file=sys.stderr)
Output:
[PV1100] invalid characters found: platform=universal, description=invalids=('/'), value='fi:l*e/p"a?t>h|.t<xt'

[PV1002] found a reserved name by a platform: 'COM1' is a reserved name, platform=universal, reusable_name=False
Sample Code:
from pathvalidate import is_valid_filename, sanitize_filename

fname = "fi:l*e/p\"a?t>h|.t<xt"
print(f"is_valid_filename('{fname}') return {is_valid_filename(fname)}\n")

sanitized_fname = sanitize_filename(fname)
print(f"is_valid_filename('{sanitized_fname}') return {is_valid_filename(sanitized_fname)}\n")
Output:
is_valid_filename('fi:l*e/p"a?t>h|.t<xt') return False

is_valid_filename('filepath.txt') return True
Sample Code:
from argparse import ArgumentParser

from pathvalidate.argparse import validate_filename_arg, validate_filepath_arg

parser = ArgumentParser()
parser.add_argument("--filename", type=validate_filename_arg)
parser.add_argument("--filepath", type=validate_filepath_arg)
options = parser.parse_args()

if options.filename:
    print(f"filename: {options.filename}")

if options.filepath:
    print(f"filepath: {options.filepath}")
Output:
$ ./examples/argparse_validate.py --filename eg
filename: eg
$ ./examples/argparse_validate.py --filename e?g
usage: argparse_validate.py [-h] [--filename FILENAME] [--filepath FILEPATH]
argparse_validate.py: error: argument --filename: [PV1100] invalid characters found: invalids=(':'), value='e:g', platform=Windows

Note

validate_filepath_arg consider platform as of "auto" if the input is an absolute file path.

Sample Code:
from argparse import ArgumentParser

from pathvalidate.argparse import sanitize_filename_arg, sanitize_filepath_arg


parser = ArgumentParser()
parser.add_argument("--filename", type=sanitize_filename_arg)
parser.add_argument("--filepath", type=sanitize_filepath_arg)
options = parser.parse_args()

if options.filename:
    print("filename: {}".format(options.filename))

if options.filepath:
    print("filepath: {}".format(options.filepath))
Output:
$ ./examples/argparse_sanitize.py --filename e/g
filename: eg

Note

sanitize_filepath_arg is set platform as "auto".

Sample Code:
import click

from pathvalidate.click import validate_filename_arg, validate_filepath_arg


@click.command()
@click.option("--filename", callback=validate_filename_arg)
@click.option("--filepath", callback=validate_filepath_arg)
def cli(filename: str, filepath: str) -> None:
    if filename:
        click.echo(f"filename: {filename}")
    if filepath:
        click.echo(f"filepath: {filepath}")


if __name__ == "__main__":
    cli()
Output:
$ ./examples/click_validate.py --filename ab
filename: ab
$ ./examples/click_validate.py --filepath e?g
Usage: click_validate.py [OPTIONS]
Try 'click_validate.py --help' for help.

Error: Invalid value for '--filename': [PV1100] invalid characters found: invalids=('?'), value='e?g', platform=Windows
Sample Code:
import click

from pathvalidate.click import sanitize_filename_arg, sanitize_filepath_arg


@click.command()
@click.option("--filename", callback=sanitize_filename_arg)
@click.option("--filepath", callback=sanitize_filepath_arg)
def cli(filename, filepath):
    if filename:
        click.echo(f"filename: {filename}")
    if filepath:
        click.echo(f"filepath: {filepath}")


if __name__ == "__main__":
    cli()
Output:
$ ./examples/click_sanitize.py --filename a/b
filename: ab

More examples can be found at https://pathvalidate.rtfd.io/en/latest/pages/examples/index.html

pip install pathvalidate
conda install conda-forge::pathvalidate
sudo add-apt-repository ppa:thombashi/ppa
sudo apt update
sudo apt install python3-pathvalidate

Python 3.9+ no external dependencies.

https://pathvalidate.rtfd.io/

ex-sponsor: Charles Becker (chasbecker) ex-sponsor: 時雨堂 (shiguredo) onetime: Dmitry Belyaev (b4tman) onetime: Arturi0 onetime: GitHub (github)

Become a sponsor