⚡ A fast, async, stream-based link checker written in Rust.
Finds broken hyperlinks and mail addresses inside Markdown, HTML,
reStructuredText, or any other text file or website!
Available as a command-line utility, a library and a GitHub Action.
pacman -S lychee-link-checker
brew install lychee
docker pull lycheeverse/lychee
nix-env -iA nixos.lychee
pkg install lychee
pkg install lychee
We provide binaries for Linux, macOS, and Windows for every release.
You can download them from the releases page.
On APT/dpkg-based Linux distros (e.g. Debian, Ubuntu, Linux Mint and Kali Linux)
the following commands will install all required build dependencies, including
the Rust toolchain and cargo
:
curl -sSf 'https://sh.rustup.rs' | sh
apt install gcc pkg-config libc6-dev libssl-dev
cargo install lychee
This comparison is made on a best-effort basis. Please create a PR to fix outdated information.
lychee | awesome_bot | muffet | broken-link-checker | linkinator | linkchecker | markdown-link-check | fink | |
---|---|---|---|---|---|---|---|---|
Language | Rust | Ruby | Go | JS | TypeScript | Python | JS | PHP |
Async/Parallel | ||||||||
JSON output | 1 | |||||||
Static binary | ️ | |||||||
Markdown files | ||||||||
HTML files | ||||||||
Text files | ||||||||
Website support | ||||||||
Chunked encodings | ||||||||
GZIP compression | ||||||||
Basic Auth | ||||||||
Custom user agent | ||||||||
Relative URLs | ||||||||
Skip relative URLs | ||||||||
Include patterns | ️ | |||||||
Exclude patterns | ||||||||
Handle redirects | ||||||||
Ignore insecure SSL | ||||||||
File globbing | ||||||||
Limit scheme | ||||||||
Custom headers | ||||||||
Summary | ||||||||
HEAD requests |
||||||||
Colored output | ||||||||
Filter status code | ||||||||
Custom timeout | ||||||||
E-mail links | ||||||||
Progress bar | ||||||||
Retry and backoff | ||||||||
Skip private domains | ||||||||
Use as library | ||||||||
Quiet mode | ||||||||
Config file | ||||||||
Recursion | ||||||||
Amazing lychee logo |
1 Other machine-readable formats like CSV are supported.
Recursively check all links in supported files inside the current directory
lychee .
You can also specify various types of inputs:
# check links in specific local file(s):
lychee README.md
lychee test.html info.txt
# check links on a website:
lychee https://endler.dev
# check links in directory but block network requests
lychee --offline path/to/directory
# check links in a remote file:
lychee https://raw.githubusercontent.com/lycheeverse/lychee/master/README.md
# check links in local files via shell glob:
lychee ~/projects/*/README.md
# check links in local files (lychee supports advanced globbing and ~ expansion):
lychee "~/projects/big_project/**/README.*"
# ignore case when globbing and check result for each link:
lychee --glob-ignore-case --verbose "~/projects/**/[r]eadme.*"
# check links from epub file (requires atool: https://www.nongnu.org/atool)
acat -F zip {file.epub} "*.xhtml" "*.html" | lychee -
lychee parses other file formats as plaintext and extracts links using linkify. This generally works well if there are no format or encoding specifics, but in case you need dedicated support for a new file format, please consider creating an issue.
Here's how to mount a local directory into the container and check some input
with lychee. The --init
parameter is passed so that lychee can be stopped
from the terminal. We also pass -it
to start an interactive terminal, which
is required to show the progress bar.
docker run --init -it -v `pwd`:/input lycheeverse/lychee /input/README.md
To avoid getting rate-limited while checking GitHub links, you can optionally
set an environment variable with your Github token like so GITHUB_TOKEN=xxxx
,
or use the --github-token
CLI option. It can also be set in the config file.
Here is an example config file.
The token can be generated in your GitHub account settings page. A personal token with no extra permissions is enough to be able to check public repos links.
There is an extensive list of commandline parameters to customize the behavior. See below for a full list.
USAGE:
lychee [FLAGS] [OPTIONS] <inputs>...
FLAGS:
--cache Use request cache stored on disk at `.lycheecache`
--dump Don't perform any link checking. Instead, dump all the links extracted from inputs that
would be checked
-E, --exclude-all-private Exclude all private IPs from checking.
Equivalent to `--exclude-private --exclude-link-local --exclude-loopback`
--exclude-link-local Exclude link-local IP address range from checking
--exclude-loopback Exclude loopback IP address range and localhost from checking
--exclude-mail Exclude all mail addresses from checking
--exclude-private Exclude private IP address ranges from checking
--glob-ignore-case Ignore case when expanding filesystem path glob inputs
--help Prints help information
--include-verbatim Find links in verbatim sections like `pre`- and `code` blocks
-i, --insecure Proceed for server connections considered insecure (invalid TLS)
-n, --no-progress Do not show progress bar.
This is recommended for non-interactive shells (e.g. for continuous integration)
--offline Only check local files and block network requests
--require-https When HTTPS is available, treat HTTP links as errors
--skip-missing Skip missing input files (default is to error if they don't exist)
-V, --version Prints version information
-v, --verbose Verbose program output
OPTIONS:
-a, --accept <accept> Comma-separated list of accepted status codes for valid links
-b, --base <base> Base URL or website root directory to check relative URLs e.g.
https://example.com or `/path/to/public`
--basic-auth <basic-auth> Basic authentication support. E.g. `username:password`
-c, --config <config-file> Configuration file to use [default: ./lychee.toml]
--exclude <exclude>... Exclude URLs from checking (supports regex)
--exclude-file <exclude-file>... Deprecated; use `--exclude-path` instead
--exclude-path <exclude-path>... Exclude file path from getting checked
-f, --format <format> Output format of final status report (compact, detailed, json, markdown)
[default: compact]
--github-token <github-token> GitHub API token to use when checking github.com links, to avoid rate
limiting [env: GITHUB_TOKEN]
-h, --headers <headers>... Custom request headers
--include <include>... URLs to check (supports regex). Has preference over all excludes
--max-cache-age <max-cache-age> Discard all cached requests older than this duration [default: 1d]
--max-concurrency <max-concurrency> Maximum number of concurrent network requests [default: 128]
-m, --max-redirects <max-redirects> Maximum number of allowed redirects [default: 5]
--max-retries <max-retries> Maximum number of retries per request [default: 3]
-X, --method <method> Request method [default: get]
-o, --output <output> Output file of status report
--remap <remap>... Remap URI matching pattern to different URI
-r, --retry-wait-time <retry-wait-time> Minimum wait time in seconds between retries of failed requests [default:
1]
-s, --scheme <scheme>... Only test links with the given schemes (e.g. http and https)
-T, --threads <threads> Number of threads to utilize. Defaults to number of cores available to
the system
-t, --timeout <timeout> Website timeout in seconds from connect to response finished [default:
20]
-u, --user-agent <user-agent> User agent [default: lychee/0.10.1]
ARGS:
<inputs>... The inputs (where to get links to check from). These can be: files (e.g. `README.md`), file globs
(e.g. `"~/git/*/README.md"`), remote URLs (e.g. `https://example.com/README.md`) or standard
input (`-`). NOTE: Use `--` to separate inputs from options that allow multiple arguments
0
for success (all links checked successfully or excluded/skipped as configured)1
for missing inputs and any unexpected runtime failures or config errors2
for link check failures (if any non-excluded link failed the check)
You can exclude links from getting checked by specifying regex patterns
with --exclude
(e.g. --exclude example\.(com|org)
).
If a file named .lycheeignore
exists in the current working directory, its
contents are excluded as well. The file allows you to list multiple regular
expressions for exclusion (one pattern per line).
If the --cache
flag is set, lychee will cache responses in a file called
.lycheecache
in the current directory. If the file exists and the flag is set,
then the cache will be loaded on startup. This can greatly speed up future runs.
Note that by default lychee will not store any data on disk.
You can use lychee as a library for your own projects! Here is a "hello world" example:
use lychee_lib::Result;
#[tokio::main]
async fn main() -> Result<()> {
let response = lychee_lib::check("https://github.com/lycheeverse/lychee").await?;
println!("{response}");
Ok(())
}
This is equivalent to the following snippet, in which we build our own client:
use lychee_lib::{ClientBuilder, Result, Status};
#[tokio::main]
async fn main() -> Result<()> {
let client = ClientBuilder::default().client()?;
let response = client.check("https://github.com/lycheeverse/lychee").await?;
assert!(response.status().is_success());
Ok(())
}
The client builder is very customizable:
let client = lychee_lib::ClientBuilder::builder()
.includes(includes)
.excludes(excludes)
.max_redirects(cfg.max_redirects)
.user_agent(cfg.user_agent)
.allow_insecure(cfg.insecure)
.custom_headers(headers)
.method(method)
.timeout(timeout)
.github_token(cfg.github_token)
.scheme(cfg.scheme)
.accepted(accepted)
.build()
.client()?;
All options that you set will be used for all link checks. See the builder documentation for all options. For more information, check out the examples folder.
A GitHub Action that uses lychee is available as a separate repository: lycheeverse/lychee-action which includes usage instructions.
We'd be thankful for any contribution.
We try to keep the issue-tracker up-to-date so you can quickly find a task to work on.
Try one of these links to get started:
Lychee is written in Rust. Install rust-up to get started. Begin by making sure the following commands succeed without errors.
cargo test # runs tests
cargo clippy # lints code
cargo install cargo-publish-all
cargo-publish-all --dry-run --yes # dry run release
Lychee makes heavy use of async code to be resource-friendly while still being performant. Async code can be difficult to troubleshoot with most tools, however. Therefore we provide experimental support for tokio-console. It provides a top(1)-like overview for async tasks!
If you want to give it a spin, download and start the console:
git clone https://github.com/tokio-rs/console
cd console
cargo run
Then run lychee with some special flags and features enabled.
RUSTFLAGS="--cfg tokio_unstable" cargo run --features tokio-console -- <input1> <input2> ...
If you find a way to make lychee faster, please do reach out.
We collect a list of common workarounds for various websites in our troubleshooting guide.
- https://github.com/opensearch-project/OpenSearch
- https://github.com/ramitsurana/awesome-kubernetes
- https://github.com/papers-we-love/papers-we-love
- https://github.com/pingcap/docs
- https://github.com/microsoft/WhatTheHack
- https://github.com/Azure/ResourceModules
- https://github.com/nix-community/awesome-nix
- https://github.com/balena-io/docs
- https://github.com/launchdarkly/LaunchDarkly-Docs
- https://github.com/pawroman/links
- https://github.com/analysis-tools-dev/static-analysis
- https://github.com/analysis-tools-dev/dynamic-analysis
- https://github.com/mre/idiomatic-rust
- https://github.com/lycheeverse/lychee (yes, the lychee docs are checked with lychee 🤯)
If you are using lychee for your project, please add it here.
The first prototype of lychee was built in episode 10 of Hello Rust. Thanks to all Github- and Patreon sponsors for supporting the development since the beginning. Also, thanks to all the great contributors who have since made this project more mature.
lychee is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or https://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or https://opensource.org/licenses/MIT)
at your option.