Browsertrix Crawler 1.x

Browsertrix Crawler is a standalone browser-based high-fidelity crawling system, designed to run a complex, customizable browser-based crawl in a single Docker container. Browsertrix Crawler uses Puppeteer to control one or more Brave Browser browser windows in parallel. Data is captured through the Chrome Devtools Protocol (CDP) in the browser.

For information on how to use and develop Browsertrix Crawler, see the hosted Browsertrix Crawler documentation.

For information on how to build the docs locally, see the docs page.

Support

Initial support for 0.x version of Browsertrix Crawler, was provided by Kiwix. The initial functionality for Browsertrix Crawler was developed to support the zimit project in a collaboration between Webrecorder and Kiwix, and this project has been split off from Zimit into a core component of Webrecorder.

Additional support for Browsertrix Crawler, including for the development of the 0.4.x version has been provided by Portico.

License

AGPLv3 or later, see LICENSE for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 459 Commits
.github/workflows		.github/workflows
.husky		.husky
config/policies		config/policies
docs		docs
html		html
src		src
tests		tests
.dockerignore		.dockerignore
.eslintignore		.eslintignore
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierignore		.prettierignore
CHANGES.md		CHANGES.md
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
package.json		package.json
requirements.txt		requirements.txt
test-setup.js		test-setup.js
tsconfig.eslint.json		tsconfig.eslint.json
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Browsertrix Crawler 1.x

Support

License

About

Releases 112

Sponsor this project

Packages

Contributors 28

Languages

License

webrecorder/browsertrix-crawler

Folders and files

Latest commit

History

Repository files navigation

Browsertrix Crawler 1.x

Support

License

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 112

Sponsor this project

Packages 0

Contributors 28

Languages

Packages