-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Third party libraries
- Introduction
- Unmanaged development dependencies
- Managed dependencies
- Security
- Upgrade dependencies
- Add dependencies
At Oppia we have a lot of dependencies, and we have many different ways of managing them. Some of these dependencies are required for our production application, while others are only used for development. Below, we will discuss how all these dependencies are managed.
Oppia has some development dependencies that the user has to manage themselves. These are:
- Python 3.9
- Google Chrome
- git
- Tools commonly found on Linux and MacOS systems like bash
In our setup instructions, we walk developers through installing these.
The rest of Oppia's development and production dependencies are managed by the Oppia code. These dependencies are installed to the following places:
-
oppia_tools/
stores development tools that aren't used in production. This folder is a sibling of the repository root (oppia/
), so if you are in the root of the repository, this folder is at../oppia_tools/
. However, we'll useoppia_tools/
here for simplicity. -
third_party/python_libs
stores our production Python dependencies. -
third_party/static
stores the production Node.js dependencies that aren't installed from npm. These dependencies are downloaded and compiled by our own code. -
node_modules
stores the production Node.js dependencies that we install from npm. On developer machines, it also stores development dependencies. These dependencies are downloaded by yarn and compiled by webpack. -
third_party/
also stores some other production dependencies like protobuf files.
When you run python -m scripts.start
, a chain of scripts installs and/or upgrades dependencies as necessary:
flowchart TD
first("start.py") -->|"calls install_third_party_libs.main() when start.py is executed or imported"| ITPL("install_third_party_libs.py")
ITPL -->|"main() calls install_third_party.main()"|ITP("install_third_party.py")
ITPL -->|"calls install_python_dev_dependencies.main() when executed or imported"| IPDD("install_python_dev_dependencies.py")
ITP -->|"main() calls install_python_prod_dependencies.main()"| IPPD("install_python_prod_dependencies.py")
We'll look at each of these scripts below.
When you start a development server, you execute scripts/start.py
. This file does not itself install any dependencies, but it contains a call to install_third_party_libs.main()
that will be executed whenever start.py
is executed, or even imported.
Whenever install_third_party_libs.py
is executed or imported, it calls install_python_dev_dependencies.main()
to install Python development dependencies to the active virtual environment. These dependencies are listed in requirements_dev.txt
, which install_python_dev_dependencies.py
asserts matches what would be produced by compiling requirements_dev.in
. If this assertion fails, the script goes ahead and does the compilation but then fails to make sure you commit the updated requirements_dev.txt
file.
This script installs the frontend dependencies from package.json
using yarn
. The versions of all these Node.js modules are pinned by yarn.lock
, and they get installed to node_modules/
.
We also install the protoc and buf tools for protobuf (these versions are pinned). By the time we install those tools, we've already called install_third_party.main()
, so we have downloaded the proto files. We use buf to generate code from those proto files (i.e. compile the proto files).
We use a dependencies.json
file to specify other dependencies that we have but which aren't Python packages or Node.js modules. This file has a dependencies
key, under which it has the following keys:
-
proto
: Files here are installed tothird_party/
-
frontend
: Files here are installed tothird_party/static/
-
oppiaTools
: Files here are installed tooppia_tools/
Under each of these keys is a collection of key-value pairs where each key is a dependency name and each value is an object describing how to download and install the dependency. We support three types of dependencies:
-
Zip archives of files. The zip archive must expand to a single file or folder. The following keys are required in the object describing this dependency:
-
version
: The dependency's version string. -
url
: The full URL to the zip archive. -
downloadFormat
: Must be set tozip
.
The object must specify one of the following keys:
-
rootDir
: The base name of the expanded file or folder. -
rootDirPrefix
: The prefix of the base name of the expanded file or folder. The prefix concatenated with the version string should produce the full base name.
The object must also specify one of the following keys:
-
targetDir
: The base name to which the expanded file or folder should be renamed. -
targetDirPrefix
: The prefix of the base name to which the expanded file or folder should be renamed. The prefix concatenated with the version string should produce the full base name.
-
-
Collections of files. The following keys are required in the object describing this dependency:
-
version
: The dependency's version string. -
url
: The prefix of the URL to the files. -
downloadFormat
: Must be set tofiles
. -
files
: List of the files to download. Each file will be downloaded from the URL formed by joiningurl
with the item infiles
, using a slash (/
) as a delimiter. Files are not renamed after they are downloaded. -
targetDirPrefix
: A string which, when suffixed withversion
, yields the base name of the folder to which each file will be downloaded.
-
-
Tar archives of files. These work just like zip archives, except that there are no optional keys. Instead of the
rootDir
,rootDirPrefix
,targetDir
, andtargetDirPrefix
keys, these two keys are required:-
tarRootDirPrefix
: Same asrootDirPrefix
for zip files. -
targetDirPrefix
: Same astargetDirPrefix
for zip files.
-
New dependencies should not be added to dependencies.json
, as we are trying to remove this method of installing dependencies. Instead, you should use the node modules method.
We download and install the Redis CLI and Elasticsearch development server. We pin the versions of these dependencies that we install, but we don't automatically upgrade old versions. Specifically, we try to execute the Redis and Elasticsearch binaries. If the binaries execute successfully, then we don't install anything.
This script uses pip to install the Python dependencies we need in production to third_party/python_libs
. We define these dependencies using requirements.in
, which lists the libraries we depend on directly. Then install_backend_python_libs.py
generates requirements.txt
, which lists those direct dependencies, plus all the packages that our direct dependencies need, and so on. Both requirements.in
and requirements.txt
specify versions, so requirements.txt
is analogous to yarn.lock
in that it pins all the versions of all the Python packages we use in production.
All dependencies we add to our projects introduce risks. These include security risks (an attacker could inject malicious code into Oppia if they control one of our dependencies) and maintainability risks (if a dependency isn't well-maintained, its bugs could break Oppia). To mitigate these risks, we should do the following:
- Minimize the number of dependencies we have and the extent to which we rely on them. Especially for small dependencies, it may be better to implement them ourselves. That said, for security-related operations (especially cryptographic ones), relying on a trusted library may be better than rolling our own and possibly making mistakes.
- Vet any dependencies we do have. This means checking that we trust the maintainer not to add malicious code, maintain security measures to stop someone else from adding malicious code, and maintain the dependency by fixing bugs. For small dependencies, this may mean reviewing their code manually. For larger ones, we may have to rely on reputation.
-
Pin dependencies to a specific, immutable version. Here are some examples of types of dependencies we use often:
-
PyPI: Use
==
to specify a particular version number, e.g.my_dependency==1.0.5
.[!WARNING] Do not use
=>
, which will tellpip
to automatically install the latest version of a package without our involvement. -
NPM: Use
yarn.lock
to pin dependency versions (this happens automatically for all NPM dependencies handled by yarn). -
GitHub Actions: Use a commit hash ("SHA value" in the docs) when specifying a third-party dependency. For example:
- uses: actions/javascript-action@172239021f7ba04fe7327647b213799853a9eb89
.
-
- Hash dependencies using a hashing algorithm approved by NIST and compare that hash to a list of the values we expect. In most cases, we should take advantage of package managers' built-in functionality to verify checksums. This ensures that even if someone manages to change the code of a supposedly immutable dependency version (for example, there are ways to do this in PyPI where packages can be provided either as source code or as binaries), we won't install the malicious dependency.
- Upgrade dependencies every month to ensure we benefit from any security fixes.
Ideally, we'd also constrain dependencies to limit the damage they can do, but that isn't really supported by any of the dependency ecosystems we use yet. Also note that not all of these measures are in place yet. Work on implementing them is tracked in oppia/oppia#16991.
Before upgrading any backend dependencies, you should check for any breaking changes between the currently installed version and the new version you want to upgrade the library to. You should:
- Check the library's changelog for breaking changes.
- Important: Test that everything works correctly after you install the upgraded version (see below).
Note that we don't have an automatic way to detect outdated backend dependencies, so to upgrade all outdated backend dependencies, you have to look up each library online to find its latest version and compare that to the version currently specified in our code. We explain where our code specifies version numbers below.
A production library (these are listed in requirements.in
) can be upgraded as follows:
- Update the library's version number in
requirements.in
. - Run
scripts/start.py
to updaterequirements.txt
based on the newrequirements.in
.
- Update the library's version number in
requirements_dev.in
. - Run
scripts/start.py
to updaterequirements_dev.txt
based on the newrequirements_dev.in
.
Backend libraries that are not installed using pip also have their versions specified in install_third_party_libs.py
, but they use custom code for each library. You should read the code to find the section for your library.
You can update all frontend libraries that we install from npm as follows. Note that <yarn version>
specifies the currently-installed version of yarn (there should only be one version in oppia_tools
). Also, all commands should be run from the repository root.
-
Run
../oppia_tools/yarn-<yarn version>/bin/yarn upgrade
to upgrade libraries that should not have any breaking changes. -
Run
../oppia_tools/yarn-<yarn version>/bin/yarn outdated
to show outdated libraries whose newer versions might have breaking changes. -
Check for breaking changes in the outdated libraries from the previous step. You should:
- Check the library's changelog for breaking changes.
- Important: Test that everything works correctly after you install the upgraded version.
-
Manually update the versions in
package.json
for all packages you decide to upgrade. -
Run
../oppia_tools/yarn-<yarn version>/bin/yarn install
Other frontend libraries are specified in dependencies.json
. You can upgrade these libraries as follows:
-
Check for breaking changes in the libraries you want to upgrade. You should:
- Check the library's changelog for breaking changes.
- Test that after you install the upgraded version (see below), everything works correctly.
-
Change the version in
dependencies.json
to the new version you want to install.
Note that we don't have an automatic way to detect outdated libraries in dependencies.json
, so to upgrade all such libraries, you have to check each version manually.
Note that we don't have an automatic way to detect outdated libraries in dependencies.json
, so to identify outdated dependencies, you have to look up each library online to find its latest version and compare that to the version specified in dependencies.json
.
Note that all dependencies must be compatible with our Apache 2 license. All additions must also be approved by @vojtechjelinek.
Note that if a library is needed both for development and in production, then it should be added according to both of the following sets of instructions.
Here's how to add a backend, production library that can be installed from pip:
- Add the library to
requirements.in
in the form<package-name>==<version>
. - Generate
requirements.txt
fromrequirements.in
by runningscripts/start.py
.
- Add the library to
requirements_dev.in
in the form<package-name>==<version>
. - Generate
requirements_dev.txt
fromrequirements_dev.in
by runningscripts/start.py
.
If a library cannot be installed from pip, you'll have to add custom code to scripts/install_third_party_libs.py
to handle installation.
If the library is available from npm, you can install it like this:
- Add it to
package.json
. If the dependency is needed in production, add it under"dependencies"
. Otherwise, add it under"devDependencies"
. If a dependency is needed in both production and for development, only add it under"dependencies"
. - Run
../oppia_tools/yarn-<yarn version>/bin/yarn install
from the repository root. Note that<yarn version>
specifies the version of yarn.
If your library is not available from npm, you can add it to dependencies.json
. However, you should check with @vojtechjelinek first as we are trying to move away from dependencies.json
.
Have an idea for how to improve the wiki? Please help make our documentation better by following our instructions for contributing to the wiki.
Core documentation
- Oppia's mission
- Code of Conduct
- Get involved!
- How to report a bug
- Google Summer of Code 2024
- Hacktoberfest 2024
Developing Oppia
- FAQs
- How to get help
- Getting started with the project
- How the codebase is organized
- Making your first PR
- Debugging
- Testing
- Codebase policies and processes
- Guidelines for launching new features
- Guidelines for making an urgent fix (hotfix)
- Testing jobs and other features on production
- Guidelines for Developers with Write Access to the Oppia Repository
- Release schedule and other information
- Revert and Regression Policy
- Privacy aware programming
- Code review:
- Project organization:
- QA Testing:
- Design docs:
- Team-Specific Guides
- LaCE/CD:
- Developer Workflow:
Developer Reference
- Oppiabot
- Git cheat sheet
- Frontend
- Backend
- Backend Type Annotations
- Writing state migrations
- Calculating statistics
- Storage models
- Coding for speed in GAE
- Adding a new page
- Adding static assets
- Wipeout Implementation
- Notes on NDB Datastore transactions
- How to handle merging of change lists for exploration properties
- Instructions for editing roles or actions
- Protocol buffers
- Webpack
- Third-party libraries
- Extension frameworks
- Oppia-ml Extension
- Mobile development
- Performance testing
- Build process
- Best practices for leading Oppia teams
- Past Events