-
Notifications
You must be signed in to change notification settings - Fork 606
/
Copy path.cursorrules
49 lines (35 loc) · 4.45 KB
/
.cursorrules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
You are an expert programming assistant helping build a project called MultiQC.
Base MultiQC codebase is written in Python 3.9+. Code should use type hints, and it's checked with mypy.
Pydantic should be used to validate configs coming from user.
Plotly is used to build plots. Plotly Python library creates plot objects that a dumped to JSON, which is compressed and embedded into the portable HTML report, where it's loaded and rendered by frontend using Plotly-JS library.
The frontend is written with JavaScript, jQuery, Bootstrap 3. All assets are embedded into the portable HTML report, so npm can't be used. To keep the report as small as possible, the limited number of JS libraries can be used.
## Modules
MultiQC supports bioformatics tools through so-called "modules". Each "module" is a Python module placed in `multiqc/modules`, and it is dynamically loaded as a Python entry point specified in `pyproject.toml`. A module describes how MultiQC should parse outputs/logs from the corresponding tool, what to extract, how to summarise it, and what kinds of plots and tables to use to present data in the report. There is also a file `multiqc/search_patterns.yaml` that describes file name patterns and content patterns to discover output files/logs from each tool/module.
## Code style
When writing code, you must follow the following rules:
- Use f-strings and other MODERN Python 3 syntax. Do not use `__future__` imports or `OrderedDict`'s.
- Use double quotes for strings.
- Do not add shebang lines to Python files unless they are placed in the `scripts/` folder.
When writing modules, you must follow the following rules:
- Raise `ModuleNoSamplesFound` when no samples are found. DO NOT RAISE `UserWarning`!
- Call `self.add_software_version()`, even if version is not found, as it's required by linting.
- Call `self.write_data_file` in the very end of the module, after all sections are added. IT IS IMPORTANT TO CALL IT IN THE END!
- Add entry point into `pyproject.toml`. Ignore `setup.py`.
- Do not add separate markdown files or module-level docstrings. Instead add a docstring to the module class.
- Module's `info` MUST start with a capital letter.
THIS IS VERY IMPORTANT. YOU MUST FOLLOW THESE GUIDELINES.
## MultiQC codebase structure
- `multiqc/modules` - all the MultiQC "modules", plus 3 special-case modules: "software_versions", "profile_runtime", and "custom_content". The latter includes code to parse custom, non-tool sections and plots passed by end user directly thought configs or TSV/CSV files.
- `multiqc/core` - core codebase for log discovery, running modules, logging, output writing, and AI summarization.
- `multiqc/plots` - plotting code: describes how to prepare data for plotting and build Plotly layouts for different plot types.
- `multiqc/templates` - HTML templates for MultiQC report. Only `multiqc/templates/default` is worth attention here. It includes HTML templates to be combined and rendered with Jinja2 to produce the final HTML report, as well as `assets` - the folder with JavaScript code for all dynamic features like loading and decompressing the plot JSON dumps, rendering it with Plotly-JS, the toolbox with features to highlight/hide/rename samples in the report, etc. It also contains `default_multiqc.css` - all CSS goes there.
- `multiqc/base_module.py` - a base module class for all modules to inherit. Povides a lot of sample name cleaning and grouping logic.
- `multiqc/utils` - common Python utility functions.
- `multiqc/config.py` - a configuration class that contains all the configuration variables for MultiQC, as well as all the configuration discovery logic.
- `multiqc/interactive.py` - function helpers to construct a MultiQC report in interactive mode, e.g. in a Jupyter notebook.
- `multiqc/report.py` - a singleton class with globale variables that are passed to Jinja2 to render a report. Holds the "state" of the report, i.e. modules, sections, discovered files, list of HTML anchor to keep them unique, etc., as well as multiple helper functions.
- `multiqc/multiqc.py` - main entry point of MultiQC. Includes command line interface logic.
- `multiqc/validation.py` - helpers to validate plot configs and user custom content with Pydantic.
- `scripts` - auxiliarry scripts for development.
- `tests` - test suite for MultiQC.
- `docs` - documentation for MultiQC. `docs/markdown/modules` and `docs/markdown/modules.mdx` are autogenerated from the class docstrings in `multiqc/modules` using `scripts/make_module_docs.py`, the rest is written manually.