Two-Stage Coefficient Estimation in Symbolic Regression for Scientific Discovery

This repository contains the official implementation of "Two-Stage Coefficient Estimation in Symbolic Regression for Scientific Discovery" accepted at the Machine Learning and the Physical Sciences Workshop @ NeurIPS 2024.

Setup

To reproduce experiments in the workshop paper

Make a virtual environment.

$ pyenv global 3.11
$ python -m venv env
$ source env/bin/activate
$ pip install --upgrade pip

Install srsd-coeff-optim library.

$ pip install .

Install libraries for experiments.

$ pip install -r expt/requirements.txt

To install our library for your own project

$ pip install git+https://github.com/omron-sinicx/srsd-coeff-optim

Reproduction experiments

Make data directory under expt and prepare srsd-benchmark datasets.

$ pwd
***/srsd-coeff-optim/expt/data
$ git lfs install
$ git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_easy
$ git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_medium
$ git clone https://huggingface.co/datasets/yoshitomo-matsubara/srsd-feynman_hard

Estimate coefficients given the ground truth skeleton

E.g.)

$ python expt/main.py --dataset srsd --n_data 20 --discrete_method brute-force --continuous_method lm-jump --max_contiter 100 --outiter 2 --beamsize 10 --init_method uniform --n_core 8 --seed 0

Evaluate the estimation results

E.g.)

$ python expt/analyze.py --eval_method stats_by_difficulty --dataset srsd --n_data 20 --discrete_method brute-force --continuous_method lm-jump --outiter 2 --beamsize 10 --init_method uniform --seed 0

Generate figures in the paper or poster

E.g.)

$ python expt/figure.py

Call our optimizer in your project

All you need to do is instantiate an Optimizer class object and call the optimize method.

E.g.)

import math

import sympy

from srsd_coeff_optim import Optimizer

optimizer = Optimizer(
    discrete_method="brute-force",
    continuous_method="lm-jump",
    max_contiter=100,
    outiter=2,
    beamsize=10,
    seed=0,
    discrete_candidates=[-1, 0.5, -0.5, 1.5, -1.5, 2, -2, 3, -3, 4, -4, 5, -5],
)

f = sympy.sympify("x0 * c0 + log(c1 + x1 ** c2)")  # x0 * 1 + log(2 + x1^3)
data = [
    [2.0, 4.0, 2.0 + math.log(2.0 + 4.0**3.0)],
    [3.0, 6.0, 3.0 + math.log(2.0 + 6.0**3.0)],
    [4.0, 1.0, 4.0 + math.log(2.0 + 1.0**3.0)],
]  # [x0, x1, y]
coeff_init = None
init_method = "uniform"
allinfo = True

(
    formula,
    status,
    ffinal,
    coeff_opt,
    disc_time_hist,
    cont_time_hist,
    expo_coeffs,
    other_coeffs,
    disc_coeffs,
    cont_coeffs,
    coeff_opt_hist,
    cont_error_hist,
) = optimizer.optimize(f, data, coeff_init, init_method, allinfo)
assert abs(coeff_opt[0] - 1.0) < 1e-6  # c0
assert abs(coeff_opt[1] - 2.0) < 1e-6  # c1
assert coeff_opt[2] == 3.0  # c2

Citation

@inproceedings{
    negishi2024twostage,
    title={Two-Stage Coefficient Estimation in Symbolic Regression for Scientific Discovery},
    author={Negishi, Masahiro and Matsubara, Yoshitomo and Chiba, Naoya and Igarashi, Ryo and Ushiku, Yoshitaka},
    booktitle={Machine Learning and the Physical Sciences Workshop @ NeurIPS 2024},
    year={2024},
    url={https://ml4physicalsciences.github.io/2024/files/NeurIPS_ML4PS_2024_11.pdf}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
expt		expt
srsd_coeff_optim		srsd_coeff_optim
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Two-Stage Coefficient Estimation in Symbolic Regression for Scientific Discovery

Setup

To reproduce experiments in the workshop paper

To install our library for your own project

Reproduction experiments

Estimate coefficients given the ground truth skeleton

Evaluate the estimation results

Generate figures in the paper or poster

Call our optimizer in your project

Citation

About

Releases

Packages

Languages

License

omron-sinicx/srsd-coeff-optim

Folders and files

Latest commit

History

Repository files navigation

Two-Stage Coefficient Estimation in Symbolic Regression for Scientific Discovery

Setup

To reproduce experiments in the workshop paper

To install our library for your own project

Reproduction experiments

Estimate coefficients given the ground truth skeleton

Evaluate the estimation results

Generate figures in the paper or poster

Call our optimizer in your project

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages