Description
Describe the bug
I have a flask app that has an endpoint that looks something like this, where model
is FullBayesianForecaster
:
@app.route("/predict", methods=["POST"])
def predict():
df = get_df()
result = model.predict(df, seed=1234)
return result.to_dict(orient="records")
I tested it locally with curl and it worked great: the flask app returned the same predictions for the same input. I then deployed it to a test environment, and another service that talks to it immediately started getting non-deterministic results for requests with the same parameters. I checked the prediction results with another curl request to the test environment, and got the same result back every time, as expected.
After scratching my head a bit, I tried using threading.Lock
like this:
with lock:
result = model.predict(df, seed=1234)
And then the non-determinism went away. This leads me to believe that model prediction is not thread safe. I haven't yet dug down far enough to know for sure what's causing the issue, but this seems to be at least one likely culprit:
https://github.com/uber/orbit/blob/27371ec/orbit/forecaster/full_bayes.py#L96-L97
If multiple threads are calling model.predict
such that they fight over the value of self._prediction_meta
here, that seems likely to cause predictions to be not thread-safe:
https://github.com/uber/orbit/blob/c232980/orbit/forecaster/forecaster.py#L389
To Reproduce
Try running model.predict
with different parameters in the presence of multiple threads, and set seed=a_fixed_number
.
Expected behavior
Calling model.predict
with seed=a_fixed_number
in the presence of multiple threads returns deterministic predictions.
Screenshots
N/A
Environment (please complete the following information):
- OS: Ubuntu
- Python Version: 3.9
- Versions of Major Dependencies (
pandas
,scikit-learn
,cython
): [e.g.pandas==1.3.5
,scikit-learn==<not installed>
,cython==0.29.30
]
Additional context
N/A