You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
Please do not modify this template :) and fill in all the required fields.
Dify version
0v0.14.1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
1. Create a workflow with parallel nodes:
Create a new workflow in the application
Add multiple parallel nodes that each make Anthropic API calls:
Configure the workflow to process these parallel nodes concurrently
2. Execute the workflow:
When running this workflow, all parallel nodes attempt to call Anthropic API simultaneously
Due to Anthropic's rate limits (concurrent request limits), some requests receive HTTP 429 responses
System automatically initiates retries with backoff:
HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 429 Too Many Requests"
Retrying request to /v1/messages in 51.000000 seconds
3. During retry period:
Some parallel nodes complete successfully
Others are waiting for retry due to rate limits
The long wait during retry (50+ seconds) causes the database session to detach
When retried requests complete, the workflow attempts to finalize
At this point, accessing workflow_run.created_by_role fails because the session is detached
This scenario specifically occurs because:
Parallel execution triggers rate limits
Rate limits cause long retry delays
During these delays, database sessions become invalid
Session detachment causes SQLAlchemy errors when accessing object attributes
✔️ Expected Behavior
Hi Dify team! First of all, I want to express my sincere gratitude for your amazing project. I've been regularly using Dify to build my productivity tools, and it has been invaluable to my work.
I've encountered a bug and would like to contribute to its resolution. While my backend database expertise is limited, I've worked with Claude to analyze the issue as thoroughly as possible to provide a comprehensive report.
Issue Overview
The bug occurs during parallel workflow execution when multiple Anthropic API calls trigger rate limiting. Here's what happens:
Multiple parallel nodes attempt concurrent Anthropic API calls
Rate limiting (HTTP 429) triggers retry mechanisms with ~50s delays
During these delays, SQLAlchemy sessions become detached
When the workflow tries to finalize, accessing workflow_run.created_by_role fails
sqlalchemy.orm.exc.DetachedInstanceError: Instance <WorkflowRun> is not bound to a Session; attribute refresh operation cannot proceed
I've analyzed the code and logs, and it appears to be a session management issue with long-running operations in parallel workflows.
I'm happy to help resolve this bug, whether by implementing a fix myself with your guidance or providing any additional information you need. If my analysis is helpful, I'd love to collaborate on a solution. If not, I welcome your professional team's insights on how to best address this.
Thank you again for creating such an excellent project! Let me know if you need any clarification or additional details about the issue.
Best regards
❌ Actual Behavior
An SQLAlchemy DetachedInstanceError occurs during workflow completion handling when the system attempts to access created_by_role on a detached WorkflowRun instance. This issue appears to be triggered by long-running operations, particularly when API rate limiting causes retries.
Error Details
The core error:
sqlalchemy.orm.exc.DetachedInstanceError: Instance <WorkflowRun at 0x7fbb786d7dd0> is not bound to a Session; attribute refresh operation cannot proceed
Stack Trace
The error occurs in the following call chain:
_workflow_finish_to_stream_response() in workflow_cycle_manage.py
Attempts to access workflow_run.created_by_role
SQLAlchemy tries to refresh the attribute
Fails because the instance is detached from its session
Reproduction Scenario
Start a workflow execution
During execution, API calls to Anthropic hit rate limits (HTTP 429)
System initiates retry with significant delay (50+ seconds)
When workflow completion handling occurs, the database session has expired
Accessing created_by_role triggers the error
Technical Analysis
The issue stems from session management in long-running operations
The database session expires/detaches while waiting for API retry
The code attempts to access lazy-loaded attributes after session detachment
Current implementation doesn't properly handle session lifecycle for extended operations
Impact
Workflow execution fails to complete properly
User experience is disrupted
Error responses are sent to the client
Potential inconsistency in workflow state
Suggested Solutions
Short-term Fix
Implement proper session handling in _workflow_finish_to_stream_response:
Hey @leozanee! I'm here to help you with bug fixes, questions, and contributions. Let's work together to resolve the issue you've encountered.
To resolve the SQLAlchemy DetachedInstanceError in the WorkflowCycleManage module caused by session management issues during long-running tasks with parallel nodes and API rate limiting, you can follow these steps:
Close Sessions Immediately: After performing any database operation, ensure that the session is closed to release the connection. This prevents holding onto connections during long-running tasks.
Pass IDs Instead of Model Objects: When working with data across different tasks or nodes, pass the ID of the record instead of the model object. This avoids detachment issues when the session is closed.
Retrieve Data Within Each Task: For long-running tasks, retrieve the necessary data within the task itself using the ID, ensuring that each task manages its own session.
By implementing these practices, you can mitigate the DetachedInstanceError and manage database connections more effectively during long-running and concurrent tasks [1].
Self Checks
Dify version
0v0.14.1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
1. Create a workflow with parallel nodes:
2. Execute the workflow:
3. During retry period:
workflow_run.created_by_role
fails because the session is detachedThis scenario specifically occurs because:
✔️ Expected Behavior
Hi Dify team! First of all, I want to express my sincere gratitude for your amazing project. I've been regularly using Dify to build my productivity tools, and it has been invaluable to my work.
I've encountered a bug and would like to contribute to its resolution. While my backend database expertise is limited, I've worked with Claude to analyze the issue as thoroughly as possible to provide a comprehensive report.
Issue Overview
The bug occurs during parallel workflow execution when multiple Anthropic API calls trigger rate limiting. Here's what happens:
workflow_run.created_by_role
failsI've analyzed the code and logs, and it appears to be a session management issue with long-running operations in parallel workflows.
I'm happy to help resolve this bug, whether by implementing a fix myself with your guidance or providing any additional information you need. If my analysis is helpful, I'd love to collaborate on a solution. If not, I welcome your professional team's insights on how to best address this.
Thank you again for creating such an excellent project! Let me know if you need any clarification or additional details about the issue.
Best regards
❌ Actual Behavior
An SQLAlchemy
DetachedInstanceError
occurs during workflow completion handling when the system attempts to accesscreated_by_role
on a detachedWorkflowRun
instance. This issue appears to be triggered by long-running operations, particularly when API rate limiting causes retries.Error Details
The core error:
Stack Trace
The error occurs in the following call chain:
_workflow_finish_to_stream_response()
in workflow_cycle_manage.pyworkflow_run.created_by_role
Reproduction Scenario
created_by_role
triggers the errorTechnical Analysis
Impact
Suggested Solutions
Short-term Fix
_workflow_finish_to_stream_response
:Long-term Solutions
scoped_session
for better session lifecycle managementAdditional Context
[Logs will be attached showing the sequence of events, including rate limiting and retry attempts]
Environment
Related Issues
I added the screenshot of the logs when the error happened as below:
The text was updated successfully, but these errors were encountered: