Skip to content

Concurrent access to a Google Drive folder via the API may result in only 502 Bad Gateway being returned #10661

Open
@yacchin1205

Description

Accessing a Google Drive folder via the API at the same time may cause it to only return a 502 Bad Gateway afterwards.

Reproduction steps:

  1. Create a project in osf.io.
  2. Attach Google Drive to the created project.
  3. Create a folder (in this case MultipleObjectTest) under Google Drive.
  4. Acquire a Personal Access Token from the user's Settings.
  5. Execute the following commands from the user's shell.
$ export OSF_TOKEN=(Personal Access Token obtained.)
$ export NODE_ID=(GUID of the created project)
# Sending requests to the API in three parallel jobs.
$ seq 5 | xargs -P 3 -I{} curl -H "Authorization: Bearer ${OSF_TOKEN}" https://api.osf.io/v2/nodes/${NODE_ID}/files/googledrive/MultipleObjectTest/

After a few runs, it only returns a 502 Bad Gateway.

$ seq 5 | xargs -P 3 -I{} curl -H "Authorization: Bearer ${OSF_TOKEN}" https://api.osf.io/v2/nodes/${NODE_ID}/files/googledrive/MultipleObjectTest/
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>
...

From that point afterwards, a 502 Bad Gateway will be returned for a single request.

When I tried the same test for a container launched by docker-compose as a development environment, the following logs were output.

...
  File "./api/nodes/views.py", line 1199, in get_default_queryset
    file_ids = [f.id for f in self.bulk_get_file_nodes_from_wb_resp(files_list)]
  File "./api/base/views.py", line 642, in bulk_get_file_nodes_from_wb_resp
    file_obj = base_class.objects.get(target_object_id=node.id, target_content_type=content_type, _path='/' + attrs['path'].lstrip('/'))
  File "/usr/lib/python3.6/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 384, in get
    (self.model._meta.object_name, num)
addons.googledrive.models.MultipleObjectsReturned: get() returned more than one GoogleDriveFile -- it returned 2!

It appears that multiple BaseFileNodes have been created, which would normally be expected to be one for a path.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions