Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CVAT_HOST to CSRF_TRUSTED_ORIGINS #6322

Closed
wants to merge 2 commits into from
Closed

Conversation

pktiuk
Copy link
Contributor

@pktiuk pktiuk commented Jun 15, 2023

Closes: #6321

Motivation and context

CVAT uses django.middleware.csrf.CsrfViewMiddleware, but it does not set variable CSRF_TRUSTED_ORIGINS. It may cause some issues. Especially in configs with changed traefik configurations (links with http:// and https:// are treated as separate entries and may cause blocking some requests to server)

How has this been tested?

It was built and launched with docker

Checklist

  • I submit my changes into the develop branch
  • I have added a description of my changes into the CHANGELOG file
    - [ ] I have updated the documentation accordingly
    - [ ] I have added tests to cover my changes
  • I have linked related issues (see GitHub docs)
    - [ ] I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@pktiuk pktiuk marked this pull request as ready for review June 15, 2023 10:46
@pktiuk
Copy link
Contributor Author

pktiuk commented Jun 19, 2023

@azhavoro
Will this change make it for the next release?

@codecov
Copy link

codecov bot commented Jun 19, 2023

Codecov Report

Merging #6322 (67c1195) into develop (8962c12) will increase coverage by 0.02%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop    #6322      +/-   ##
===========================================
+ Coverage    82.49%   82.52%   +0.02%     
===========================================
  Files          370      370              
  Lines        39831    39831              
  Branches      3549     3549              
===========================================
+ Hits         32860    32869       +9     
+ Misses        6971     6962       -9     
Components Coverage Δ
cvat-ui 77.56% <ø> (+0.03%) ⬆️
cvat-server 86.87% <ø> (+<0.01%) ⬆️

@pktiuk
Copy link
Contributor Author

pktiuk commented Jun 26, 2023

@bsekachev

Could I get any feedback?

@jevansbio
Copy link

I had the issue and implementing this PR fixed them

@pktiuk
Copy link
Contributor Author

pktiuk commented Jul 24, 2023

@azhavoro
Could I get some feedback about this PR?

@brunodrugowick
Copy link

Is there a workaround available while this is not merged other than cherry-picking this commit?

@pktiuk
Copy link
Contributor Author

pktiuk commented Jul 25, 2023

@brunodrugowick
I am not aware of any other fix addressing this issue.
I have no idea why this is not merged yet.

@bsekachev
Copy link
Member

Hi @pktiuk ,

It is not merged yet, because it is not totally clear for now, why we need to modify CSRF_TRUSTED_ORIGIN list.

According to Django documentation:

CsrfViewMiddleware verifies the Origin header, if provided by the browser, against the current host and the CSRF_TRUSTED_ORIGINS setting. This provides protection against cross-subdomain attacks.

As I understand, it means, that there is not necessarity to specify CVAT_HOST in CSRF_TRUSTED_ORIGINS, because it is already included "by default". On the other hand it does not disable CSRF token check itself. According to the documentation, check is triggered anyway:

For all incoming requests that are not using HTTP GET, HEAD, OPTIONS or TRACE, a CSRF cookie must be present, and the ‘csrfmiddlewaretoken’ field must be present and correct. If it isn’t, the user will get a 403 error.

and I am not really sure how it solves your problem. My only assumption for now is that somehow in your configuration server origin is not the same as client origin.

Finally, we have https://app.cvat.ai where .. we do not configure CSRF_TRUSTED_ORIGIN and it works.

P.S. I am more frontend than backend dev, and maybe misunderstood something

@jloebl
Copy link

jloebl commented Sep 6, 2023

My only assumption for now is that somehow in your configuration server origin is not the same as client origin.

As we see by the linked issues, more people are facing this issue after updating cvat. Sadly, we cant use the update if we can not manage users anymore. This is a huge deal breaker!

@pktiuk
Copy link
Contributor Author

pktiuk commented Sep 12, 2023

Hello @bsekachev ,
Sorry for delay. I think I can explain why this fix works (this is just my guess, not 100% sure).
As you can see in django docs

These checks prevent, for example, a POST request from subdomain.example.com from succeeding against api.example.com. If you need cross-origin unsafe requests, continuing the example, add 'https://subdomain.example.com' to this list (and/or http://... if requests originate from an insecure page).

variable CSRF_TRUSTED_ORIGINS differentiates between HTTP and HTTPS origins. In our case this may be problematic, when we use docker-compose.https.yml, because Django generates http, which are later encrypted by traefik and it gets back responses in https (or at least marked as https). My changes adds both http and https to trusted origins to be sure, that both configurations work.

@baudneo
Copy link

baudneo commented Sep 26, 2023

Using CVAT and cloudflared causes the CSRF 403 Forbidden on Django admin page POSTS. This PR should be merged as my setup would seem to be common. Users who want/need control over CORS/CSRF, should have it.

@baudneo

This comment was marked as outdated.

@pktiuk

This comment was marked as outdated.

@baudneo baudneo mentioned this pull request Sep 27, 2023
@baudneo

This comment was marked as outdated.

@pktiuk

This comment was marked as off-topic.

@baudneo

This comment was marked as off-topic.

@baudneo

This comment was marked as off-topic.

@pktiuk
Copy link
Contributor Author

pktiuk commented Oct 9, 2023

@brunodrugowick @bsekachev
I see that this PR is approved, so can it be merged?

@baudneo I think you should mark your older comments as outdated to avoid spam.

@kodai2199
Copy link

In my experience I was running CVAT using HTTP, but accessing it using a nginx HTTPS proxy and proxied cloudflare dns. After setting up CVAT to use HTTPS natively, and updating the nginx configuration, the issue was resolved without additional fixes.

@olliethomas
Copy link

Hi, is there anything that I can help with to get this merged? The issue that this resolves is stopping us from deploying CVAT on our infrastructure :(

@marc31
Copy link

marc31 commented Apr 17, 2024

In my experience I was running CVAT using HTTP, but accessing it using a nginx HTTPS proxy and proxied cloudflare dns. After setting up CVAT to use HTTPS natively, and updating the nginx configuration, the issue was resolved without additional fixes.

I'm in the same situation, but in my infrastructure, I can't use CVAT to directly manage its certificate. Do you think this will be merged?

@brunodrugowick
Copy link

I don't know if it helps others, but I'm deploying CVAT with a Cloudflare tunnel under my domain.

The only thing needed is to set the CVAT_HOST variable (if I'm not mistaken) to the final domain where it will be accessed, e.g. CVAT_HOST=cvat.yourdomain.com.

If you're deploying to a local network instead you just set the same var to the IP address of the box.

Sorry if this is not directly related to the problem or just too shallow into the problem, it's been a while since I've read this whole thread.

@marc31
Copy link

marc31 commented Apr 18, 2024

I'm deploying to a server, and my CVAT_HOST variable correctly holds the value of the final domain.
I'm using CVAT's Docker Compose for deployment without the HTTPS component, as ports 80 and 443 are already allocated by an NGINX server acting as a reverse proxy.
Without including the CSRF_TRUSTED_ORIGINS variable with my final domain, including the protocol, I encounter CSRF errors.

@brunodrugowick
Copy link

I'm deploying to a server, and my CVAT_HOST variable correctly holds the value of the final domain. I'm using CVAT's Docker Compose for deployment without the HTTPS component, as ports 80 and 443 are already allocated by an NGINX server acting as a reverse proxy. Without including the CSRF_TRUSTED_ORIGINS variable with my final domain, including the protocol, I encounter CSRF errors.

Ah, you're right. I do have to set CSRF_TRUSTED_ORIGINS:

#!/bin/bash

if [[ -z $1 ]]; then
    echo "internal or external?"
    exit 1
elif [[ $1 == "internal" ]]; then
    export CVAT_HOST=192.168.1.5
elif [[ $1 == "external" ]]; then
    export CVAT_HOST=cvat.whatever.com
    export CSRF_TRUSTED_ORIGINS=cvat.whatever.com
fi

#docker-compose up -d

#export CVAT_NUCLIO_DEFAULT_TIMEOUT=600
docker-compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml up -d

@marc31
Copy link

marc31 commented Apr 22, 2024

I'm deploying to a server, and my CVAT_HOST variable correctly holds the value of the final domain. I'm using CVAT's Docker Compose for deployment without the HTTPS component, as ports 80 and 443 are already allocated by an NGINX server acting as a reverse proxy. Without including the CSRF_TRUSTED_ORIGINS variable with my final domain, including the protocol, I encounter CSRF errors.

Ah, you're right. I do have to set CSRF_TRUSTED_ORIGINS:

#!/bin/bash

if [[ -z $1 ]]; then
    echo "internal or external?"
    exit 1
elif [[ $1 == "internal" ]]; then
    export CVAT_HOST=192.168.1.5
elif [[ $1 == "external" ]]; then
    export CVAT_HOST=cvat.whatever.com
    export CSRF_TRUSTED_ORIGINS=cvat.whatever.com
fi

#docker-compose up -d

#export CVAT_NUCLIO_DEFAULT_TIMEOUT=600
docker-compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml up -d

Which version of CVAT are you using? I'm surprised because when I search throughout the entire project for CSRF_TRUSTED_ORIGINS, I only find one reference in the file cvat/cvat/settings/development.py, indicating that Django employs it only when is running in dev mode.

Additionally, in the docker-compose.yml files, there isn't anything specified for the container:

services:
    ...
    cvat_server:
        ...
        environment:
            ...
            CSRF_TRUSTED_ORIGINS=${CSRF_TRUSTED_ORIGINS}
            ...
        ...
    ...
...        

This implies that the variable you're assigning is never passed to the container and therefore isn't utilized.

@nmanovic
Copy link
Contributor

I will close the PR because the change looks unnecessary.

@nmanovic nmanovic closed this Apr 26, 2024
@marc31
Copy link

marc31 commented Apr 26, 2024

@nmanovic Can you explain why this PR looks unnecessary please ?

@brunodrugowick
Copy link

brunodrugowick commented Apr 27, 2024

I'm deploying to a server, and my CVAT_HOST variable correctly holds the value of the final domain. I'm using CVAT's Docker Compose for deployment without the HTTPS component, as ports 80 and 443 are already allocated by an NGINX server acting as a reverse proxy. Without including the CSRF_TRUSTED_ORIGINS variable with my final domain, including the protocol, I encounter CSRF errors.

Ah, you're right. I do have to set CSRF_TRUSTED_ORIGINS:

#!/bin/bash

if [[ -z $1 ]]; then
    echo "internal or external?"
    exit 1
elif [[ $1 == "internal" ]]; then
    export CVAT_HOST=192.168.1.5
elif [[ $1 == "external" ]]; then
    export CVAT_HOST=cvat.whatever.com
    export CSRF_TRUSTED_ORIGINS=cvat.whatever.com
fi

#docker-compose up -d

#export CVAT_NUCLIO_DEFAULT_TIMEOUT=600
docker-compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml up -d

Which version of CVAT are you using? I'm surprised because when I search throughout the entire project for CSRF_TRUSTED_ORIGINS, I only find one reference in the file cvat/cvat/settings/development.py, indicating that Django employs it only when is running in dev mode.

Additionally, in the docker-compose.yml files, there isn't anything specified for the container:

services:
    ...
    cvat_server:
        ...
        environment:
            ...
            CSRF_TRUSTED_ORIGINS=${CSRF_TRUSTED_ORIGINS}
            ...
        ...
    ...
...        

This implies that the variable you're assigning is never passed to the container and therefore isn't utilized.

Hi, I'm in 2.11.0. Apparently I added this var in one more place (other than my own script):

 >> grep -rnw . -e 'CSRF_TRUSTED_ORIGINS'
./docker-compose.yml:260:      CSRF_TRUSTED_ORIGINS: <redacted>
./run-proxy.sh:10:    export CSRF_TRUSTED_ORIGINS=<redacted>
./components/serverless/run-proxy.sh:10:    export CSRF_TRUSTED_ORIGINS=<redacted>
./cvat/settings/development.py:30:CSRF_TRUSTED_ORIGINS = [UI_URL]

I see that CSRF_TRUSTED_ORIGINS is an option from Django and CVAT uses Django:

/path/to/cvat ((HEAD detached at v2.11.0))
 >> grep -rnw . -e 'django=='
./.github/workflows/pylint.yml:26:              $(egrep "^django==" ./cvat/requirements/base.txt)
/path/to/cvat ((HEAD detached at v2.11.0))
 >> cat cvat/requirements/base.txt | grep django==
django==4.2.6

I don't know much about Django, but I have a lot of experience with Spring (JVM world) and you can change A LOT of things without touching the source code, only by setting runtime environment variables, so I wouldn't be surprised that this has an effect even though you can't find references in CVAT's source code.

Also I can see empirically that setting it does have an effect:

/path/to/cvat ((HEAD detached at v2.11.0))
 >> ./run-proxy.sh external
WARNING: Found orphan containers (cvat_redis) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
cvat_redis_ondisk is up-to-date
cvat_redis_inmem is up-to-date
cvat_db is up-to-date
cvat_opa is up-to-date
Starting cvat_grafana ...
traefik is up-to-date
cvat_clickhouse is up-to-date
nuclio is up-to-date
cvat_worker_import is up-to-date
cvat_utils is up-to-date
cvat_worker_quality_reports is up-to-date
cvat_worker_analytics_reports is up-to-date
cvat_worker_webhooks is up-to-date
cvat_worker_export is up-to-date
cvat_worker_annotation is up-to-date
cvat_server is up-to-date
cvat_vector is up-to-date
Starting cvat_grafana ... done
/path/to/cvat ((HEAD detached at v2.11.0))
 >> wget <redacted - previously cvat.whatever.com>
--2024-04-27 14:16:20--  http://cvat.whatever.com/
Resolving cvat.whatever.com (cvat.whatever.com)... <redacted>, <redacted>, ...
Connecting to cvat.whatever.com (cvat.whatever.com)|<redacted>|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html’

index.html          [ <=>   ]   1,24K  --.-KB/s    in 0s

2024-04-27 14:16:20 (120 MB/s) - ‘index.html’ saved [1265]

/path/to/cvat ((HEAD detached at v2.11.0))
 >> wget 192.168.0.17:8080
--2024-04-27 14:16:33--  http://192.168.0.17:8080/
Connecting to 192.168.0.17:8080... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-04-27 14:16:33 ERROR 404: Not Found.

/path/to/cvat ((HEAD detached at v2.11.0))
 >> ./run-proxy.sh internal
WARNING: Found orphan containers (cvat_redis) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
cvat_clickhouse is up-to-date
cvat_redis_inmem is up-to-date
Recreating traefik ...
cvat_db is up-to-date
cvat_redis_ondisk is up-to-date
Recreating cvat_grafana ...
nuclio is up-to-date
cvat_opa is up-to-date
cvat_vector is up-to-date
cvat_worker_annotation is up-to-date
cvat_utils is up-to-date
cvat_worker_import is up-to-date
cvat_worker_quality_reports is up-to-date
cvat_worker_export is up-to-date
Recreating traefik      ... done
Recreating cvat_grafana ... done
Recreating cvat_server  ... done
Recreating cvat_ui      ... done
/path/to/cvat ((HEAD detached at v2.11.0))
 >> wget cvat.whatever.com
--2024-04-27 14:16:51--  http://cvat.whatever.com/
Resolving cvat.whatever.com (cvat.whatever.com)... <redacted>, <redacted>, <redacted>, ...
Connecting to cvat.whatever.com (cvat.whatever.com)|<redacted>|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-04-27 14:16:51 ERROR 404: Not Found.

/path/to/cvat ((HEAD detached at v2.11.0))
 >> wget 192.168.0.17:8080
--2024-04-27 14:16:55--  http://192.168.0.17:8080/
Connecting to 192.168.0.17:8080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1265 (1,2K) [text/html]
Saving to: ‘index.html.1’

index.html.1    100%[======>]   1,24K  --.-KB/s    in 0s

2024-04-27 14:16:55 (406 MB/s) - ‘index.html.1’ saved [1265/1265]

@marc31
Copy link

marc31 commented Apr 28, 2024

@brunodrugowick Thanks you

I notice that you have a modified version of the docker-compose.yml file. Indeed, when I look at https://github.com/cvat-ai/cvat/blob/v2.11.0/docker-compose.yml, I see that the string CSRF_TRUSTED_ORIGINS is not present, contrary to what I can see in your grep output.

Regarding the tests with wget, they are not relevant for CSRF_TRUSTED_ORIGINS because it would require POST requests sending response data to a form. However, these tests show us that the CVAT_HOST variable is taken into account to define container labels, allowing Traefik to redirect correctly (cf.

- traefik.http.routers.cvat.rule=Host(`${CVAT_HOST:-localhost}`) &&
). This does not necessarily mean that the variable is accessible for the django container, which you can verify by running:

docker compose exec -it cvat_server /bin/bash -c 'echo "$CVAT_HOST"'
docker compose exec -it cvat_server /bin/bash -c 'echo "$CSRF_TRUSTED_ORIGINS"'

Moreover, even if the variable were accessible in the docker container, it would still need to be added to the python file in the settings directory.

Note: There is another way to make this work without adding the CSRF_TRUSTED_ORIGINS variable to Django's settings, but it would still require setting the USE_X_FORWARDED_HOST variable to true in Django's configuration and modifying Traefik's configuration.

@brunodrugowick
Copy link

@marc31 ,

Regarding the tests with wget, they are not relevant for CSRF_TRUSTED_ORIGINS because it would require POST requests sending response data to a form. However, these tests show us that the CVAT_HOST variable is taken into account to define container labels, allowing Traefik to redirect correctly

Indeed... I completely ignored what CSRF actually is! 🤦🏼

I got into all this because I was not able to upload things to CVAT unless I was running locally, so I guess at some point I ran into CSRF (or maybe not?). Sorry for the confusion.

@pktiuk
Copy link
Contributor Author

pktiuk commented Apr 28, 2024

Things would be much simpler for users if this patch would be merged over half a year ago.

@st0w
Copy link

st0w commented Jul 29, 2024

I came across this because I was running into the same issue. However I figured out how this can be solved without needing to patch the existing codebase.

Brief background: we are running CVAT on Google Cloud, and have an application load balancer that terminates SSL. So we were running into the same issue with CSRF origin failures when POSTing to the admin section. I believe it is because when is_secure() is called in the Django CSRF middleware, it returns false - as expected, since SSL is terminated at the load balancer and the connection coming to the app server is HTTP. But this ends up with CVAT/Django seeing the origin as different, because locally it thinks it is http:// and not https://. It kept reporting that the https://cvat_host.tld was not a valid origin.

As has been discussed, part of the issue is that the production CVAT settings do not specify the CSRF_TRUSTED_ORIGINS list, which the development settings do. Setting the variable in the container environment doesn't matter, because it needs to be used by Django - which it isn't in the current CVAT production settings.

The solution is to use an overlay to patch the Django settings, in the same fashion as described in the CVAT documentation for LDAP authentication: https://docs.cvat.ai/docs/administration/advanced/ldap/

The only changes that are needed are as follows:

Create a local-settings.py file that will be used to overlay the standard production settings. Contents:

# Overlaying production
from cvat.settings.production import *

CSRF_TRUSTED_ORIGINS = ['https://your.domain.tld']

Then create a file for docker-compose to apply the settings; I named it docker-compose.settings_overlay.local.yml. Contents:

services:
  cvat_server:
    environment:
      DJANGO_SETTINGS_MODULE: settings
    volumes:
      - ./local-settings.py:/home/django/settings.py:ro

And then apply it along with the standard docker-compose.yml and any other additional configurations (e.g., I also use the external DB config):

docker compose -f /opt/cvat/cvat/docker-compose.yml \
        -f /opt/cvat/cvat/docker-compose.external_db.local.yml \
        -f /opt/cvat/cvat/docker-compose.settings_overlay.local.yml \
        up -d

No source code changes needed, and this will persist across rebuilds, updates, deployments, etc.

@ducviet00-h2
Copy link

It would help user alot if this PR was merged

@jpfleischer
Copy link

@st0w , thanks very much for your fix that works extremely well. CVAT is essentially broken when trying to deploy over cloudflare tunnel without doing this fix. It should be included in the main documentation.

@danbiagini
Copy link

@st0w this worked great. I ended up forking github.com/cvat-ai/cvat and creating a light overlay with these changes and also piping in the email parameters to the settings. I'm going to attempt to keep it up to date with cvat releases.

https://github.com/danbiagini/cvat-light

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CSRF verification failed. Request aborted - after update