Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Restrict Special Characters in Autogenerated Image Filenames for Security and Compatibility #21429

Open
TARishabh opened this issue Dec 12, 2024 · 7 comments
Assignees
Labels
enhancement Label to indicate an issue is a feature/improvement triage needed

Comments

@TARishabh
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Currently, the system autogenerates image filenames, but there is no enforced restriction on disallowing certain special characters such as <, >, &, and others that could lead to potential security or compatibility issues. Introduce security vulnerabilities like XSS or injection attacks if filenames are not sanitized properly.

Describe the solution (or solutions) you'd like

Restrict allowed characters in filenames, make sure autogenerated filenames are limited to the following characters:

  • Alphanumeric characters (a-z, A-Z, 0-9)
  • Hyphens (-)
  • Underscores (_)
  • Periods (.) for extensions

Describe alternatives you've considered and rejected

No response

Additional context

invalid_file_names.mp4
@TARishabh TARishabh added triage needed enhancement Label to indicate an issue is a feature/improvement labels Dec 12, 2024
@TARishabh
Copy link
Collaborator Author

Hi @seanlip , Could you please confirm if the solutions I’ve suggested are correct? Also, are there any additional characters you suggest restricting in filenames for better security (beyond <, >, etc)? Thanks!

@seanlip
Copy link
Member

seanlip commented Dec 13, 2024

Hi @TARishabh, the approach of going with an allowlist for filenames seems good. Better to do that than a denylist.

One question: for your example video, how do the images currently get locally stored in your local GCS? What are the filenames there? (Need this info to assess how important this issue is.)

@TARishabh
Copy link
Collaborator Author

@seanlip Sorry for the delay ! Currently, in my local setup, images are stored in the following format within the GCS file system:
<entity>/<entity-id>/assets/image/<image_name>

For example:

Entity: exploration
Entity ID: 2eMUrx1Y2uRb
Image name: img_20241215_181758_c1er5we4z9_height_350_width_450.svg

This results in the GCS file path:
exploration/2eMUrx1Y2uRb/assets/image/img_20241215_181758_c1er5we4z9_height_350_width_450.svg

When accessed locally during development, the full URL to the image is:
http://localhost:8181/assetsdevhandler/exploration/2eMUrx1Y2uRb/assets/image/img_20241215_181758_c1er5we4z9_height_350_width_450.svg

Thanks !

@seanlip
Copy link
Member

seanlip commented Dec 16, 2024

Thanks, sounds good. I think this means we autogenerate the image names so this might not be super-high priority, but still would be good to fix. Thanks!

@seanlip
Copy link
Member

seanlip commented Dec 16, 2024

@TARishabh I looked at this again and I think the issue, as written, needs more clarification.

It looks like autogenerated filenames already follow the restrictions you laid out. Shouldn't the issue be to update the backend code that processes the creation of files on GCS? If so, could you please link to the relevant file in the issue description? If not, could you please clarify where the validation should happen (backend, frontend, connection to GCS, etc.)?

Thanks!

@TARishabh
Copy link
Collaborator Author

@seanlip , Sorry for the delay.

To clarify, while the autogenerated filenames currently follow a basic pattern, the goal of this issue is to further restrict special characters (like <, >, &, etc.) to enhance security and compatibility, especially in cases where the filenames might be exposed to external systems or used in URLs.

I agree that the validation needs to be enforced during the backend file creation process before files are uploaded to GCS. I believe it would be best to handle the sanitization there, as this would ensure consistency across all scenarios, whether the files are being accessed locally or remotely.

I'll update the issue description to include this information and link the relevant backend file that processes file creation on GCS. Does this sound good?

Thanks !

@seanlip
Copy link
Member

seanlip commented Dec 31, 2024

@TARishabh Yes, seems fine, thanks. If it's a simple fix, feel free to make it directly too. Thanks!

@TARishabh TARishabh self-assigned this Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Label to indicate an issue is a feature/improvement triage needed
Projects
None yet
Development

No branches or pull requests

2 participants