Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: docker build #65

Closed
wants to merge 6 commits into from
Closed

fix: docker build #65

wants to merge 6 commits into from

Conversation

chrismclarke
Copy link
Member

@chrismclarke chrismclarke commented Jan 11, 2025

  • Add gitattributes so that required shell scripts preserve correct line endings on windows
  • Remove WORKDIR from dockerfile to fix issue related to https://github.com/fastapi/fastapi/issues/1988 on windows
  • Update dockerfile to avoid nested app/app folder structures
  • Remove dockerfile ENTRYPOINT script in favour of existing CMD (could see argued either way, but in practice using the entrypoint instead of leaving standard bash would limit what binaries available to subsequent CMD)
  • Add uvicorn to explicit list of requirements
  • Remove local app folder binding override (not required as container inlines build, although could be nice in the future if instead updating docker image with deps update only, independent of app code changes)

@istride
I'm not sure which of the updates above are strictly required to fix the current build, just that the combined fixes appear to be working as per the test action run I triggered via https://github.com/IDEMSInternational/epicsa-climate-api/actions/runs/12720672214?pr=65

Should also note that tests fail independently of this code, as the test scripts are currently used to quality-check production data which itself has issues. Recommend decoupling these two types of testing in the future, but beyond scope of this PR

@chrismclarke chrismclarke requested a review from istride January 11, 2025 23:36
@chrismclarke chrismclarke marked this pull request as ready for review January 11, 2025 23:36
@istride
Copy link
Contributor

istride commented Jan 13, 2025

I'd prefer that we did know exactly which changes are positively contributing to the fix, otherwise I'll feel that we're just guessing.

From my point of view (on Linux), the only problem is that uvicorn was not installed in the container - installing it fixes that problem. I don't have a Windows machine, so I cannot tell what other issues exist beyond that.

The /app/app/ folder structure looks redundant, but it is not intended to be. The first 'app' is the working directory, intended to hold all files pertaining to the running of the application in the container. I think it's important that this directory not be /, so that app and OS files are not mixed together. The second 'app' is the root Python module of the application and exists here as a convenience - as a mount point for application source files, and access to the tests. My preference is for this structure to remain as it was.

The entrypoint script is intended to make the Google credentials file available to the app at the correct location.The credentials are currently required when running the app (via uvicorn) and when running the tests (via pytest). The script is intended to do its thing and then run whatever command is passed to the container, or the default command. If you really want to temporarily skip the entrypoint, I recommend using the --entrypoint flag or entrypoint directive in Compose. I would prefer that the entrypoint configuration remain as it was because I cannot see how it would affect how the container runs in Windows.

The app directory mount in the Compose file is a convenience for developers, so that the container does not have to be rebuilt after every code change.

The .gitattributes update seems unnecessary to me, but I can't be sure because I don't know what problem it's attempting to solve.

Comment on lines 17 to 25
######################
# Debugging
######################

# entrypoint: /bin/bash -l -c "tail -f /dev/null"
# command: tail -f /dev/null
# volumes:
# - "./app:/app"
# - "./service-account.json:/app/service-account.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to have commented directives. I recommend storing this in a docker-compose.override.yml file (which should be git ignored), if these are personal preferences. If they are intended to be used by others, I suggest creating a separate docker compose file - e.g. docker-compose.debug.yml.

Copy link
Member Author

@chrismclarke chrismclarke Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good suggestions, I've moved over to the troubleshooting readme with 552c24e

@chrismclarke
Copy link
Member Author

chrismclarke commented Jan 13, 2025

I'd prefer that we did know exactly which changes are positively contributing to the fix, otherwise I'll feel that we're just guessing.

Fair point. The suggestion was that all the proposed changes were positive contributions in one way or another, although good to take the case-by-case comments as you've outlined.

From my point of view (on Linux), the only problem is that uvicorn was not installed in the container - installing it fixes that problem. I don't have a Windows machine, so I cannot tell what other issues exist beyond that.

Possible, feel free if you want to open an alternate PR to just fix the core build and I can revert this to draft/follow-up. For now these changes do manage to unblock the work I need to get done on the picsa app, so not super time-sensitive (but would be nice to have a working build soon)

The /app/app/ folder structure looks redundant, but it is not intended to be. The first 'app' is the working directory, intended to hold all files pertaining to the running of the application in the container. I think it's important that this directory not be /, so that app and OS files are not mixed together. The second 'app' is the root Python module of the application and exists here as a convenience - as a mount point for application source files, and access to the tests. My preference is for this structure to remain as it was.

Oh, good to know. I think it would be better to use some alternate naming conventions to make it obvious when referring to app what layer we are currently in (given the possibility for both absolute and relative paths in code). If there is strong reason to keep /app/app then this should be commented inline in the code as it is an otherwise somewhat unexpected/confusing pattern.

The entrypoint script is intended to make the Google credentials file available to the app at the correct location.The credentials are currently required when running the app (via uvicorn) and when running the tests (via pytest). The script is intended to do its thing and then run whatever command is passed to the container, or the default command. If you really want to temporarily skip the entrypoint, I recommend using the --entrypoint flag or entrypoint directive in Compose. I would prefer that the entrypoint configuration remain as it was because I cannot see how it would affect how the container runs in Windows.

I understand the need for this, I'm just not entirely convinced this is the best way to go about it. The entrypoint is designed to be a gateway to the shell/binary/cli where commands should run from, e.g. running python which can take a name of script/args to execute. If we really want to use an entrypoint to configure the environment, I think best practice would be to also use that shell to trigger the python script
https://www.docker.com/blog/docker-best-practices-choosing-between-run-cmd-and-entrypoint/

The app directory mount in the Compose file is a convenience for developers, so that the container does not have to be rebuilt after every code change.

The build step from the code changes only take a couple seconds due to the docker cache layers, so I don't think this is significant. The place it could save time is if developers are running the api locally through the docker container, although if promoting this we should update the readme accordingly. Should also note that live-reload will not work in windows due to the way file change detection is propogated through containers (not tested myself on linux, would be interested to know if it does).

The .gitattributes update seems unnecessary to me, but I can't be sure because I don't know what problem it's attempting to solve.

This comment seems to ignore what I wrote in the description, which is a little frustrating.

Add gitattributes so that required shell scripts preserve correct line endings on windows

To re-iterate, when checking out a git repo on windows line endings are converted to CRLF by default, unless a user has specifically changed global git config. This means if building locally any bash file will no longer be executable within a linux container. Only bash files are sensitive to these types of changes, so makes sense to just handle with a .gitattributes file to preserve those line-endings. It is a minor fix required just for windows, and easier than explaining to potential devs how to configure global line-endings. More info at:
https://docs.github.com/en/get-started/getting-started-with-git/configuring-git-to-handle-line-endings

@istride
Copy link
Contributor

istride commented Jan 14, 2025

I did not know that git converts line endings by default on Windows - thanks for explaining. I now fully understand why the .gitattributes change is necessary. It seems to me that this change and installing uvicorn are the only two changes that are fixing the build and Windows-specific issues. I consider the other changes to be contentious, so I would prefer them not to be part of this PR.

Live reload works on Linux and Mac, so it would be best to keep the mount point and binding in place.

I too would prefer not to use entrypoint, and I think the way to avoid it is to change the app so that the location of the credentials file is configurable (via env vars) and not hardcoded, as it is currently. However, I consider this to be out of scope.

Comment on lines +135 to +144
**Docker container fails**
This can happen for many reasons, check the logs for any specific errors raised.

If troubleshooting the code in the container, it may be useful to bypass default entrypoint or commands so that the container can be kept running and inspected.
Typically this can be done by overriding the `docker-compose.yml`, such as:

```yml
entrypoint: /bin/bash -l -c "tail -f /dev/null"
command: tail -f /dev/null
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to go on about this - I'm not trying to torment you, honestly.

I would like to know how this helps with troubleshooting since I have never done anything like this before. What is the method?

I would like to avoid any suggestion that the Docker Compose file should be modified temporarily to accommodate a short-term interest, like debugging. The reason being to avoid accidental commits of overrides. I would like to make it very clear that short-term or personal overrides should go in docker-compose.override.yml.

Copy link
Member Author

@chrismclarke chrismclarke Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite often times that I find a docker container fails and the messages reported not always very clear or actionable. A simple case like a python "module not found" error could be related to a file not being present, having incorrect permissions, or any number of other issues related to the code, dependencies etc. Usually as a starting point for debugging I various other potential issues which are more easily debugged by spinning up the container, looking at the files, and trying to execute commands from a shell within the container itself.

The trouble will this is if a container fails it immediately shuts down. If it has a restart policy it will try to restart, but fail again and shut down ad-infinitum.

So two tools I frequently find myself falling back on are:

  1. The ability to spin up a container and run a different command, with the sole purpose of getting it to stay running. There's various ways to do this depending on OS and shell in use, one link I have bookmarked for such occasions is:
    https://kodekloud.com/blog/keep-docker-container-running/

Most of the time you can simply provide an override command in the docker-compose to do this. Although in the case above standard commands like cat and tail were not working due to a custom entrypoint being specified, meaning this needed override too (as in the troubleshooting example).

  1. The ability to remote into the container to view files and execute directly on the container. While this is possible through docker commands like docker exec -it /bin/bash ... (https://kodekloud.com/blog/docker-exec/), I find it easier to use the vscode devcontainers extension to attach to a running container, from where you can view all files within the container and run commands
    https://code.visualstudio.com/docs/devcontainers/attach-container

Debugging is indeed a short term interest, but here I think the priority should be on efficiency more than isolation. I would say most developers tend to lean into things like adding console logs and temporary file overrides as part of the debug process, and it is routine to rely on developer competencies, review processes and automated testing to try make sure things that shouldn't be included in production don't get committed (same goes for most code in general). I think this is better than trying to setup temporary environments just for debugging, as even with a dedicated compose override developers will still need to make ongoing modifications to production files while troubleshooting. If we need more strict safeguards I would recommend looking more into linting or other CI automation that can flag issues as part of review process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear enough that the intention is to use 'exec' or something like devcontainers once the container is running. I understand the advice you have given as your personal preference for debugging, in which case, those changes should be kept in your override file. I think it is worthwhile to encourage people to use the override file first, then, if the change is generally applicable, to change the default Compose file.

As you have noted, there are several ways to keep a container running indefinitely. When I need to inspect a container I tend to use a run command. For this project, it would be:

docker compose run --rm app bash

This keeps the container running, so I assume that it would also be compatible with the devcontainers extension. If so, there is no need to change the Docker Compose file for the sake of debugging. If the entrypoint also needs to be overriden, the --entrypoint flag will do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please try to clarify which of your comments are blocking on this PR vs not. I'm not sure whether you are strongly opposed to me having that text in the readme or not, and am quite keen to get some form of PR merged to unblock the main build.

Alternatively if you are able to open a PR with just the changes you deem necessary I'd be happy to review, merge and deal with whatever other follow-up in a separate PR

@istride
Copy link
Contributor

istride commented Jan 24, 2025

I intended to extract the essential changes that fix the Docker issues and create a new PR, but I have ended up pushing those changes directly to main. This is an honest mistake on my part - sorry about that.

Nevertheless, would you check that the main branch works for you. Please raise a new issue or PR to address any remaining problems. If you agree, I will close this PR.

@chrismclarke chrismclarke marked this pull request as draft January 24, 2025 19:47
@chrismclarke
Copy link
Member Author

chrismclarke commented Jan 24, 2025

I intended to extract the essential changes that fix the Docker issues and create a new PR, but I have ended up pushing those changes directly to main. This is an honest mistake on my part - sorry about that.

Nevertheless, would you check that the main branch works for you. Please raise a new issue or PR to address any remaining problems. If you agree, I will close this PR.

No worries, it happens.

I checked out out main and for the most part worked, just needed to install python-dotenv which I've also added to the requirements. Will close here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants