Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version of Frozen Lake based on the implementation #186

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

LooseTerrifyingSpaceMonkey

from python's gymnasium.

@Sid-Bhatia-0
Copy link
Member

Thank you for the pull request @LooseTerrifyingSpaceMonkey . Will review it by this weekend.

@Sid-Bhatia-0
Copy link
Member

Hi @LooseTerrifyingSpaceMonkey . Thanks for the PR.

I cloned your branch and tried testing the environment. But couldn't get it to work. Had you tested the environment? I am getting the following error (This probably has something to do with line 33 where default value of map_name = String.):

(base)  (frozen_lake_impl) example $ julia --project=.
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.8.5 (2023-01-08)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(example) pkg> st
Status `~/projects/temp/GridWorlds.jl/example/Project.toml` (empty project)

(example) pkg> dev --local ../../GridWorlds.jl/
   Resolving package versions...
    Updating `~/projects/temp/GridWorlds.jl/example/Project.toml`
  [e15a9946] + GridWorlds v0.5.0 `..`
    Updating `~/projects/temp/GridWorlds.jl/example/Manifest.toml`
⌅ [1520ce14] + AbstractTrees v0.3.4
  [d842c3ba] + CommonRLInterface v0.3.1
  [34da2185] + Compat v4.6.1
  [864edb3b] + DataStructures v0.18.13
  [e15a9946] + GridWorlds v0.5.0 `..`
  [1914dd2f] + MacroTools v0.5.10
  [bac558e1] + OrderedCollections v1.6.0
⌅ [e575027e] + ReinforcementLearningBase v0.9.7
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [b77e0a4c] + InteractiveUtils
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [8dfed614] + Test
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll v1.0.1+0
  [4536629a] + OpenBLAS_jll v0.3.20+0
  [8e850b90] + libblastrampoline_jll v5.1.1+0
        Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated -m`

julia> import GridWorlds as GW
[ Info: Precompiling GridWorlds [e15a9946-cd7f-4d03-83e2-6c30bacb0043]

julia> # Each environment `Env` lives in its own module `EnvModule`
       # For example, the `SingleRoomUndirected` environment lives inside the `SingleRoomUndirectedModule` module

       env = GW.FrozenLakeUndirectedModule.FrozenLakeUndirected()
Obstacle Positions: CartesianIndex{2}[CartesianIndex(140689686970704, 140689686970960), CartesianIndex(140689686971216, 140689686971472), CartesianIndex(140689686972080, 140689686972496), CartesianIndex(140689686972528, 140689686972560)] Height: 8 Width: 8
ERROR: BoundsError: attempt to access 4×10×10 BitArray{3} at index [4, 140689686970704, 140689686970960]
Stacktrace:
 [1] throw_boundserror(A::BitArray{3}, I::Tuple{Int64, Int64, Int64})
   @ Base ./abstractarray.jl:703
 [2] checkbounds
   @ ./abstractarray.jl:668 [inlined]
 [3] _setindex!
   @ ./abstractarray.jl:1366 [inlined]
 [4] setindex!
   @ ./abstractarray.jl:1344 [inlined]
 [5] update_obstacles_on_map(tile_map::BitArray{3}, obstacle_positions::Vector{CartesianIndex{2}})
   @ GridWorlds.FrozenLakeUndirectedModule ~/projects/temp/GridWorlds.jl/src/envs/frozen_lake_undirected.jl:87
 [6] GridWorlds.FrozenLakeUndirectedModule.FrozenLakeUndirected(; map_name::Type, R::Type, height::Int64, width::Int64, num_obstacles::Int64, rng::Random._GLOBAL_RNG, is_slippery::Bool)
   @ GridWorlds.FrozenLakeUndirectedModule ~/projects/temp/GridWorlds.jl/src/envs/frozen_lake_undirected.jl:71
 [7] GridWorlds.FrozenLakeUndirectedModule.FrozenLakeUndirected()
   @ GridWorlds.FrozenLakeUndirectedModule ~/projects/temp/GridWorlds.jl/src/envs/frozen_lake_undirected.jl:33
 [8] top-level scope
   @ REPL[4]:4

julia>

Copy link
Member

@Sid-Bhatia-0 Sid-Bhatia-0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to mention this environment in the README.

I used some tool (maybe asciinema) to generate the GIFs in the README. But we can figure that out later.

Also, please make sure to test the code thoroughly this time 😅

Thanks.

src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
changed sizing to include walls,
updated README.md with description
(still need images...),
renamed OBSTACLE to HOLE,
changed the hole character.
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved

agent_position = CartesianIndex(2, 2)
tile_map[AGENT, agent_position] = true
if randomize_start_end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This randomization should also be done when resetting the environment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a significant change from the gymnasium model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used the gymnasium one. Do you mean for every episode, the map remains the same?
I think it is valuable to have an option to be able to randomize the map on every episode to facilitate generalization. Most other environments in this package are like that. We could add a boolean to toggle it off if a user wants to keep the same map and adhere to the gym specification. Does that sound reasonable?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the map remains the same. From what I have seen it is used for q-learning and dynamic programming examples. If the map changed after every episode the agent wouldn't learn. I will set a boolean so the user has the option to turn on the resetting.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide some guidance on unit testing with all these different flags? I don't understand how the settings are passed in runtests.jl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a boolean works.

Unfortunately, automatic unit testing all flag combinations for these games is a bit hard, and isn't implemented in runtests.jl yet. So for now, you can just test the default settings for an environment in runtests.jl:

env = GW.RLBaseEnv(Env(R = R))
and test the rest of the combinations manually by directly playing these games directly in the terminal by doing GW.play!(env).

src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
obstacle_position = GW.sample_empty_position(rng, tile_map)
obstacle_positions[i] = obstacle_position
if map_name == ""
function get_neighbors(state::CartesianIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reasons of having these functions defined inside the constructor? If not, we can move them outside. Also, we will need to run A* when resetting the environment too. So we will be needing them there.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because they are never used elsewhere. They won't be used when resetting the environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reopening it as per above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just saying, I haven't seen a lot of code where functions are put inside other functions, even if they are not used elsewhere. In closures it could be useful when you want to capture other variable and such. But other than that, I haven't seen it happening much unless there is some specific reason. I may be wrong, but I think it is clearer to have them outside, unless there is a specific need, in my opinion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually see this quite a bit within Python code written by academics. I don't see it by people in industry. I don't know why. I can change it though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Please go ahead with the change. Thanks.


The objective of the agent is to navigate its way to the goal while avoiding falling into the holes in the lake. When the agent reaches the goal, it receives a reward of 1 and the environment terminates. If the agent collides falls into a hole, the agent receives a reward of -1 and the environment terminates. The probablility of moving in the direction given by an agent is 1/3 while there is 1/3 chance to move in either perpendicular direction (for example: 1/3 chance to move up, 1/3 chance to move left and 1/3 chance to move right if the agent chose up). The scenario is based on the [Frozen Lake environment](https://gymnasium.farama.org/environments/toy_text/frozen_lake/) in Python's gymnasium. In the Python version there are two preset maps: "4x4" and "8x8". The GridWorlds implementation includes the walls as part of the dimensions, so the equivalent maps in GridWorlds is "6x6" and "10x10" respectively. The start, goal, and holes are located in the same positions in the lake as the Python version. If specifying custom height and widths keep in mind it is going to add walls all around the map so the actual surface of the lake is (height - 2, width - 2).

<img src="https://user-images.githubusercontent.com/32610387/126910030-d93a714d-10b7-4117-887c-773afe78c625.png">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The screenshot and gif are from DynamicObstaclesUndirected. We need to create new ones for FrozenLakeUndirected.

We can do that in the end.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct.

src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved
src/envs/frozen_lake_undirected.jl Outdated Show resolved Hide resolved

agent_position = CartesianIndex(2, 2)
tile_map[AGENT, agent_position] = true
if randomize_start_end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used the gymnasium one. Do you mean for every episode, the map remains the same?
I think it is valuable to have an option to be able to randomize the map on every episode to facilitate generalization. Most other environments in this package are like that. We could add a boolean to toggle it off if a user wants to keep the same map and adhere to the gym specification. Does that sound reasonable?

obstacle_position = GW.sample_empty_position(rng, tile_map)
obstacle_positions[i] = obstacle_position
if map_name == ""
function get_neighbors(state::CartesianIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reopening it as per above

src/envs/frozen_lake_undirected.jl Show resolved Hide resolved
obstacle_position = GW.sample_empty_position(rng, tile_map)
obstacle_positions[i] = obstacle_position
if map_name == ""
function get_neighbors(state::CartesianIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just saying, I haven't seen a lot of code where functions are put inside other functions, even if they are not used elsewhere. In closures it could be useful when you want to capture other variable and such. But other than that, I haven't seen it happening much unless there is some specific reason. I may be wrong, but I think it is clearer to have them outside, unless there is a specific need, in my opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants