Assignment 5

Due Date: 3 April at 5:00 p.m.  (Note that this assignment is to be completed individually)


The task in this assignment is to implement the value iteration planning algorithm.  The environment is a simple grid-based world which is simulated in ROS through a node called gridsim.  The agent who lives in this world can move to one of the 8 adjacent cells in the grid, as long as the map does not indicate that the destination cell is an obstacle.

Download the following zip file which contains all necessary components:

Your job is to fill in the value_it method in  You can make the following simplifying assumptions:

  • The same payoff function as discussed in class can be used.
  • The movement of the robot can be considered deterministic.  This means that in any state x_i when an action u is taken, only one new state x_j is possible.


To test your code execute the following: roslaunch comp4766_a5 office.launch.  Take a look at the launch file to see what is going on.  The following nodes are created:

  • The simulator,
  • The map server, map_server, which is used to provide access to the map to the other nodes
  • rviz
  • A Planner object, defined in  The Planner will call your value_it function

Once the launch file is executed you should see the map, overlaid with a green square indicating the agent’s position.  Nothing will happen until a goal position is selected.  Click on the “2D Nav Goal” button then click somewhere within the map (this tool allows a pose to be selected, but our agents have no orientation, so the selected direction is ignored).  If you uncomment the starter code provided for value_it then you will see a random value function superimposed over the map in blue.  You should also see randomly oriented arrows which are intended to show the control policy.  The agent may make a few movements.  However, since the control policy is random, we cannot expect much to happen.

Once your implementation is complete run the launch file again and you should hopefully see a sensible value function and control policy.


  • The data stored in occupancy_grid is a one-dimensional list that holds two-dimensional data.  This data is stored in row-major order which means that it is filled row-by-row.  To access column i, row j, do the following:[width * j + i]
    where width is defined as follows:
    width =


Submit by the deadline given above.