My main research is focused on Subgoal Identification in Reinforcement Learning. Subgoal Identification is useful for the agent which tries to learn the environment to achieve some tasks efficiently. The difference between classical Q Learning and Macro-Q Learning with dynamic subgoal identification is obvious in the two animations below, prepared by Alper Demir.
You can see the classical 2 rooms with 1 door grid domain in the following figure. Agent start each episode in a random cell at the left room and tries to reach the bottom right corner of the right room.
For Q Learning without subgoal identification this is the trace of agent history for 100th episode in this domain:
When the agent detects a subgoal close to the doorway (the cell colored with white) in this domain like the figure below:
Then the agent generates an option to reach that state and the trace of agent history for 100th episode in this domain becomes more goal oriented as can be seen in the animation below:
The efficency in learning which agent gains with the dynamic identification of subgoal is significant. For the details of these learning processes please refer to the Alper's webpage.