Learning tabula rasa can be unnecessarily slow Humans can use past information - Soccer with different numbers of players
- Different state variables and actions
Agents: leverage learned knowledge in novel/modified tasks
Model Free Model Free - Q-Learning, Sarsa, etc.
- Learn values of actions
- In example: ~256 actions
Model-Based - Dyna-Q, R-Max, etc.
- Learn effects of actions (“what is the next state?” → planning)
- In example: ~36 actions
Transferring Instances for Model Based REinforcement Learning Transferring Instances for Model Based REinforcement Learning Transfer between - Model-learning RL algorithms
- Different state variables and actions
- Continuous state spaces
In this paper, we use:
χx: starget→ssource χx: starget→ssource - Given state variable in target task (some x from s = x1, x2, … xn )
- Return corresponding state variable in source task
χA: atarget→asource Intuitive mappings exist in some domains (Oracle) Mappings can be learned (e.g., Taylor, Kuhlmann, and Stone (2008))
3D Mountain Car - x, y, ,
- Neutral, West, East, South, North
χX χA - Neutral → Neutral
- West, South → Left
- East, North → Right
Fitted R-MAX balances: - sample complexity
- computational complexity
- asymptotic performance
Instance Transfer in Fitted Q Iteration Transferring Regression Model of Transition Function - Atkeson and Santamaria, 1997
Ordering Prioritized Sweeping via Transfer - Tanaka and Yamamura, 2003
- Wilson et. al, 2007
Implement with other model-learning methods Implement with other model-learning methods - Dyna-Q
- R-Max
- Fitted Q Iteration
Guard against U-shaped curve in Fitted R-Max? - Can TIMBREL improve performance of real world problems?
Significantly increases speed of learning Significantly increases speed of learning Results suggest less data needed to learn than Transfer performances depends on: - Source task and target task similarity
- Amount of source task data collected
Model Free: Model Free: - Value Function [Taylor, Liu, & Stone JMLR-07]
- Policy [Taylor, Whiteson, & Stone AAMAS-07]
- Rules [Taylor & Stone, ICML-07]
Full Model?
Dostları ilə paylaş: |