... application of the rl methodology to the addressed problem agent collaboration - no - proactive - reactive rerward function: is immediate? environment model free or model based? (se l'agent impara la transition matrix) exploration vs exploitation: policy converge studies