280 B
280 B
...
application of the rl methodology to the addressed problem agent collaboration
- no
- proactive
- reactive rerward function: is immediate?
environment model free or model based? (se l'agent impara la transition matrix)
exploration vs exploitation: policy
converge studies