Author(s) |
Abed-Alguni, Bilal
Paul, David
Chalup, Stephan
Henskens, Frans
|
Publication Date |
2016
|
Abstract |
Cooperative reinforcement learning algorithms such as BEST-Q, AVE-Q, PSO-Q, and WSS use Q-value sharing strategies between reinforcement learners to accelerate the learning process. This paper presents a comparison study of the performance of these cooperative algorithms as well as an algorithm that aggregates their results. In addition, this paper studies the effects of the frequency of Q-value sharing on the learning speed of the independent learners that share their Q-values among each other. The algorithms are compared using the taxi problem (multi-task problem) and different instances of the shortest path problem (single-task problem). The experimental results when learners have equal levels of experience suggest that sharing of Q-values is not beneficial and produces similar results to single agent Q-learning. However, the experimental results when learners have different levels of experience suggest that most of the cooperative Q-learning algorithms perform similarly, but better than single agent Q-learning, especially when Q-value sharing is highly frequent. This paper then places Q-value sharing in the context of modern reinforcement learning techniques and suggests some future directions for research.
|
Citation |
International Journal of Artificial Intelligence, 14(1), p. 71-93
|
ISSN |
0974-0635
|
Link | |
Publisher |
Centre for Environment, Social and Economic Research Publications
|
Title |
A Comparison Study of Cooperative Q-learning Algorithms for Independent Learners
|
Type of document |
Journal Article
|
Entity Type |
Publication
|
Name | Size | format | Description | Link |
---|