A Comparison Study of Cooperative Q-learning Algorithms for Independent Learners

Title
A Comparison Study of Cooperative Q-learning Algorithms for Independent Learners
Publication Date
2016
Author(s)
Abed-Alguni, Bilal
Paul, David
( author )
OrcID: https://orcid.org/0000-0002-2428-5667
Email: dpaul4@une.edu.au
UNE Id une-id:dpaul4
Chalup, Stephan
Henskens, Frans
Type of document
Journal Article
Language
en
Entity Type
Publication
Publisher
Centre for Environment, Social and Economic Research Publications
Place of publication
India
UNE publication id
une:18867
Abstract
Cooperative reinforcement learning algorithms such as BEST-Q, AVE-Q, PSO-Q, and WSS use Q-value sharing strategies between reinforcement learners to accelerate the learning process. This paper presents a comparison study of the performance of these cooperative algorithms as well as an algorithm that aggregates their results. In addition, this paper studies the effects of the frequency of Q-value sharing on the learning speed of the independent learners that share their Q-values among each other. The algorithms are compared using the taxi problem (multi-task problem) and different instances of the shortest path problem (single-task problem). The experimental results when learners have equal levels of experience suggest that sharing of Q-values is not beneficial and produces similar results to single agent Q-learning. However, the experimental results when learners have different levels of experience suggest that most of the cooperative Q-learning algorithms perform similarly, but better than single agent Q-learning, especially when Q-value sharing is highly frequent. This paper then places Q-value sharing in the context of modern reinforcement learning techniques and suggests some future directions for research.
Link
Citation
International Journal of Artificial Intelligence, 14(1), p. 71-93
ISSN
0974-0635
Start page
71
End page
93

Files:

NameSizeformatDescriptionLink