A Comparison Study of Cooperative Q-learning Algorithms for Independent Learners

Abed-Alguni, Bilal; Paul, David; Chalup, Stephan; Henskens, Frans

Title

Publication Date

2016

Author(s)

Abed-Alguni, Bilal

Paul, David

( author )
OrcID: https://orcid.org/0000-0002-2428-5667
Email: dpaul4@une.edu.au
UNE Id une-id:dpaul4

Chalup, Stephan

Henskens, Frans

Type of document

Journal Article

Language

en

Entity Type

Publication

Publisher

Centre for Environment, Social and Economic Research Publications

Place of publication

India

UNE publication id

une:18867

Abstract

Cooperative reinforcement learning algorithms such as BEST-Q, AVE-Q, PSO-Q, and WSS use Q-value sharing strategies between reinforcement learners to accelerate the learning process. This paper presents a comparison study of the performance of these cooperative algorithms as well as an algorithm that aggregates their results. In addition, this paper studies the effects of the frequency of Q-value sharing on the learning speed of the independent learners that share their Q-values among each other. The algorithms are compared using the taxi problem (multi-task problem) and different instances of the shortest path problem (single-task problem). The experimental results when learners have equal levels of experience suggest that sharing of Q-values is not beneficial and produces similar results to single agent Q-learning. However, the experimental results when learners have different levels of experience suggest that most of the cooperative Q-learning algorithms perform similarly, but better than single agent Q-learning, especially when Q-value sharing is highly frequent. This paper then places Q-value sharing in the context of modern reinforcement learning techniques and suggests some future directions for research.

Link

link

Citation

International Journal of Artificial Intelligence, 14(1), p. 71-93

ISSN

0974-0635

Start page

71

End page

93

A Comparison Study of Cooperative Q-learning Algorithms for Independent Learners

Files: