Identifying malicious web domains using machine learning techniques with online credibility and performance data

Hu, Zhongyi; Chiong, Raymond; Pranata, Ilung; Susilo, Willy; Bao, Yukun

doi:10.1109/CEC.2016.7748347

Title

Identifying malicious web domains using machine learning techniques with online credibility and performance data

Publication Date

2016

Author(s)

Hu, Zhongyi

Chiong, Raymond

( author )
OrcID: https://orcid.org/0000-0002-8285-1903
Email: rchiong@une.edu.au
UNE Id une-id:rchiong

Pranata, Ilung

Susilo, Willy

Bao, Yukun

Type of document

Conference Publication

Language

en

Entity Type

Publication

Publisher

IEEE

Place of publication

United States of America

DOI

10.1109/CEC.2016.7748347

UNE publication id

une:1959.11/61466

Abstract

Malicious web domains represent a big threat to web users' privacy and security. With so much freely available data on the Internet about web domains' popularity and performance, this study investigated the performance of well-known machine learning techniques used in conjunction with this type of online data to identify malicious web domains. Two datasets consisting of malware and phishing domains were collected to build and evaluate the machine learning classifiers. Five single classifiers and four ensemble classifiers were applied to distinguish malicious domains from benign ones. In addition, a binary particle swarm optimisation (BPSO) based feature selection method was used to improve the performance of single classifiers. Experimental results show that, based on the web domains' popularity and performance data features, the examined machine learning techniques can accurately identify malicious domains in different ways. Furthermore, the BPSO-based feature selection procedure is shown to be an effective way to improve the performance of classifiers.

Link

link

Citation

Proceedings of the 2016 IEEE Congress on Evolutionary Computation, p. 5186-5194

ISBN

9781509006236

9781509006229

Start page

5186

End page

5194

Identifying malicious web domains using machine learning techniques with online credibility and performance data

Files: