SenseInput: An Image-Based Sensitive Input Detection Scheme for Phishing Website Detection

Lin, Shih Chun; Wl, Pang Cheng; Chen, Hong Yen; Morikawa, Tomohiro; Takahashi, Takeshi; TSUNG-NAN LIN

doi:10.1109/ICC45855.2022.9838653

SenseInput: An Image-Based Sensitive Input Detection Scheme for Phishing Website Detection

Journal

IEEE International Conference on Communications

Journal Volume

2022-May

ISBN

9781538683477

Date Issued

2022-01-01

Author(s)

Lin, Shih Chun

Wl, Pang Cheng

Chen, Hong Yen

Morikawa, Tomohiro

Takahashi, Takeshi

TSUNG-NAN LIN

DOI

10.1109/ICC45855.2022.9838653

URI

https://scholars.lib.ntu.edu.tw/handle/123456789/632633

URL

https://api.elsevier.com/content/abstract/scopus_id/85137261494

Abstract

Phishing has persistently posed threats to the World Wide Web as phishing websites evolve over these years. Many previous works were devoted to extracting useful features and focused on the essential components of phishing websites. One of the essential components is sensitive inputs which require sensitive information. Yet, due to a large variety of web designs, detecting the existence of sensitive inputs is not trivial. Some previous works have provided rule-based approaches to detect login forms, which contain sensitive inputs, using HTML codes. However, the novel phishing websites modify HTML codes against the detection rules, which causes less accurate detection.To overcome the limitation of previous works, we proposed SenseInput using hybrid deep learning models to detect the existence of sensitive inputs and sensitive information because phishing websites eventually present sensitive inputs in their visual content. SenseInput achieved 96.94% f1-score for sensitive input detection on our dataset and 96.73% f1-score on a public dataset, Phishpedia Phish30K. Next, we used 22 features involving the proposed seven statistical features and two sensitive input features for phishing detection. The experiment shows that our approach achieves 98.48% and 95.87% f1-score on our validation and Phishpedia datasets, outperforming previous approaches. Finally, we investigated the influence of sensitive input features. The result shows that our sensitive input features are more effective than the rule-based login form. Besides, the experiment also indicates that proposed sensitive input features can reduce the impact of bias between different datasets.

Subjects

computer vision | machine learning | object detection | phishing detection

Type

conference paper

SenseInput: An Image-Based Sensitive Input Detection Scheme for Phishing Website Detection

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)