PHD Discussions Logo

Ask, Learn and Accelerate in your PhD Research

Question Icon Post Your Answer

Question Icon

Where to Find Real Phishing & Malware Data for Your Security Project

 I'm building a machine learning model to detect phishing sites or malware. Where can I find large, reliable datasets of actual malicious URLs or files to train and test my model?

All Answers (1 Answers In All)

By Shreya K Answered 2 months ago

 For phishing URLs, check out PhishTank (community-verified list) and OpenPhish (live feed). Academics often use the University of New Brunswick's dataset. For malware samples, VirusShare is a massive repository, and Contagio Malware Dump has good curated samples. A word of caution: always handle malware in a secure, isolated lab environment (a sandboxed VM). Most of these sources provide data in CSV or JSON, making them great for research.

Your Answer