Zin Mar Win, Nyein Aye
Abstract: Today, the internet is the most powerful tools throughout the world. But the explosive growth of unsolicited emails has prompted the development of numerous spam filtering techniques. It needlessly obstruct the entire system. Spammers are creating new ways against anti-spam technology. By the end of 2006, the nature of spam had totally shifted. The newest of which is image-based spam. In general words, image spam is a type of email in which the text message is presented as a picture in an image file. This prevents text based spam filters from detecting and blocking such spam messages. There are several techniques available for detecting image spam (DNSBL, GrayListing, Spamtraps, etc). Each one has its own advantages and disadvantages. On behalf of their weakness, they become controversial to one another. This paper includes a general study on image spam detection using histogram and hough transform, which are explaind in the following sections. The proposed methods are tested on a spam archive dataset and are found to be effective in identifying all types of spam images having (1) only images (2) both text and images. The goal is to automatically classify an image directly as being spam or ham. The proposed method is able to identify a large amount of malicious images while being computationally inexpensive.
Keywords: histogram, hough transform, anti-spam technology, image spam detection, spam archive dataset