International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 121 | Views: 193

Research Paper | Computer Science & Engineering | Kuwait | Volume 5 Issue 11, November 2016


Improving Compression Methods for Arabic Text Using Dedicated Character Mapping

Hedaya Ghanim Alshammar [2] | Dina Hamad Alghurair [3]


Abstract: Natural Language Text Compression methods have been discussed thoroughly in the literature in the past years, different methodologies have been implemented and introduced, most however focused on English and European languages. Rather few studies have focused on Arabic Language, some methods used statistical approaches, other methods used dictionary based compression techniques, while some used features of the Arabic language and derivation rules in attempt to increase compression ratio. In this paper, we will introduce several statistical methods for natural language and apply it on Arabic text. We will also provide implementation for each of these methods and give a comparison between them in terms of performance, compression ratio, resource requirements for running the algorithms and areas and application and usage. Golomb, Elias Gamma Code, Huffman methods are to be implemented, and compared as a sample statistical algorithms, We will also introduce a dedicated Arabic Character Mapping technique to be used in the Elias, Golomb and Huffman algorithms, which will show through the results a major improvement to the compression ratio in comparison to the original methods when applied on binary data ignoring the language underneath, the improvement introduced will show that it can be superior even to LZW when used on small Arabic Sample Files, two sets of data will be tested, first set uses random Arabic text, the second set will use real texts from complete Arabic stories and books.


Keywords: Arabic Text Compression, Golomb Code, Elias Code, Huffman Code, Improved Arabic Character Mapping


Edition: Volume 5 Issue 11, November 2016,


Pages: 1379 - 1387


How to Download this Article?

Type Your Valid Email Address below to Receive the Article PDF Link


Verification Code will appear in 2 Seconds ... Wait

Top