Improving Compression Methods for Arabic Text Using Dedicated Character Mapping
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
www.ijsr.net | Open Access | Fully Refereed | Peer Reviewed International Journal

ISSN: 2319-7064

Research Paper | Computer Science & Engineering | Kuwait | Volume 5 Issue 11, November 2016

Improving Compression Methods for Arabic Text Using Dedicated Character Mapping

Hedaya Ghanim Alshammar, Dina Hamad Alghurair

Natural Language Text Compression methods have been discussed thoroughly in the literature in the past years, different methodologies have been implemented and introduced, most however focused on English and European languages. Rather few studies have focused on Arabic Language, some methods used statistical approaches, other methods used dictionary based compression techniques, while some used features of the Arabic language and derivation rules in attempt to increase compression ratio. In this paper, we will introduce several statistical methods for natural language and apply it on Arabic text. We will also provide implementation for each of these methods and give a comparison between them in terms of performance, compression ratio, resource requirements for running the algorithms and areas and application and usage. Golomb, Elias Gamma Code, Huffman methods are to be implemented, and compared as a sample statistical algorithms, We will also introduce a dedicated Arabic Character Mapping technique to be used in the Elias, Golomb and Huffman algorithms, which will show through the results a major improvement to the compression ratio in comparison to the original methods when applied on binary data ignoring the language underneath, the improvement introduced will show that it can be superior even to LZW when used on small Arabic Sample Files, two sets of data will be tested, first set uses random Arabic text, the second set will use real texts from complete Arabic stories and books.

Keywords: Arabic Text Compression, Golomb Code, Elias Code, Huffman Code, Improved Arabic Character Mapping

Edition: Volume 5 Issue 11, November 2016

Pages: 1379 - 1387

Share this Article

How to Cite this Article?

Hedaya Ghanim Alshammar, Dina Hamad Alghurair, "Improving Compression Methods for Arabic Text Using Dedicated Character Mapping", International Journal of Science and Research (IJSR), https://www.ijsr.net/search_index_results_paperid.php?id=ART20163069, Volume 5 Issue 11, November 2016, 1379 - 1387

74 PDF Views | 71 PDF Downloads

Download Article PDF



Top