• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Center for Artificial Intelligence and Cybersecurity – AIRI

  • Home
  • About Us
    • Vision, Mission and Goals
    • Center Activities
    • Center Faculty
    • Steering Committee
    • Press
  • Research
    • Scientific Projects
    • Research Papers
  • Laboratories
    • Machine Learning
    • Natural Speech & Language Processing
    • Blockchain Technology
    • Information Processing & Pattern Recognition
    • AI in Medicine
    • Data Mining
    • Computer Vision
    • Complex Networks
    • Human-Computer Interaction
    • Maritime Cybersecurity
    • Autonomous Navigation
    • AI in Mechatronics
    • AI in Education
    • Hybrid Computational Methods
    • Drug Design
    • Legal Aspects of AI
    • Ethically Aligned AI
    • Cultural Complexity
  • Collaboration
    • Industry Collaboration
    • Industry Projects
    • International Collaboration
  • News
  • Contact

Comparison of Entropy and Dictionary Based Text Compression in English, German, French, Italian, Czech, Hungarian, Finnish, and Croatian

01.07.2020

The rapid growth in the amount of data in the digital world leads to the need for data compression, and so forth, reducing the number of bits needed to represent a text file, an image, audio, or video content. Compressing data saves storage capacity and speeds up data transmission. In this paper, we focus on the text compression and provide a comparison of algorithms (in particular, entropy-based arithmetic and dictionary-based Lempel–Ziv–Welch (LZW) methods) for text compression in different languages (Croatian, Finnish, Hungarian, Czech, Italian, French, German, and English). The main goal is to answer a question: ”How does the language of a text affect the compression ratio?” The results indicated that the compression ratio is affected by the size of the language alphabet, and size or type of the text. For example, The European Green Deal was compressed by 75.79%, 76.17%, 77.33%, 76.84%, 73.25%, 74.63%, 75.14%, and 74.51% using the LZW algorithm, and by 72.54%, 71.47%, 72.87%, 73.43%, 69.62%, 69.94%, 72.42% and 72% using the arithmetic algorithm for the English, German, French, Italian, Czech, Hungarian, Finnish, and Croatian versions, respectively.

Authors:
Matea Ignatoski, Jonatan Lerga, Ljubiša Stanković, Miloš Daković
Journal:
Mathematics
Publishing date:
01.07.2020
View original article

Primary Sidebar

Latest Projects

ABsistemDCiCloud

Machine Learning for Knowledge Transfer in Medical Radiology

Estimating River Discharges in Highly Stratified Estuaries

Multilayer Framework for the Information Spreading Characterization in Social Media during the COVID-19 Crisis (InfoCoV)

European Network for assuring food integrity using non-destructive spectral sensors

Latest Research Papers

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

Entropy-Based Concentration and Instantaneous Frequency of TFDs from Cohen’s, Affine, and Reassigned Classes

Coupled encoding methods for antimicrobial peptide prediction: How sensitive is a highly accurate model?

The Choice of Time–Frequency Representations of Non-Stationary Signals Affects Machine Learning Model Accuracy: A Case Study on Earthquake Detection from LEN-DB Data

Improved Parametrized Multiple Window Spectrogram with Application in Ship Navigation Systems

Latest News

Assoc. prof. Jonatan Lerga received the Croatian Academy of Sciences and Arts award

Dr. Sc. Nikola Lopac successfully defended his doctoral dissertation

Presentation at the conference “Digital Innovation and Technology for People”

Assoc. prof. dr. sc. Jonatan Lerga presented AIRI Center at the IEEE Rijeka : Computer Society Congress 2021

Prof. dr. sc. Ana Mestrovic participated at the Panel on perspectives and real-life applications of AI organized by IEEE Technology and Engineering Management Society

We provide the expertise for solving real world problems using AI

If your company wants to implement artificial intelligence in your products or services, or increase your level of cybersecurity, our multidisciplinary team of scientists is your ideal partner.

Contact us

Footer

Center for Artificial Intelligence and Cybersecurity
  • jlerga@airi.uniri.hr
  • +385 51 406 500

University of Rijeka

University of Rijeka

About the Center

  • About Us
  • News
  • Privacy Policy
  • Contact

Center Activities

  • Laboratories
  • Scientific Projects
  • Industry Projects
  • Research Papers
  • Industry Collaboration
  • International Collaboration

Footer bottom left

© 2020 Center for Artificial Intelligence and Cybersecurity, all rights reserved.

Designed & developed by Nela Dunato Art & Design