R&D: Enhancing Performance of E-Government Information Systems with SSD-based Hadoop Mapreduce
Study proposes data address-based shuffle mechanism optimized for Hadoop clusters equipped with SSDs, aiming to enhance data processing performance in e-government applications
This is a Press Release edited by StorageNewsletter.com on November 10, 2025 at 2:00 pmScientific Reports has published an article written by Fredrick Ishengoma, Department of Information Systems and Technology, College of Informatics and Virtual Education, The University of Dodoma, Dodoma, Tanzania.
Abstract: “E-government applications generate and process large volumes of heterogeneous data that demand high-throughput and low-latency computation. Although Hadoop MapReduce is commonly used for such tasks, its performance is often limited by disk I/O constraints and network delays during the shuffle phase. This study proposes a data address-based shuffle mechanism optimized for Hadoop clusters equipped with Solid-State Drives (SSDs), aiming to enhance data processing performance in e-government applications. The mechanism introduces three key components: address-based sorting, address-based merging, and pre-transmission of intermediate data, which collectively reduce disk I/O and network transfer overhead. Experimental evaluations using Terasort and Wordcount benchmarks demonstrate execution time reductions of 8% and 1%, respectively, with statistical significance confirmed through 95% confidence intervals. Scalability assessments on a simulated 50-node cluster and energy profiling further validate the approach, showing improved performance, reduced network congestion, and a 31% decrease in energy consumption compared to HDD-based systems. The findings establish the proposed mechanism as a cost-effective and efficient solution for large-scale data processing in public sector computing environments.“










