Informatics Report Series


Report   

EDI-INF-RR-1217


Related Pages

Report (by Number) Index
Report (by Date) Index
Author Index
Institute Index

Home
Title:Sorting Hierarchical Data in External Memory
Authors: Ioannis Koltsidas ; Heiko Mueller ; Stratis Viglas
Date: 2007
Abstract:
Sorting hierarchical data in external memory is needed in a wide variety of applications including archiving scientific data and dealing with large XML datasets. The topic of sorting hierarchical data has received little attention form the research community so far. In this paper, we focus on sorting arbitrary hierarchical datasets that exceed the size of physical memory. We propose HErMeS, an algorithm that generalizes the most widely-used techniques for sorting flat data in external memory, namely multiway merge-sort and replacement selection. HErMeS efficiently takes into consideration the hierarchical nature of the data in order to minimize the number of disk accesses and optimize the usage of available memory. The algorithm's theoretical bounds with respect to the structure of the hierarchical dataset are extracted, while an experimental study demonstrates that our implementation of the algorithm meets its theoretical expectations. Using several workloads, we compare our algorithm to existing approaches and show that it outperforms them by a significant factor. These results, we believe, prove our technique to be a viable and scalable solution.
Links To Paper
1st Link
Bibtex format
@Misc{EDI-INF-RR-1217,
author = { Ioannis Koltsidas and Heiko Mueller and Stratis Viglas },
title = {Sorting Hierarchical Data in External Memory},
year = 2007,
url = {http://homepages.inf.ed.ac.uk/s0679010/hermes-TR.pdf},
}


Home : Publications : Report 

Please mail <reports@inf.ed.ac.uk> with any changes or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh