閱讀全文 | |
篇名 |
A Greedy Approach with New Cost Model for Intermediate Datasets Storage Problem in General Workflows
|
---|---|
並列篇名 | A Greedy Approach with New Cost Model for Intermediate Datasets Storage Problem in General Workflows |
作者 | Zimao Li、Yingying Wang |
英文摘要 | Running a scientific workflow on the cloud will generate a large volume of intermediate datasets and many of them have valuable information that can be used for further study, but the cost of storing them all is unbelievably high for the enormous data size. A feasible solution is to keep some of the intermediate datasets stored and re-compute the others when needed, the intermediate dataset storage problem asks to find a tradeoff to minimize the total cost of storing or re-generating each of the intermediate datasets. This paper focuses on a new cost model for the problem with general workflow, which incorporates additional delay tolerance, usage frequency and the transfer cost to make the cost model becoming more general. Based on a directed acyclic graph describing the dependence relationship between datasets, a greedy approach for the problem is proposed and implemented. Experimental results demonstrate the effectiveness and efficiency of our algorithm. |
起訖頁 | 166-174 |
關鍵詞 | delay tolerance、greedy algorithm、intermediate datasets storage、transfer cost、usage rate |
刊名 | 電腦學刊 |
期數 | 201802 (29:1期) |
DOI |
|
QR Code | |
該期刊 上一篇
| On the Node Searching Spanning Tree Problem |
該期刊 下一篇
| Locality Preserving Semi-Supervised Canonical Correlation Analysis for Localization in Wireless Sensor Network |