Optimization of workflows for data analytics
Abstract
The big data analysis with the help of automated mechanisms attracts a lot of attention because of the growing need for end-to-end processing of this data. Modern workflows for data analysis, or simply data flows, are adopted in order to process and analyze large volumes of data. However, the data flows become more and more complex and operate in highly dynamic parallel, distributed and heterogeneous environments. This thesis deals with the data flow cost-based optimization and propose task ordering techniques that aim to minimize the total execution cost of the data flow tasks. Additionally, a set of engine selection techniques are proposed for task allocation to specific heterogeneous engines that aim to minimize the flow execution cost. More specifically, the contributions of the thesis are summarized as follows: (i) a thorough survey of data flow optimization research area is presented, (ii) they are presented effective accurate algorithms for finding the optimal order of tasks tha ...
show more
![]() | Download full text in PDF format (3.67 MB)
(Available only to registered users)
|
All items in National Archive of Phd theses are protected by copyright.
|
Usage statistics
VIEWS
Concern the unique Ph.D. Thesis' views for the period 07/2018 - 07/2023.
Source: Google Analytics.
Source: Google Analytics.
ONLINE READER
Concern the online reader's opening for the period 07/2018 - 07/2023.
Source: Google Analytics.
Source: Google Analytics.
DOWNLOADS
Concern all downloads of this Ph.D. Thesis' digital file.
Source: National Archive of Ph.D. Theses.
Source: National Archive of Ph.D. Theses.
USERS
Concern all registered users of National Archive of Ph.D. Theses who have interacted with this Ph.D. Thesis. Mostly, it concerns downloads.
Source: National Archive of Ph.D. Theses.
Source: National Archive of Ph.D. Theses.






