10.6084/m9.figshare.1008319.v1
Claas-thido Pfaff
Claas-thido
Pfaff
Karin Nadrowski
Karin
Nadrowski
Sophia Ratcliffe
Sophia
Ratcliffe
Christian Wirth
Christian
Wirth
Helge Bruelheide
Helge
Bruelheide
Data used to quantify the complexity of the workflow on biodiversity-ecosystem functioning
f1000research.com
2014
workflow
data complexity
quality
biodiversity
ecosystem
carbon pools
reproducibility
data reuse
Ecology
2014-05-15 11:33:09
Dataset
https://f1000.figshare.com/articles/dataset/Data_used_to_quantify_the_complexity_of_the_workflow_on_biodiversity_ecosystem_functioning/1008319
<p>The dataset represents the data directly derived from the workflow. Each row represents one port in the workflow. As most actors have multiple ports multiple rows represent one actors. The rows belonging to one actor are<br>further highlighted as they are separated by rows of NA. The dataset contains information about the purpose of the purpose of the actor, the description of the purpose, the position in the workflow, the lines of R code in the actor and the count of R functions used, the information about which additional R package have been used. Furthermore it contains information about the port name, whether the port has been used or not and if the port is an input or output port and the overall count of input and output ports of an actor. The dataset also contains information about the variable the ports handle (header from original dataset) and from which dataset the data comes from. Information about the structure of the input for each port is given as well as the length of the input and a lifecycle (how often has it been used) of the variable.</p>
<p>Summary dataset:</p>
<p>The dataset represents an aggregate by actor which is derived from the full dataset where each line represented an actor. In this dataset each line represents an actor. It contains information such as the ratio from output to input ports of an actor, a count of input and output ports, the actor purpose and R functions used in the actor. It also sums datasets identified by the id an actor deals with (domain_ids). It provides information about the whole line of code and the percent contribution of each actor to the total line of code. It also holds information about the used R packages, the total input and output port count as well as the actor position, a count of R packages used, a total count of datasets (domains), the total of R functions used and the the values for complexity (absolute and relative).</p>
<p> </p>