DW Handbook

This page shows the name of the construct, its description

Attribute Value
Name Aggregate
Description None
Function None
Aim Allow a data engineer (user) to create a summarised form specific numerical attributes of the dataset using a set of statistical functions (sum, count, mean, min, max, median or a combination of these) and grouped by specific attributes
Context This operation is used to create a new dataset aggregating the values within a specified attribute using a statistical function across distinct values of another specified attribute to serve as input into target data analysis.
Rationale creating a summarised form of the dataset can be used to increase value from the analysis the datasets are being prepared for.
Mechanisim Aggregate a dataset by creating a statistical summary of a dataset. This can be done by exploring the facilities found in GUI-based tools and programming language functions.
Formalisim <indicies of grouping attribute(s)> (R) = {(<indicies of grouping attribute(s)>,func_aj) | <indicies of grouping attribute(s)> and j are index values of attributes ϵ R, Where: R is a relation with n columns. func is an aggregation function (such as sum,count,average,min,max), func_aj represents the result of func applied to the column aj. (Elmasri and Navathe, 2015)
Relational Algebra (RA) Similar to RA operation Aggregation(γ)
Type Atomic
Class Unary
Transformation_category N:1
Inputs
InputsNumber of input datasets
Input dataset, index of grouping attribute, function to aggregate by applied to index of aggregated attribute1
Outputs
OutputsNumber of output datasets
Aggregated dataset1
Used in stage(s) Structuring2

Back