Section 3.2. Aggregations

Another common ETL operation is aggregation.

If you have for example a series of contracts, possibly more than one for each customer, and you would like to know how much each customer has paid to the company, you need to sum all contract amounts for each one of the customers. Sum is an aggregation method. If you have a series of temperature measures and would like to know their average, average is another aggregation method. If you have people distributed across countries and you would like to know the percentage or even just the absolute number of people living in each country, percentage and count are aggregation methods as well. And so on.

In an aggregation operation, aggregation methods can be applied to groups and subgroups of the data. Groups and subgroups are identified through values in some of the dataset features (columns), such as: men and women through the gender feature; income bins through the income feature; etc.