Hashing aggregate

Populate an ephemeral hash table as the DBMS scans the table. For each record, check whether there is already an entry in the hash table:

DISTINCT: Discard duplicate

GROUP BY: Perform aggregate computation

If everything fits in memory, then it is easy

If the DBMS must spill data to disk, then we need to be smarter...

Last updated