EN VI

Sql - Can we omit column names in GROUP BY when we JOIN two tables?

2024-03-15 04:00:15
Sql - Can we omit column names in GROUP BY when we JOIN two tables?

As I was practicing SQL on pgexercises.com, I've come across a solution that didn't make sense on a specific line. You can see the question in the link below:

https://pgexercises.com/questions/aggregates/rankmembers.html

The solution query is as shown below, it works but I couldn't really grasp how it doesn't raise an error due to erroneous GROUP BY usage:

select firstname, surname, hours, rank() over (order by hours desc) from
    (select firstname, surname,
        ((sum(bks.slots)+10)/20)*10 as hours

        from cd.bookings bks
        inner join cd.members mems
            on bks.memid = mems.memid
        group by mems.memid
    ) as subq
order by rank, surname, firstname;

What doesn't click on my behalf is: why doesn't including firstname and surname in GROUP BY cause an error? From what I know, the GROUP BY must include all non-aggregate columns that we display to work. Is this because we JOIN by the primary key which identifies each column uniquely, thus somehow negate the necessity to include the display columns in the GROUP BY statement?

Solution:

Your guess is exactly right: the PK uniquely identifies the row, and things cannot be more unique than just unique. If memid is unique, any combination of the field with any other field from that table will also be a unique combination.

Note that it works only for primary key, and while unique not null works the same in that sense, group by won't react the same. Also, it works regardless of the join. Demo.

From PostgreSQL GROUP BY documentation (emphasis mine):

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

Answer

Login


Forgot Your Password?

Create Account


Lost your password? Please enter your email address. You will receive a link to create a new password.

Reset Password

Back to login