Culbreath: Ok here’s a sanity check. For this problem, I said that the second and third queries WILL return the proper count: http://pastebin.com/bD0BYKgj
Culbreath: But not the first and fourth
Culbreath: I feel sketchy about the third one. but sqlfiddle confirms it does properly count what I’m after
Varieur: Culbreath: show your fiddle.
Culbreath: Varieur: http://sqlfiddle.com/#!9/096f4/1
Varieur: Culbreath: http://sqlfiddle.com/#!9/096f4/5 corrected sql
Varieur: Culbreath: http://sqlfiddle.com/#!9/096f4/7 is more complete
Varieur: Culbreath: if you use WHERE, the rows are filtered *before* the grouping operation. With HAVING, they are filtered *after*
Varieur: Culbreath: that’s the main difference.
Varieur: Culbreath: it can have disastrous effects on performance, as the HAVING form will scan the entire table.
Culbreath: Varieur: the second and fourth queries in your fiddle don’t fit the problem spec
Varieur: Culbreath: no. They show variations on aggregation, however.
Varieur: Culbreath: I’m still unconvinced that you grasp the concept well.
Varieur: Culbreath: the only way to expose your misconceptions is to show variations of the same query, and have you look at the output.
Culbreath: Looks pretty straightforward in the fiddle at least. Where do you think I need to shore up?
Varieur: Culbreath: anyway, did you understand my comment regarding HAVING vs WHERE ?
Culbreath: What I don’t understand is how in the query from my problem, there is ” HAVING first_name=’John’ ”
Varieur: Culbreath: someone is abusing HAVING to filter the results, that is all
Culbreath: Does it translate to “Elliminate all groups who do not contain a row where first_name is ‘John’ ” ?
Varieur: Culbreath: when the grouping operation is done, you’re left with tuples rows
Culbreath: Oh I understand it. It works when specifying a column like that because that column was grouped on
Varieur: Culbreath: so to correct you: “It removes resulting tuples where the first name was not ‘John'”
Varieur: Culbreath: the grouping operation is always done *before* the HAVING clause is applied. WHERE is done *before*
Culbreath: Given these records: ‘a’, ‘red’ ‘b’, ‘red’ ‘c’, ‘red’ and the schema ‘letter’, ‘color’, what type of tuple will GROUP BY color return?
Varieur: Culbreath: add it to the fiddle and see.
Manross: Indeterminate for letter
Varieur: Culbreath: ‘a’, ‘b’ or ‘c’.
Manross: Culbreath: and it can be different for each time you execute the sql
Manross: Culbreath: so relying on HAVING to fix this is incorrect
Varieur: I really pity the students of that course, if the teacher can’t explain those concepts.
Manross: Varieur: doesn’t mysql version 5.7 default to requiring all the columns in select be in the group by?
Culbreath: I don’t think he’s gotten to it yet. AFAIK we’re mostly interested in the inner workings of the db like parallelism, sorting/hashing, etc. sql is just a 2-week tangent
Manross: Varieur: the former sql_mode specification?
Varieur: Manross: I have to install 5.7 to see, but that would break 95% of the sql MySQL users run, nowadays.
Manross: Varieur: bingo – https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
Varieur: Manross: FWIW, allowing the user to select other columns if the column you group by is deterministic is fine.
Varieur: Manross: and other rdbms like MSSQL simply don’t allow this
Manross: I don’t think pg does either, but that’s just from some casual reading, never used it
Manross: A reliance on seekwill, as it were. make him come back
Varieur: Manross: you think I have power in that regard?
Manross: He was an entertaining mofo