Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I appreciate that this may appear to many as a dum question but I cannot find a clear explanation anywhere as to what the effect of "group by" has on a select max(...) from SQL statement.

I have the following data (there is another column image of type mediumblob which is not shown):

id  title                      test_id 

1  bomb                             0 
2  Soft watch                       2 
3  Dali                             1 
4  Narciss                          1 
5  The Woman In Green               0 
6  A summer in Vetheuil             0 
7  Artist's Garden                  2 
8  Beech Forest                     2 
9  Claude Monet                     0 

I know if I perform

select max(id) from images 
where image is not null; 

I get the max value of id i.e.:

max(id) 

9

However can someone please explain what is happening when I perform

select max(id), title, test_id 
from images 
where image is not null 
group by id; 

I find that the max(id) serves no useful purpose (results shown below)?

  max(id)  title                      test_id 

  1  bomb                              0 
  2  Soft watch                        2 
  3  Dali                              1 
  4  Narciss                           1 
  5  The Woman In Green                0 
  6  A summer in Vetheuil              0 
  7  Artist's Garden                   2 
  8  Beech Forest                      2 
  9  Claude Monet                      0 
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
249 views
Welcome To Ask or Share your Answers For Others

1 Answer

In the case of using MAX() the GROUP BY clause essentially tells the query engine how to group the items from which to determine a maximum. In your first example you were selecting only a single column, so there was no need for grouping. But in your second example you had multiple columns. So you need to tell the query engine how to determine which ones are going to be compared to find a maximum.

You told it to group by the id column. Which means that it's going to compare records which have the same id and give you the maximum one for each unique id. Since every record has a different id, you essentially didn't do anything with that clause.

It grouped all records with an id of 1 (which was a single record), and returned the record with the maximum id from that group (which was that record). It did the same for 2, 3, etc.

In the case of the three columns shown here, the only place where it would make sense to group your records would be on the test_id column. Something like this:

SELECT MAX(id), title, test_id
FROM images
WHERE image IS NOT null 
GROUP BY test_id

This would group them by the test_id, so the results will include records 6 (the maximum id for test_id 0), 4 (the maximum id for test_id 1), and 8 (the maximum id for test_id 2). By splitting the records into those three groups based on the three unique values in the test_id column, it can effectively find a "maximum" id within each group.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...