当前位置 : 首页 » 互动问答 » 正文

How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?

分类 : 互动问答 | 发布时间 : 2009-03-05 04:14:26 | 评论 : 17 | 浏览 : 1283970 | 喜欢 : 667

My table is:

id  home  datetime     player   resource
---|-----|------------|--------|---------
1  | 10  | 04/03/2009 | john   | 399 
2  | 11  | 04/03/2009 | juliet | 244
5  | 12  | 04/03/2009 | borat  | 555
3  | 10  | 03/03/2009 | john   | 300
4  | 11  | 03/03/2009 | juliet | 200
6  | 12  | 03/03/2009 | borat  | 500
7  | 13  | 24/12/2008 | borat  | 600
8  | 13  | 01/01/2009 | borat  | 700

I need to select each distinct home holding the maximum value of datetime.

Result would be:

id  home  datetime     player   resource 
---|-----|------------|--------|---------
1  | 10  | 04/03/2009 | john   | 399
2  | 11  | 04/03/2009 | juliet | 244
5  | 12  | 04/03/2009 | borat  | 555
8  | 13  | 01/01/2009 | borat  | 700

I have tried:

-- 1 ..by the MySQL manual: 

SELECT DISTINCT
  home,
  id,
  datetime AS dt,
  player,
  resource
FROM topten t1
WHERE datetime = (SELECT
  MAX(t2.datetime)
FROM topten t2
GROUP BY home)
GROUP BY datetime
ORDER BY datetime DESC

Doesn't work. Result-set has 130 rows although database holds 187. Result includes some duplicates of home.

-- 2 ..join

SELECT
  s1.id,
  s1.home,
  s1.datetime,
  s1.player,
  s1.resource
FROM topten s1
JOIN (SELECT
  id,
  MAX(datetime) AS dt
FROM topten
GROUP BY id) AS s2
  ON s1.id = s2.id
ORDER BY datetime 

Nope. Gives all the records.

-- 3 ..something exotic: 

With various results.

回答(17)

  • 1楼
  • You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten table on BOTH fields:

    SELECT tt.*
    FROM topten tt
    INNER JOIN
        (SELECT home, MAX(datetime) AS MaxDateTime
        FROM topten
        GROUP BY home) groupedtt 
    ON tt.home = groupedtt.home 
    AND tt.datetime = groupedtt.MaxDateTime
    
  • 2楼
  • Here goes T-SQL version:

    -- Test data
    DECLARE @TestTable TABLE (id INT, home INT, date DATETIME, 
      player VARCHAR(20), resource INT)
    INSERT INTO @TestTable
    SELECT 1, 10, '2009-03-04', 'john', 399 UNION
    SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
    SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
    SELECT 3, 10, '2009-03-03', 'john', 300 UNION
    SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
    SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
    SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
    SELECT 8, 13, '2009-01-01', 'borat', 700
    
    -- Answer
    SELECT id, home, date, player, resource 
    FROM (SELECT id, home, date, player, resource, 
        RANK() OVER (PARTITION BY home ORDER BY date DESC) N
        FROM @TestTable
    )M WHERE N = 1
    
    -- and if you really want only home with max date
    SELECT T.id, T.home, T.date, T.player, T.resource 
        FROM @TestTable T
    INNER JOIN 
    (   SELECT TI.id, TI.home, TI.date, 
            RANK() OVER (PARTITION BY TI.home ORDER BY TI.date) N
        FROM @TestTable TI
        WHERE TI.date IN (SELECT MAX(TM.date) FROM @TestTable TM)
    )TJ ON TJ.N = 1 AND T.id = TJ.id
    

    EDIT
    Unfortunately, there are no RANK() OVER function in MySQL.
    But it can be emulated, see Emulating Analytic (AKA Ranking) Functions with MySQL.
    So this is MySQL version:

    SELECT id, home, date, player, resource 
    FROM TestTable AS t1 
    WHERE 
        (SELECT COUNT(*) 
                FROM TestTable AS t2 
                WHERE t2.home = t1.home AND t2.date > t1.date
        ) = 0
    
  • 3楼
  • The fastest MySQL solution, without inner queries and without GROUP BY:

    SELECT m.*                    -- get the row that contains the max value
    FROM topten m                 -- "m" from "max"
        LEFT JOIN topten b        -- "b" from "bigger"
            ON m.home = b.home    -- match "max" row with "bigger" row by `home`
            AND m.datetime < b.datetime           -- want "bigger" than "max"
    WHERE b.datetime IS NULL      -- keep only if there is no bigger than max
    

    Explanation:

    Join the table with itself using the home column. The use of LEFT JOIN ensures all the rows from table m appear in the result set. Those that don't have a match in table b will have NULLs for the columns of b.

    The other condition on the JOIN asks to match only the rows from b that have bigger value on the datetime column than the row from m.

    Using the data posted in the question, the LEFT JOIN will produce this pairs:

    +------------------------------------------+--------------------------------+
    |              the row from `m`            |    the matching row from `b`   |
    |------------------------------------------|--------------------------------|
    | id  home  datetime     player   resource | id    home   datetime      ... |
    |----|-----|------------|--------|---------|------|------|------------|-----|
    | 1  | 10  | 04/03/2009 | john   | 399     | NULL | NULL | NULL       | ... | *
    | 2  | 11  | 04/03/2009 | juliet | 244     | NULL | NULL | NULL       | ... | *
    | 5  | 12  | 04/03/2009 | borat  | 555     | NULL | NULL | NULL       | ... | *
    | 3  | 10  | 03/03/2009 | john   | 300     | 1    | 10   | 04/03/2009 | ... |
    | 4  | 11  | 03/03/2009 | juliet | 200     | 2    | 11   | 04/03/2009 | ... |
    | 6  | 12  | 03/03/2009 | borat  | 500     | 5    | 12   | 04/03/2009 | ... |
    | 7  | 13  | 24/12/2008 | borat  | 600     | 8    | 13   | 01/01/2009 | ... |
    | 8  | 13  | 01/01/2009 | borat  | 700     | NULL | NULL | NULL       | ... | *
    +------------------------------------------+--------------------------------+
    

    Finally, the WHERE clause keeps only the pairs that have NULLs in the columns of b (they are marked with * in the table above); this means, due to the second condition from the JOIN clause, the row selected from m has the biggest value in column datetime.

    Read the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.

  • 4楼
  • This will work even if you have two or more rows for each home with equal DATETIME's:

    SELECT id, home, datetime, player, resource
    FROM   (
           SELECT (
                  SELECT  id
                  FROM    topten ti
                  WHERE   ti.home = t1.home
                  ORDER BY
                          ti.datetime DESC
                  LIMIT 1
                  ) lid
           FROM   (
                  SELECT  DISTINCT home
                  FROM    topten
                  ) t1
           ) ro, topten t2
    WHERE  t2.id = ro.lid
    
  • 5楼
  • I think this will give you the desired result:

    SELECT   home, MAX(datetime)
    FROM     my_table
    GROUP BY home
    

    BUT if you need other columns as well, just make a join with the original table (check Michael La Voie answer)

    Best regards.

  • 6楼
  • Since people seem to keep running into this thread (comment date ranges from 1.5 year) isn't this much simpler:

    SELECT * FROM (SELECT * FROM topten ORDER BY datetime DESC) tmp GROUP BY home

    No aggregation functions needed...

    Cheers.

  • 7楼
  • You can also try this one and for large tables query performance will be better. It works when there no more than two records for each home and their dates are different. Better general MySQL query is one from Michael La Voie above.

    SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
    FROM   t_scores_1 t1 
    INNER JOIN t_scores_1 t2
       ON t1.home = t2.home
    WHERE t1.date > t2.date
    

    Or in case of Postgres or those dbs that provide analytic functions try

    SELECT t.* FROM 
    (SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
      , row_number() over (partition by t1.home order by t1.date desc) rw
     FROM   topten t1 
     INNER JOIN topten t2
       ON t1.home = t2.home
     WHERE t1.date > t2.date 
    ) t
    WHERE t.rw = 1
    
  • 8楼
  • This works on Oracle:

    with table_max as(
      select id
           , home
           , datetime
           , player
           , resource
           , max(home) over (partition by home) maxhome
        from table  
    )
    select id
         , home
         , datetime
         , player
         , resource
      from table_max
     where home = maxhome
    
  • 9楼
  • SELECT  tt.*
    FROM    TestTable tt 
    INNER JOIN 
            (
            SELECT  coord, MAX(datetime) AS MaxDateTime 
            FROM    rapsa 
            GROUP BY
                    krd 
            ) groupedtt
    ON      tt.coord = groupedtt.coord
            AND tt.datetime = groupedtt.MaxDateTime
    
  • 10楼
  • Try this for SQL Server:

    WITH cte AS (
       SELECT home, MAX(year) AS year FROM Table1 GROUP BY home
    )
    SELECT * FROM Table1 a INNER JOIN cte ON a.home = cte.home AND a.year = cte.year
    
  • 11楼
  • SELECT c1, c2, c3, c4, c5 FROM table1 WHERE c3 = (select max(c3) from table)
    
    SELECT * FROM table1 WHERE c3 = (select max(c3) from table1)
    
  • 12楼
  • Another way to gt the most recent row per group using a sub query which basically calculates a rank for each row per group and then filter out your most recent rows as with rank = 1

    select a.*
    from topten a
    where (
      select count(*)
      from topten b
      where a.home = b.home
      and a.`datetime` < b.`datetime`
    ) +1 = 1
    

    DEMO

    Here is the visual demo for rank no for each row for better understanding

    By reading some comments what about if there are two rows which have same 'home' and 'datetime' field values?

    Above query will fail and will return more than 1 rows for above situation. To cover up this situation there will be a need of another criteria/parameter/column to decide which row should be taken which falls in above situation. By viewing sample data set i assume there is a primary key column id which should be set to auto increment. So we can use this column to pick the most recent row by tweaking same query with the help of CASE statement like

    select a.*
    from topten a
    where (
      select count(*)
      from topten b
      where a.home = b.home
      and  case 
           when a.`datetime` = b.`datetime`
           then a.id < b.id
           else a.`datetime` < b.`datetime`
           end
    ) + 1 = 1
    

    DEMO

    Above query will pick the row with highest id among the same datetime values

    visual demo for rank no for each row

  • 13楼
  • Here is MySQL version which prints only one entry where there are duplicates MAX(datetime) in a group.

    You could test here http://www.sqlfiddle.com/#!2/0a4ae/1

    Sample Data

    mysql> SELECT * from topten;
    +------+------+---------------------+--------+----------+
    | id   | home | datetime            | player | resource |
    +------+------+---------------------+--------+----------+
    |    1 |   10 | 2009-04-03 00:00:00 | john   |      399 |
    |    2 |   11 | 2009-04-03 00:00:00 | juliet |      244 |
    |    3 |   10 | 2009-03-03 00:00:00 | john   |      300 |
    |    4 |   11 | 2009-03-03 00:00:00 | juliet |      200 |
    |    5 |   12 | 2009-04-03 00:00:00 | borat  |      555 |
    |    6 |   12 | 2009-03-03 00:00:00 | borat  |      500 |
    |    7 |   13 | 2008-12-24 00:00:00 | borat  |      600 |
    |    8 |   13 | 2009-01-01 00:00:00 | borat  |      700 |
    |    9 |   10 | 2009-04-03 00:00:00 | borat  |      700 |
    |   10 |   11 | 2009-04-03 00:00:00 | borat  |      700 |
    |   12 |   12 | 2009-04-03 00:00:00 | borat  |      700 |
    +------+------+---------------------+--------+----------+
    

    MySQL Version with User variable

    SELECT *
    FROM (
        SELECT ord.*,
            IF (@prev_home = ord.home, 0, 1) AS is_first_appear,
            @prev_home := ord.home
        FROM (
            SELECT t1.id, t1.home, t1.player, t1.resource
            FROM topten t1
            INNER JOIN (
                SELECT home, MAX(datetime) AS mx_dt
                FROM topten
                GROUP BY home
              ) x ON t1.home = x.home AND t1.datetime = x.mx_dt
            ORDER BY home
        ) ord, (SELECT @prev_home := 0, @seq := 0) init
    ) y
    WHERE is_first_appear = 1;
    +------+------+--------+----------+-----------------+------------------------+
    | id   | home | player | resource | is_first_appear | @prev_home := ord.home |
    +------+------+--------+----------+-----------------+------------------------+
    |    9 |   10 | borat  |      700 |               1 |                     10 |
    |   10 |   11 | borat  |      700 |               1 |                     11 |
    |   12 |   12 | borat  |      700 |               1 |                     12 |
    |    8 |   13 | borat  |      700 |               1 |                     13 |
    +------+------+--------+----------+-----------------+------------------------+
    4 rows in set (0.00 sec)
    

    Accepted Answers' outout

    SELECT tt.*
    FROM topten tt
    INNER JOIN
        (
        SELECT home, MAX(datetime) AS MaxDateTime
        FROM topten
        GROUP BY home
    ) groupedtt ON tt.home = groupedtt.home AND tt.datetime = groupedtt.MaxDateTime
    +------+------+---------------------+--------+----------+
    | id   | home | datetime            | player | resource |
    +------+------+---------------------+--------+----------+
    |    1 |   10 | 2009-04-03 00:00:00 | john   |      399 |
    |    2 |   11 | 2009-04-03 00:00:00 | juliet |      244 |
    |    5 |   12 | 2009-04-03 00:00:00 | borat  |      555 |
    |    8 |   13 | 2009-01-01 00:00:00 | borat  |      700 |
    |    9 |   10 | 2009-04-03 00:00:00 | borat  |      700 |
    |   10 |   11 | 2009-04-03 00:00:00 | borat  |      700 |
    |   12 |   12 | 2009-04-03 00:00:00 | borat  |      700 |
    +------+------+---------------------+--------+----------+
    7 rows in set (0.00 sec)
    
  • 14楼
  • Try this

    select * from mytable a join
    (select home, max(datetime) datetime
    from mytable
    group by home) b
     on a.home = b.home and a.datetime = b.datetime
    

    Regards K

  • 15楼
  • Why not using: SELECT home, MAX(datetime) AS MaxDateTime,player,resource FROM topten GROUP BY home Did I miss something?

  • 16楼
  • this is the query you need:

     SELECT b.id, a.home,b.[datetime],b.player,a.resource FROM
     (SELECT home,MAX(resource) AS resource FROM tbl_1 GROUP BY home) AS a
    
     LEFT JOIN
    
     (SELECT id,home,[datetime],player,resource FROM tbl_1) AS b
     ON  a.resource = b.resource WHERE a.home =b.home;
    
  • 17楼
  • @Michae The accepted answer will working fine in most of the cases but it fail for one for as below.

    In case if there were 2 rows having HomeID and Datetime same the query will return both rows, not distinct HomeID as required, for that add Distinct in query as below.

    SELECT DISTINCT tt.home  , tt.MaxDateTime
    FROM topten tt
    INNER JOIN
        (SELECT home, MAX(datetime) AS MaxDateTime
        FROM topten
        GROUP BY home) groupedtt 
    ON tt.home = groupedtt.home 
    AND tt.datetime = groupedtt.MaxDateTime
    

相关阅读:

How can I tell what edition of SQL Server runs on the machine?

How do I limit the number of rows returned by an Oracle query after ordering?

Oracle SQL: Update a table with data from another table

MySQL date format DD/MM/YYYY select query?

How to insert a value that contains an apostrophe (single quote)?

Count work days between two dates

Multiple select statements in Single query

How to access remote server with local phpMyAdmin client?

Equivalent of LIMIT and OFFSET for SQL Server?

Grant **all** privileges on database