|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "id": "eeeadbe0", |
| 6 | + "metadata": {}, |
| 7 | + "source": [ |
| 8 | + "# MySQL 8.0.40\n", |
| 9 | + "\n", |
| 10 | + "## 0) 前提\n", |
| 11 | + "\n", |
| 12 | + "* エンジン: **MySQL 8**\n", |
| 13 | + "* 並び順: 任意(`ORDER BY` を付けない)\n", |
| 14 | + "* `NOT IN` は NULL 罠のため回避\n", |
| 15 | + "* 判定は **ID 基準**、表示は仕様どおりの列名と順序\n", |
| 16 | + "\n", |
| 17 | + "## 1) 問題\n", |
| 18 | + "\n", |
| 19 | + "* `Orders` から **最も多く注文を行った顧客の `customer_number`** を返す\n", |
| 20 | + " (テストでは最大が一意。Follow up: 同数最大が複数いても全件返す)\n", |
| 21 | + "\n", |
| 22 | + "* 入力テーブル例:\n", |
| 23 | + "\n", |
| 24 | + " ```markdown\n", |
| 25 | + " Table: Orders\n", |
| 26 | + " +-----------------+----------+\n", |
| 27 | + " | Column Name | Type |\n", |
| 28 | + " +-----------------+----------+\n", |
| 29 | + " | order_number | int | -- PK\n", |
| 30 | + " | customer_number | int |\n", |
| 31 | + " +-----------------+----------+\n", |
| 32 | + " ```\n", |
| 33 | + "\n", |
| 34 | + "* 出力仕様:\n", |
| 35 | + "\n", |
| 36 | + " ```markdown\n", |
| 37 | + " +-----------------+\n", |
| 38 | + " | customer_number |\n", |
| 39 | + " +-----------------+\n", |
| 40 | + " ```\n", |
| 41 | + "\n", |
| 42 | + "## 2) 最適解(単一クエリ)\n", |
| 43 | + "\n", |
| 44 | + "> **ウィンドウ関数+事前集計**で 1 クエリ。`OVER` 内の `ORDER BY` は順位付けのためで、最終 `SELECT` に `ORDER BY` は不要。\n", |
| 45 | + "\n", |
| 46 | + "```sql\n", |
| 47 | + "WITH cnt AS (\n", |
| 48 | + " SELECT\n", |
| 49 | + " customer_number,\n", |
| 50 | + " COUNT(*) AS order_cnt\n", |
| 51 | + " FROM Orders\n", |
| 52 | + " GROUP BY customer_number\n", |
| 53 | + "),\n", |
| 54 | + "win AS (\n", |
| 55 | + " SELECT\n", |
| 56 | + " customer_number,\n", |
| 57 | + " DENSE_RANK() OVER (ORDER BY order_cnt DESC) AS rnk\n", |
| 58 | + " FROM cnt\n", |
| 59 | + ")\n", |
| 60 | + "SELECT\n", |
| 61 | + " customer_number\n", |
| 62 | + "FROM win\n", |
| 63 | + "WHERE rnk = 1;\n", |
| 64 | + "\n", |
| 65 | + "Runtime 429 ms\n", |
| 66 | + "Beats 75.41%\n", |
| 67 | + "\n", |
| 68 | + "```\n", |
| 69 | + "\n", |
| 70 | + "* これで **一意最大**も**同数最大が複数**も対応(Follow up 充足)\n", |
| 71 | + "\n", |
| 72 | + "## 3) 代替解\n", |
| 73 | + "\n", |
| 74 | + "> **最大値をサブクエリで求めて一致フィルタ**。ウィンドウが重い環境や互換用に。\n", |
| 75 | + "\n", |
| 76 | + "```sql\n", |
| 77 | + "WITH cnt AS (\n", |
| 78 | + " SELECT customer_number, COUNT(*) AS order_cnt\n", |
| 79 | + " FROM Orders\n", |
| 80 | + " GROUP BY customer_number\n", |
| 81 | + "),\n", |
| 82 | + "mx AS (\n", |
| 83 | + " SELECT MAX(order_cnt) AS max_cnt FROM cnt\n", |
| 84 | + ")\n", |
| 85 | + "SELECT c.customer_number\n", |
| 86 | + "FROM cnt AS c\n", |
| 87 | + "JOIN mx ON c.order_cnt = mx.max_cnt;\n", |
| 88 | + "\n", |
| 89 | + "Runtime 469 ms\n", |
| 90 | + "Beats 41.83%\n", |
| 91 | + "\n", |
| 92 | + "```\n", |
| 93 | + "\n", |
| 94 | + "※ `NOT IN` は未使用。`ORDER BY ... LIMIT 1` でも実現できるが、本要件では**結果順は任意**かつ **最大同率全件**を自然に返せる上記方式が安全。\n", |
| 95 | + "\n", |
| 96 | + "## 4) 要点解説\n", |
| 97 | + "\n", |
| 98 | + "* **方針**: まず `customer_number` 単位で件数を集計 → 上位判定(`DENSE_RANK` か `MAX` 照合) → 仕様列のみ投影。\n", |
| 99 | + "* **NULL / 重複**:\n", |
| 100 | + "\n", |
| 101 | + " * `customer_number` が NULL の行が存在するなら集計前に `WHERE customer_number IS NOT NULL` を入れる(問題文では想定外だが堅牢性の観点)。\n", |
| 102 | + " * `order_number` は PK のため重複はなし。\n", |
| 103 | + "* **安定性**: 出力順は問わないため **最終 `ORDER BY` は不要**。上位選別はウィンドウ内で完結。\n", |
| 104 | + "\n", |
| 105 | + "## 5) 計算量(概算)\n", |
| 106 | + "\n", |
| 107 | + "* `GROUP BY` 集計: **O(N)**~**O(N log N)**(ヒープ/ソート次第)\n", |
| 108 | + "* ウィンドウ `DENSE_RANK`(代替は `MAX` 照合): 集計後のユニーク顧客数を M として **O(M log M)**(内部ソート含む)\n", |
| 109 | + " ※ 代替解(`MAX` 照合)は **O(M)** でやや軽量。\n", |
| 110 | + "\n", |
| 111 | + "## 6) 図解(Mermaid 超保守版)\n", |
| 112 | + "\n", |
| 113 | + "```mermaid\n", |
| 114 | + "flowchart TD\n", |
| 115 | + " A[Orders] --> B[customer_number ごとに COUNT]\n", |
| 116 | + " B --> C[順位付け DENSE_RANK または MAX 照合]\n", |
| 117 | + " C --> D[rnk=1 または order_cnt=max_cnt を抽出]\n", |
| 118 | + " D --> E[出力 customer_number]\n", |
| 119 | + "```\n" |
| 120 | + ] |
| 121 | + }, |
| 122 | + { |
| 123 | + "cell_type": "markdown", |
| 124 | + "id": "4392213e", |
| 125 | + "metadata": {}, |
| 126 | + "source": [ |
| 127 | + "## 結論(用途別ベスト)\n", |
| 128 | + "\n", |
| 129 | + "### 1) 最大が一意(本問題の前提)なら最短クエリが最速になりやすい\n", |
| 130 | + "\n", |
| 131 | + "```sql\n", |
| 132 | + "SELECT customer_number\n", |
| 133 | + "FROM Orders\n", |
| 134 | + "GROUP BY customer_number\n", |
| 135 | + "ORDER BY COUNT(*) DESC\n", |
| 136 | + "LIMIT 1;\n", |
| 137 | + "\n", |
| 138 | + "Runtime 470 ms\n", |
| 139 | + "Beats 41.18%\n", |
| 140 | + "\n", |
| 141 | + "```\n", |
| 142 | + "\n", |
| 143 | + "* 余計な CTE・結合・ウィンドウ不要。\n", |
| 144 | + "* `GROUP BY` 後に件数降順で **先頭 1 件だけ**返すので、実装や実行計画的にもシンプル。\n", |
| 145 | + "\n", |
| 146 | + "### 2) 同数最大をすべて返したい(Follow-up 汎用)\n", |
| 147 | + "\n", |
| 148 | + "**ウィンドウ関数なし**で 1 クエリ:\n", |
| 149 | + "\n", |
| 150 | + "```sql\n", |
| 151 | + "SELECT customer_number\n", |
| 152 | + "FROM Orders\n", |
| 153 | + "GROUP BY customer_number\n", |
| 154 | + "HAVING COUNT(*) = (\n", |
| 155 | + " SELECT COUNT(*) AS mx\n", |
| 156 | + " FROM Orders\n", |
| 157 | + " GROUP BY customer_number\n", |
| 158 | + " ORDER BY mx DESC\n", |
| 159 | + " LIMIT 1\n", |
| 160 | + ");\n", |
| 161 | + "\n", |
| 162 | + "Runtime 433 ms\n", |
| 163 | + "Beats 72.43%\n", |
| 164 | + "\n", |
| 165 | + "```\n", |
| 166 | + "\n", |
| 167 | + "* 内側サブクエリで「最大件数」だけを 1 行取得 → 外側で一致フィルタ。\n", |
| 168 | + "* あなたの `cnt→mx→JOIN` 版よりも結合が無いぶん軽くなることが多いです。\n", |
| 169 | + "\n", |
| 170 | + "### 3) ウィンドウ関数派(可読性重視)なら CTE をやめて 1 段で\n", |
| 171 | + "\n", |
| 172 | + "```sql\n", |
| 173 | + "SELECT customer_number\n", |
| 174 | + "FROM (\n", |
| 175 | + " SELECT\n", |
| 176 | + " customer_number,\n", |
| 177 | + " DENSE_RANK() OVER (ORDER BY COUNT(*) DESC) AS rnk\n", |
| 178 | + " FROM Orders\n", |
| 179 | + " GROUP BY customer_number\n", |
| 180 | + ") t\n", |
| 181 | + "WHERE rnk = 1;\n", |
| 182 | + "\n", |
| 183 | + "Runtime 461 ms\n", |
| 184 | + "Beats 48.24%\n", |
| 185 | + "\n", |
| 186 | + "```\n", |
| 187 | + "\n", |
| 188 | + "* `COUNT(*)` を直接 `ORDER BY` に使い、そのまま `DENSE_RANK`。\n", |
| 189 | + "* 中間 `cnt` CTE のマテリアライズを避けられる分だけ有利になる場合があります。\n", |
| 190 | + "\n", |
| 191 | + "---\n", |
| 192 | + "\n", |
| 193 | + "## 実務チューニングのヒント\n", |
| 194 | + "\n", |
| 195 | + "1. **インデックス**\n", |
| 196 | + "\n", |
| 197 | + "```sql\n", |
| 198 | + "CREATE INDEX idx_orders_customer ON Orders(customer_number);\n", |
| 199 | + "```\n", |
| 200 | + "\n", |
| 201 | + "* `GROUP BY customer_number` の集約が大幅にラクになります(全表スキャン回避/ソート削減)。\n", |
| 202 | + "\n", |
| 203 | + "2. **CTE は必要最小限に**\n", |
| 204 | + " MySQL 8 では CTE がマテリアライズされるケースがあり、単回参照の中間表は**派生表**に畳んだ方が速いことが多いです(上の「3)」がそれ)。\n", |
| 205 | + "\n", |
| 206 | + "3. **LeetCode の Runtime はノイズ大**\n", |
| 207 | + " 実環境では **EXPLAIN** で実行計画を確認し、`rows` 見積もり・ファイルソート有無・テンポラリ使用などをチェックしてく さい。\n", |
| 208 | + "\n", |
| 209 | + "---\n", |
| 210 | + "\n", |
| 211 | + "## まとめ\n", |
| 212 | + "\n", |
| 213 | + "* **一意最大前提**なら ⇒ `GROUP BY ... ORDER BY COUNT(*) DESC LIMIT 1` が最有力。\n", |
| 214 | + "* **同率最大も返す**なら ⇒ `HAVING COUNT(*) = (SELECT ... LIMIT 1)` がシンプル&速いことが多い。\n", |
| 215 | + "* **ウィンドウ採用**なら ⇒ 中間 CTE 省略の 1 段構成に。\n", |
| 216 | + "* **物理対策** ⇒ `customer_number` にインデックス。\n" |
| 217 | + ] |
| 218 | + } |
| 219 | + ], |
| 220 | + "metadata": { |
| 221 | + "language_info": { |
| 222 | + "name": "python" |
| 223 | + } |
| 224 | + }, |
| 225 | + "nbformat": 4, |
| 226 | + "nbformat_minor": 5 |
| 227 | +} |
0 commit comments