Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "eeeadbe0",
"metadata": {},
"source": [
"# MySQL 8.0.40\n",
"\n",
"## 0) 前提\n",
"\n",
"* エンジン: **MySQL 8**\n",
"* 並び順: 任意(`ORDER BY` を付けない)\n",
"* `NOT IN` は NULL 罠のため回避\n",
"* 判定は **ID 基準**、表示は仕様どおりの列名と順序\n",
"\n",
"## 1) 問題\n",
"\n",
"* `Orders` から **最も多く注文を行った顧客の `customer_number`** を返す\n",
" (テストでは最大が一意。Follow up: 同数最大が複数いても全件返す)\n",
"\n",
"* 入力テーブル例:\n",
"\n",
" ```markdown\n",
" Table: Orders\n",
" +-----------------+----------+\n",
" | Column Name | Type |\n",
" +-----------------+----------+\n",
" | order_number | int | -- PK\n",
" | customer_number | int |\n",
" +-----------------+----------+\n",
" ```\n",
"\n",
"* 出力仕様:\n",
"\n",
" ```markdown\n",
" +-----------------+\n",
" | customer_number |\n",
" +-----------------+\n",
" ```\n",
"\n",
"## 2) 最適解(単一クエリ)\n",
"\n",
"> **ウィンドウ関数+事前集計**で 1 クエリ。`OVER` 内の `ORDER BY` は順位付けのためで、最終 `SELECT` に `ORDER BY` は不要。\n",
"\n",
"```sql\n",
"WITH cnt AS (\n",
" SELECT\n",
" customer_number,\n",
" COUNT(*) AS order_cnt\n",
" FROM Orders\n",
" GROUP BY customer_number\n",
"),\n",
"win AS (\n",
" SELECT\n",
" customer_number,\n",
" DENSE_RANK() OVER (ORDER BY order_cnt DESC) AS rnk\n",
" FROM cnt\n",
")\n",
"SELECT\n",
" customer_number\n",
"FROM win\n",
"WHERE rnk = 1;\n",
"\n",
"Runtime 429 ms\n",
"Beats 75.41%\n",
"\n",
"```\n",
"\n",
"* これで **一意最大**も**同数最大が複数**も対応(Follow up 充足)\n",
"\n",
"## 3) 代替解\n",
"\n",
"> **最大値をサブクエリで求めて一致フィルタ**。ウィンドウが重い環境や互換用に。\n",
"\n",
"```sql\n",
"WITH cnt AS (\n",
" SELECT customer_number, COUNT(*) AS order_cnt\n",
" FROM Orders\n",
" GROUP BY customer_number\n",
"),\n",
"mx AS (\n",
" SELECT MAX(order_cnt) AS max_cnt FROM cnt\n",
")\n",
"SELECT c.customer_number\n",
"FROM cnt AS c\n",
"JOIN mx ON c.order_cnt = mx.max_cnt;\n",
"\n",
"Runtime 469 ms\n",
"Beats 41.83%\n",
"\n",
"```\n",
"\n",
"※ `NOT IN` は未使用。`ORDER BY ... LIMIT 1` でも実現できるが、本要件では**結果順は任意**かつ **最大同率全件**を自然に返せる上記方式が安全。\n",
"\n",
"## 4) 要点解説\n",
"\n",
"* **方針**: まず `customer_number` 単位で件数を集計 → 上位判定(`DENSE_RANK` か `MAX` 照合) → 仕様列のみ投影。\n",
"* **NULL / 重複**:\n",
"\n",
" * `customer_number` が NULL の行が存在するなら集計前に `WHERE customer_number IS NOT NULL` を入れる(問題文では想定外だが堅牢性の観点)。\n",
" * `order_number` は PK のため重複はなし。\n",
"* **安定性**: 出力順は問わないため **最終 `ORDER BY` は不要**。上位選別はウィンドウ内で完結。\n",
"\n",
"## 5) 計算量(概算)\n",
"\n",
"* `GROUP BY` 集計: **O(N)**~**O(N log N)**(ヒープ/ソート次第)\n",
"* ウィンドウ `DENSE_RANK`(代替は `MAX` 照合): 集計後のユニーク顧客数を M として **O(M log M)**(内部ソート含む)\n",
" ※ 代替解(`MAX` 照合)は **O(M)** でやや軽量。\n",
"\n",
"## 6) 図解(Mermaid 超保守版)\n",
"\n",
"```mermaid\n",
"flowchart TD\n",
" A[Orders] --> B[customer_number ごとに COUNT]\n",
" B --> C[順位付け DENSE_RANK または MAX 照合]\n",
" C --> D[rnk=1 または order_cnt=max_cnt を抽出]\n",
" D --> E[出力 customer_number]\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "4392213e",
"metadata": {},
"source": [
"## 結論(用途別ベスト)\n",
"\n",
"### 1) 最大が一意(本問題の前提)なら最短クエリが最速になりやすい\n",
"\n",
"```sql\n",
"SELECT customer_number\n",
"FROM Orders\n",
"GROUP BY customer_number\n",
"ORDER BY COUNT(*) DESC\n",
"LIMIT 1;\n",
"\n",
"Runtime 470 ms\n",
"Beats 41.18%\n",
"\n",
"```\n",
"\n",
"* 余計な CTE・結合・ウィンドウ不要。\n",
"* `GROUP BY` 後に件数降順で **先頭 1 件だけ**返すので、実装や実行計画的にもシンプル。\n",
"\n",
"### 2) 同数最大をすべて返したい(Follow-up 汎用)\n",
"\n",
"**ウィンドウ関数なし**で 1 クエリ:\n",
"\n",
"```sql\n",
"SELECT customer_number\n",
"FROM Orders\n",
"GROUP BY customer_number\n",
"HAVING COUNT(*) = (\n",
" SELECT COUNT(*) AS mx\n",
" FROM Orders\n",
" GROUP BY customer_number\n",
" ORDER BY mx DESC\n",
" LIMIT 1\n",
");\n",
"\n",
"Runtime 433 ms\n",
"Beats 72.43%\n",
"\n",
"```\n",
"\n",
"* 内側サブクエリで「最大件数」だけを 1 行取得 → 外側で一致フィルタ。\n",
"* あなたの `cnt→mx→JOIN` 版よりも結合が無いぶん軽くなることが多いです。\n",
"\n",
"### 3) ウィンドウ関数派(可読性重視)なら CTE をやめて 1 段で\n",
"\n",
"```sql\n",
"SELECT customer_number\n",
"FROM (\n",
" SELECT\n",
" customer_number,\n",
" DENSE_RANK() OVER (ORDER BY COUNT(*) DESC) AS rnk\n",
" FROM Orders\n",
" GROUP BY customer_number\n",
") t\n",
"WHERE rnk = 1;\n",
"\n",
"Runtime 461 ms\n",
"Beats 48.24%\n",
"\n",
"```\n",
"\n",
"* `COUNT(*)` を直接 `ORDER BY` に使い、そのまま `DENSE_RANK`。\n",
"* 中間 `cnt` CTE のマテリアライズを避けられる分だけ有利になる場合があります。\n",
"\n",
"---\n",
"\n",
"## 実務チューニングのヒント\n",
"\n",
"1. **インデックス**\n",
"\n",
"```sql\n",
"CREATE INDEX idx_orders_customer ON Orders(customer_number);\n",
"```\n",
"\n",
"* `GROUP BY customer_number` の集約が大幅にラクになります(全表スキャン回避/ソート削減)。\n",
"\n",
"2. **CTE は必要最小限に**\n",
" MySQL 8 では CTE がマテリアライズされるケースがあり、単回参照の中間表は**派生表**に畳んだ方が速いことが多いです(上の「3)」がそれ)。\n",
"\n",
"3. **LeetCode の Runtime はノイズ大**\n",
" 実環境では **EXPLAIN** で実行計画を確認し、`rows` 見積もり・ファイルソート有無・テンポラリ使用などをチェックしてく さい。\n",
"\n",
"---\n",
"\n",
"## まとめ\n",
"\n",
"* **一意最大前提**なら ⇒ `GROUP BY ... ORDER BY COUNT(*) DESC LIMIT 1` が最有力。\n",
"* **同率最大も返す**なら ⇒ `HAVING COUNT(*) = (SELECT ... LIMIT 1)` がシンプル&速いことが多い。\n",
"* **ウィンドウ採用**なら ⇒ 中間 CTE 省略の 1 段構成に。\n",
"* **物理対策** ⇒ `customer_number` にインデックス。\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading