{
"cells": [
{
"cell_type": "markdown",
"id": "41fb9eda-8d3e-43e6-ac29-ed5bb545f0e3",
"metadata": {},
"source": [
"# データサイエンス100本ノック 21~30"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "bfa0c462-fa3a-443e-b011-1094b8e3d629",
"metadata": {},
"outputs": [],
"source": [
"import polars as pl\n",
"from helper.polars import load_100knocks_data\n",
"pl.Config.set_fmt_str_lengths(100)\n",
"df_customer, df_category, df_product, df_receipt, df_store, df_geocode = load_100knocks_data()"
]
},
{
"cell_type": "markdown",
"id": "8cd96d80-1213-4ca1-8b44-adce30c50d68",
"metadata": {},
"source": [
"## P-021\n",
"\n",
"レシート明細データ(df_receipt)に対し、件数をカウントせよ。"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "f38ef610-0171-4880-9686-659f395ec9f5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"104681"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_receipt.shape[0]"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "c28699b4-3554-4f5e-bbf6-a09914617373",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"104681"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(df_receipt)"
]
},
{
"cell_type": "markdown",
"id": "fcf3d91e-7156-4a97-8f77-3a34cdb3da7c",
"metadata": {},
"source": [
"## P-022\n",
"\n",
"レシート明細データ(df_receipt)の顧客ID(customer_id)に対し、ユニーク件数をカウントせよ。"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "dc524952-78f2-439f-a6cd-af621c8c0e28",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"8307"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".select(pl.col.customer_id.n_unique())\n",
".item()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "7b7e7f55-c6f5-41bb-9c3a-e4b1cc42fc05",
"metadata": {},
"source": [
"## P-023\n",
"\n",
"レシート明細データ(df_receipt)に対し、店舗コード(store_cd)ごとに売上金額(amount)と売上数量(quantity)を合計せよ。"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "9942b42b-3702-45f1-bee1-5c8c967817ae",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
shape: (5, 3)store_cd | amount | quantity |
---|
str | i64 | i64 |
"S12007" | 638761 | 2099 |
"S12013" | 787513 | 2425 |
"S12014" | 725167 | 2358 |
"S12029" | 794741 | 2555 |
"S12030" | 684402 | 2403 |
"
],
"text/plain": [
"shape: (5, 3)\n",
"┌──────────┬────────┬──────────┐\n",
"│ store_cd ┆ amount ┆ quantity │\n",
"│ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 │\n",
"╞══════════╪════════╪══════════╡\n",
"│ S12007 ┆ 638761 ┆ 2099 │\n",
"│ S12013 ┆ 787513 ┆ 2425 │\n",
"│ S12014 ┆ 725167 ┆ 2358 │\n",
"│ S12029 ┆ 794741 ┆ 2555 │\n",
"│ S12030 ┆ 684402 ┆ 2403 │\n",
"└──────────┴────────┴──────────┘"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by('store_cd')\n",
".agg(pl.col('amount', 'quantity').sum())\n",
".sort(by='store_cd')\n",
".head()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "c251677d-6642-4027-a6c4-b6b666510307",
"metadata": {},
"source": [
"## P-024\n",
"\n",
"レシート明細データ(df_receipt)に対し、顧客ID(customer_id)ごとに最も新しい売上年月日(sales_ymd)を求め、10件表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "d6a62bab-6712-4c72-a2bc-5fef42bc958c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 2)customer_id | sales_ymd |
---|
str | i64 |
"CS006214000001" | 20190908 |
"CS008415000097" | 20190417 |
"CS028414000014" | 20191023 |
"ZZ000000000000" | 20191031 |
"CS025415000050" | 20191008 |
"
],
"text/plain": [
"shape: (5, 2)\n",
"┌────────────────┬───────────┐\n",
"│ customer_id ┆ sales_ymd │\n",
"│ --- ┆ --- │\n",
"│ str ┆ i64 │\n",
"╞════════════════╪═══════════╡\n",
"│ CS006214000001 ┆ 20190908 │\n",
"│ CS008415000097 ┆ 20190417 │\n",
"│ CS028414000014 ┆ 20191023 │\n",
"│ ZZ000000000000 ┆ 20191031 │\n",
"│ CS025415000050 ┆ 20191008 │\n",
"└────────────────┴───────────┘"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by(pl.col.customer_id, maintain_order=True)\n",
".agg(pl.col('sales_ymd').max())\n",
".head()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "932c80f8-9cbd-4e0a-8052-87e070f3b4cc",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 9)customer_id | sales_ymd | sales_epoch | store_cd | receipt_no | receipt_sub_no | product_cd | quantity | amount |
---|
str | i64 | i64 | str | i64 | i64 | str | i64 | i64 |
"CS006214000001" | 20190908 | 1567900800 | "S14006" | 112 | 1 | "P071302010" | 1 | 770 |
"CS008415000097" | 20190417 | 1555459200 | "S13008" | 1192 | 1 | "P090405001" | 1 | 200 |
"CS028414000014" | 20191023 | 1571788800 | "S14028" | 1122 | 2 | "P070301019" | 1 | 225 |
"ZZ000000000000" | 20191031 | 1572480000 | "S13019" | 112 | 1 | "P080102015" | 1 | 30 |
"CS025415000050" | 20191008 | 1570492800 | "S14025" | 1182 | 2 | "P080801009" | 1 | 258 |
"
],
"text/plain": [
"shape: (5, 9)\n",
"┌────────────────┬───────────┬─────────────┬──────────┬───┬────────────────┬────────────┬──────────┬────────┐\n",
"│ customer_id ┆ sales_ymd ┆ sales_epoch ┆ store_cd ┆ … ┆ receipt_sub_no ┆ product_cd ┆ quantity ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 ┆ str ┆ ┆ i64 ┆ str ┆ i64 ┆ i64 │\n",
"╞════════════════╪═══════════╪═════════════╪══════════╪═══╪════════════════╪════════════╪══════════╪════════╡\n",
"│ CS006214000001 ┆ 20190908 ┆ 1567900800 ┆ S14006 ┆ … ┆ 1 ┆ P071302010 ┆ 1 ┆ 770 │\n",
"│ CS008415000097 ┆ 20190417 ┆ 1555459200 ┆ S13008 ┆ … ┆ 1 ┆ P090405001 ┆ 1 ┆ 200 │\n",
"│ CS028414000014 ┆ 20191023 ┆ 1571788800 ┆ S14028 ┆ … ┆ 2 ┆ P070301019 ┆ 1 ┆ 225 │\n",
"│ ZZ000000000000 ┆ 20191031 ┆ 1572480000 ┆ S13019 ┆ … ┆ 1 ┆ P080102015 ┆ 1 ┆ 30 │\n",
"│ CS025415000050 ┆ 20191008 ┆ 1570492800 ┆ S14025 ┆ … ┆ 2 ┆ P080801009 ┆ 1 ┆ 258 │\n",
"└────────────────┴───────────┴─────────────┴──────────┴───┴────────────────┴────────────┴──────────┴────────┘"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by(pl.col.customer_id, maintain_order=True)\n",
".agg(\n",
" pl.all()\n",
" .sort_by('sales_ymd', descending=True)\n",
" .first()\n",
")\n",
".head()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "ecdbc6b6-288c-4642-9941-1f979ddff51e",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 9)customer_id | sales_ymd | sales_epoch | store_cd | receipt_no | receipt_sub_no | product_cd | quantity | amount |
---|
str | i64 | i64 | str | i64 | i64 | str | i64 | i64 |
"CS006214000001" | 20190908 | 1567900800 | "S14006" | 112 | 1 | "P071302010" | 1 | 770 |
"CS008415000097" | 20190417 | 1555459200 | "S13008" | 1192 | 1 | "P090405001" | 1 | 200 |
"CS028414000014" | 20191023 | 1571788800 | "S14028" | 1122 | 2 | "P070301019" | 1 | 225 |
"ZZ000000000000" | 20191031 | 1572480000 | "S13019" | 112 | 1 | "P080102015" | 1 | 30 |
"CS025415000050" | 20191008 | 1570492800 | "S14025" | 1182 | 2 | "P080801009" | 1 | 258 |
"
],
"text/plain": [
"shape: (5, 9)\n",
"┌────────────────┬───────────┬─────────────┬──────────┬───┬────────────────┬────────────┬──────────┬────────┐\n",
"│ customer_id ┆ sales_ymd ┆ sales_epoch ┆ store_cd ┆ … ┆ receipt_sub_no ┆ product_cd ┆ quantity ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 ┆ str ┆ ┆ i64 ┆ str ┆ i64 ┆ i64 │\n",
"╞════════════════╪═══════════╪═════════════╪══════════╪═══╪════════════════╪════════════╪══════════╪════════╡\n",
"│ CS006214000001 ┆ 20190908 ┆ 1567900800 ┆ S14006 ┆ … ┆ 1 ┆ P071302010 ┆ 1 ┆ 770 │\n",
"│ CS008415000097 ┆ 20190417 ┆ 1555459200 ┆ S13008 ┆ … ┆ 1 ┆ P090405001 ┆ 1 ┆ 200 │\n",
"│ CS028414000014 ┆ 20191023 ┆ 1571788800 ┆ S14028 ┆ … ┆ 2 ┆ P070301019 ┆ 1 ┆ 225 │\n",
"│ ZZ000000000000 ┆ 20191031 ┆ 1572480000 ┆ S13019 ┆ … ┆ 1 ┆ P080102015 ┆ 1 ┆ 30 │\n",
"│ CS025415000050 ┆ 20191008 ┆ 1570492800 ┆ S14025 ┆ … ┆ 2 ┆ P080801009 ┆ 1 ┆ 258 │\n",
"└────────────────┴───────────┴─────────────┴──────────┴───┴────────────────┴────────────┴──────────┴────────┘"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by(pl.col.customer_id, maintain_order=True)\n",
".agg(\n",
" pl.all()\n",
" .get(pl.col.sales_ymd.arg_max())\n",
")\n",
".head()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "eb29cb39-03b6-4e12-beb2-2d704f472769",
"metadata": {},
"source": [
"## P-025\n",
"\n",
"レシート明細データ(df_receipt)に対し、顧客ID(customer_id)ごとに最も古い売上年月日(sales_ymd)を求め、10件表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "21dfc47d-c5df-4db5-8112-6da055afb244",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 2)customer_id | sales_ymd |
---|
str | i64 |
"CS006214000001" | 20170509 |
"CS008415000097" | 20170328 |
"CS028414000014" | 20170403 |
"ZZ000000000000" | 20170101 |
"CS025415000050" | 20170619 |
"
],
"text/plain": [
"shape: (5, 2)\n",
"┌────────────────┬───────────┐\n",
"│ customer_id ┆ sales_ymd │\n",
"│ --- ┆ --- │\n",
"│ str ┆ i64 │\n",
"╞════════════════╪═══════════╡\n",
"│ CS006214000001 ┆ 20170509 │\n",
"│ CS008415000097 ┆ 20170328 │\n",
"│ CS028414000014 ┆ 20170403 │\n",
"│ ZZ000000000000 ┆ 20170101 │\n",
"│ CS025415000050 ┆ 20170619 │\n",
"└────────────────┴───────────┘"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by(pl.col.customer_id, maintain_order=True)\n",
".agg(pl.col('sales_ymd').min())\n",
".head()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "0568f5c5-3dd1-40b5-9df0-c12631b68e12",
"metadata": {},
"source": [
"## P-026\n",
"\n",
"レシート明細データ(df_receipt)に対し、顧客ID(customer_id)ごとに最も新しい売上年月日(sales_ymd)と古い売上年月日を求め、両者が異なるデータを10件表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "61abb4d0-e3d9-4925-b2c8-7bfc7d38fd18",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 3)customer_id | sales_ymd_min | sales_ymd_max |
---|
str | i64 | i64 |
"CS006214000001" | 20170509 | 20190908 |
"CS008415000097" | 20170328 | 20190417 |
"CS028414000014" | 20170403 | 20191023 |
"ZZ000000000000" | 20170101 | 20191031 |
"CS025415000050" | 20170619 | 20191008 |
"
],
"text/plain": [
"shape: (5, 3)\n",
"┌────────────────┬───────────────┬───────────────┐\n",
"│ customer_id ┆ sales_ymd_min ┆ sales_ymd_max │\n",
"│ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 │\n",
"╞════════════════╪═══════════════╪═══════════════╡\n",
"│ CS006214000001 ┆ 20170509 ┆ 20190908 │\n",
"│ CS008415000097 ┆ 20170328 ┆ 20190417 │\n",
"│ CS028414000014 ┆ 20170403 ┆ 20191023 │\n",
"│ ZZ000000000000 ┆ 20170101 ┆ 20191031 │\n",
"│ CS025415000050 ┆ 20170619 ┆ 20191008 │\n",
"└────────────────┴───────────────┴───────────────┘"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".lazy()\n",
".group_by('customer_id', maintain_order=True)\n",
".agg(\n",
" pl.col.sales_ymd.min().alias('sales_ymd_min'),\n",
" pl.col.sales_ymd.max().alias('sales_ymd_max'), \n",
")\n",
".filter(\n",
" pl.col.sales_ymd_min != pl.col.sales_ymd_max\n",
")\n",
".head(5)\n",
".collect()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "877a00f9-bd76-47b2-ac87-b736105f0369",
"metadata": {},
"source": [
"## P-027\n",
"\n",
"レシート明細データ(df_receipt)に対し、店舗コード(store_cd)ごとに売上金額(amount)の平均を計算し、降順でTOP5を表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "b1aba2c1-55df-4775-a196-0a8cf8401e2a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 2)store_cd | amount |
---|
str | f64 |
"S13052" | 402.86747 |
"S13015" | 351.11196 |
"S13003" | 350.915519 |
"S14010" | 348.791262 |
"S13001" | 348.470386 |
"
],
"text/plain": [
"shape: (5, 2)\n",
"┌──────────┬────────────┐\n",
"│ store_cd ┆ amount │\n",
"│ --- ┆ --- │\n",
"│ str ┆ f64 │\n",
"╞══════════╪════════════╡\n",
"│ S13052 ┆ 402.86747 │\n",
"│ S13015 ┆ 351.11196 │\n",
"│ S13003 ┆ 350.915519 │\n",
"│ S14010 ┆ 348.791262 │\n",
"│ S13001 ┆ 348.470386 │\n",
"└──────────┴────────────┘"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by('store_cd')\n",
".agg(pl.col.amount.mean())\n",
".sort(by='amount', descending=True)\n",
".head(5)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "b7c66477-1fb7-4327-95bf-cb69865e6339",
"metadata": {},
"source": [
"## P-028\n",
"\n",
"レシート明細データ(df_receipt)に対し、店舗コード(store_cd)ごとに売上金額(amount)の中央値を計算し、降順でTOP5を表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "6540861d-8ac7-4634-89f1-650e8fb61d1a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 2)store_cd | amount |
---|
str | f64 |
"S13052" | 190.0 |
"S14010" | 188.0 |
"S14050" | 185.0 |
"S13018" | 180.0 |
"S13003" | 180.0 |
"
],
"text/plain": [
"shape: (5, 2)\n",
"┌──────────┬────────┐\n",
"│ store_cd ┆ amount │\n",
"│ --- ┆ --- │\n",
"│ str ┆ f64 │\n",
"╞══════════╪════════╡\n",
"│ S13052 ┆ 190.0 │\n",
"│ S14010 ┆ 188.0 │\n",
"│ S14050 ┆ 185.0 │\n",
"│ S13018 ┆ 180.0 │\n",
"│ S13003 ┆ 180.0 │\n",
"└──────────┴────────┘"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by('store_cd')\n",
".agg(pl.col.amount.median())\n",
".sort(by='amount', descending=True)\n",
".head(5)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "8b665aa8-7406-49b8-a052-5f009f5dba89",
"metadata": {},
"source": [
"## P-029\n",
"\n",
"レシート明細データ(df_receipt)に対し、店舗コード(store_cd)ごとに商品コード(product_cd)の最頻値を求め、10件表示させよ。"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "a1e32925-4e63-45b3-aa88-255cc3312a98",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (10, 3)store_cd | product_cd | count |
---|
str | list[str] | u32 |
"S12007" | ["P060303001"] | 72 |
"S12013" | ["P060303001"] | 107 |
"S12014" | ["P060303001"] | 65 |
"S12029" | ["P060303001"] | 92 |
"S12030" | ["P060303001"] | 115 |
"S13001" | ["P060303001"] | 67 |
"S13002" | ["P060303001"] | 78 |
"S13003" | ["P071401001"] | 65 |
"S13004" | ["P060303001"] | 88 |
"S13005" | ["P040503001"] | 36 |
"
],
"text/plain": [
"shape: (10, 3)\n",
"┌──────────┬────────────────┬───────┐\n",
"│ store_cd ┆ product_cd ┆ count │\n",
"│ --- ┆ --- ┆ --- │\n",
"│ str ┆ list[str] ┆ u32 │\n",
"╞══════════╪════════════════╪═══════╡\n",
"│ S12007 ┆ [\"P060303001\"] ┆ 72 │\n",
"│ S12013 ┆ [\"P060303001\"] ┆ 107 │\n",
"│ S12014 ┆ [\"P060303001\"] ┆ 65 │\n",
"│ S12029 ┆ [\"P060303001\"] ┆ 92 │\n",
"│ S12030 ┆ [\"P060303001\"] ┆ 115 │\n",
"│ S13001 ┆ [\"P060303001\"] ┆ 67 │\n",
"│ S13002 ┆ [\"P060303001\"] ┆ 78 │\n",
"│ S13003 ┆ [\"P071401001\"] ┆ 65 │\n",
"│ S13004 ┆ [\"P060303001\"] ┆ 88 │\n",
"│ S13005 ┆ [\"P040503001\"] ┆ 36 │\n",
"└──────────┴────────────────┴───────┘"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mode = pl.col.product_cd.mode()\n",
"(\n",
"df_receipt\n",
".group_by('store_cd')\n",
".agg(\n",
" mode,\n",
" (pl.col.product_cd == mode.first())\n",
" .sum()\n",
" .alias('count')\n",
")\n",
".sort(by='store_cd')\n",
".head(10)\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "fb41be78-0bd7-443a-8867-d6dc342e9568",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\ruoyu\\AppData\\Local\\Temp\\ipykernel_25276\\4003027574.py:4: DeprecationWarning: `pl.count()` is deprecated. Please use `pl.len()` instead.\n",
" .agg(pl.count())\n"
]
},
{
"data": {
"text/html": [
"\n",
"
shape: (10, 3)store_cd | product_cd | count |
---|
str | str | u32 |
"S12007" | "P060303001" | 72 |
"S12013" | "P060303001" | 107 |
"S12014" | "P060303001" | 65 |
"S12029" | "P060303001" | 92 |
"S12030" | "P060303001" | 115 |
"S13001" | "P060303001" | 67 |
"S13002" | "P060303001" | 78 |
"S13003" | "P071401001" | 65 |
"S13004" | "P060303001" | 88 |
"S13005" | "P040503001" | 36 |
"
],
"text/plain": [
"shape: (10, 3)\n",
"┌──────────┬────────────┬───────┐\n",
"│ store_cd ┆ product_cd ┆ count │\n",
"│ --- ┆ --- ┆ --- │\n",
"│ str ┆ str ┆ u32 │\n",
"╞══════════╪════════════╪═══════╡\n",
"│ S12007 ┆ P060303001 ┆ 72 │\n",
"│ S12013 ┆ P060303001 ┆ 107 │\n",
"│ S12014 ┆ P060303001 ┆ 65 │\n",
"│ S12029 ┆ P060303001 ┆ 92 │\n",
"│ S12030 ┆ P060303001 ┆ 115 │\n",
"│ S13001 ┆ P060303001 ┆ 67 │\n",
"│ S13002 ┆ P060303001 ┆ 78 │\n",
"│ S13003 ┆ P071401001 ┆ 65 │\n",
"│ S13004 ┆ P060303001 ┆ 88 │\n",
"│ S13005 ┆ P040503001 ┆ 36 │\n",
"└──────────┴────────────┴───────┘"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by('store_cd', 'product_cd')\n",
".agg(pl.count())\n",
".filter(\n",
" (pl.col.count == pl.col.count.max()).over('store_cd')\n",
")\n",
".sort(by='store_cd')\n",
".head(10)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "d2faae52-32f9-41ca-bc62-75aed368bfb2",
"metadata": {},
"source": [
"## P-030\n",
"\n",
"レシート明細データ(df_receipt)に対し、店舗コード(store_cd)ごとに売上金額(amount)の分散を計算し、降順で5件表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "9391f79b-c544-4fd2-8f48-f2ece51ffe9c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 2)store_cd | vars_amount |
---|
str | f64 |
"S13052" | 440088.701311 |
"S14011" | 306314.558164 |
"S14034" | 296920.081011 |
"S13001" | 295431.993329 |
"S13015" | 295294.361116 |
"
],
"text/plain": [
"shape: (5, 2)\n",
"┌──────────┬───────────────┐\n",
"│ store_cd ┆ vars_amount │\n",
"│ --- ┆ --- │\n",
"│ str ┆ f64 │\n",
"╞══════════╪═══════════════╡\n",
"│ S13052 ┆ 440088.701311 │\n",
"│ S14011 ┆ 306314.558164 │\n",
"│ S14034 ┆ 296920.081011 │\n",
"│ S13001 ┆ 295431.993329 │\n",
"│ S13015 ┆ 295294.361116 │\n",
"└──────────┴───────────────┘"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by('store_cd')\n",
".agg(\n",
" vars_amount=pl.col.amount.var(ddof=0)\n",
")\n",
".sort(by='vars_amount', descending=True)\n",
".head()\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}