{
"cells": [
{
"cell_type": "markdown",
"id": "41fb9eda-8d3e-43e6-ac29-ed5bb545f0e3",
"metadata": {},
"source": [
"# データサイエンス100本ノック 61~70"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "bfa0c462-fa3a-443e-b011-1094b8e3d629",
"metadata": {},
"outputs": [],
"source": [
"import polars as pl\n",
"from helper.polars import load_100knocks_data\n",
"pl.Config.set_fmt_str_lengths(100)\n",
"df_customer, df_category, df_product, df_receipt, df_store, df_geocode = load_100knocks_data()"
]
},
{
"cell_type": "markdown",
"id": "cef2d8ef-7332-45c8-9dfb-ca942aee4fba",
"metadata": {},
"source": [
"## P-061\n",
"\n",
"レシート明細データ(df_receipt)の売上金額(amount)を顧客ID(customer_id)ごとに合計し、売上金額合計を常用対数化(底10)して顧客ID、売上金額合計とともに10件表示せよ。ただし、顧客IDが\"Z\"から始まるのものは非会員を表すため、除外して計算すること。"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f141f3df-e59a-46f1-a9b1-ab2134504c15",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
shape: (5, 3)customer_id | sum_amount | log_amount |
---|
str | i64 | f64 |
"CS025515000057" | 8013 | 3.903822 |
"CS013414000081" | 5343 | 3.727826 |
"CS010315000024" | 202 | 2.306425 |
"CS001515000463" | 784 | 2.894593 |
"CS007512000029" | 328 | 2.516535 |
"
],
"text/plain": [
"shape: (5, 3)\n",
"┌────────────────┬────────────┬────────────┐\n",
"│ customer_id ┆ sum_amount ┆ log_amount │\n",
"│ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ f64 │\n",
"╞════════════════╪════════════╪════════════╡\n",
"│ CS025515000057 ┆ 8013 ┆ 3.903822 │\n",
"│ CS013414000081 ┆ 5343 ┆ 3.727826 │\n",
"│ CS010315000024 ┆ 202 ┆ 2.306425 │\n",
"│ CS001515000463 ┆ 784 ┆ 2.894593 │\n",
"│ CS007512000029 ┆ 328 ┆ 2.516535 │\n",
"└────────────────┴────────────┴────────────┘"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by('customer_id')\n",
".agg(\n",
" pl.col.amount.sum().alias('sum_amount'),\n",
" (pl.col.amount.sum() + 0.5).log10().alias('log_amount')\n",
")\n",
".head()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "1ae58f1a-d1c5-49f7-a03a-2bfa3d07ec5f",
"metadata": {},
"source": [
"## P-062\n",
"\n",
"レシート明細データ(df_receipt)の売上金額(amount)を顧客ID(customer_id)ごとに合計し、売上金額合計を自然対数化(底e)して顧客ID、売上金額合計とともに10件表示せよ。ただし、顧客IDが\"Z\"から始まるのものは非会員を表すため、除外して計算すること。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e09cefaa-c74c-415c-882c-dc9aed82fbcd",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 3)customer_id | sum_amount | log_amount |
---|
str | i64 | f64 |
"CS037615000085" | 202 | 5.31074 |
"CS030212000045" | 208 | 5.339939 |
"CS028415000215" | 4062 | 8.309554 |
"CS005214000025" | 4028 | 8.301149 |
"CS034513000213" | 2908 | 7.975393 |
"
],
"text/plain": [
"shape: (5, 3)\n",
"┌────────────────┬────────────┬────────────┐\n",
"│ customer_id ┆ sum_amount ┆ log_amount │\n",
"│ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ f64 │\n",
"╞════════════════╪════════════╪════════════╡\n",
"│ CS037615000085 ┆ 202 ┆ 5.31074 │\n",
"│ CS030212000045 ┆ 208 ┆ 5.339939 │\n",
"│ CS028415000215 ┆ 4062 ┆ 8.309554 │\n",
"│ CS005214000025 ┆ 4028 ┆ 8.301149 │\n",
"│ CS034513000213 ┆ 2908 ┆ 7.975393 │\n",
"└────────────────┴────────────┴────────────┘"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".group_by('customer_id')\n",
".agg(\n",
" pl.col.amount.sum().alias('sum_amount'),\n",
" (pl.col.amount.sum() + 0.5).log().alias('log_amount')\n",
")\n",
".head()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "e4dc441f-0537-450a-bc7f-498702fe1312",
"metadata": {},
"source": [
"## P-063\n",
"\n",
"商品データ(df_product)の単価(unit_price)と原価(unit_cost)から各商品の利益額を算出し、結果を10件表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "34adc762-2a34-46e1-8653-02299bc3636f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 4)product_cd | unit_price | unit_cost | unit_profit |
---|
str | i64 | i64 | i64 |
"P040101001" | 198 | 149 | 49 |
"P040101002" | 218 | 164 | 54 |
"P040101003" | 230 | 173 | 57 |
"P040101004" | 248 | 186 | 62 |
"P040101005" | 268 | 201 | 67 |
"
],
"text/plain": [
"shape: (5, 4)\n",
"┌────────────┬────────────┬───────────┬─────────────┐\n",
"│ product_cd ┆ unit_price ┆ unit_cost ┆ unit_profit │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 ┆ i64 │\n",
"╞════════════╪════════════╪═══════════╪═════════════╡\n",
"│ P040101001 ┆ 198 ┆ 149 ┆ 49 │\n",
"│ P040101002 ┆ 218 ┆ 164 ┆ 54 │\n",
"│ P040101003 ┆ 230 ┆ 173 ┆ 57 │\n",
"│ P040101004 ┆ 248 ┆ 186 ┆ 62 │\n",
"│ P040101005 ┆ 268 ┆ 201 ┆ 67 │\n",
"└────────────┴────────────┴───────────┴─────────────┘"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_product\n",
".select(\n",
" pl.col.product_cd,\n",
" pl.col.unit_price,\n",
" pl.col.unit_cost,\n",
" (pl.col.unit_price - pl.col.unit_cost).alias('unit_profit')\n",
")\n",
".head()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "2cfeb849-0ca3-4555-97b3-45d38dde1f79",
"metadata": {},
"source": [
"## P-064 \n",
"\n",
"商品データ(df_product)の単価(unit_price)と原価(unit_cost)から、各商品の利益率の全体平均を算出せよ。ただし、単価と原価には欠損が生じていることに注意せよ。"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "3c26f9a7-8a2a-4a9b-9f3f-7f854407726b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.24911389885177007"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_product\n",
".select(\n",
" ((pl.col.unit_price - pl.col.unit_cost) / pl.col.unit_price).mean()\n",
")\n",
".item()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "20a8f113-146f-4c76-9162-22d90584ef4e",
"metadata": {},
"source": [
"## P-065\n",
"\n",
"商品データ(df_product)の各商品について、利益率が30%となる新たな単価を求めよ。ただし、1円未満は切り捨てること。そして結果を10件表示させ、利益率がおよそ30%付近であることを確認せよ。ただし、単価(unit_price)と原価(unit_cost)には欠損が生じていることに注意せよ。"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "2f988e75-e48b-4b43-9c85-555b6bf8436a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (10, 5)product_cd | unit_price | unit_cost | new_price | new_profit_rate |
---|
str | i64 | i64 | i32 | f64 |
"P040101001" | 198 | 149 | 212 | 0.29717 |
"P040101002" | 218 | 164 | 234 | 0.299145 |
"P040101003" | 230 | 173 | 247 | 0.299595 |
"P040101004" | 248 | 186 | 265 | 0.298113 |
"P040101005" | 268 | 201 | 287 | 0.299652 |
"P040101006" | 298 | 224 | 320 | 0.3 |
"P040101007" | 338 | 254 | 362 | 0.298343 |
"P040101008" | 420 | 315 | 450 | 0.3 |
"P040101009" | 498 | 374 | 534 | 0.299625 |
"P040101010" | 580 | 435 | 621 | 0.299517 |
"
],
"text/plain": [
"shape: (10, 5)\n",
"┌────────────┬────────────┬───────────┬───────────┬─────────────────┐\n",
"│ product_cd ┆ unit_price ┆ unit_cost ┆ new_price ┆ new_profit_rate │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 ┆ i32 ┆ f64 │\n",
"╞════════════╪════════════╪═══════════╪═══════════╪═════════════════╡\n",
"│ P040101001 ┆ 198 ┆ 149 ┆ 212 ┆ 0.29717 │\n",
"│ P040101002 ┆ 218 ┆ 164 ┆ 234 ┆ 0.299145 │\n",
"│ P040101003 ┆ 230 ┆ 173 ┆ 247 ┆ 0.299595 │\n",
"│ P040101004 ┆ 248 ┆ 186 ┆ 265 ┆ 0.298113 │\n",
"│ P040101005 ┆ 268 ┆ 201 ┆ 287 ┆ 0.299652 │\n",
"│ P040101006 ┆ 298 ┆ 224 ┆ 320 ┆ 0.3 │\n",
"│ P040101007 ┆ 338 ┆ 254 ┆ 362 ┆ 0.298343 │\n",
"│ P040101008 ┆ 420 ┆ 315 ┆ 450 ┆ 0.3 │\n",
"│ P040101009 ┆ 498 ┆ 374 ┆ 534 ┆ 0.299625 │\n",
"│ P040101010 ┆ 580 ┆ 435 ┆ 621 ┆ 0.299517 │\n",
"└────────────┴────────────┴───────────┴───────────┴─────────────────┘"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_product\n",
".select(\n",
" pl.col.product_cd,\n",
" pl.col.unit_price,\n",
" pl.col.unit_cost,\n",
" (pl.col.unit_cost / 0.7).floor().cast(pl.Int32).alias('new_price')\n",
")\n",
".with_columns(\n",
" new_profit_rate=(pl.col.new_price - pl.col.unit_cost) / pl.col.new_price\n",
")\n",
".head(10)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "c28cc267-2054-495f-8d19-e0381a05f8a4",
"metadata": {},
"source": [
"## P-066\n",
"\n",
"商品データ(df_product)の各商品について、利益率が30%となる新たな単価を求めよ。今回は、1円未満を丸めること(四捨五入または偶数への丸めで良い)。そして結果を10件表示させ、利益率がおよそ30%付近であることを確認せよ。ただし、単価(unit_price)と原価(unit_cost)には欠損が生じていることに注意せよ。"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "c91a4b3a-c8ff-4dfe-acb1-e7ede9586b9f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (10, 5)product_cd | unit_price | unit_cost | new_price | new_profit_rate |
---|
str | i64 | i64 | i32 | f64 |
"P040101001" | 198 | 149 | 213 | 0.300469 |
"P040101002" | 218 | 164 | 234 | 0.299145 |
"P040101003" | 230 | 173 | 247 | 0.299595 |
"P040101004" | 248 | 186 | 266 | 0.300752 |
"P040101005" | 268 | 201 | 287 | 0.299652 |
"P040101006" | 298 | 224 | 320 | 0.3 |
"P040101007" | 338 | 254 | 363 | 0.300275 |
"P040101008" | 420 | 315 | 450 | 0.3 |
"P040101009" | 498 | 374 | 534 | 0.299625 |
"P040101010" | 580 | 435 | 621 | 0.299517 |
"
],
"text/plain": [
"shape: (10, 5)\n",
"┌────────────┬────────────┬───────────┬───────────┬─────────────────┐\n",
"│ product_cd ┆ unit_price ┆ unit_cost ┆ new_price ┆ new_profit_rate │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 ┆ i32 ┆ f64 │\n",
"╞════════════╪════════════╪═══════════╪═══════════╪═════════════════╡\n",
"│ P040101001 ┆ 198 ┆ 149 ┆ 213 ┆ 0.300469 │\n",
"│ P040101002 ┆ 218 ┆ 164 ┆ 234 ┆ 0.299145 │\n",
"│ P040101003 ┆ 230 ┆ 173 ┆ 247 ┆ 0.299595 │\n",
"│ P040101004 ┆ 248 ┆ 186 ┆ 266 ┆ 0.300752 │\n",
"│ P040101005 ┆ 268 ┆ 201 ┆ 287 ┆ 0.299652 │\n",
"│ P040101006 ┆ 298 ┆ 224 ┆ 320 ┆ 0.3 │\n",
"│ P040101007 ┆ 338 ┆ 254 ┆ 363 ┆ 0.300275 │\n",
"│ P040101008 ┆ 420 ┆ 315 ┆ 450 ┆ 0.3 │\n",
"│ P040101009 ┆ 498 ┆ 374 ┆ 534 ┆ 0.299625 │\n",
"│ P040101010 ┆ 580 ┆ 435 ┆ 621 ┆ 0.299517 │\n",
"└────────────┴────────────┴───────────┴───────────┴─────────────────┘"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_product\n",
".select(\n",
" pl.col.product_cd,\n",
" pl.col.unit_price,\n",
" pl.col.unit_cost,\n",
" (pl.col.unit_cost / 0.7)\n",
" .round()\n",
" .cast(pl.Int32)\n",
" .alias('new_price')\n",
")\n",
".with_columns(\n",
" new_profit_rate=(pl.col.new_price - pl.col.unit_cost) / pl.col.new_price\n",
")\n",
".head(10)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "146499a6-daae-4bdc-bc25-4b24720d903c",
"metadata": {},
"source": [
"## P-067\n",
"\n",
"商品データ(df_product)の各商品について、利益率が30%となる新たな単価を求めよ。今回は、1円未満を切り上げること。そして結果を10件表示させ、利益率がおよそ30%付近であることを確認せよ。ただし、単価(unit_price)と原価(unit_cost)には欠損が生じていることに注意せよ。"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "02ef7918-2cc1-42aa-8684-b0d0820b38f8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (10, 5)product_cd | unit_price | unit_cost | new_price | new_profit_rate |
---|
str | i64 | i64 | i32 | f64 |
"P040101001" | 198 | 149 | 213 | 0.300469 |
"P040101002" | 218 | 164 | 235 | 0.302128 |
"P040101003" | 230 | 173 | 248 | 0.302419 |
"P040101004" | 248 | 186 | 266 | 0.300752 |
"P040101005" | 268 | 201 | 288 | 0.302083 |
"P040101006" | 298 | 224 | 320 | 0.3 |
"P040101007" | 338 | 254 | 363 | 0.300275 |
"P040101008" | 420 | 315 | 450 | 0.3 |
"P040101009" | 498 | 374 | 535 | 0.300935 |
"P040101010" | 580 | 435 | 622 | 0.300643 |
"
],
"text/plain": [
"shape: (10, 5)\n",
"┌────────────┬────────────┬───────────┬───────────┬─────────────────┐\n",
"│ product_cd ┆ unit_price ┆ unit_cost ┆ new_price ┆ new_profit_rate │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 ┆ i32 ┆ f64 │\n",
"╞════════════╪════════════╪═══════════╪═══════════╪═════════════════╡\n",
"│ P040101001 ┆ 198 ┆ 149 ┆ 213 ┆ 0.300469 │\n",
"│ P040101002 ┆ 218 ┆ 164 ┆ 235 ┆ 0.302128 │\n",
"│ P040101003 ┆ 230 ┆ 173 ┆ 248 ┆ 0.302419 │\n",
"│ P040101004 ┆ 248 ┆ 186 ┆ 266 ┆ 0.300752 │\n",
"│ P040101005 ┆ 268 ┆ 201 ┆ 288 ┆ 0.302083 │\n",
"│ P040101006 ┆ 298 ┆ 224 ┆ 320 ┆ 0.3 │\n",
"│ P040101007 ┆ 338 ┆ 254 ┆ 363 ┆ 0.300275 │\n",
"│ P040101008 ┆ 420 ┆ 315 ┆ 450 ┆ 0.3 │\n",
"│ P040101009 ┆ 498 ┆ 374 ┆ 535 ┆ 0.300935 │\n",
"│ P040101010 ┆ 580 ┆ 435 ┆ 622 ┆ 0.300643 │\n",
"└────────────┴────────────┴───────────┴───────────┴─────────────────┘"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_product\n",
".select(\n",
" pl.col.product_cd,\n",
" pl.col.unit_price,\n",
" pl.col.unit_cost,\n",
" (pl.col.unit_cost / 0.7)\n",
" .ceil()\n",
" .cast(pl.Int32)\n",
" .alias('new_price')\n",
")\n",
".with_columns(\n",
" new_profit_rate=(pl.col.new_price - pl.col.unit_cost) / pl.col.new_price\n",
")\n",
".head(10)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "3427ba24-66a0-4b8c-af8b-48ad033f9e8e",
"metadata": {},
"source": [
"## P-068\n",
"\n",
"商品データ(df_product)の各商品について、消費税率10%の税込み金額を求めよ。1円未満の端数は切り捨てとし、結果を10件表示せよ。ただし、単価(unit_price)には欠損が生じていることに注意せよ。"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "d30b8ab7-fe83-4407-9bd7-b1d66d499278",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (10, 3)product_cd | unit_price | tex_price |
---|
str | i64 | i32 |
"P040101001" | 198 | 217 |
"P040101002" | 218 | 239 |
"P040101003" | 230 | 253 |
"P040101004" | 248 | 272 |
"P040101005" | 268 | 294 |
"P040101006" | 298 | 327 |
"P040101007" | 338 | 371 |
"P040101008" | 420 | 462 |
"P040101009" | 498 | 547 |
"P040101010" | 580 | 638 |
"
],
"text/plain": [
"shape: (10, 3)\n",
"┌────────────┬────────────┬───────────┐\n",
"│ product_cd ┆ unit_price ┆ tex_price │\n",
"│ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i32 │\n",
"╞════════════╪════════════╪═══════════╡\n",
"│ P040101001 ┆ 198 ┆ 217 │\n",
"│ P040101002 ┆ 218 ┆ 239 │\n",
"│ P040101003 ┆ 230 ┆ 253 │\n",
"│ P040101004 ┆ 248 ┆ 272 │\n",
"│ P040101005 ┆ 268 ┆ 294 │\n",
"│ P040101006 ┆ 298 ┆ 327 │\n",
"│ P040101007 ┆ 338 ┆ 371 │\n",
"│ P040101008 ┆ 420 ┆ 462 │\n",
"│ P040101009 ┆ 498 ┆ 547 │\n",
"│ P040101010 ┆ 580 ┆ 638 │\n",
"└────────────┴────────────┴───────────┘"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_product\n",
".select(\n",
" pl.col.product_cd,\n",
" pl.col.unit_price,\n",
" (pl.col.unit_price * 1.1)\n",
" .floor()\n",
" .cast(pl.Int32)\n",
" .alias('tex_price')\n",
")\n",
".head(10)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "140440e5-49cd-4437-a37e-f0ac6749115f",
"metadata": {},
"source": [
"## P-069\n",
"\n",
"レシート明細データ(df_receipt)と商品データ(df_product)を結合し、顧客毎に全商品の売上金額合計と、カテゴリ大区分コード(category_major_cd)が\"07\"(瓶詰缶詰)の売上金額合計を計算の上、両者の比率を求めよ。抽出対象はカテゴリ大区分コード\"07\"(瓶詰缶詰)の売上実績がある顧客のみとし、結果を10件表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "cc266504-60fb-4173-9d76-2ca38c13dcb7",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (10, 4)customer_id | sum_all | sum_07 | sales_rate |
---|
str | i64 | i64 | f64 |
"CS006214000001" | 7364 | 4713 | 0.640005 |
"CS008415000097" | 1895 | 1337 | 0.705541 |
"CS028414000014" | 6222 | 3701 | 0.594825 |
"ZZ000000000000" | 12395003 | 6943009 | 0.560146 |
"CS025415000050" | 5736 | 3536 | 0.616457 |
"CS003515000195" | 5412 | 4581 | 0.846452 |
"CS024514000042" | 533 | 0 | 0.0 |
"CS040415000178" | 6149 | 4802 | 0.78094 |
"CS027514000015" | 2788 | 1475 | 0.529053 |
"CS025415000134" | 4902 | 3430 | 0.699714 |
"
],
"text/plain": [
"shape: (10, 4)\n",
"┌────────────────┬──────────┬─────────┬────────────┐\n",
"│ customer_id ┆ sum_all ┆ sum_07 ┆ sales_rate │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ i64 ┆ i64 ┆ f64 │\n",
"╞════════════════╪══════════╪═════════╪════════════╡\n",
"│ CS006214000001 ┆ 7364 ┆ 4713 ┆ 0.640005 │\n",
"│ CS008415000097 ┆ 1895 ┆ 1337 ┆ 0.705541 │\n",
"│ CS028414000014 ┆ 6222 ┆ 3701 ┆ 0.594825 │\n",
"│ ZZ000000000000 ┆ 12395003 ┆ 6943009 ┆ 0.560146 │\n",
"│ CS025415000050 ┆ 5736 ┆ 3536 ┆ 0.616457 │\n",
"│ CS003515000195 ┆ 5412 ┆ 4581 ┆ 0.846452 │\n",
"│ CS024514000042 ┆ 533 ┆ 0 ┆ 0.0 │\n",
"│ CS040415000178 ┆ 6149 ┆ 4802 ┆ 0.78094 │\n",
"│ CS027514000015 ┆ 2788 ┆ 1475 ┆ 0.529053 │\n",
"│ CS025415000134 ┆ 4902 ┆ 3430 ┆ 0.699714 │\n",
"└────────────────┴──────────┴─────────┴────────────┘"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".join(df_product, on='product_cd', how='left')\n",
".group_by('customer_id', maintain_order=True)\n",
".agg(\n",
" sum_all=pl.col.amount.sum(),\n",
" sum_07=pl.col.amount\n",
" .filter(pl.col.category_major_cd == '07')\n",
" .sum()\n",
")\n",
".with_columns(\n",
" sales_rate=pl.col.sum_07 / pl.col.sum_all\n",
")\n",
".head(10)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "514d4b6e-e29b-46de-8ac4-e85425c88dcf",
"metadata": {},
"source": [
"## P-070\n",
"\n",
"レシート明細データ(df_receipt)の売上日(sales_ymd)に対し、顧客データ(df_customer)の会員申込日(application_date)からの経過日数を計算し、顧客ID(customer_id)、売上日、会員申込日とともに10件表示せよ(sales_ymdは数値、application_dateは文字列でデータを保持している点に注意)。"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "d40a2bf3-6d0b-4877-938a-584b7bc2245d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 4)customer_id | sales_ymd | application_date | elapsed_days |
---|
str | date | date | i64 |
"CS020414000112" | 2019-02-04 | 2015-04-27 | 1379 |
"CS021515000011" | 2017-03-12 | 2015-06-29 | 622 |
"CS014514000085" | 2017-01-21 | 2015-06-06 | 595 |
"CS014415000088" | 2017-10-09 | 2015-12-21 | 658 |
"CS008514000055" | 2018-12-03 | 2015-02-12 | 1390 |
"
],
"text/plain": [
"shape: (5, 4)\n",
"┌────────────────┬────────────┬──────────────────┬──────────────┐\n",
"│ customer_id ┆ sales_ymd ┆ application_date ┆ elapsed_days │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ date ┆ date ┆ i64 │\n",
"╞════════════════╪════════════╪══════════════════╪══════════════╡\n",
"│ CS020414000112 ┆ 2019-02-04 ┆ 2015-04-27 ┆ 1379 │\n",
"│ CS021515000011 ┆ 2017-03-12 ┆ 2015-06-29 ┆ 622 │\n",
"│ CS014514000085 ┆ 2017-01-21 ┆ 2015-06-06 ┆ 595 │\n",
"│ CS014415000088 ┆ 2017-10-09 ┆ 2015-12-21 ┆ 658 │\n",
"│ CS008514000055 ┆ 2018-12-03 ┆ 2015-02-12 ┆ 1390 │\n",
"└────────────────┴────────────┴──────────────────┴──────────────┘"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".select('customer_id', 'sales_ymd')\n",
".unique()\n",
".join(df_customer, on='customer_id')\n",
".select(\n",
" pl.col.customer_id,\n",
" pl.col.sales_ymd.cast(str).str.strptime(pl.Date, '%Y%m%d'),\n",
" pl.col.application_date.cast(str).str.strptime(pl.Date, '%Y%m%d')\n",
")\n",
".with_columns(\n",
" elapsed_days=(pl.col.sales_ymd - pl.col.application_date).dt.total_days()\n",
")\n",
".head()\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}