{ "cells": [ { "cell_type": "markdown", "id": "41fb9eda-8d3e-43e6-ac29-ed5bb545f0e3", "metadata": {}, "source": [ "# データサイエンス100本ノック 61~70" ] }, { "cell_type": "code", "execution_count": 3, "id": "bfa0c462-fa3a-443e-b011-1094b8e3d629", "metadata": {}, "outputs": [], "source": [ "import polars as pl\n", "from helper.polars import load_100knocks_data\n", "pl.Config.set_fmt_str_lengths(100)\n", "df_customer, df_category, df_product, df_receipt, df_store, df_geocode = load_100knocks_data()" ] }, { "cell_type": "markdown", "id": "cef2d8ef-7332-45c8-9dfb-ca942aee4fba", "metadata": {}, "source": [ "## P-061\n", "\n", "レシート明細データ(df_receipt)の売上金額(amount)を顧客ID(customer_id)ごとに合計し、売上金額合計を常用対数化(底10)して顧客ID、売上金額合計とともに10件表示せよ。ただし、顧客IDが\"Z\"から始まるのものは非会員を表すため、除外して計算すること。" ] }, { "cell_type": "code", "execution_count": 4, "id": "f141f3df-e59a-46f1-a9b1-ab2134504c15", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (5, 3)
customer_idsum_amountlog_amount
stri64f64
"CS025515000057"80133.903822
"CS013414000081"53433.727826
"CS010315000024"2022.306425
"CS001515000463"7842.894593
"CS007512000029"3282.516535
" ], "text/plain": [ "shape: (5, 3)\n", "┌────────────────┬────────────┬────────────┐\n", "│ customer_id ┆ sum_amount ┆ log_amount │\n", "│ --- ┆ --- ┆ --- │\n", "│ str ┆ i64 ┆ f64 │\n", "╞════════════════╪════════════╪════════════╡\n", "│ CS025515000057 ┆ 8013 ┆ 3.903822 │\n", "│ CS013414000081 ┆ 5343 ┆ 3.727826 │\n", "│ CS010315000024 ┆ 202 ┆ 2.306425 │\n", "│ CS001515000463 ┆ 784 ┆ 2.894593 │\n", "│ CS007512000029 ┆ 328 ┆ 2.516535 │\n", "└────────────────┴────────────┴────────────┘" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_receipt\n", ".group_by('customer_id')\n", ".agg(\n", " pl.col.amount.sum().alias('sum_amount'),\n", " (pl.col.amount.sum() + 0.5).log10().alias('log_amount')\n", ")\n", ".head()\n", ")" ] }, { "cell_type": "markdown", "id": "1ae58f1a-d1c5-49f7-a03a-2bfa3d07ec5f", "metadata": {}, "source": [ "## P-062\n", "\n", "レシート明細データ(df_receipt)の売上金額(amount)を顧客ID(customer_id)ごとに合計し、売上金額合計を自然対数化(底e)して顧客ID、売上金額合計とともに10件表示せよ。ただし、顧客IDが\"Z\"から始まるのものは非会員を表すため、除外して計算すること。" ] }, { "cell_type": "code", "execution_count": 5, "id": "e09cefaa-c74c-415c-882c-dc9aed82fbcd", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (5, 3)
customer_idsum_amountlog_amount
stri64f64
"CS037615000085"2025.31074
"CS030212000045"2085.339939
"CS028415000215"40628.309554
"CS005214000025"40288.301149
"CS034513000213"29087.975393
" ], "text/plain": [ "shape: (5, 3)\n", "┌────────────────┬────────────┬────────────┐\n", "│ customer_id ┆ sum_amount ┆ log_amount │\n", "│ --- ┆ --- ┆ --- │\n", "│ str ┆ i64 ┆ f64 │\n", "╞════════════════╪════════════╪════════════╡\n", "│ CS037615000085 ┆ 202 ┆ 5.31074 │\n", "│ CS030212000045 ┆ 208 ┆ 5.339939 │\n", "│ CS028415000215 ┆ 4062 ┆ 8.309554 │\n", "│ CS005214000025 ┆ 4028 ┆ 8.301149 │\n", "│ CS034513000213 ┆ 2908 ┆ 7.975393 │\n", "└────────────────┴────────────┴────────────┘" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_receipt\n", ".group_by('customer_id')\n", ".agg(\n", " pl.col.amount.sum().alias('sum_amount'),\n", " (pl.col.amount.sum() + 0.5).log().alias('log_amount')\n", ")\n", ".head()\n", ")" ] }, { "cell_type": "markdown", "id": "e4dc441f-0537-450a-bc7f-498702fe1312", "metadata": {}, "source": [ "## P-063\n", "\n", "商品データ(df_product)の単価(unit_price)と原価(unit_cost)から各商品の利益額を算出し、結果を10件表示せよ。" ] }, { "cell_type": "code", "execution_count": 6, "id": "34adc762-2a34-46e1-8653-02299bc3636f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (5, 4)
product_cdunit_priceunit_costunit_profit
stri64i64i64
"P040101001"19814949
"P040101002"21816454
"P040101003"23017357
"P040101004"24818662
"P040101005"26820167
" ], "text/plain": [ "shape: (5, 4)\n", "┌────────────┬────────────┬───────────┬─────────────┐\n", "│ product_cd ┆ unit_price ┆ unit_cost ┆ unit_profit │\n", "│ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ i64 ┆ i64 ┆ i64 │\n", "╞════════════╪════════════╪═══════════╪═════════════╡\n", "│ P040101001 ┆ 198 ┆ 149 ┆ 49 │\n", "│ P040101002 ┆ 218 ┆ 164 ┆ 54 │\n", "│ P040101003 ┆ 230 ┆ 173 ┆ 57 │\n", "│ P040101004 ┆ 248 ┆ 186 ┆ 62 │\n", "│ P040101005 ┆ 268 ┆ 201 ┆ 67 │\n", "└────────────┴────────────┴───────────┴─────────────┘" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_product\n", ".select(\n", " pl.col.product_cd,\n", " pl.col.unit_price,\n", " pl.col.unit_cost,\n", " (pl.col.unit_price - pl.col.unit_cost).alias('unit_profit')\n", ")\n", ".head()\n", ")" ] }, { "cell_type": "markdown", "id": "2cfeb849-0ca3-4555-97b3-45d38dde1f79", "metadata": {}, "source": [ "## P-064 \n", "\n", "商品データ(df_product)の単価(unit_price)と原価(unit_cost)から、各商品の利益率の全体平均を算出せよ。ただし、単価と原価には欠損が生じていることに注意せよ。" ] }, { "cell_type": "code", "execution_count": 7, "id": "3c26f9a7-8a2a-4a9b-9f3f-7f854407726b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.24911389885177007" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_product\n", ".select(\n", " ((pl.col.unit_price - pl.col.unit_cost) / pl.col.unit_price).mean()\n", ")\n", ".item()\n", ")" ] }, { "cell_type": "markdown", "id": "20a8f113-146f-4c76-9162-22d90584ef4e", "metadata": {}, "source": [ "## P-065\n", "\n", "商品データ(df_product)の各商品について、利益率が30%となる新たな単価を求めよ。ただし、1円未満は切り捨てること。そして結果を10件表示させ、利益率がおよそ30%付近であることを確認せよ。ただし、単価(unit_price)と原価(unit_cost)には欠損が生じていることに注意せよ。" ] }, { "cell_type": "code", "execution_count": 8, "id": "2f988e75-e48b-4b43-9c85-555b6bf8436a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (10, 5)
product_cdunit_priceunit_costnew_pricenew_profit_rate
stri64i64i32f64
"P040101001"1981492120.29717
"P040101002"2181642340.299145
"P040101003"2301732470.299595
"P040101004"2481862650.298113
"P040101005"2682012870.299652
"P040101006"2982243200.3
"P040101007"3382543620.298343
"P040101008"4203154500.3
"P040101009"4983745340.299625
"P040101010"5804356210.299517
" ], "text/plain": [ "shape: (10, 5)\n", "┌────────────┬────────────┬───────────┬───────────┬─────────────────┐\n", "│ product_cd ┆ unit_price ┆ unit_cost ┆ new_price ┆ new_profit_rate │\n", "│ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ i64 ┆ i64 ┆ i32 ┆ f64 │\n", "╞════════════╪════════════╪═══════════╪═══════════╪═════════════════╡\n", "│ P040101001 ┆ 198 ┆ 149 ┆ 212 ┆ 0.29717 │\n", "│ P040101002 ┆ 218 ┆ 164 ┆ 234 ┆ 0.299145 │\n", "│ P040101003 ┆ 230 ┆ 173 ┆ 247 ┆ 0.299595 │\n", "│ P040101004 ┆ 248 ┆ 186 ┆ 265 ┆ 0.298113 │\n", "│ P040101005 ┆ 268 ┆ 201 ┆ 287 ┆ 0.299652 │\n", "│ P040101006 ┆ 298 ┆ 224 ┆ 320 ┆ 0.3 │\n", "│ P040101007 ┆ 338 ┆ 254 ┆ 362 ┆ 0.298343 │\n", "│ P040101008 ┆ 420 ┆ 315 ┆ 450 ┆ 0.3 │\n", "│ P040101009 ┆ 498 ┆ 374 ┆ 534 ┆ 0.299625 │\n", "│ P040101010 ┆ 580 ┆ 435 ┆ 621 ┆ 0.299517 │\n", "└────────────┴────────────┴───────────┴───────────┴─────────────────┘" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_product\n", ".select(\n", " pl.col.product_cd,\n", " pl.col.unit_price,\n", " pl.col.unit_cost,\n", " (pl.col.unit_cost / 0.7).floor().cast(pl.Int32).alias('new_price')\n", ")\n", ".with_columns(\n", " new_profit_rate=(pl.col.new_price - pl.col.unit_cost) / pl.col.new_price\n", ")\n", ".head(10)\n", ")" ] }, { "cell_type": "markdown", "id": "c28cc267-2054-495f-8d19-e0381a05f8a4", "metadata": {}, "source": [ "## P-066\n", "\n", "商品データ(df_product)の各商品について、利益率が30%となる新たな単価を求めよ。今回は、1円未満を丸めること(四捨五入または偶数への丸めで良い)。そして結果を10件表示させ、利益率がおよそ30%付近であることを確認せよ。ただし、単価(unit_price)と原価(unit_cost)には欠損が生じていることに注意せよ。" ] }, { "cell_type": "code", "execution_count": 9, "id": "c91a4b3a-c8ff-4dfe-acb1-e7ede9586b9f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (10, 5)
product_cdunit_priceunit_costnew_pricenew_profit_rate
stri64i64i32f64
"P040101001"1981492130.300469
"P040101002"2181642340.299145
"P040101003"2301732470.299595
"P040101004"2481862660.300752
"P040101005"2682012870.299652
"P040101006"2982243200.3
"P040101007"3382543630.300275
"P040101008"4203154500.3
"P040101009"4983745340.299625
"P040101010"5804356210.299517
" ], "text/plain": [ "shape: (10, 5)\n", "┌────────────┬────────────┬───────────┬───────────┬─────────────────┐\n", "│ product_cd ┆ unit_price ┆ unit_cost ┆ new_price ┆ new_profit_rate │\n", "│ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ i64 ┆ i64 ┆ i32 ┆ f64 │\n", "╞════════════╪════════════╪═══════════╪═══════════╪═════════════════╡\n", "│ P040101001 ┆ 198 ┆ 149 ┆ 213 ┆ 0.300469 │\n", "│ P040101002 ┆ 218 ┆ 164 ┆ 234 ┆ 0.299145 │\n", "│ P040101003 ┆ 230 ┆ 173 ┆ 247 ┆ 0.299595 │\n", "│ P040101004 ┆ 248 ┆ 186 ┆ 266 ┆ 0.300752 │\n", "│ P040101005 ┆ 268 ┆ 201 ┆ 287 ┆ 0.299652 │\n", "│ P040101006 ┆ 298 ┆ 224 ┆ 320 ┆ 0.3 │\n", "│ P040101007 ┆ 338 ┆ 254 ┆ 363 ┆ 0.300275 │\n", "│ P040101008 ┆ 420 ┆ 315 ┆ 450 ┆ 0.3 │\n", "│ P040101009 ┆ 498 ┆ 374 ┆ 534 ┆ 0.299625 │\n", "│ P040101010 ┆ 580 ┆ 435 ┆ 621 ┆ 0.299517 │\n", "└────────────┴────────────┴───────────┴───────────┴─────────────────┘" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_product\n", ".select(\n", " pl.col.product_cd,\n", " pl.col.unit_price,\n", " pl.col.unit_cost,\n", " (pl.col.unit_cost / 0.7)\n", " .round()\n", " .cast(pl.Int32)\n", " .alias('new_price')\n", ")\n", ".with_columns(\n", " new_profit_rate=(pl.col.new_price - pl.col.unit_cost) / pl.col.new_price\n", ")\n", ".head(10)\n", ")" ] }, { "cell_type": "markdown", "id": "146499a6-daae-4bdc-bc25-4b24720d903c", "metadata": {}, "source": [ "## P-067\n", "\n", "商品データ(df_product)の各商品について、利益率が30%となる新たな単価を求めよ。今回は、1円未満を切り上げること。そして結果を10件表示させ、利益率がおよそ30%付近であることを確認せよ。ただし、単価(unit_price)と原価(unit_cost)には欠損が生じていることに注意せよ。" ] }, { "cell_type": "code", "execution_count": 10, "id": "02ef7918-2cc1-42aa-8684-b0d0820b38f8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (10, 5)
product_cdunit_priceunit_costnew_pricenew_profit_rate
stri64i64i32f64
"P040101001"1981492130.300469
"P040101002"2181642350.302128
"P040101003"2301732480.302419
"P040101004"2481862660.300752
"P040101005"2682012880.302083
"P040101006"2982243200.3
"P040101007"3382543630.300275
"P040101008"4203154500.3
"P040101009"4983745350.300935
"P040101010"5804356220.300643
" ], "text/plain": [ "shape: (10, 5)\n", "┌────────────┬────────────┬───────────┬───────────┬─────────────────┐\n", "│ product_cd ┆ unit_price ┆ unit_cost ┆ new_price ┆ new_profit_rate │\n", "│ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ i64 ┆ i64 ┆ i32 ┆ f64 │\n", "╞════════════╪════════════╪═══════════╪═══════════╪═════════════════╡\n", "│ P040101001 ┆ 198 ┆ 149 ┆ 213 ┆ 0.300469 │\n", "│ P040101002 ┆ 218 ┆ 164 ┆ 235 ┆ 0.302128 │\n", "│ P040101003 ┆ 230 ┆ 173 ┆ 248 ┆ 0.302419 │\n", "│ P040101004 ┆ 248 ┆ 186 ┆ 266 ┆ 0.300752 │\n", "│ P040101005 ┆ 268 ┆ 201 ┆ 288 ┆ 0.302083 │\n", "│ P040101006 ┆ 298 ┆ 224 ┆ 320 ┆ 0.3 │\n", "│ P040101007 ┆ 338 ┆ 254 ┆ 363 ┆ 0.300275 │\n", "│ P040101008 ┆ 420 ┆ 315 ┆ 450 ┆ 0.3 │\n", "│ P040101009 ┆ 498 ┆ 374 ┆ 535 ┆ 0.300935 │\n", "│ P040101010 ┆ 580 ┆ 435 ┆ 622 ┆ 0.300643 │\n", "└────────────┴────────────┴───────────┴───────────┴─────────────────┘" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_product\n", ".select(\n", " pl.col.product_cd,\n", " pl.col.unit_price,\n", " pl.col.unit_cost,\n", " (pl.col.unit_cost / 0.7)\n", " .ceil()\n", " .cast(pl.Int32)\n", " .alias('new_price')\n", ")\n", ".with_columns(\n", " new_profit_rate=(pl.col.new_price - pl.col.unit_cost) / pl.col.new_price\n", ")\n", ".head(10)\n", ")" ] }, { "cell_type": "markdown", "id": "3427ba24-66a0-4b8c-af8b-48ad033f9e8e", "metadata": {}, "source": [ "## P-068\n", "\n", "商品データ(df_product)の各商品について、消費税率10%の税込み金額を求めよ。1円未満の端数は切り捨てとし、結果を10件表示せよ。ただし、単価(unit_price)には欠損が生じていることに注意せよ。" ] }, { "cell_type": "code", "execution_count": 11, "id": "d30b8ab7-fe83-4407-9bd7-b1d66d499278", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (10, 3)
product_cdunit_pricetex_price
stri64i32
"P040101001"198217
"P040101002"218239
"P040101003"230253
"P040101004"248272
"P040101005"268294
"P040101006"298327
"P040101007"338371
"P040101008"420462
"P040101009"498547
"P040101010"580638
" ], "text/plain": [ "shape: (10, 3)\n", "┌────────────┬────────────┬───────────┐\n", "│ product_cd ┆ unit_price ┆ tex_price │\n", "│ --- ┆ --- ┆ --- │\n", "│ str ┆ i64 ┆ i32 │\n", "╞════════════╪════════════╪═══════════╡\n", "│ P040101001 ┆ 198 ┆ 217 │\n", "│ P040101002 ┆ 218 ┆ 239 │\n", "│ P040101003 ┆ 230 ┆ 253 │\n", "│ P040101004 ┆ 248 ┆ 272 │\n", "│ P040101005 ┆ 268 ┆ 294 │\n", "│ P040101006 ┆ 298 ┆ 327 │\n", "│ P040101007 ┆ 338 ┆ 371 │\n", "│ P040101008 ┆ 420 ┆ 462 │\n", "│ P040101009 ┆ 498 ┆ 547 │\n", "│ P040101010 ┆ 580 ┆ 638 │\n", "└────────────┴────────────┴───────────┘" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_product\n", ".select(\n", " pl.col.product_cd,\n", " pl.col.unit_price,\n", " (pl.col.unit_price * 1.1)\n", " .floor()\n", " .cast(pl.Int32)\n", " .alias('tex_price')\n", ")\n", ".head(10)\n", ")" ] }, { "cell_type": "markdown", "id": "140440e5-49cd-4437-a37e-f0ac6749115f", "metadata": {}, "source": [ "## P-069\n", "\n", "レシート明細データ(df_receipt)と商品データ(df_product)を結合し、顧客毎に全商品の売上金額合計と、カテゴリ大区分コード(category_major_cd)が\"07\"(瓶詰缶詰)の売上金額合計を計算の上、両者の比率を求めよ。抽出対象はカテゴリ大区分コード\"07\"(瓶詰缶詰)の売上実績がある顧客のみとし、結果を10件表示せよ。" ] }, { "cell_type": "code", "execution_count": 12, "id": "cc266504-60fb-4173-9d76-2ca38c13dcb7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (10, 4)
customer_idsum_allsum_07sales_rate
stri64i64f64
"CS006214000001"736447130.640005
"CS008415000097"189513370.705541
"CS028414000014"622237010.594825
"ZZ000000000000"1239500369430090.560146
"CS025415000050"573635360.616457
"CS003515000195"541245810.846452
"CS024514000042"53300.0
"CS040415000178"614948020.78094
"CS027514000015"278814750.529053
"CS025415000134"490234300.699714
" ], "text/plain": [ "shape: (10, 4)\n", "┌────────────────┬──────────┬─────────┬────────────┐\n", "│ customer_id ┆ sum_all ┆ sum_07 ┆ sales_rate │\n", "│ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ i64 ┆ i64 ┆ f64 │\n", "╞════════════════╪══════════╪═════════╪════════════╡\n", "│ CS006214000001 ┆ 7364 ┆ 4713 ┆ 0.640005 │\n", "│ CS008415000097 ┆ 1895 ┆ 1337 ┆ 0.705541 │\n", "│ CS028414000014 ┆ 6222 ┆ 3701 ┆ 0.594825 │\n", "│ ZZ000000000000 ┆ 12395003 ┆ 6943009 ┆ 0.560146 │\n", "│ CS025415000050 ┆ 5736 ┆ 3536 ┆ 0.616457 │\n", "│ CS003515000195 ┆ 5412 ┆ 4581 ┆ 0.846452 │\n", "│ CS024514000042 ┆ 533 ┆ 0 ┆ 0.0 │\n", "│ CS040415000178 ┆ 6149 ┆ 4802 ┆ 0.78094 │\n", "│ CS027514000015 ┆ 2788 ┆ 1475 ┆ 0.529053 │\n", "│ CS025415000134 ┆ 4902 ┆ 3430 ┆ 0.699714 │\n", "└────────────────┴──────────┴─────────┴────────────┘" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_receipt\n", ".join(df_product, on='product_cd', how='left')\n", ".group_by('customer_id', maintain_order=True)\n", ".agg(\n", " sum_all=pl.col.amount.sum(),\n", " sum_07=pl.col.amount\n", " .filter(pl.col.category_major_cd == '07')\n", " .sum()\n", ")\n", ".with_columns(\n", " sales_rate=pl.col.sum_07 / pl.col.sum_all\n", ")\n", ".head(10)\n", ")" ] }, { "cell_type": "markdown", "id": "514d4b6e-e29b-46de-8ac4-e85425c88dcf", "metadata": {}, "source": [ "## P-070\n", "\n", "レシート明細データ(df_receipt)の売上日(sales_ymd)に対し、顧客データ(df_customer)の会員申込日(application_date)からの経過日数を計算し、顧客ID(customer_id)、売上日、会員申込日とともに10件表示せよ(sales_ymdは数値、application_dateは文字列でデータを保持している点に注意)。" ] }, { "cell_type": "code", "execution_count": 13, "id": "d40a2bf3-6d0b-4877-938a-584b7bc2245d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (5, 4)
customer_idsales_ymdapplication_dateelapsed_days
strdatedatei64
"CS020414000112"2019-02-042015-04-271379
"CS021515000011"2017-03-122015-06-29622
"CS014514000085"2017-01-212015-06-06595
"CS014415000088"2017-10-092015-12-21658
"CS008514000055"2018-12-032015-02-121390
" ], "text/plain": [ "shape: (5, 4)\n", "┌────────────────┬────────────┬──────────────────┬──────────────┐\n", "│ customer_id ┆ sales_ymd ┆ application_date ┆ elapsed_days │\n", "│ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ date ┆ date ┆ i64 │\n", "╞════════════════╪════════════╪══════════════════╪══════════════╡\n", "│ CS020414000112 ┆ 2019-02-04 ┆ 2015-04-27 ┆ 1379 │\n", "│ CS021515000011 ┆ 2017-03-12 ┆ 2015-06-29 ┆ 622 │\n", "│ CS014514000085 ┆ 2017-01-21 ┆ 2015-06-06 ┆ 595 │\n", "│ CS014415000088 ┆ 2017-10-09 ┆ 2015-12-21 ┆ 658 │\n", "│ CS008514000055 ┆ 2018-12-03 ┆ 2015-02-12 ┆ 1390 │\n", "└────────────────┴────────────┴──────────────────┴──────────────┘" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", "df_receipt\n", ".select('customer_id', 'sales_ymd')\n", ".unique()\n", ".join(df_customer, on='customer_id')\n", ".select(\n", " pl.col.customer_id,\n", " pl.col.sales_ymd.cast(str).str.strptime(pl.Date, '%Y%m%d'),\n", " pl.col.application_date.cast(str).str.strptime(pl.Date, '%Y%m%d')\n", ")\n", ".with_columns(\n", " elapsed_days=(pl.col.sales_ymd - pl.col.application_date).dt.total_days()\n", ")\n", ".head()\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.8" } }, "nbformat": 4, "nbformat_minor": 5 }