{
"cells": [
{
"cell_type": "markdown",
"id": "5e21c65a-cbb3-4a2b-9af2-616c22dc6476",
"metadata": {},
"source": [
"# データサイエンス100本ノック 01~10"
]
},
{
"cell_type": "markdown",
"id": "64ed2fa0-5457-4e30-82a3-066350aaf86e",
"metadata": {},
"source": [
"## はじめに"
]
},
{
"cell_type": "markdown",
"id": "4fdc1344-e7bb-4a2c-a2e3-584d8565060b",
"metadata": {},
"source": [
"[100knocks-preprocess](https://github.com/The-Japan-DataScientist-Society/100knocks-preprocess) からサンプルデータをダウンロードしてください。\n",
"\n",
"- 初めに以下のセルを実行してください。\n",
"- 必要なライブラリのインポートとデータベース(PostgreSQL)からのデータ読み込みを行います。\n",
"- pandas等、利用が想定されるライブラリは以下セルでインポートしています。\n",
"- その他利用したいライブラリがあれば適宜インストールしてください(\"!pip install ライブラリ名\"でインストールも可能)\n",
"- 処理は複数回に分けても構いません。\n",
"- 名前、住所等はダミーデータであり、実在するものではありません。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d00dfa5d-5ee3-4480-87d5-f162e26dcf50",
"metadata": {},
"outputs": [],
"source": [
"import polars as pl\n",
"from helper.polars import load_100knocks_data\n",
"pl.Config.set_fmt_str_lengths(100)\n",
"df_customer, df_category, df_product, df_receipt, df_store, df_geocode = load_100knocks_data()"
]
},
{
"cell_type": "markdown",
"id": "2d20b42e-adae-4714-960c-d55e6d717343",
"metadata": {},
"source": [
"## P-001\n",
"\n",
"レシート明細データ(df_receipt)から全項目の先頭10件を表示し、どのようなデータを保有しているか目視で確認せよ。"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "964e327a-ee93-42de-8b29-5ee5832fbe43",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
shape: (5, 9)sales_ymd | sales_epoch | store_cd | receipt_no | receipt_sub_no | customer_id | product_cd | quantity | amount |
---|
i64 | i64 | str | i64 | i64 | str | str | i64 | i64 |
20181103 | 1541203200 | "S14006" | 112 | 1 | "CS006214000001" | "P070305012" | 1 | 158 |
20181118 | 1542499200 | "S13008" | 1132 | 2 | "CS008415000097" | "P070701017" | 1 | 81 |
20170712 | 1499817600 | "S14028" | 1102 | 1 | "CS028414000014" | "P060101005" | 1 | 170 |
20190205 | 1549324800 | "S14042" | 1132 | 1 | "ZZ000000000000" | "P050301001" | 1 | 25 |
20180821 | 1534809600 | "S14025" | 1102 | 2 | "CS025415000050" | "P060102007" | 1 | 90 |
"
],
"text/plain": [
"shape: (5, 9)\n",
"┌───────────┬─────────────┬──────────┬────────────┬───┬────────────────┬────────────┬──────────┬────────┐\n",
"│ sales_ymd ┆ sales_epoch ┆ store_cd ┆ receipt_no ┆ … ┆ customer_id ┆ product_cd ┆ quantity ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ i64 ┆ str ┆ i64 ┆ ┆ str ┆ str ┆ i64 ┆ i64 │\n",
"╞═══════════╪═════════════╪══════════╪════════════╪═══╪════════════════╪════════════╪══════════╪════════╡\n",
"│ 20181103 ┆ 1541203200 ┆ S14006 ┆ 112 ┆ … ┆ CS006214000001 ┆ P070305012 ┆ 1 ┆ 158 │\n",
"│ 20181118 ┆ 1542499200 ┆ S13008 ┆ 1132 ┆ … ┆ CS008415000097 ┆ P070701017 ┆ 1 ┆ 81 │\n",
"│ 20170712 ┆ 1499817600 ┆ S14028 ┆ 1102 ┆ … ┆ CS028414000014 ┆ P060101005 ┆ 1 ┆ 170 │\n",
"│ 20190205 ┆ 1549324800 ┆ S14042 ┆ 1132 ┆ … ┆ ZZ000000000000 ┆ P050301001 ┆ 1 ┆ 25 │\n",
"│ 20180821 ┆ 1534809600 ┆ S14025 ┆ 1102 ┆ … ┆ CS025415000050 ┆ P060102007 ┆ 1 ┆ 90 │\n",
"└───────────┴─────────────┴──────────┴────────────┴───┴────────────────┴────────────┴──────────┴────────┘"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_receipt.head(5)"
]
},
{
"cell_type": "markdown",
"id": "1daa8262-befc-4318-93fb-88799b6bcd0e",
"metadata": {},
"source": [
"## P-002\n",
"\n",
"レシート明細データ(df_receipt)から売上年月日(sales_ymd)、顧客ID(customer_id)、商品コード(product_cd)、売上金額(amount)の順に列を指定し、10件表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "9a8debbf-0b94-4589-ac9c-0fb76f4a67c4",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 4)sales_ymd | customer_id | product_cd | amount |
---|
i64 | str | str | i64 |
20181103 | "CS006214000001" | "P070305012" | 158 |
20181118 | "CS008415000097" | "P070701017" | 81 |
20170712 | "CS028414000014" | "P060101005" | 170 |
20190205 | "ZZ000000000000" | "P050301001" | 25 |
20180821 | "CS025415000050" | "P060102007" | 90 |
"
],
"text/plain": [
"shape: (5, 4)\n",
"┌───────────┬────────────────┬────────────┬────────┐\n",
"│ sales_ymd ┆ customer_id ┆ product_cd ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 │\n",
"╞═══════════╪════════════════╪════════════╪════════╡\n",
"│ 20181103 ┆ CS006214000001 ┆ P070305012 ┆ 158 │\n",
"│ 20181118 ┆ CS008415000097 ┆ P070701017 ┆ 81 │\n",
"│ 20170712 ┆ CS028414000014 ┆ P060101005 ┆ 170 │\n",
"│ 20190205 ┆ ZZ000000000000 ┆ P050301001 ┆ 25 │\n",
"│ 20180821 ┆ CS025415000050 ┆ P060102007 ┆ 90 │\n",
"└───────────┴────────────────┴────────────┴────────┘"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".select('sales_ymd', 'customer_id', 'product_cd', 'amount')\n",
".head(5)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "a60cdad9-ecf8-4e1a-8b0f-4c99f817dd74",
"metadata": {},
"source": [
"## P-003\n",
"\n",
"レシート明細データ(df_receipt)から売上年月日(sales_ymd)、顧客ID(customer_id)、商品コード(product_cd)、売上金額(amount)の順に列を指定し、10件表示せよ。ただし、sales_ymdをsales_dateに項目名を変更して抽出すること。"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "ad332325-426b-4b00-9eaa-d84f2424296a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 4)sales_date | customer_id | product_cd | amount |
---|
i64 | str | str | i64 |
20181103 | "CS006214000001" | "P070305012" | 158 |
20181118 | "CS008415000097" | "P070701017" | 81 |
20170712 | "CS028414000014" | "P060101005" | 170 |
20190205 | "ZZ000000000000" | "P050301001" | 25 |
20180821 | "CS025415000050" | "P060102007" | 90 |
"
],
"text/plain": [
"shape: (5, 4)\n",
"┌────────────┬────────────────┬────────────┬────────┐\n",
"│ sales_date ┆ customer_id ┆ product_cd ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 │\n",
"╞════════════╪════════════════╪════════════╪════════╡\n",
"│ 20181103 ┆ CS006214000001 ┆ P070305012 ┆ 158 │\n",
"│ 20181118 ┆ CS008415000097 ┆ P070701017 ┆ 81 │\n",
"│ 20170712 ┆ CS028414000014 ┆ P060101005 ┆ 170 │\n",
"│ 20190205 ┆ ZZ000000000000 ┆ P050301001 ┆ 25 │\n",
"│ 20180821 ┆ CS025415000050 ┆ P060102007 ┆ 90 │\n",
"└────────────┴────────────────┴────────────┴────────┘"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".select(\n",
" pl.col('sales_ymd').alias('sales_date'), \n",
" 'customer_id', \n",
" 'product_cd', \n",
" 'amount')\n",
".head(5)\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "53f141d7-2136-404c-ab1c-6a28bd0c01a8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 4)sales_date | customer_id | product_cd | amount |
---|
i64 | str | str | i64 |
20181103 | "CS006214000001" | "P070305012" | 158 |
20181118 | "CS008415000097" | "P070701017" | 81 |
20170712 | "CS028414000014" | "P060101005" | 170 |
20190205 | "ZZ000000000000" | "P050301001" | 25 |
20180821 | "CS025415000050" | "P060102007" | 90 |
"
],
"text/plain": [
"shape: (5, 4)\n",
"┌────────────┬────────────────┬────────────┬────────┐\n",
"│ sales_date ┆ customer_id ┆ product_cd ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 │\n",
"╞════════════╪════════════════╪════════════╪════════╡\n",
"│ 20181103 ┆ CS006214000001 ┆ P070305012 ┆ 158 │\n",
"│ 20181118 ┆ CS008415000097 ┆ P070701017 ┆ 81 │\n",
"│ 20170712 ┆ CS028414000014 ┆ P060101005 ┆ 170 │\n",
"│ 20190205 ┆ ZZ000000000000 ┆ P050301001 ┆ 25 │\n",
"│ 20180821 ┆ CS025415000050 ┆ P060102007 ┆ 90 │\n",
"└────────────┴────────────────┴────────────┴────────┘"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".select('sales_ymd', 'customer_id', 'product_cd', 'amount')\n",
".rename({'sales_ymd':'sales_date'})\n",
".head(5)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "f848ff4c-c026-42b8-97db-6bbc26a929a3",
"metadata": {},
"source": [
"## P-004\n",
"\n",
"レシート明細データ(df_receipt)から売上日(sales_ymd)、顧客ID(customer_id)、商品コード(product_cd)、売上金額(amount)の順に列を指定し、以下の条件を満たすデータを抽出せよ。\n",
"\n",
"- 顧客ID(customer_id)が\"CS018205000001\""
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "fe849782-6cd6-4e94-a7b4-460eca1726cd",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 4)sales_ymd | customer_id | product_cd | amount |
---|
i64 | str | str | i64 |
20180911 | "CS018205000001" | "P071401012" | 2200 |
20180414 | "CS018205000001" | "P060104007" | 600 |
20170614 | "CS018205000001" | "P050206001" | 990 |
20170614 | "CS018205000001" | "P060702015" | 108 |
20190216 | "CS018205000001" | "P071005024" | 102 |
"
],
"text/plain": [
"shape: (5, 4)\n",
"┌───────────┬────────────────┬────────────┬────────┐\n",
"│ sales_ymd ┆ customer_id ┆ product_cd ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 │\n",
"╞═══════════╪════════════════╪════════════╪════════╡\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401012 ┆ 2200 │\n",
"│ 20180414 ┆ CS018205000001 ┆ P060104007 ┆ 600 │\n",
"│ 20170614 ┆ CS018205000001 ┆ P050206001 ┆ 990 │\n",
"│ 20170614 ┆ CS018205000001 ┆ P060702015 ┆ 108 │\n",
"│ 20190216 ┆ CS018205000001 ┆ P071005024 ┆ 102 │\n",
"└───────────┴────────────────┴────────────┴────────┘"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".filter(pl.col.customer_id == 'CS018205000001')\n",
".select('sales_ymd', 'customer_id', 'product_cd', 'amount')\n",
".head(5)\n",
")"
]
},
{
"cell_type": "markdown",
"id": "be1aa091-89e6-49e0-8e0b-0185deadcba0",
"metadata": {},
"source": [
"## P-005\n",
"\n",
"レシート明細データ(df_receipt)から売上日(sales_ymd)、顧客ID(customer_id)、商品コード(product_cd)、売上金額(amount)の順に列を指定し、以下の全ての条件を満たすデータを抽出せよ。\n",
"\n",
"- 顧客ID(customer_id)が\"CS018205000001\"\n",
"- 売上金額(amount)が1,000以上"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "a7033673-d9e8-44d4-bbfc-77c9ae10612a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (3, 4)sales_ymd | customer_id | product_cd | amount |
---|
i64 | str | str | i64 |
20180911 | "CS018205000001" | "P071401012" | 2200 |
20190226 | "CS018205000001" | "P071401020" | 2200 |
20180911 | "CS018205000001" | "P071401005" | 1100 |
"
],
"text/plain": [
"shape: (3, 4)\n",
"┌───────────┬────────────────┬────────────┬────────┐\n",
"│ sales_ymd ┆ customer_id ┆ product_cd ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 │\n",
"╞═══════════╪════════════════╪════════════╪════════╡\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401012 ┆ 2200 │\n",
"│ 20190226 ┆ CS018205000001 ┆ P071401020 ┆ 2200 │\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401005 ┆ 1100 │\n",
"└───────────┴────────────────┴────────────┴────────┘"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".filter(\n",
" (pl.col.customer_id == 'CS018205000001')\n",
" & (pl.col.amount >= 1000)\n",
")\n",
".select('sales_ymd', 'customer_id', 'product_cd', 'amount')\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "e313773d-3acf-4a0a-9fc2-64bf53e75787",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (3, 4)sales_ymd | customer_id | product_cd | amount |
---|
i64 | str | str | i64 |
20180911 | "CS018205000001" | "P071401012" | 2200 |
20190226 | "CS018205000001" | "P071401020" | 2200 |
20180911 | "CS018205000001" | "P071401005" | 1100 |
"
],
"text/plain": [
"shape: (3, 4)\n",
"┌───────────┬────────────────┬────────────┬────────┐\n",
"│ sales_ymd ┆ customer_id ┆ product_cd ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 │\n",
"╞═══════════╪════════════════╪════════════╪════════╡\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401012 ┆ 2200 │\n",
"│ 20190226 ┆ CS018205000001 ┆ P071401020 ┆ 2200 │\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401005 ┆ 1100 │\n",
"└───────────┴────────────────┴────────────┴────────┘"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".filter(\n",
" pl.col.customer_id\n",
" .eq('CS018205000001')\n",
" .and_(pl.col.amount.ge(1000))\n",
")\n",
".select('sales_ymd', 'customer_id', 'product_cd', 'amount')\n",
")"
]
},
{
"cell_type": "markdown",
"id": "6e706d21-e5e1-4f35-867e-d2cdae919337",
"metadata": {},
"source": [
"## P-006\n",
"\n",
"レシート明細データ(df_receipt)から売上日(sales_ymd)、顧客ID(customer_id)、商品コード(product_cd)、売上数量(quantity)、売上金額(amount)の順に列を指定し、以下の全ての条件を満たすデータを抽出せよ。\n",
"\n",
"- 顧客ID(customer_id)が\"CS018205000001\"\n",
"- 売上金額(amount)が1,000以上または売上数量(quantity)が5以上"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "f460c947-5acf-4667-8ac9-64731f9709dd",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 5)sales_ymd | customer_id | product_cd | amount | quantity |
---|
i64 | str | str | i64 | i64 |
20180911 | "CS018205000001" | "P071401012" | 2200 | 1 |
20180414 | "CS018205000001" | "P060104007" | 600 | 6 |
20170614 | "CS018205000001" | "P050206001" | 990 | 5 |
20190226 | "CS018205000001" | "P071401020" | 2200 | 1 |
20180911 | "CS018205000001" | "P071401005" | 1100 | 1 |
"
],
"text/plain": [
"shape: (5, 5)\n",
"┌───────────┬────────────────┬────────────┬────────┬──────────┐\n",
"│ sales_ymd ┆ customer_id ┆ product_cd ┆ amount ┆ quantity │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 ┆ i64 │\n",
"╞═══════════╪════════════════╪════════════╪════════╪══════════╡\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401012 ┆ 2200 ┆ 1 │\n",
"│ 20180414 ┆ CS018205000001 ┆ P060104007 ┆ 600 ┆ 6 │\n",
"│ 20170614 ┆ CS018205000001 ┆ P050206001 ┆ 990 ┆ 5 │\n",
"│ 20190226 ┆ CS018205000001 ┆ P071401020 ┆ 2200 ┆ 1 │\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401005 ┆ 1100 ┆ 1 │\n",
"└───────────┴────────────────┴────────────┴────────┴──────────┘"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".filter(\n",
" pl.col.customer_id\n",
" .eq('CS018205000001')\n",
" .and_(\n",
" pl.col.amount\n",
" .ge(1000)\n",
" .or_(pl.col.quantity >= 5)\n",
" )\n",
")\n",
".select('sales_ymd', 'customer_id', 'product_cd', 'amount', 'quantity')\n",
")"
]
},
{
"cell_type": "markdown",
"id": "31b1e579-0592-4d16-ad6a-6561d07cf28c",
"metadata": {},
"source": [
"## P-007\n",
"\n",
"レシート明細データ(df_receipt)から売上日(sales_ymd)、顧客ID(customer_id)、商品コード(product_cd)、売上金額(amount)の順に列を指定し、以下の全ての条件を満たすデータを抽出せよ。\n",
"\n",
"- 顧客ID(customer_id)が\"CS018205000001\"\n",
"- 売上金額(amount)が1,000以上2,000以下"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "26720de0-2167-4f24-89a8-02d0f34f19a5",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (1, 4)sales_ymd | customer_id | product_cd | amount |
---|
i64 | str | str | i64 |
20180911 | "CS018205000001" | "P071401005" | 1100 |
"
],
"text/plain": [
"shape: (1, 4)\n",
"┌───────────┬────────────────┬────────────┬────────┐\n",
"│ sales_ymd ┆ customer_id ┆ product_cd ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 │\n",
"╞═══════════╪════════════════╪════════════╪════════╡\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401005 ┆ 1100 │\n",
"└───────────┴────────────────┴────────────┴────────┘"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".filter(\n",
" (pl.col.customer_id == 'CS018205000001')\n",
" & pl.col.amount.is_between(1000, 2000)\n",
")\n",
".select('sales_ymd', 'customer_id', 'product_cd', 'amount')\n",
")"
]
},
{
"cell_type": "markdown",
"id": "4d38e1b9-edd6-454c-a7f6-bba98d44f67e",
"metadata": {},
"source": [
"## P-008\n",
"\n",
"レシート明細データ(df_receipt)から売上日(sales_ymd)、顧客ID(customer_id)、商品コード(product_cd)、売上金額(amount)の順に列を指定し、以下の全ての条件を満たすデータを抽出せよ。\n",
"\n",
"- 顧客ID(customer_id)が\"CS018205000001\"\n",
"- 商品コード(product_cd)が\"P071401019\"以外"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "8f3bc3b1-22b5-48fc-9323-315f457a68c6",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (12, 4)sales_ymd | customer_id | product_cd | amount |
---|
i64 | str | str | i64 |
20180911 | "CS018205000001" | "P071401012" | 2200 |
20180414 | "CS018205000001" | "P060104007" | 600 |
20170614 | "CS018205000001" | "P050206001" | 990 |
20170614 | "CS018205000001" | "P060702015" | 108 |
20190216 | "CS018205000001" | "P071005024" | 102 |
… | … | … | … |
20190924 | "CS018205000001" | "P060805001" | 495 |
20190226 | "CS018205000001" | "P071401020" | 2200 |
20180911 | "CS018205000001" | "P071401005" | 1100 |
20190216 | "CS018205000001" | "P040101002" | 218 |
20190924 | "CS018205000001" | "P091503001" | 280 |
"
],
"text/plain": [
"shape: (12, 4)\n",
"┌───────────┬────────────────┬────────────┬────────┐\n",
"│ sales_ymd ┆ customer_id ┆ product_cd ┆ amount │\n",
"│ --- ┆ --- ┆ --- ┆ --- │\n",
"│ i64 ┆ str ┆ str ┆ i64 │\n",
"╞═══════════╪════════════════╪════════════╪════════╡\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401012 ┆ 2200 │\n",
"│ 20180414 ┆ CS018205000001 ┆ P060104007 ┆ 600 │\n",
"│ 20170614 ┆ CS018205000001 ┆ P050206001 ┆ 990 │\n",
"│ 20170614 ┆ CS018205000001 ┆ P060702015 ┆ 108 │\n",
"│ 20190216 ┆ CS018205000001 ┆ P071005024 ┆ 102 │\n",
"│ … ┆ … ┆ … ┆ … │\n",
"│ 20190924 ┆ CS018205000001 ┆ P060805001 ┆ 495 │\n",
"│ 20190226 ┆ CS018205000001 ┆ P071401020 ┆ 2200 │\n",
"│ 20180911 ┆ CS018205000001 ┆ P071401005 ┆ 1100 │\n",
"│ 20190216 ┆ CS018205000001 ┆ P040101002 ┆ 218 │\n",
"│ 20190924 ┆ CS018205000001 ┆ P091503001 ┆ 280 │\n",
"└───────────┴────────────────┴────────────┴────────┘"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_receipt\n",
".filter(\n",
" (pl.col.customer_id == 'CS018205000001')\n",
" & (pl.col.product_cd != 'P071401019')\n",
")\n",
".select('sales_ymd', 'customer_id', 'product_cd', 'amount')\n",
")"
]
},
{
"cell_type": "markdown",
"id": "6ee2b9ea-66db-4ba6-838d-b97558768f0f",
"metadata": {},
"source": [
"## P-009\n",
"\n",
"以下の処理において、出力結果を変えずにORをANDに書き換えよ。\n",
"\n",
"```\n",
"df_store.query('not(prefecture_cd == \"13\" | floor_area > 900)')\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "823a5eed-ced1-4a2d-8a7d-99eb4696b3a0",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (3, 10)store_cd | store_name | prefecture_cd | prefecture | address | address_kana | tel_no | longitude | latitude | floor_area |
---|
str | str | str | str | str | str | str | f64 | f64 | f64 |
"S14046" | "北山田店" | "14" | "神奈川県" | "神奈川県横浜市都筑区北山田一丁目" | "カナガワケンヨコハマシツヅキクキタヤマタイッチョウメ" | "045-123-4049" | 139.5916 | 35.56189 | 831.0 |
"S14011" | "日吉本町店" | "14" | "神奈川県" | "神奈川県横浜市港北区日吉本町四丁目" | "カナガワケンヨコハマシコウホククヒヨシホンチョウヨンチョウメ" | "045-123-4033" | 139.6316 | 35.54655 | 890.0 |
"S12013" | "習志野店" | "12" | "千葉県" | "千葉県習志野市芝園一丁目" | "チバケンナラシノシシバゾノイッチョウメ" | "047-123-4002" | 140.022 | 35.66122 | 808.0 |
"
],
"text/plain": [
"shape: (3, 10)\n",
"┌──────────┬────────────┬───────────────┬────────────┬───┬──────────────┬───────────┬──────────┬────────────┐\n",
"│ store_cd ┆ store_name ┆ prefecture_cd ┆ prefecture ┆ … ┆ tel_no ┆ longitude ┆ latitude ┆ floor_area │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ str ┆ str ┆ str ┆ ┆ str ┆ f64 ┆ f64 ┆ f64 │\n",
"╞══════════╪════════════╪═══════════════╪════════════╪═══╪══════════════╪═══════════╪══════════╪════════════╡\n",
"│ S14046 ┆ 北山田店 ┆ 14 ┆ 神奈川県 ┆ … ┆ 045-123-4049 ┆ 139.5916 ┆ 35.56189 ┆ 831.0 │\n",
"│ S14011 ┆ 日吉本町店 ┆ 14 ┆ 神奈川県 ┆ … ┆ 045-123-4033 ┆ 139.6316 ┆ 35.54655 ┆ 890.0 │\n",
"│ S12013 ┆ 習志野店 ┆ 12 ┆ 千葉県 ┆ … ┆ 047-123-4002 ┆ 140.022 ┆ 35.66122 ┆ 808.0 │\n",
"└──────────┴────────────┴───────────────┴────────────┴───┴──────────────┴───────────┴──────────┴────────────┘"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_store\n",
".filter(\n",
" (pl.col.prefecture_cd != '13')\n",
" & (pl.col.floor_area <= 900)\n",
")\n",
")"
]
},
{
"cell_type": "markdown",
"id": "70f4a496-23a4-490b-b071-c6fe39685804",
"metadata": {},
"source": [
"## P-010\n",
"\n",
"店舗データ(df_store)から、店舗コード(store_cd)が\"S14\"で始まるものだけ全項目抽出し、10件表示せよ。"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "7013d377-eab8-4146-9b5c-768c05f89861",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 10)store_cd | store_name | prefecture_cd | prefecture | address | address_kana | tel_no | longitude | latitude | floor_area |
---|
str | str | str | str | str | str | str | f64 | f64 | f64 |
"S14010" | "菊名店" | "14" | "神奈川県" | "神奈川県横浜市港北区菊名一丁目" | "カナガワケンヨコハマシコウホククキクナイッチョウメ" | "045-123-4032" | 139.6326 | 35.50049 | 1732.0 |
"S14033" | "阿久和店" | "14" | "神奈川県" | "神奈川県横浜市瀬谷区阿久和西一丁目" | "カナガワケンヨコハマシセヤクアクワニシイッチョウメ" | "045-123-4043" | 139.4961 | 35.45918 | 1495.0 |
"S14036" | "相模原中央店" | "14" | "神奈川県" | "神奈川県相模原市中央二丁目" | "カナガワケンサガミハラシチュウオウニチョウメ" | "042-123-4045" | 139.3716 | 35.57327 | 1679.0 |
"S14040" | "長津田店" | "14" | "神奈川県" | "神奈川県横浜市緑区長津田みなみ台五丁目" | "カナガワケンヨコハマシミドリクナガツタミナミダイゴチョウメ" | "045-123-4046" | 139.4994 | 35.52398 | 1548.0 |
"S14050" | "阿久和西店" | "14" | "神奈川県" | "神奈川県横浜市瀬谷区阿久和西一丁目" | "カナガワケンヨコハマシセヤクアクワニシイッチョウメ" | "045-123-4053" | 139.4961 | 35.45918 | 1830.0 |
"
],
"text/plain": [
"shape: (5, 10)\n",
"┌──────────┬──────────────┬───────────────┬────────────┬───┬──────────────┬───────────┬──────────┬────────────┐\n",
"│ store_cd ┆ store_name ┆ prefecture_cd ┆ prefecture ┆ … ┆ tel_no ┆ longitude ┆ latitude ┆ floor_area │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ str ┆ str ┆ str ┆ ┆ str ┆ f64 ┆ f64 ┆ f64 │\n",
"╞══════════╪══════════════╪═══════════════╪════════════╪═══╪══════════════╪═══════════╪══════════╪════════════╡\n",
"│ S14010 ┆ 菊名店 ┆ 14 ┆ 神奈川県 ┆ … ┆ 045-123-4032 ┆ 139.6326 ┆ 35.50049 ┆ 1732.0 │\n",
"│ S14033 ┆ 阿久和店 ┆ 14 ┆ 神奈川県 ┆ … ┆ 045-123-4043 ┆ 139.4961 ┆ 35.45918 ┆ 1495.0 │\n",
"│ S14036 ┆ 相模原中央店 ┆ 14 ┆ 神奈川県 ┆ … ┆ 042-123-4045 ┆ 139.3716 ┆ 35.57327 ┆ 1679.0 │\n",
"│ S14040 ┆ 長津田店 ┆ 14 ┆ 神奈川県 ┆ … ┆ 045-123-4046 ┆ 139.4994 ┆ 35.52398 ┆ 1548.0 │\n",
"│ S14050 ┆ 阿久和西店 ┆ 14 ┆ 神奈川県 ┆ … ┆ 045-123-4053 ┆ 139.4961 ┆ 35.45918 ┆ 1830.0 │\n",
"└──────────┴──────────────┴───────────────┴────────────┴───┴──────────────┴───────────┴──────────┴────────────┘"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(\n",
"df_store\n",
".filter(pl.col.store_cd.str.starts_with('S14'))\n",
".head(5)\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}