{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "221db5d2-27ce-4178-ba69-003a64169cd4",
   "metadata": {},
   "source": [
    "# Python関数で処理"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22d62603-e6af-43c3-b63a-790ffe05e2e6",
   "metadata": {},
   "source": [
    "Polarsでは、ユーザー関数を呼び出してデータの変換や処理を行うことができます。ユーザー関数を使用することで、既存の関数や演算子だけでは実現できない特定の処理を追加することが可能です。本章では、polarsでユーザー関数をどのように定義し、呼び出すかについて詳しく解説します。ユーザー関数を利用することで、より柔軟で効率的なデータ処理を実現することができます。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d43f4cf2-2ca2-4033-8410-4e5581b04480",
   "metadata": {},
   "outputs": [],
   "source": [
    "import polars as pl\n",
    "import numpy as np\n",
    "from helper.jupyter import row"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b579dba-e226-4c59-805c-93aaf7dd9aa1",
   "metadata": {},
   "source": [
    "## pipe\n",
    "\n",
    "Polars では、コードの可読性を向上させたり、関数を柔軟に適用したりするために、`pipe` メソッドを利用できます。このメソッドは、主に以下の2つの場面で使用されます。\n",
    "\n",
    "1. `DataFrame.pipe()`\n",
    "2. `Expr.pipe()`\n",
    "\n",
    "それぞれの使い方を具体例を交えて解説します。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "32ed7fd5-830a-4ef3-878f-9f543f22ea7b",
   "metadata": {},
   "source": [
    "### DataFrame.pipe\n",
    "\n",
    "`DataFrame.pipe` メソッドは、関数をチェーン処理の中で適用するための便利なツールです。これにより、複雑なデータ処理を段階的に記述でき、コードの可読性と保守性を向上させることができます。\n",
    "\n",
    "`pipe()` の最初の引数には、処理を実行する関数を渡します。この関数の最初の引数として、`DataFrame` 自体が自動的に渡されます。また、`pipe()` のその他の位置引数やキーワード引数は、そのまま渡された関数に引き継がれます。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "1ce483b8-72ae-44d6-941b-a37557bff6c7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table><tr><td><div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (3, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>A</th><th>B</th><th>C</th></tr><tr><td>i64</td><td>i64</td><td>i64</td></tr></thead><tbody><tr><td>1</td><td>4</td><td>7</td></tr><tr><td>2</td><td>5</td><td>8</td></tr><tr><td>3</td><td>6</td><td>9</td></tr></tbody></table></div></td><td><div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (3, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>A</th><th>B</th><th>C</th><th>D</th></tr><tr><td>i64</td><td>i64</td><td>i64</td><td>i32</td></tr></thead><tbody><tr><td>2</td><td>4</td><td>7</td><td>10</td></tr><tr><td>4</td><td>5</td><td>8</td><td>10</td></tr><tr><td>6</td><td>6</td><td>9</td><td>10</td></tr></tbody></table></div></td></tr></table>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df = pl.DataFrame({\n",
    "    \"A\": [1, 2, 3],\n",
    "    \"B\": [4, 5, 6],\n",
    "    \"C\": [7, 8, 9]\n",
    "})\n",
    "\n",
    "def add_column(df, col_name, value):\n",
    "    return df.with_columns(pl.lit(value).alias(col_name))\n",
    "\n",
    "def multiply_column(df, col_name, factor):\n",
    "    return df.with_columns((pl.col(col_name) * factor).alias(col_name))\n",
    "\n",
    "df_res = (\n",
    "    df.pipe(add_column, col_name=\"D\", value=10)\n",
    "      .pipe(multiply_column, col_name=\"A\", factor=2)\n",
    ")\n",
    "\n",
    "row(df, df_res)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d37640e-d435-4020-977a-170344d2e733",
   "metadata": {},
   "source": [
    "### Expr.pipe\n",
    "\n",
    "演算式の `pipe` メソッドを使用することで、演算式に対して柔軟なデータ変換やカスタムロジックを適用できます。以下は、2次式を計算する関数 `quadratic()` を定義し、`pipe` メソッドを利用して複数の演算式に処理を適用する例です。\n",
    "\n",
    "`pipe()` の最初の引数には、`Expr` を処理する関数を指定します。この関数の最初の引数には、演算式が自動的に渡されます。また、`pipe()` のその他の位置引数やキーワード引数は、そのまま指定した関数に引き継がれます。\n",
    "\n",
    "次のコードでは、`.pipe(quadratic, ...)`を使って次のような計算を行います。\n",
    "\n",
    "  - `A`列 に対して $ x^2 + 2 \\cdot x + 3 $ を計算し、列名に `\"_2\"` を追加します。\n",
    "  - `B`列 と`C`列 の合計に対して $ x^2 + 2 \\cdot x $ を計算し、列名に `\"_q\"` を追加します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "6b4c1b4c-c1c0-41be-bf8b-31eda51ce344",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table><tr><td><div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (3, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>A</th><th>B</th><th>C</th></tr><tr><td>i64</td><td>i64</td><td>i64</td></tr></thead><tbody><tr><td>1</td><td>4</td><td>7</td></tr><tr><td>2</td><td>5</td><td>8</td></tr><tr><td>3</td><td>6</td><td>9</td></tr></tbody></table></div></td><td><div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (3, 5)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>A</th><th>B</th><th>C</th><th>A_2</th><th>B_q</th></tr><tr><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td></tr></thead><tbody><tr><td>1</td><td>4</td><td>7</td><td>6</td><td>143</td></tr><tr><td>2</td><td>5</td><td>8</td><td>11</td><td>195</td></tr><tr><td>3</td><td>6</td><td>9</td><td>18</td><td>255</td></tr></tbody></table></div></td></tr></table>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def quadratic(x, a=0, b=0, c=0, suffix='_2'):\n",
    "    return (a * x ** 2 + b * x + c).name.suffix(suffix)\n",
    "\n",
    "df_res = (\n",
    "    df.with_columns(\n",
    "        pl.col(\"A\").pipe(quadratic, 1, 2, 3),\n",
    "        (pl.col(\"B\") + pl.col(\"C\")).pipe(quadratic, a=1, b=2, suffix='_q')\n",
    "    )\n",
    ")\n",
    "\n",
    "row(df, df_res)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b6ecb88-0bf3-4aba-83dc-9c6c1c2ab337",
   "metadata": {},
   "source": [
    "## map_batches"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d766b6eb-5d44-4848-932d-b8082eb11738",
   "metadata": {},
   "source": [
    "`pl.map_batches`は、DataFrameの複数の列を一括で処理する際に便利な関数です。この関数を使うと、指定した列をユーザー定義関数に渡してカスタムの計算を実行し、その結果を新しい列として追加することができます。\n",
    "\n",
    "```python\n",
    "pl.map_batches(column_names, f: Callable[[list[pl.Series]], pl.Series | Any])\n",
    "```\n",
    "\n",
    "- `column_names`：ユーザー関数に渡す列名或いは演算式のリスト。\n",
    "- `f`: データを処理するためのユーザー定義関数。この関数は、指定された列の`pl.Series`リストを受け取り、新しい`pl.Series`または他の値を返します。\n",
    "\n",
    "次の例では、`a`と`b`の列を使ってユーザー定義関数`hypot`を適用し、その結果を新しい列`c`としてDataFrameに追加します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "f284dffe-979b-4f83-8c5a-f2dcc11317da",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>a</th><th>b</th><th>g</th><th>c</th></tr><tr><td>i64</td><td>i64</td><td>str</td><td>f64</td></tr></thead><tbody><tr><td>3</td><td>4</td><td>&quot;A&quot;</td><td>5.0</td></tr><tr><td>3</td><td>12</td><td>&quot;B&quot;</td><td>12.369317</td></tr><tr><td>3</td><td>6</td><td>&quot;A&quot;</td><td>6.708204</td></tr><tr><td>4</td><td>7</td><td>&quot;B&quot;</td><td>8.062258</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 4)\n",
       "┌─────┬─────┬─────┬───────────┐\n",
       "│ a   ┆ b   ┆ g   ┆ c         │\n",
       "│ --- ┆ --- ┆ --- ┆ ---       │\n",
       "│ i64 ┆ i64 ┆ str ┆ f64       │\n",
       "╞═════╪═════╪═════╪═══════════╡\n",
       "│ 3   ┆ 4   ┆ A   ┆ 5.0       │\n",
       "│ 3   ┆ 12  ┆ B   ┆ 12.369317 │\n",
       "│ 3   ┆ 6   ┆ A   ┆ 6.708204  │\n",
       "│ 4   ┆ 7   ┆ B   ┆ 8.062258  │\n",
       "└─────┴─────┴─────┴───────────┘"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pl.DataFrame(\n",
    "    {\n",
    "        \"a\": [3, 3, 3, 4],\n",
    "        \"b\": [4, 12, 6, 7],\n",
    "        \"g\": ['A', 'B', 'A', 'B']\n",
    "    }\n",
    ")\n",
    "\n",
    "def hypot(args:list[pl.Series]):\n",
    "    a, b = args\n",
    "    return (a**2 + b**2)**0.5\n",
    "    \n",
    "df.with_columns(\n",
    "    pl.map_batches(['a', 'b'], hypot).alias('c')\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e3d185a-9976-46d4-91dd-c7c6697e293c",
   "metadata": {},
   "source": [
    "`column_names`引数には演算式を使用することができます。下のコードでは、`a`列の値に1を加えた結果と`b`列の値を使って計算します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "24983fab-8e3a-4122-97fc-8e44c4ef10b0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>a</th><th>b</th><th>g</th><th>c</th></tr><tr><td>i64</td><td>i64</td><td>str</td><td>f64</td></tr></thead><tbody><tr><td>3</td><td>4</td><td>&quot;A&quot;</td><td>5.656854</td></tr><tr><td>3</td><td>12</td><td>&quot;B&quot;</td><td>12.649111</td></tr><tr><td>3</td><td>6</td><td>&quot;A&quot;</td><td>7.211103</td></tr><tr><td>4</td><td>7</td><td>&quot;B&quot;</td><td>8.602325</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 4)\n",
       "┌─────┬─────┬─────┬───────────┐\n",
       "│ a   ┆ b   ┆ g   ┆ c         │\n",
       "│ --- ┆ --- ┆ --- ┆ ---       │\n",
       "│ i64 ┆ i64 ┆ str ┆ f64       │\n",
       "╞═════╪═════╪═════╪═══════════╡\n",
       "│ 3   ┆ 4   ┆ A   ┆ 5.656854  │\n",
       "│ 3   ┆ 12  ┆ B   ┆ 12.649111 │\n",
       "│ 3   ┆ 6   ┆ A   ┆ 7.211103  │\n",
       "│ 4   ┆ 7   ┆ B   ┆ 8.602325  │\n",
       "└─────┴─────┴─────┴───────────┘"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.with_columns(\n",
    "    pl.map_batches([pl.col('a') + 1, 'b'], hypot).alias('c')\n",
    "\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "512895b4-c5ee-48f8-8d2b-ac5b80443dfb",
   "metadata": {},
   "source": [
    "演算式には、`map_batches()`というメソッドもあります。以下は`Expr.map_batches()`を使って演算式の計算結果をユーザー関数で処理する例です。2列のデータは２回に分かり、`square()`関数に渡します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "85c9a540-f8d4-4fec-abe8-ab031ca6ddc1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 2)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>a</th><th>b</th></tr><tr><td>i64</td><td>i64</td></tr></thead><tbody><tr><td>9</td><td>16</td></tr><tr><td>9</td><td>144</td></tr><tr><td>9</td><td>36</td></tr><tr><td>16</td><td>49</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 2)\n",
       "┌─────┬─────┐\n",
       "│ a   ┆ b   │\n",
       "│ --- ┆ --- │\n",
       "│ i64 ┆ i64 │\n",
       "╞═════╪═════╡\n",
       "│ 9   ┆ 16  │\n",
       "│ 9   ┆ 144 │\n",
       "│ 9   ┆ 36  │\n",
       "│ 16  ┆ 49  │\n",
       "└─────┴─────┘"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def square(s):\n",
    "    return s**2\n",
    "\n",
    "df.select(\n",
    "    pl.col('a', 'b').map_batches(square)\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "083cc88b-fbf6-4f62-8b15-e1a6ad1fdd23",
   "metadata": {},
   "source": [
    "`pl.struct()`を使って複数の列を一つの構造体列に変換し、その後で`map_batches()`を使ってカスタム関数を適用することができます。以下は、複数の列を処理するためのサンプルコードです。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "44f87979-ba75-47f5-8d65-74b66135f3a6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 1)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>a</th></tr><tr><td>i64</td></tr></thead><tbody><tr><td>25</td></tr><tr><td>153</td></tr><tr><td>45</td></tr><tr><td>65</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 1)\n",
       "┌─────┐\n",
       "│ a   │\n",
       "│ --- │\n",
       "│ i64 │\n",
       "╞═════╡\n",
       "│ 25  │\n",
       "│ 153 │\n",
       "│ 45  │\n",
       "│ 65  │\n",
       "└─────┘"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def square2(s):\n",
    "    return s.struct.field('a')**2 + s.struct.field('b')**2\n",
    "\n",
    "df.select(\n",
    "    pl.struct('a', 'b').map_batches(square2)\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "1ca108be-df30-474a-853e-13f75fae78bd",
   "metadata": {},
   "source": [
    "(map_batches_in_agg)=\n",
    "\n",
    "`GroupBy.agg()` のコンテキストで `map_batches()` を使用する場合、次の2つの引数がユーザー関数の入出力に影響します。  \n",
    "\n",
    "- `returns_scalar`（デフォルト: `False`）  \n",
    "  - `True` の場合、ユーザー関数の戻り値は 1 つのスカラー値になります。  \n",
    "  - `False` の場合、スカラー値はリストに変換されます。  \n",
    "\n",
    "- `agg_list`（デフォルト: `False`）  \n",
    "  - `True` の場合、各グループの値をリスト型の `Series` オブジェクトとしてユーザー関数に渡します。  \n",
    "  - `False` の場合、各グループの値を `Series` として渡し、ユーザー関数はグループごとに複数回実行されます。  \n",
    "\n",
    "次のプログラムは、`returns_scalar` 引数の影響を比較します。デフォルト値 `False` の場合、結果の列には 1 つの値が入ったリストが格納されます。  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "4a69bcc9-57c1-42f7-882d-f9dd86db12ce",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table><tr><td><div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (3, 2)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>g</th><th>x</th></tr><tr><td>str</td><td>list[f64]</td></tr></thead><tbody><tr><td>&quot;A&quot;</td><td>[2.5]</td></tr><tr><td>&quot;B&quot;</td><td>[4.333333]</td></tr><tr><td>&quot;C&quot;</td><td>[3.0]</td></tr></tbody></table></div></td><td><div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (3, 2)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>g</th><th>x</th></tr><tr><td>str</td><td>f64</td></tr></thead><tbody><tr><td>&quot;A&quot;</td><td>2.5</td></tr><tr><td>&quot;B&quot;</td><td>4.333333</td></tr><tr><td>&quot;C&quot;</td><td>3.0</td></tr></tbody></table></div></td></tr></table>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df = pl.DataFrame(\n",
    "    dict(\n",
    "        g=['A', 'B', 'C', 'A', 'B', 'B'], \n",
    "        x=[1, 2, 3, 4, 5, 6])\n",
    ")\n",
    "\n",
    "def func1(s):\n",
    "    return s.mean()\n",
    "\n",
    "g = df.group_by('g', maintain_order=True)\n",
    "\n",
    "row(\n",
    "    g.agg(pl.col('x').map_batches(func1)),\n",
    "    g.agg(pl.col('x').map_batches(func1, returns_scalar=True))\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b2194bbd-ce88-4d12-84b5-14914ba28066",
   "metadata": {},
   "source": [
    "次のプログラムでは、ユーザー関数内で入力データを `print` しています。結果から、ユーザー関数がグループごとに実行されていることがわかります。この場合、演算効率が低下する可能性があります。  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "6744ec5b-89f8-4d69-8e32-897e2f992e8b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "shape: (3,)\n",
      "Series: '' [i64]\n",
      "[\n",
      "\t2\n",
      "\t5\n",
      "\t6\n",
      "]\n",
      "shape: (2,)\n",
      "Series: '' [i64]\n",
      "[\n",
      "\t1\n",
      "\t4\n",
      "]\n",
      "shape: (1,)\n",
      "Series: '' [i64]\n",
      "[\n",
      "\t3\n",
      "]\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (3, 2)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>g</th><th>x</th></tr><tr><td>str</td><td>f64</td></tr></thead><tbody><tr><td>&quot;A&quot;</td><td>2.5</td></tr><tr><td>&quot;B&quot;</td><td>4.333333</td></tr><tr><td>&quot;C&quot;</td><td>3.0</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (3, 2)\n",
       "┌─────┬──────────┐\n",
       "│ g   ┆ x        │\n",
       "│ --- ┆ ---      │\n",
       "│ str ┆ f64      │\n",
       "╞═════╪══════════╡\n",
       "│ A   ┆ 2.5      │\n",
       "│ B   ┆ 4.333333 │\n",
       "│ C   ┆ 3.0      │\n",
       "└─────┴──────────┘"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def func2(s):\n",
    "    print(s)\n",
    "    return s.mean()\n",
    "\n",
    "g.agg(pl.col('x').map_batches(func2, returns_scalar=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e889e685-96e9-4bd5-94e6-168d21f1953f",
   "metadata": {},
   "source": [
    "`agg_list` 引数を `True` に設定すると、各グループの値がリストとしてユーザー関数に渡されます。これにより、ユーザー関数は 1 回だけ実行されます。  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "8156276c-6b1c-4c0f-a2ac-6a9d405eea63",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "shape: (3,)\n",
      "Series: 'x' [list[i64]]\n",
      "[\n",
      "\t[1, 4]\n",
      "\t[2, 5, 6]\n",
      "\t[3]\n",
      "]\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (3, 2)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>g</th><th>x</th></tr><tr><td>str</td><td>f64</td></tr></thead><tbody><tr><td>&quot;A&quot;</td><td>2.5</td></tr><tr><td>&quot;B&quot;</td><td>4.333333</td></tr><tr><td>&quot;C&quot;</td><td>3.0</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (3, 2)\n",
       "┌─────┬──────────┐\n",
       "│ g   ┆ x        │\n",
       "│ --- ┆ ---      │\n",
       "│ str ┆ f64      │\n",
       "╞═════╪══════════╡\n",
       "│ A   ┆ 2.5      │\n",
       "│ B   ┆ 4.333333 │\n",
       "│ C   ┆ 3.0      │\n",
       "└─────┴──────────┘"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def func3(s):\n",
    "    print(s)\n",
    "    return s.list.mean()\n",
    "\n",
    "df.group_by('g', maintain_order=True).agg(pl.col('x').map_batches(func3, agg_list=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f3563ce8-e1dd-4600-b337-eacd8c183e8e",
   "metadata": {},
   "source": [
    "## map_groups\n",
    "\n",
    "`pl.map_groups()` を使用して各グループのデータをカスタム関数で処理し、その結果を収集することができます。以下のプログラムは、グループごとに、`a`列の最大値と`b`の最大値の比を計算します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "bbe98338-2eff-45c2-b28d-74f4721ac1c6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (2, 2)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>g</th><th>a</th></tr><tr><td>str</td><td>f64</td></tr></thead><tbody><tr><td>&quot;B&quot;</td><td>0.333333</td></tr><tr><td>&quot;A&quot;</td><td>0.5</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (2, 2)\n",
       "┌─────┬──────────┐\n",
       "│ g   ┆ a        │\n",
       "│ --- ┆ ---      │\n",
       "│ str ┆ f64      │\n",
       "╞═════╪══════════╡\n",
       "│ B   ┆ 0.333333 │\n",
       "│ A   ┆ 0.5      │\n",
       "└─────┴──────────┘"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def ratio_max(args):\n",
    "    a, b = args\n",
    "    return a.max() / b.max()\n",
    "\n",
    "df.group_by('g').agg(\n",
    "    pl.map_groups(['a', 'b'], ratio_max)\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5435fa52-f324-4427-9791-5eaef2f4b414",
   "metadata": {},
   "source": [
    "## map_elements\n",
    "\n",
    "`Expr.map_elements()`を使用して演算式の各個値をカスタム関数に渡し、その結果を新しい列として計算することができます。以下の例では、`f` 関数を使って各値を処理する例です。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "e88a790e-8dc2-4eef-ada2-10994041b259",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>a</th><th>b</th><th>g</th><th>b_category</th></tr><tr><td>i64</td><td>i64</td><td>str</td><td>str</td></tr></thead><tbody><tr><td>3</td><td>4</td><td>&quot;A&quot;</td><td>&quot;small&quot;</td></tr><tr><td>3</td><td>12</td><td>&quot;B&quot;</td><td>&quot;large&quot;</td></tr><tr><td>3</td><td>6</td><td>&quot;A&quot;</td><td>&quot;middle&quot;</td></tr><tr><td>4</td><td>7</td><td>&quot;B&quot;</td><td>&quot;middle&quot;</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 4)\n",
       "┌─────┬─────┬─────┬────────────┐\n",
       "│ a   ┆ b   ┆ g   ┆ b_category │\n",
       "│ --- ┆ --- ┆ --- ┆ ---        │\n",
       "│ i64 ┆ i64 ┆ str ┆ str        │\n",
       "╞═════╪═════╪═════╪════════════╡\n",
       "│ 3   ┆ 4   ┆ A   ┆ small      │\n",
       "│ 3   ┆ 12  ┆ B   ┆ large      │\n",
       "│ 3   ┆ 6   ┆ A   ┆ middle     │\n",
       "│ 4   ┆ 7   ┆ B   ┆ middle     │\n",
       "└─────┴─────┴─────┴────────────┘"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def f(x):\n",
    "    if x > 10:\n",
    "        return 'large'\n",
    "    elif x > 5:\n",
    "        return 'middle'\n",
    "    else:\n",
    "        return 'small'\n",
    "\n",
    "df.with_columns(\n",
    "    pl.col.b.map_elements(f, return_dtype=pl.String).alias('b_category')\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5278f638-ebd9-40ad-9ab1-20b5f679eaed",
   "metadata": {},
   "source": [
    "`Expr.map_elements()`は各値を逐一処理するため、データ量が多い場合は演算速度が遅くなる可能性があります。そのため、ベクトル演算やPolarsの組み込み関数で処理できない場合にのみ使用することが推奨されます。例えば、上記の例では、以下のようにベクトル化された条件分岐を使用することもできます："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "84935dde-780b-4ed1-99c5-990b76b56b5a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>a</th><th>b</th><th>g</th><th>b_category</th></tr><tr><td>i64</td><td>i64</td><td>str</td><td>str</td></tr></thead><tbody><tr><td>3</td><td>4</td><td>&quot;A&quot;</td><td>&quot;small&quot;</td></tr><tr><td>3</td><td>12</td><td>&quot;B&quot;</td><td>&quot;large&quot;</td></tr><tr><td>3</td><td>6</td><td>&quot;A&quot;</td><td>&quot;middle&quot;</td></tr><tr><td>4</td><td>7</td><td>&quot;B&quot;</td><td>&quot;middle&quot;</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 4)\n",
       "┌─────┬─────┬─────┬────────────┐\n",
       "│ a   ┆ b   ┆ g   ┆ b_category │\n",
       "│ --- ┆ --- ┆ --- ┆ ---        │\n",
       "│ i64 ┆ i64 ┆ str ┆ str        │\n",
       "╞═════╪═════╪═════╪════════════╡\n",
       "│ 3   ┆ 4   ┆ A   ┆ small      │\n",
       "│ 3   ┆ 12  ┆ B   ┆ large      │\n",
       "│ 3   ┆ 6   ┆ A   ┆ middle     │\n",
       "│ 4   ┆ 7   ┆ B   ┆ middle     │\n",
       "└─────┴─────┴─────┴────────────┘"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.with_columns(\n",
    "    pl.when(pl.col('b') > 10).then(pl.lit('large'))\n",
    "      .when(pl.col('b') > 5).then(pl.lit('middle'))\n",
    "      .otherwise(pl.lit('small'))\n",
    "      .alias('b_category')\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c31e12fe-6264-4db9-a6c4-ac205c2cec44",
   "metadata": {},
   "source": [
    "`agg()`と`over()`のコンテキストで`map_elements()`を使用する場合、ユーザー関数の引数には各グループに該当する値の`Series`オブジェクトが渡されます。次のコードでは、ユーザー関数`f`と演算式を用いて同じ計算を実装します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "8743b640-eeb3-4400-bdc8-bbf3bacbb733",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table><tr><td><div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (2, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>g</th><th>a</th><th>b</th></tr><tr><td>str</td><td>f64</td><td>f64</td></tr></thead><tbody><tr><td>&quot;A&quot;</td><td>0.0</td><td>2.0</td></tr><tr><td>&quot;B&quot;</td><td>0.5</td><td>12.5</td></tr></tbody></table></div></td><td><div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (2, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>g</th><th>a</th><th>b</th></tr><tr><td>str</td><td>f64</td><td>f64</td></tr></thead><tbody><tr><td>&quot;B&quot;</td><td>0.5</td><td>12.5</td></tr><tr><td>&quot;A&quot;</td><td>0.0</td><td>2.0</td></tr></tbody></table></div></td></tr></table>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def f(s):\n",
    "    return ((s - s[0])**2).mean()\n",
    "\n",
    "cols = pl.col('a', 'b')\n",
    "    \n",
    "df1 = (\n",
    "df\n",
    ".group_by('g', maintain_order=True)\n",
    ".agg(cols.map_elements(f, return_dtype=pl.Float64))\n",
    ")\n",
    "\n",
    "df2 = (\n",
    "df\n",
    ".group_by('g')\n",
    ".agg(((cols - cols.first())**2).mean())\n",
    ")\n",
    "\n",
    "row(df1, df2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6ca694b5-a9f6-48ba-a2ec-4b79bc70c7c3",
   "metadata": {},
   "source": [
    "## rolling_map"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f5b0baeb-ea79-4e28-9cb7-324b55af321e",
   "metadata": {},
   "source": [
    "`pl.rolling_mean()`などの移動ウィンドウ処理関数で対応できない場合に、`rolling_map()`を使用してウィンドウ内のデータをカスタムユーザー関数で処理できます。具体的には、以下のコードは、各ウィンドウ内のデータをユーザー関数に渡し、処理した結果を新しい列として追加します。\n",
    "\n",
    ":::{tip}\n",
    "`pl.Series._s`：Seriesオブジェクトの内部データにアクセスするために使用します。これにより、実際のデータやそのメモリのIDを確認できます。\n",
    ":::\n",
    "\n",
    "`arguments`リストに、各ウィンドウ内のデータとその Series オブジェクトのメモリIDが記録されます。効率的にメモリを利用するため、同じSeriesオブジェクトが複数回渡されていることが確認できます。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "34478149-11b2-404b-9846-f64ba7ba15d8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2294865850272 [2, 4]\n",
      "2294865850272 [2, 4, 0]\n",
      "2294865850272 [4, 0, 3]\n",
      "2294865850272 [0, 3, 1]\n",
      "2294865850272 [3, 1, 6]\n",
      "2294865850272 [1, 6]\n"
     ]
    }
   ],
   "source": [
    "df = pl.DataFrame(\n",
    "    {\n",
    "        \"a\": [2, 4, 0, 3, 1, 6]\n",
    "    }\n",
    ")\n",
    "\n",
    "arguments = []\n",
    "def f(s):\n",
    "    arguments.append((id(s._s), s.to_list()))\n",
    "    return s.mean()\n",
    "\n",
    "df.select(\n",
    "    pl.col('a').rolling_map(f, window_size=3, center=True, min_periods=1)\n",
    ")\n",
    "\n",
    "for id_, arg in arguments:\n",
    "    print(id_, arg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ca9ae9d3-6f4b-49f8-b41e-93acc2cec433",
   "metadata": {},
   "source": [
    "## ufunc"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0d41ae4-fc6d-44f2-8b3d-3311a1222603",
   "metadata": {},
   "source": [
    "`NumPy` の `ufunc`関数を直接使用して列あるいは演算式に対して演算を行うことができます。例えば、`np.hypot` は二つの値の平方根の和を計算する関数で、Polarsでは次のように使用することができます。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "b677bf71-177c-480d-8a24-b174ef0e9465",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>a</th><th>b</th><th>g</th><th>c</th></tr><tr><td>i64</td><td>i64</td><td>str</td><td>f64</td></tr></thead><tbody><tr><td>3</td><td>4</td><td>&quot;A&quot;</td><td>5.0</td></tr><tr><td>3</td><td>12</td><td>&quot;B&quot;</td><td>12.369317</td></tr><tr><td>3</td><td>6</td><td>&quot;A&quot;</td><td>6.708204</td></tr><tr><td>4</td><td>7</td><td>&quot;B&quot;</td><td>8.062258</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 4)\n",
       "┌─────┬─────┬─────┬───────────┐\n",
       "│ a   ┆ b   ┆ g   ┆ c         │\n",
       "│ --- ┆ --- ┆ --- ┆ ---       │\n",
       "│ i64 ┆ i64 ┆ str ┆ f64       │\n",
       "╞═════╪═════╪═════╪═══════════╡\n",
       "│ 3   ┆ 4   ┆ A   ┆ 5.0       │\n",
       "│ 3   ┆ 12  ┆ B   ┆ 12.369317 │\n",
       "│ 3   ┆ 6   ┆ A   ┆ 6.708204  │\n",
       "│ 4   ┆ 7   ┆ B   ┆ 8.062258  │\n",
       "└─────┴─────┴─────┴───────────┘"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pl.DataFrame(\n",
    "    {\n",
    "        \"a\": [3, 3, 3, 4],\n",
    "        \"b\": [4, 12, 6, 7],\n",
    "        \"g\": ['A', 'B', 'A', 'B']\n",
    "    }\n",
    ")\n",
    "\n",
    "df.with_columns(\n",
    "    np.hypot(pl.col(\"a\"), pl.col(\"b\")).alias(\"c\")\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e694949e-cb3b-4530-8dd3-f1aec66d4674",
   "metadata": {},
   "source": [
    "`np.hypot()`を使って列を計算する際、Polarsの内部でどのように処理されるかを理解するために、次のように演算式を見てみます。この演算式は`a`列を構造体列に変換し、`b`列を\"argument_1\"として追加します。構造体列に対して、`python_udf()`で処理します。この関数でNumPyの`np.hypot()`を呼び出します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "6c5c047f-6a86-44aa-97b3-95ee895fdc25",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "col(\"a\").as_struct([col(\"b\").alias(\"argument_1\")]).python_udf()"
      ],
      "text/plain": [
       "<Expr ['col(\"a\").as_struct([col(\"b\").a…'] at 0x2164EF48090>"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.hypot(pl.col(\"a\"), pl.col(\"b\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "27c49839-8964-4ad1-85b7-f1a68299d133",
   "metadata": {},
   "source": [
    "## 列名とフィールド名変更"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4163a972-cd4d-4749-b56a-3a1b132ad7bd",
   "metadata": {},
   "source": [
    "**`Expr.name.map()`**\n",
    "\n",
    "列名をユーザー関数で変更します。\n",
    "\n",
    "**`Expr.name.map_fields()`**\n",
    "\n",
    "Structのfield名をユーザー関数で変更します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "04798bfd-54fd-438d-ab2b-844070bdacee",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>Aa</th><th>Bb</th><th>Gg</th></tr><tr><td>i64</td><td>i64</td><td>str</td></tr></thead><tbody><tr><td>3</td><td>4</td><td>&quot;A&quot;</td></tr><tr><td>3</td><td>12</td><td>&quot;B&quot;</td></tr><tr><td>3</td><td>6</td><td>&quot;A&quot;</td></tr><tr><td>4</td><td>7</td><td>&quot;B&quot;</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 3)\n",
       "┌─────┬─────┬─────┐\n",
       "│ Aa  ┆ Bb  ┆ Gg  │\n",
       "│ --- ┆ --- ┆ --- │\n",
       "│ i64 ┆ i64 ┆ str │\n",
       "╞═════╪═════╪═════╡\n",
       "│ 3   ┆ 4   ┆ A   │\n",
       "│ 3   ┆ 12  ┆ B   │\n",
       "│ 3   ┆ 6   ┆ A   │\n",
       "│ 4   ┆ 7   ┆ B   │\n",
       "└─────┴─────┴─────┘"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def rename(n):\n",
    "    return n.upper() + n\n",
    "    \n",
    "df.select(pl.all().name.map(rename))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "953f6978-3401-448c-9529-48c32a7dff51",
   "metadata": {},
   "source": [
    "`DataFrame.map_rows()`\n",
    "\n",
    "行をTupleとして、ユーザー関数に渡します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "94a74b31-cb7b-43a0-9d69-b32fa1d3a240",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div><style>\n",
       ".dataframe > thead > tr,\n",
       ".dataframe > tbody > tr {\n",
       "  text-align: right;\n",
       "  white-space: pre-wrap;\n",
       "}\n",
       "</style>\n",
       "<small>shape: (4, 2)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>add</th><th>sub</th></tr><tr><td>str</td><td>str</td></tr></thead><tbody><tr><td>&quot;A:7&quot;</td><td>&quot;A:-1&quot;</td></tr><tr><td>&quot;B:15&quot;</td><td>&quot;B:-9&quot;</td></tr><tr><td>&quot;A:9&quot;</td><td>&quot;A:-3&quot;</td></tr><tr><td>&quot;B:11&quot;</td><td>&quot;B:-3&quot;</td></tr></tbody></table></div>"
      ],
      "text/plain": [
       "shape: (4, 2)\n",
       "┌──────┬──────┐\n",
       "│ add  ┆ sub  │\n",
       "│ ---  ┆ ---  │\n",
       "│ str  ┆ str  │\n",
       "╞══════╪══════╡\n",
       "│ A:7  ┆ A:-1 │\n",
       "│ B:15 ┆ B:-9 │\n",
       "│ A:9  ┆ A:-3 │\n",
       "│ B:11 ┆ B:-3 │\n",
       "└──────┴──────┘"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def f(row):\n",
    "    a, b, g = row\n",
    "    return f\"{g}:{a + b}\", f\"{g}:{a - b}\"\n",
    "    \n",
    "df.map_rows(f).rename({\"column_0\": \"add\", \"column_1\": \"sub\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19397399-7a13-4ea5-bfac-b06b4ade4e82",
   "metadata": {},
   "source": [
    "辞書で行を処理した場合は、`DataFrame.iter_rows()`を使います。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "a914b898-7908-41a9-ae9c-c4f10f3292de",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'a': 3, 'b': 4, 'g': 'A'}\n",
      "{'a': 3, 'b': 12, 'g': 'B'}\n",
      "{'a': 3, 'b': 6, 'g': 'A'}\n",
      "{'a': 4, 'b': 7, 'g': 'B'}\n"
     ]
    }
   ],
   "source": [
    "for row in df.iter_rows(named=True):\n",
    "    print(row)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}