{ "cells": [ { "cell_type": "markdown", "id": "1f711dce-0591-4a37-ac37-0d560a7f1987", "metadata": {}, "source": [ "# 日本の大学データ\n", "\n", "日本の大学に関するデータを用いて、Polarsライブラリの基本的な使い方を解説します。Polarsは、高性能かつ効率的なデータ操作を可能にするRust製のDataFrameライブラリであり、大量のデータを扱う際に特に有用です。以下のセクションでは、データの読み込みから基本的な操作まで、Polarsの基本機能について順を追って説明します。\n", "\n", "次のプログラムは、Polarsライブラリを使用してCSVファイルを読み込み、その最初の2行を表示します。Polarsライブラリを`pl`というエイリアスでインポートするのは一般的です。" ] }, { "cell_type": "markdown", "id": "74d2011a-e2d3-4f58-9213-60e6b03d632d", "metadata": {}, "source": [ "## ファイル読み込み" ] }, { "cell_type": "code", "execution_count": 1, "id": "62b1630f-ed73-4303-a010-f81c983e34d8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (2, 22)
codenamename_jptypetype_jpaddresspostal_codephonestatestate_jplatitudelongitudefoundfaculty_countdepartment_counthas_gradhas_remotereview_ratingreview_countdifficulty_SDdifficulty_rank
i64strstrstrstrstrstrstrstrstrstrf64f64stri64i64boolboolf64f64f64str
0"F101110100010""Hokkaido University""北海道大学""National""国立""北海道札幌市北区北8条西5丁目""060-0808""011-716-2111""Hokkai Do""北海道"43.070446141.347153"1876-08"3378truefalse4.161389.060.4"A"
1"F101110100029""Hokkaido University of Educati…"北海道教育大学""National""国立""北海道札幌市北区あいの里5条3-1-3""002-8501""011-778-0206""Hokkai Do""北海道"43.170498141.393753"1943-04"38truefalse3.79544.047.1"D"
" ], "text/plain": [ "shape: (2, 22)\n", "┌─────┬─────────┬─────────┬─────────┬───┬─────────┬─────────┬─────────┬────────┐\n", "│ ┆ code ┆ name ┆ name_jp ┆ … ┆ review_ ┆ review_ ┆ difficu ┆ diffic │\n", "│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ rating ┆ count ┆ lty_SD ┆ ulty_r │\n", "│ i64 ┆ str ┆ str ┆ str ┆ ┆ --- ┆ --- ┆ --- ┆ ank │\n", "│ ┆ ┆ ┆ ┆ ┆ f64 ┆ f64 ┆ f64 ┆ --- │\n", "│ ┆ ┆ ┆ ┆ ┆ ┆ ┆ ┆ str │\n", "╞═════╪═════════╪═════════╪═════════╪═══╪═════════╪═════════╪═════════╪════════╡\n", "│ 0 ┆ F101110 ┆ Hokkaid ┆ 北海道 ┆ … ┆ 4.16 ┆ 1389.0 ┆ 60.4 ┆ A │\n", "│ ┆ 100010 ┆ o Unive ┆ 大学 ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ rsity ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 1 ┆ F101110 ┆ Hokkaid ┆ 北海道 ┆ … ┆ 3.79 ┆ 544.0 ┆ 47.1 ┆ D │\n", "│ ┆ 100029 ┆ o Unive ┆ 教育大 ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ rsity ┆ 学 ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ of Educ ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ ati… ┆ ┆ ┆ ┆ ┆ ┆ │\n", "└─────┴─────────┴─────────┴─────────┴───┴─────────┴─────────┴─────────┴────────┘" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import polars as pl\n", "df = pl.read_csv('data/japanese_universities.csv')\n", "df.head(2)" ] }, { "cell_type": "markdown", "id": "8737489a-611b-4d8d-af58-8f3c2062559c", "metadata": {}, "source": [ "`columns`でデータフレームのすべての列名をリストとして取得できます。" ] }, { "cell_type": "markdown", "id": "71d8dd5c-c429-40c6-a9a7-c624f815b6ab", "metadata": {}, "source": [ "## 列の情報" ] }, { "cell_type": "code", "execution_count": 2, "id": "72143b70-de18-4c68-a288-b512c9ce3904", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['',\n", " 'code',\n", " 'name',\n", " 'name_jp',\n", " 'type',\n", " 'type_jp',\n", " 'address',\n", " 'postal_code',\n", " 'phone',\n", " 'state',\n", " 'state_jp',\n", " 'latitude',\n", " 'longitude',\n", " 'found',\n", " 'faculty_count',\n", " 'department_count',\n", " 'has_grad',\n", " 'has_remote',\n", " 'review_rating',\n", " 'review_count',\n", " 'difficulty_SD',\n", " 'difficulty_rank']" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.columns" ] }, { "cell_type": "markdown", "id": "3f8a5214-2587-4a50-a91e-bf926f5ee7c5", "metadata": {}, "source": [ "列名とそのデータ型を取得するには、`schema`を使用します。`schema`は、列名をキー、データ型を値とする辞書を返します。" ] }, { "cell_type": "code", "execution_count": 3, "id": "a785acbc-93e9-4d8c-8b80-5d26cfff9d99", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Schema([('', Int64),\n", " ('code', String),\n", " ('name', String),\n", " ('name_jp', String),\n", " ('type', String),\n", " ('type_jp', String),\n", " ('address', String),\n", " ('postal_code', String),\n", " ('phone', String),\n", " ('state', String),\n", " ('state_jp', String),\n", " ('latitude', Float64),\n", " ('longitude', Float64),\n", " ('found', String),\n", " ('faculty_count', Int64),\n", " ('department_count', Int64),\n", " ('has_grad', Boolean),\n", " ('has_remote', Boolean),\n", " ('review_rating', Float64),\n", " ('review_count', Float64),\n", " ('difficulty_SD', Float64),\n", " ('difficulty_rank', String)])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.schema" ] }, { "cell_type": "markdown", "id": "6762791d-0ee4-4272-bf47-06bf0f95a7c6", "metadata": {}, "source": [ "## 列操作" ] }, { "cell_type": "markdown", "id": "19954604-b667-4e98-8283-d3931f73195c", "metadata": {}, "source": [ "`drop()`メソッドを使って、特定の列を削除します。Polarsのほとんどのメソッドは新しいデータフレームを返すため、変数`df2`に新しいデータフレームを保存します。" ] }, { "cell_type": "code", "execution_count": 4, "id": "927042a0-a586-45bb-bdc2-4097032882eb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (2, 15)
name_jptype_jpaddressstate_jplatitudelongitudefoundfaculty_countdepartment_counthas_gradhas_remotereview_ratingreview_countdifficulty_SDdifficulty_rank
strstrstrstrf64f64stri64i64boolboolf64f64f64str
"北海道大学""国立""北海道札幌市北区北8条西5丁目""北海道"43.070446141.347153"1876-08"3378truefalse4.161389.060.4"A"
"北海道教育大学""国立""北海道札幌市北区あいの里5条3-1-3""北海道"43.170498141.393753"1943-04"38truefalse3.79544.047.1"D"
" ], "text/plain": [ "shape: (2, 15)\n", "┌─────────┬─────────┬─────────┬────────┬───┬────────┬────────┬────────┬────────┐\n", "│ name_jp ┆ type_jp ┆ address ┆ state_ ┆ … ┆ review ┆ review ┆ diffic ┆ diffic │\n", "│ --- ┆ --- ┆ --- ┆ jp ┆ ┆ _ratin ┆ _count ┆ ulty_S ┆ ulty_r │\n", "│ str ┆ str ┆ str ┆ --- ┆ ┆ g ┆ --- ┆ D ┆ ank │\n", "│ ┆ ┆ ┆ str ┆ ┆ --- ┆ f64 ┆ --- ┆ --- │\n", "│ ┆ ┆ ┆ ┆ ┆ f64 ┆ ┆ f64 ┆ str │\n", "╞═════════╪═════════╪═════════╪════════╪═══╪════════╪════════╪════════╪════════╡\n", "│ 北海道 ┆ 国立 ┆ 北海道 ┆ 北海道 ┆ … ┆ 4.16 ┆ 1389.0 ┆ 60.4 ┆ A │\n", "│ 大学 ┆ ┆ 札幌市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 北区北8 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 条西5丁 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 目 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 北海道 ┆ 国立 ┆ 北海道 ┆ 北海道 ┆ … ┆ 3.79 ┆ 544.0 ┆ 47.1 ┆ D │\n", "│ 教育大 ┆ ┆ 札幌市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 学 ┆ ┆ 北区あ ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ いの里5 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 条3-1 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ -3 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "└─────────┴─────────┴─────────┴────────┴───┴────────┴────────┴────────┴────────┘" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = df.drop('', 'code', 'name', 'type', 'state', 'postal_code', 'phone')\n", "df2.head(2)" ] }, { "cell_type": "markdown", "id": "939334fd-9ef3-4c59-bc7f-991bf4f6293a", "metadata": {}, "source": [ "`with_columns`メソッドは、データフレームに対して新しい列を追加したり、既存の列を変換するために使用します。このメソッドを使うと、複数の列に対して同時に操作を行うことができます。次のコードでは、`df2`に対して`with_columns`メソッドを使って次のような演算を行い、新しいデータフレーム`df3`を作成します。\n", "\n", "* `found`列を文字列から日付型に変換します。\n", "* `type_jp`と`difficulty_rank`列を文字列型からカテゴリカル型に変換します。カテゴリカル型は、文字列型に比べてメモリ効率が良く、分析に便利です。\n", "\n", "`with_columns`に渡した引数は演算の結果ではなく、演算そのものを表す演算式です。[演算式の詳細について](expression)" ] }, { "cell_type": "code", "execution_count": 5, "id": "31577d1c-a9ce-4acb-b1f0-e38272d9930a", "metadata": {}, "outputs": [], "source": [ "df3 = df2.with_columns(\n", " pl.col('found').str.to_date(\"%Y-%m\"),\n", " pl.col('type_jp', 'difficulty_rank').cast(pl.Categorical)\n", ")" ] }, { "cell_type": "markdown", "id": "bbefbc2a-488e-4ee0-997b-d5c76923c549", "metadata": {}, "source": [ "確認のため、`select()`メソッドを使ってデータ型変換が行われた列を選択し、先頭の5行を表示します。" ] }, { "cell_type": "code", "execution_count": 6, "id": "f2453419-b870-400c-a885-6e88e7b0796a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (5, 4)
name_jpfoundtype_jpdifficulty_rank
strdatecatcat
"北海道大学"1876-08-01"国立""A"
"北海道教育大学"1943-04-01"国立""D"
"室蘭工業大学"1897-05-01"国立""F"
"小樽商科大学"1910-03-01"国立""C"
"帯広畜産大学"1941-04-01"国立""B"
" ], "text/plain": [ "shape: (5, 4)\n", "┌────────────────┬────────────┬─────────┬─────────────────┐\n", "│ name_jp ┆ found ┆ type_jp ┆ difficulty_rank │\n", "│ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ date ┆ cat ┆ cat │\n", "╞════════════════╪════════════╪═════════╪═════════════════╡\n", "│ 北海道大学 ┆ 1876-08-01 ┆ 国立 ┆ A │\n", "│ 北海道教育大学 ┆ 1943-04-01 ┆ 国立 ┆ D │\n", "│ 室蘭工業大学 ┆ 1897-05-01 ┆ 国立 ┆ F │\n", "│ 小樽商科大学 ┆ 1910-03-01 ┆ 国立 ┆ C │\n", "│ 帯広畜産大学 ┆ 1941-04-01 ┆ 国立 ┆ B │\n", "└────────────────┴────────────┴─────────┴─────────────────┘" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.select('name_jp', 'found', 'type_jp', 'difficulty_rank').head()" ] }, { "cell_type": "markdown", "id": "8aef14ab-92c6-4c67-894e-a21f0a7dcb8b", "metadata": {}, "source": [ ":::{tip}\n", "`select()`と`with_columns()`は、データフレームでの列操作に使われるメソッドです。違いは、`select()`は選択された列のみを出力し、その他の列は含まれませんが、`with_columns()`は操作を行った列だけでなく、操作されなかった列もそのまま出力します。\n", ":::" ] }, { "cell_type": "markdown", "id": "f571953d-5e70-440d-8866-66362aa4b195", "metadata": {}, "source": [ "## 行フィルタ" ] }, { "cell_type": "markdown", "id": "a39fc882-2f44-4953-947d-08bbe4952b06", "metadata": {}, "source": [ "`filter()`メソッドで指定した条件を満たす行を選択することができます。このメソッドは新しいデータフレームを返し、元のデータフレームは変更されません。次のコードは`difficulty_SD`列(偏差値)の値が65より大きい行を選択しています。" ] }, { "cell_type": "code", "execution_count": 7, "id": "3ddf8485-3146-4287-b1ba-54943ae2639e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (8, 15)
name_jptype_jpaddressstate_jplatitudelongitudefoundfaculty_countdepartment_counthas_gradhas_remotereview_ratingreview_countdifficulty_SDdifficulty_rank
strcatstrstrf64f64datei64i64boolboolf64f64f64cat
"東京大学""国立""東京都文京区本郷7-3-1""東京都"35.714146139.7632141877-04-0125132truefalse4.342206.070.5"S"
"一橋大学""国立""東京都国立市中2-1""東京都"35.694389139.4436491920-04-011015truefalse4.26432.067.9"S"
"京都大学""国立""京都府京都市左京区吉田本町""京都府"35.026962135.7819671886-04-012894truefalse4.21434.065.6"S"
"国際教養大学""公立""秋田県秋田市雄和椿川字奥椿岱""秋田県"39.62706140.1811072004-04-0124truefalse4.18100.069.7"S"
"慶應義塾大学""私立""東京都港区三田2-15-45""東京都"35.649147139.7428281890-01-012859truetrue4.192682.066.3"S"
"日本医科大学""私立""東京都文京区千駄木1-1-5""東京都"35.721233139.7584531904-04-0122truefalse3.9931.070.0"S"
"早稲田大学""私立""東京都新宿区戸塚町1-104""東京都"35.710083139.7222441902-10-013594truetrue4.144280.065.7"S"
"国際基督教大学""私立""東京都三鷹市大沢3-10-2""東京都"35.686474139.5274811953-03-0126truefalse4.43206.068.0"S"
" ], "text/plain": [ "shape: (8, 15)\n", "┌─────────┬─────────┬─────────┬────────┬───┬────────┬────────┬────────┬────────┐\n", "│ name_jp ┆ type_jp ┆ address ┆ state_ ┆ … ┆ review ┆ review ┆ diffic ┆ diffic │\n", "│ --- ┆ --- ┆ --- ┆ jp ┆ ┆ _ratin ┆ _count ┆ ulty_S ┆ ulty_r │\n", "│ str ┆ cat ┆ str ┆ --- ┆ ┆ g ┆ --- ┆ D ┆ ank │\n", "│ ┆ ┆ ┆ str ┆ ┆ --- ┆ f64 ┆ --- ┆ --- │\n", "│ ┆ ┆ ┆ ┆ ┆ f64 ┆ ┆ f64 ┆ cat │\n", "╞═════════╪═════════╪═════════╪════════╪═══╪════════╪════════╪════════╪════════╡\n", "│ 東京大 ┆ 国立 ┆ 東京都 ┆ 東京都 ┆ … ┆ 4.34 ┆ 2206.0 ┆ 70.5 ┆ S │\n", "│ 学 ┆ ┆ 文京区 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 本郷7- ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 3-1 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 一橋大 ┆ 国立 ┆ 東京都 ┆ 東京都 ┆ … ┆ 4.26 ┆ 432.0 ┆ 67.9 ┆ S │\n", "│ 学 ┆ ┆ 国立市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 中2-1 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 京都大 ┆ 国立 ┆ 京都府 ┆ 京都府 ┆ … ┆ 4.2 ┆ 1434.0 ┆ 65.6 ┆ S │\n", "│ 学 ┆ ┆ 京都市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 左京区 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 吉田本 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 町 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 国際教 ┆ 公立 ┆ 秋田県 ┆ 秋田県 ┆ … ┆ 4.18 ┆ 100.0 ┆ 69.7 ┆ S │\n", "│ 養大学 ┆ ┆ 秋田市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 雄和椿 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 川字奥 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 椿岱 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 慶應義 ┆ 私立 ┆ 東京都 ┆ 東京都 ┆ … ┆ 4.19 ┆ 2682.0 ┆ 66.3 ┆ S │\n", "│ 塾大学 ┆ ┆ 港区三 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 田2-15 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ -45 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 日本医 ┆ 私立 ┆ 東京都 ┆ 東京都 ┆ … ┆ 3.99 ┆ 31.0 ┆ 70.0 ┆ S │\n", "│ 科大学 ┆ ┆ 文京区 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 千駄木1 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ -1-5 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 早稲田 ┆ 私立 ┆ 東京都 ┆ 東京都 ┆ … ┆ 4.14 ┆ 4280.0 ┆ 65.7 ┆ S │\n", "│ 大学 ┆ ┆ 新宿区 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 戸塚町 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 1-1 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 04 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 国際基 ┆ 私立 ┆ 東京都 ┆ 東京都 ┆ … ┆ 4.43 ┆ 206.0 ┆ 68.0 ┆ S │\n", "│ 督教大 ┆ ┆ 三鷹市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 学 ┆ ┆ 大沢3- ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 10-2 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "└─────────┴─────────┴─────────┴────────┴───┴────────┴────────┴────────┴────────┘" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.filter(\n", " pl.col('difficulty_SD') > 65\n", ")" ] }, { "cell_type": "markdown", "id": "7994cfb1-4c23-45a0-8a5c-ab1805ca9202", "metadata": {}, "source": [ "Pythonのビット演算子`~, |, &`を使って、複数の条件をANDやORで組み合わせて行を選択することができます。次のコードは、大阪にあり偏差値が55より高い大学を抽出します。" ] }, { "cell_type": "code", "execution_count": 8, "id": "551bd4a6-7ec8-4a55-a9ab-89ae65f993ca", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (3, 15)
name_jptype_jpaddressstate_jplatitudelongitudefoundfaculty_countdepartment_counthas_gradhas_remotereview_ratingreview_countdifficulty_SDdifficulty_rank
strcatstrstrf64f64datei64i64boolboolf64f64f64cat
"大阪大学""国立""大阪府吹田市山田丘1-1""大阪府"34.819496135.5211641917-04-012675truefalse4.121840.061.3"A"
"大阪医科薬科大学""私立""大阪府高槻市大学町2-7""大阪府"34.85186135.6244051904-05-0169truefalse4.119.057.0"B"
"関西大学""私立""大阪府吹田市山手町3-3-35""大阪府"34.769669135.5101621904-01-012944truefalse3.952661.055.6"B"
" ], "text/plain": [ "shape: (3, 15)\n", "┌─────────┬─────────┬─────────┬────────┬───┬────────┬────────┬────────┬────────┐\n", "│ name_jp ┆ type_jp ┆ address ┆ state_ ┆ … ┆ review ┆ review ┆ diffic ┆ diffic │\n", "│ --- ┆ --- ┆ --- ┆ jp ┆ ┆ _ratin ┆ _count ┆ ulty_S ┆ ulty_r │\n", "│ str ┆ cat ┆ str ┆ --- ┆ ┆ g ┆ --- ┆ D ┆ ank │\n", "│ ┆ ┆ ┆ str ┆ ┆ --- ┆ f64 ┆ --- ┆ --- │\n", "│ ┆ ┆ ┆ ┆ ┆ f64 ┆ ┆ f64 ┆ cat │\n", "╞═════════╪═════════╪═════════╪════════╪═══╪════════╪════════╪════════╪════════╡\n", "│ 大阪大 ┆ 国立 ┆ 大阪府 ┆ 大阪府 ┆ … ┆ 4.12 ┆ 1840.0 ┆ 61.3 ┆ A │\n", "│ 学 ┆ ┆ 吹田市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 山田丘1 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ -1 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 大阪医 ┆ 私立 ┆ 大阪府 ┆ 大阪府 ┆ … ┆ 4.11 ┆ 9.0 ┆ 57.0 ┆ B │\n", "│ 科薬科 ┆ ┆ 高槻市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 大学 ┆ ┆ 大学町2 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ -7 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ 関西大 ┆ 私立 ┆ 大阪府 ┆ 大阪府 ┆ … ┆ 3.95 ┆ 2661.0 ┆ 55.6 ┆ B │\n", "│ 学 ┆ ┆ 吹田市 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ 山手町3 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "│ ┆ ┆ -3-35 ┆ ┆ ┆ ┆ ┆ ┆ │\n", "└─────────┴─────────┴─────────┴────────┴───┴────────┴────────┴────────┴────────┘" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.filter(\n", " (pl.col('difficulty_SD') > 55) & (pl.col('state_jp').str.starts_with(\"大阪\"))\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "c20e3323-e1e2-4103-9ab3-4b4d77f6fd34", "metadata": {}, "source": [ "## メソッドチェーン\n", "\n", "メソッドチェーンとは、複数のメソッドを連続して呼び出すプログラミングスタイルのことです。Polarsライブラリでは、データフレームに対して複数の操作を直感的に連鎖させるためにこのスタイルをよく使用します。メソッドチェーンを使うことで、コードが読みやすくなり、一連のデータ処理操作を一つの流れとして視覚的に把握することができます。\n", "\n", "以下のコードは、データフレーム`df3`に対して一連の操作をメソッドチェーンを使って行っています:\n", "\n", "* `filter()`で`type_jp`列が「国立」である行をフィルタリングします。\n", "* `sort()`で`difficulty_SD`列に基づいて、データを降順(大きい順)にソートします。\n", "* `drop_nulls()`で`difficulty_SD`列にNULL(欠損値)が含まれる行を削除します。\n", "* `select()`で`name_jp`列と`difficulty_SD`列だけを選択します。\n", "* `head()`で最初の5行を取得します。" ] }, { "cell_type": "code", "execution_count": 9, "id": "8ef761c0-dd4f-42e6-befc-cd86cb131f4d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (5, 2)
name_jpdifficulty_SD
strf64
"東京大学"70.5
"一橋大学"67.9
"京都大学"65.6
"東京工業大学"65.0
"浜松医科大学"64.4
" ], "text/plain": [ "shape: (5, 2)\n", "┌──────────────┬───────────────┐\n", "│ name_jp ┆ difficulty_SD │\n", "│ --- ┆ --- │\n", "│ str ┆ f64 │\n", "╞══════════════╪═══════════════╡\n", "│ 東京大学 ┆ 70.5 │\n", "│ 一橋大学 ┆ 67.9 │\n", "│ 京都大学 ┆ 65.6 │\n", "│ 東京工業大学 ┆ 65.0 │\n", "│ 浜松医科大学 ┆ 64.4 │\n", "└──────────────┴───────────────┘" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(df3\n", ".filter(pl.col.type_jp == '国立')\n", ".sort('difficulty_SD', descending=True)\n", ".drop_nulls('difficulty_SD')\n", ".select('name_jp', 'difficulty_SD')\n", ".head(5)\n", ")" ] }, { "cell_type": "markdown", "id": "e28dc06a-21a0-4f02-98a9-b1ad34f6b2c2", "metadata": {}, "source": [ "## グループ処理" ] }, { "cell_type": "markdown", "id": "9269f465-1eed-4e35-bf47-79a6997b3aa5", "metadata": {}, "source": [ "次のコードで、`type_jp`列のユニークな値を取得します。結果から、国立、公立、私立の3種類の大学のデータがあることがわかります。" ] }, { "cell_type": "code", "execution_count": 10, "id": "c8511b44-2cb7-48ac-bbfc-301e2f166d94", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (3, 1)
type_jp
cat
"国立"
"公立"
"私立"
" ], "text/plain": [ "shape: (3, 1)\n", "┌─────────┐\n", "│ type_jp │\n", "│ --- │\n", "│ cat │\n", "╞═════════╡\n", "│ 国立 │\n", "│ 公立 │\n", "│ 私立 │\n", "└─────────┘" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.select(pl.col('type_jp').unique())" ] }, { "cell_type": "markdown", "id": "c73408a7-cc4e-4782-a5fc-bdf9d821ad85", "metadata": {}, "source": [ "`type_jp`列のような分類列に対して、グループ化し、各個グループのデータを集計を行うために、`group_by()`と`agg()`を使います。\n", "\n", "* `group_by()`: 指定した列の値ごとにデータをグループ化します。\n", "* `agg()`: グループ化した後に、各グループに対して特定の集計処理(例えば、カウント、合計、平均など)を実行します。\n", "\n", "次の例では`pl.len()`で各個グループの長さを取得します。" ] }, { "cell_type": "code", "execution_count": 11, "id": "70977191-5ec0-48a7-b10a-9cc80e39d83f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (3, 2)
type_jplen
catu32
"国立"86
"公立"101
"私立"626
" ], "text/plain": [ "shape: (3, 2)\n", "┌─────────┬─────┐\n", "│ type_jp ┆ len │\n", "│ --- ┆ --- │\n", "│ cat ┆ u32 │\n", "╞═════════╪═════╡\n", "│ 国立 ┆ 86 │\n", "│ 公立 ┆ 101 │\n", "│ 私立 ┆ 626 │\n", "└─────────┴─────┘" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.group_by('type_jp').agg(pl.len())" ] }, { "cell_type": "markdown", "id": "0ed5638a-d9f1-428a-a28e-b0011009bf22", "metadata": {}, "source": [ "次のコードは、各グループのサイズに加えて、偏差値列(difficulty_SD)の最小値、中間値、平均値、最大値を求めます。キーワード引数を使用して列名を指定します。" ] }, { "cell_type": "code", "execution_count": 12, "id": "341f6ebe-0d0b-4147-988c-95c3212f602b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "shape: (3, 6)
type_jpcountminmedianmeanmax
catu32f64f64f64f64
"国立"8635.751.752.91707370.5
"公立"10140.050.050.04239169.7
"私立"62635.038.540.80103470.0
" ], "text/plain": [ "shape: (3, 6)\n", "┌─────────┬───────┬──────┬────────┬───────────┬──────┐\n", "│ type_jp ┆ count ┆ min ┆ median ┆ mean ┆ max │\n", "│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n", "│ cat ┆ u32 ┆ f64 ┆ f64 ┆ f64 ┆ f64 │\n", "╞═════════╪═══════╪══════╪════════╪═══════════╪══════╡\n", "│ 国立 ┆ 86 ┆ 35.7 ┆ 51.7 ┆ 52.917073 ┆ 70.5 │\n", "│ 公立 ┆ 101 ┆ 40.0 ┆ 50.0 ┆ 50.042391 ┆ 69.7 │\n", "│ 私立 ┆ 626 ┆ 35.0 ┆ 38.5 ┆ 40.801034 ┆ 70.0 │\n", "└─────────┴───────┴──────┴────────┴───────────┴──────┘" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(df3\n", ".group_by('type_jp')\n", ".agg(\n", " count = pl.len(),\n", " min = pl.col('difficulty_SD').min(),\n", " median = pl.col('difficulty_SD').median(),\n", " mean = pl.col('difficulty_SD').mean(),\n", " max = pl.col('difficulty_SD').max(),\n", ")\n", ")" ] }, { "cell_type": "markdown", "id": "0665baf0-bbba-4630-8af1-164e7e8b4389", "metadata": {}, "source": [ "## グラフ" ] }, { "cell_type": "markdown", "id": "0cebd59c-3cf1-4e45-889d-b96628d1a44e", "metadata": {}, "source": [ "Polars自体にはグラフを出力する機能がありませんが、次のコマンドで`hvplot`と`geoviews`をインストールすれば、インタラクティブなグラフを作成することができます。\n", "\n", "```\n", "conda install hvplot geoviews\n", "```" ] }, { "cell_type": "markdown", "id": "c7b8b6f9-9a40-4ebb-92de-f8ed789e2552", "metadata": {}, "source": [ "### 棒グラフ" ] }, { "cell_type": "markdown", "id": "8c7b15b3-1a9b-4fbe-8993-18588076f7f9", "metadata": {}, "source": [ "以下のプログラムは、都道府県ごとにグループ化し、各都道府県にある大学の数を計算して、降順に並べ替え、最終的に棒グラフを作成します。グラフ作成関連のメソッドは全部`.plot`ネーミングスペースの下にあります。`bar()`メソッドの各個引数は以下のようです。\n", "\n", "* X軸に`x_label`(都道府県)、Y軸に`y_label`(大学の数)を設定します。\n", "* `rot=90`はX軸のラベル(都道府県名)を90度回転させ、縦向きに表示します。\n", "* `frame_width=900`はグラフの幅を900ピクセルに設定します。" ] }, { "cell_type": "code", "execution_count": 13, "id": "db6cd0f4-c798-447b-967d-701951e84382", "metadata": {}, "outputs": [ { "data": { "application/javascript": [ "(function(root) {\n", " function now() {\n", " return new Date();\n", " }\n", "\n", " var force = true;\n", " var py_version = '3.4.2'.replace('rc', '-rc.').replace('.dev', '-dev.');\n", " var reloading = false;\n", " var Bokeh = root.Bokeh;\n", "\n", " if (typeof (root._bokeh_timeout) === \"undefined\" || force) {\n", " root._bokeh_timeout = Date.now() + 5000;\n", " root._bokeh_failed_load = false;\n", " }\n", "\n", " function run_callbacks() {\n", " try {\n", " root._bokeh_onload_callbacks.forEach(function(callback) {\n", " if (callback != null)\n", " callback();\n", " });\n", " } finally {\n", " delete root._bokeh_onload_callbacks;\n", " }\n", " console.debug(\"Bokeh: all callbacks have finished\");\n", " }\n", "\n", " function load_libs(css_urls, js_urls, js_modules, js_exports, callback) {\n", " if (css_urls == null) css_urls = [];\n", " if (js_urls == null) js_urls = [];\n", " if (js_modules == null) js_modules = [];\n", " if (js_exports == null) js_exports = {};\n", "\n", " root._bokeh_onload_callbacks.push(callback);\n", "\n", " if (root._bokeh_is_loading > 0) {\n", " console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n", " return null;\n", " }\n", " if (js_urls.length === 0 && js_modules.length === 0 && Object.keys(js_exports).length === 0) {\n", " run_callbacks();\n", " return null;\n", " }\n", " if (!reloading) {\n", " console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n", " }\n", "\n", " function on_load() {\n", " root._bokeh_is_loading--;\n", " if (root._bokeh_is_loading === 0) {\n", " console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n", " run_callbacks()\n", " }\n", " }\n", " window._bokeh_on_load = on_load\n", "\n", " function on_error() {\n", " console.error(\"failed to load \" + url);\n", " }\n", "\n", " var skip = [];\n", " if (window.requirejs) {\n", " window.requirejs.config({'packages': {}, 'paths': {}, 'shim': {}});\n", " root._bokeh_is_loading = css_urls.length + 0;\n", " } else {\n", " root._bokeh_is_loading = css_urls.length + js_urls.length + js_modules.length + Object.keys(js_exports).length;\n", " }\n", "\n", " var existing_stylesheets = []\n", " var links = document.getElementsByTagName('link')\n", " for (var i = 0; i < links.length; i++) {\n", " var link = links[i]\n", " if (link.href != null) {\n", "\texisting_stylesheets.push(link.href)\n", " }\n", " }\n", " for (var i = 0; i < css_urls.length; i++) {\n", " var url = css_urls[i];\n", " if (existing_stylesheets.indexOf(url) !== -1) {\n", "\ton_load()\n", "\tcontinue;\n", " }\n", " const element = document.createElement(\"link\");\n", " element.onload = on_load;\n", " element.onerror = on_error;\n", " element.rel = \"stylesheet\";\n", " element.type = \"text/css\";\n", " element.href = url;\n", " console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n", " document.body.appendChild(element);\n", " } var existing_scripts = []\n", " var scripts = document.getElementsByTagName('script')\n", " for (var i = 0; i < scripts.length; i++) {\n", " var script = scripts[i]\n", " if (script.src != null) {\n", "\texisting_scripts.push(script.src)\n", " }\n", " }\n", " for (var i = 0; i < js_urls.length; i++) {\n", " var url = js_urls[i];\n", " if (skip.indexOf(url) !== -1 || existing_scripts.indexOf(url) !== -1) {\n", "\tif (!window.requirejs) {\n", "\t on_load();\n", "\t}\n", "\tcontinue;\n", " }\n", " var element = document.createElement('script');\n", " element.onload = on_load;\n", " element.onerror = on_error;\n", " element.async = false;\n", " element.src = url;\n", " console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n", " document.head.appendChild(element);\n", " }\n", " for (var i = 0; i < js_modules.length; i++) {\n", " var url = js_modules[i];\n", " if (skip.indexOf(url) !== -1 || existing_scripts.indexOf(url) !== -1) {\n", "\tif (!window.requirejs) {\n", "\t on_load();\n", "\t}\n", "\tcontinue;\n", " }\n", " var element = document.createElement('script');\n", " element.onload = on_load;\n", " element.onerror = on_error;\n", " element.async = false;\n", " element.src = url;\n", " element.type = \"module\";\n", " console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n", " document.head.appendChild(element);\n", " }\n", " for (const name in js_exports) {\n", " var url = js_exports[name];\n", " if (skip.indexOf(url) >= 0 || root[name] != null) {\n", "\tif (!window.requirejs) {\n", "\t on_load();\n", "\t}\n", "\tcontinue;\n", " }\n", " var element = document.createElement('script');\n", " element.onerror = on_error;\n", " element.async = false;\n", " element.type = \"module\";\n", " console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n", " element.textContent = `\n", " import ${name} from \"${url}\"\n", " window.${name} = ${name}\n", " window._bokeh_on_load()\n", " `\n", " document.head.appendChild(element);\n", " }\n", " if (!js_urls.length && !js_modules.length) {\n", " on_load()\n", " }\n", " };\n", "\n", " function inject_raw_css(css) {\n", " const element = document.createElement(\"style\");\n", " element.appendChild(document.createTextNode(css));\n", " document.body.appendChild(element);\n", " }\n", "\n", " var js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.4.2.min.js\", \"https://cdn.holoviz.org/panel/1.4.4/dist/panel.min.js\"];\n", " var js_modules = [];\n", " var js_exports = {};\n", " var css_urls = [];\n", " var inline_js = [ function(Bokeh) {\n", " Bokeh.set_log_level(\"info\");\n", " },\n", "function(Bokeh) {} // ensure no trailing comma for IE\n", " ];\n", "\n", " function run_inline_js() {\n", " if ((root.Bokeh !== undefined) || (force === true)) {\n", " for (var i = 0; i < inline_js.length; i++) {\n", "\ttry {\n", " inline_js[i].call(root, root.Bokeh);\n", "\t} catch(e) {\n", "\t if (!reloading) {\n", "\t throw e;\n", "\t }\n", "\t}\n", " }\n", " // Cache old bokeh versions\n", " if (Bokeh != undefined && !reloading) {\n", "\tvar NewBokeh = root.Bokeh;\n", "\tif (Bokeh.versions === undefined) {\n", "\t Bokeh.versions = new Map();\n", "\t}\n", "\tif (NewBokeh.version !== Bokeh.version) {\n", "\t Bokeh.versions.set(NewBokeh.version, NewBokeh)\n", "\t}\n", "\troot.Bokeh = Bokeh;\n", " }} else if (Date.now() < root._bokeh_timeout) {\n", " setTimeout(run_inline_js, 100);\n", " } else if (!root._bokeh_failed_load) {\n", " console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n", " root._bokeh_failed_load = true;\n", " }\n", " root._bokeh_is_initializing = false\n", " }\n", "\n", " function load_or_wait() {\n", " // Implement a backoff loop that tries to ensure we do not load multiple\n", " // versions of Bokeh and its dependencies at the same time.\n", " // In recent versions we use the root._bokeh_is_initializing flag\n", " // to determine whether there is an ongoing attempt to initialize\n", " // bokeh, however for backward compatibility we also try to ensure\n", " // that we do not start loading a newer (Panel>=1.0 and Bokeh>3) version\n", " // before older versions are fully initialized.\n", " if (root._bokeh_is_initializing && Date.now() > root._bokeh_timeout) {\n", " root._bokeh_is_initializing = false;\n", " root._bokeh_onload_callbacks = undefined;\n", " console.log(\"Bokeh: BokehJS was loaded multiple times but one version failed to initialize.\");\n", " load_or_wait();\n", " } else if (root._bokeh_is_initializing || (typeof root._bokeh_is_initializing === \"undefined\" && root._bokeh_onload_callbacks !== undefined)) {\n", " setTimeout(load_or_wait, 100);\n", " } else {\n", " root._bokeh_is_initializing = true\n", " root._bokeh_onload_callbacks = []\n", " var bokeh_loaded = Bokeh != null && (Bokeh.version === py_version || (Bokeh.versions !== undefined && Bokeh.versions.has(py_version)));\n", " if (!reloading && !bokeh_loaded) {\n", "\troot.Bokeh = undefined;\n", " }\n", " load_libs(css_urls, js_urls, js_modules, js_exports, function() {\n", "\tconsole.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n", "\trun_inline_js();\n", " });\n", " }\n", " }\n", " // Give older versions of the autoload script a head-start to ensure\n", " // they initialize before we start loading newer version.\n", " setTimeout(load_or_wait, 100)\n", "}(window));" ], "application/vnd.holoviews_load.v0+json": "(function(root) {\n function now() {\n return new Date();\n }\n\n var force = true;\n var py_version = '3.4.2'.replace('rc', '-rc.').replace('.dev', '-dev.');\n var reloading = false;\n var Bokeh = root.Bokeh;\n\n if (typeof (root._bokeh_timeout) === \"undefined\" || force) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) {\n if (callback != null)\n callback();\n });\n } finally {\n delete root._bokeh_onload_callbacks;\n }\n console.debug(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(css_urls, js_urls, js_modules, js_exports, callback) {\n if (css_urls == null) css_urls = [];\n if (js_urls == null) js_urls = [];\n if (js_modules == null) js_modules = [];\n if (js_exports == null) js_exports = {};\n\n root._bokeh_onload_callbacks.push(callback);\n\n if (root._bokeh_is_loading > 0) {\n console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls.length === 0 && js_modules.length === 0 && Object.keys(js_exports).length === 0) {\n run_callbacks();\n return null;\n }\n if (!reloading) {\n console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n }\n\n function on_load() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n run_callbacks()\n }\n }\n window._bokeh_on_load = on_load\n\n function on_error() {\n console.error(\"failed to load \" + url);\n }\n\n var skip = [];\n if (window.requirejs) {\n window.requirejs.config({'packages': {}, 'paths': {}, 'shim': {}});\n root._bokeh_is_loading = css_urls.length + 0;\n } else {\n root._bokeh_is_loading = css_urls.length + js_urls.length + js_modules.length + Object.keys(js_exports).length;\n }\n\n var existing_stylesheets = []\n var links = document.getElementsByTagName('link')\n for (var i = 0; i < links.length; i++) {\n var link = links[i]\n if (link.href != null) {\n\texisting_stylesheets.push(link.href)\n }\n }\n for (var i = 0; i < css_urls.length; i++) {\n var url = css_urls[i];\n if (existing_stylesheets.indexOf(url) !== -1) {\n\ton_load()\n\tcontinue;\n }\n const element = document.createElement(\"link\");\n element.onload = on_load;\n element.onerror = on_error;\n element.rel = \"stylesheet\";\n element.type = \"text/css\";\n element.href = url;\n console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n document.body.appendChild(element);\n } var existing_scripts = []\n var scripts = document.getElementsByTagName('script')\n for (var i = 0; i < scripts.length; i++) {\n var script = scripts[i]\n if (script.src != null) {\n\texisting_scripts.push(script.src)\n }\n }\n for (var i = 0; i < js_urls.length; i++) {\n var url = js_urls[i];\n if (skip.indexOf(url) !== -1 || existing_scripts.indexOf(url) !== -1) {\n\tif (!window.requirejs) {\n\t on_load();\n\t}\n\tcontinue;\n }\n var element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error;\n element.async = false;\n element.src = url;\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n for (var i = 0; i < js_modules.length; i++) {\n var url = js_modules[i];\n if (skip.indexOf(url) !== -1 || existing_scripts.indexOf(url) !== -1) {\n\tif (!window.requirejs) {\n\t on_load();\n\t}\n\tcontinue;\n }\n var element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error;\n element.async = false;\n element.src = url;\n element.type = \"module\";\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n for (const name in js_exports) {\n var url = js_exports[name];\n if (skip.indexOf(url) >= 0 || root[name] != null) {\n\tif (!window.requirejs) {\n\t on_load();\n\t}\n\tcontinue;\n }\n var element = document.createElement('script');\n element.onerror = on_error;\n element.async = false;\n element.type = \"module\";\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n element.textContent = `\n import ${name} from \"${url}\"\n window.${name} = ${name}\n window._bokeh_on_load()\n `\n document.head.appendChild(element);\n }\n if (!js_urls.length && !js_modules.length) {\n on_load()\n }\n };\n\n function inject_raw_css(css) {\n const element = document.createElement(\"style\");\n element.appendChild(document.createTextNode(css));\n document.body.appendChild(element);\n }\n\n var js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.4.2.min.js\", \"https://cdn.holoviz.org/panel/1.4.4/dist/panel.min.js\"];\n var js_modules = [];\n var js_exports = {};\n var css_urls = [];\n var inline_js = [ function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\nfunction(Bokeh) {} // ensure no trailing comma for IE\n ];\n\n function run_inline_js() {\n if ((root.Bokeh !== undefined) || (force === true)) {\n for (var i = 0; i < inline_js.length; i++) {\n\ttry {\n inline_js[i].call(root, root.Bokeh);\n\t} catch(e) {\n\t if (!reloading) {\n\t throw e;\n\t }\n\t}\n }\n // Cache old bokeh versions\n if (Bokeh != undefined && !reloading) {\n\tvar NewBokeh = root.Bokeh;\n\tif (Bokeh.versions === undefined) {\n\t Bokeh.versions = new Map();\n\t}\n\tif (NewBokeh.version !== Bokeh.version) {\n\t Bokeh.versions.set(NewBokeh.version, NewBokeh)\n\t}\n\troot.Bokeh = Bokeh;\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n }\n root._bokeh_is_initializing = false\n }\n\n function load_or_wait() {\n // Implement a backoff loop that tries to ensure we do not load multiple\n // versions of Bokeh and its dependencies at the same time.\n // In recent versions we use the root._bokeh_is_initializing flag\n // to determine whether there is an ongoing attempt to initialize\n // bokeh, however for backward compatibility we also try to ensure\n // that we do not start loading a newer (Panel>=1.0 and Bokeh>3) version\n // before older versions are fully initialized.\n if (root._bokeh_is_initializing && Date.now() > root._bokeh_timeout) {\n root._bokeh_is_initializing = false;\n root._bokeh_onload_callbacks = undefined;\n console.log(\"Bokeh: BokehJS was loaded multiple times but one version failed to initialize.\");\n load_or_wait();\n } else if (root._bokeh_is_initializing || (typeof root._bokeh_is_initializing === \"undefined\" && root._bokeh_onload_callbacks !== undefined)) {\n setTimeout(load_or_wait, 100);\n } else {\n root._bokeh_is_initializing = true\n root._bokeh_onload_callbacks = []\n var bokeh_loaded = Bokeh != null && (Bokeh.version === py_version || (Bokeh.versions !== undefined && Bokeh.versions.has(py_version)));\n if (!reloading && !bokeh_loaded) {\n\troot.Bokeh = undefined;\n }\n load_libs(css_urls, js_urls, js_modules, js_exports, function() {\n\tconsole.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n\trun_inline_js();\n });\n }\n }\n // Give older versions of the autoload script a head-start to ensure\n // they initialize before we start loading newer version.\n setTimeout(load_or_wait, 100)\n}(window));" }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "\n", "if ((window.PyViz === undefined) || (window.PyViz instanceof HTMLElement)) {\n", " window.PyViz = {comms: {}, comm_status:{}, kernels:{}, receivers: {}, plot_index: []}\n", "}\n", "\n", "\n", " function JupyterCommManager() {\n", " }\n", "\n", " JupyterCommManager.prototype.register_target = function(plot_id, comm_id, msg_handler) {\n", " if (window.comm_manager || ((window.Jupyter !== undefined) && (Jupyter.notebook.kernel != null))) {\n", " var comm_manager = window.comm_manager || Jupyter.notebook.kernel.comm_manager;\n", " comm_manager.register_target(comm_id, function(comm) {\n", " comm.on_msg(msg_handler);\n", " });\n", " } else if ((plot_id in window.PyViz.kernels) && (window.PyViz.kernels[plot_id])) {\n", " window.PyViz.kernels[plot_id].registerCommTarget(comm_id, function(comm) {\n", " comm.onMsg = msg_handler;\n", " });\n", " } else if (typeof google != 'undefined' && google.colab.kernel != null) {\n", " google.colab.kernel.comms.registerTarget(comm_id, (comm) => {\n", " var messages = comm.messages[Symbol.asyncIterator]();\n", " function processIteratorResult(result) {\n", " var message = result.value;\n", " console.log(message)\n", " var content = {data: message.data, comm_id};\n", " var buffers = []\n", " for (var buffer of message.buffers || []) {\n", " buffers.push(new DataView(buffer))\n", " }\n", " var metadata = message.metadata || {};\n", " var msg = {content, buffers, metadata}\n", " msg_handler(msg);\n", " return messages.next().then(processIteratorResult);\n", " }\n", " return messages.next().then(processIteratorResult);\n", " })\n", " }\n", " }\n", "\n", " JupyterCommManager.prototype.get_client_comm = function(plot_id, comm_id, msg_handler) {\n", " if (comm_id in window.PyViz.comms) {\n", " return window.PyViz.comms[comm_id];\n", " } else if (window.comm_manager || ((window.Jupyter !== undefined) && (Jupyter.notebook.kernel != null))) {\n", " var comm_manager = window.comm_manager || Jupyter.notebook.kernel.comm_manager;\n", " var comm = comm_manager.new_comm(comm_id, {}, {}, {}, comm_id);\n", " if (msg_handler) {\n", " comm.on_msg(msg_handler);\n", " }\n", " } else if ((plot_id in window.PyViz.kernels) && (window.PyViz.kernels[plot_id])) {\n", " var comm = window.PyViz.kernels[plot_id].connectToComm(comm_id);\n", " comm.open();\n", " if (msg_handler) {\n", " comm.onMsg = msg_handler;\n", " }\n", " } else if (typeof google != 'undefined' && google.colab.kernel != null) {\n", " var comm_promise = google.colab.kernel.comms.open(comm_id)\n", " comm_promise.then((comm) => {\n", " window.PyViz.comms[comm_id] = comm;\n", " if (msg_handler) {\n", " var messages = comm.messages[Symbol.asyncIterator]();\n", " function processIteratorResult(result) {\n", " var message = result.value;\n", " var content = {data: message.data};\n", " var metadata = message.metadata || {comm_id};\n", " var msg = {content, metadata}\n", " msg_handler(msg);\n", " return messages.next().then(processIteratorResult);\n", " }\n", " return messages.next().then(processIteratorResult);\n", " }\n", " }) \n", " var sendClosure = (data, metadata, buffers, disposeOnDone) => {\n", " return comm_promise.then((comm) => {\n", " comm.send(data, metadata, buffers, disposeOnDone);\n", " });\n", " };\n", " var comm = {\n", " send: sendClosure\n", " };\n", " }\n", " window.PyViz.comms[comm_id] = comm;\n", " return comm;\n", " }\n", " window.PyViz.comm_manager = new JupyterCommManager();\n", " \n", "\n", "\n", "var JS_MIME_TYPE = 'application/javascript';\n", "var HTML_MIME_TYPE = 'text/html';\n", "var EXEC_MIME_TYPE = 'application/vnd.holoviews_exec.v0+json';\n", "var CLASS_NAME = 'output';\n", "\n", "/**\n", " * Render data to the DOM node\n", " */\n", "function render(props, node) {\n", " var div = document.createElement(\"div\");\n", " var script = document.createElement(\"script\");\n", " node.appendChild(div);\n", " node.appendChild(script);\n", "}\n", "\n", "/**\n", " * Handle when a new output is added\n", " */\n", "function handle_add_output(event, handle) {\n", " var output_area = handle.output_area;\n", " var output = handle.output;\n", " if ((output.data == undefined) || (!output.data.hasOwnProperty(EXEC_MIME_TYPE))) {\n", " return\n", " }\n", " var id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n", " var toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n", " if (id !== undefined) {\n", " var nchildren = toinsert.length;\n", " var html_node = toinsert[nchildren-1].children[0];\n", " html_node.innerHTML = output.data[HTML_MIME_TYPE];\n", " var scripts = [];\n", " var nodelist = html_node.querySelectorAll(\"script\");\n", " for (var i in nodelist) {\n", " if (nodelist.hasOwnProperty(i)) {\n", " scripts.push(nodelist[i])\n", " }\n", " }\n", "\n", " scripts.forEach( function (oldScript) {\n", " var newScript = document.createElement(\"script\");\n", " var attrs = [];\n", " var nodemap = oldScript.attributes;\n", " for (var j in nodemap) {\n", " if (nodemap.hasOwnProperty(j)) {\n", " attrs.push(nodemap[j])\n", " }\n", " }\n", " attrs.forEach(function(attr) { newScript.setAttribute(attr.name, attr.value) });\n", " newScript.appendChild(document.createTextNode(oldScript.innerHTML));\n", " oldScript.parentNode.replaceChild(newScript, oldScript);\n", " });\n", " if (JS_MIME_TYPE in output.data) {\n", " toinsert[nchildren-1].children[1].textContent = output.data[JS_MIME_TYPE];\n", " }\n", " output_area._hv_plot_id = id;\n", " if ((window.Bokeh !== undefined) && (id in Bokeh.index)) {\n", " window.PyViz.plot_index[id] = Bokeh.index[id];\n", " } else {\n", " window.PyViz.plot_index[id] = null;\n", " }\n", " } else if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n", " var bk_div = document.createElement(\"div\");\n", " bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n", " var script_attrs = bk_div.children[0].attributes;\n", " for (var i = 0; i < script_attrs.length; i++) {\n", " toinsert[toinsert.length - 1].childNodes[1].setAttribute(script_attrs[i].name, script_attrs[i].value);\n", " }\n", " // store reference to server id on output_area\n", " output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n", " }\n", "}\n", "\n", "/**\n", " * Handle when an output is cleared or removed\n", " */\n", "function handle_clear_output(event, handle) {\n", " var id = handle.cell.output_area._hv_plot_id;\n", " var server_id = handle.cell.output_area._bokeh_server_id;\n", " if (((id === undefined) || !(id in PyViz.plot_index)) && (server_id !== undefined)) { return; }\n", " var comm = window.PyViz.comm_manager.get_client_comm(\"hv-extension-comm\", \"hv-extension-comm\", function () {});\n", " if (server_id !== null) {\n", " comm.send({event_type: 'server_delete', 'id': server_id});\n", " return;\n", " } else if (comm !== null) {\n", " comm.send({event_type: 'delete', 'id': id});\n", " }\n", " delete PyViz.plot_index[id];\n", " if ((window.Bokeh !== undefined) & (id in window.Bokeh.index)) {\n", " var doc = window.Bokeh.index[id].model.document\n", " doc.clear();\n", " const i = window.Bokeh.documents.indexOf(doc);\n", " if (i > -1) {\n", " window.Bokeh.documents.splice(i, 1);\n", " }\n", " }\n", "}\n", "\n", "/**\n", " * Handle kernel restart event\n", " */\n", "function handle_kernel_cleanup(event, handle) {\n", " delete PyViz.comms[\"hv-extension-comm\"];\n", " window.PyViz.plot_index = {}\n", "}\n", "\n", "/**\n", " * Handle update_display_data messages\n", " */\n", "function handle_update_output(event, handle) {\n", " handle_clear_output(event, {cell: {output_area: handle.output_area}})\n", " handle_add_output(event, handle)\n", "}\n", "\n", "function register_renderer(events, OutputArea) {\n", " function append_mime(data, metadata, element) {\n", " // create a DOM node to render to\n", " var toinsert = this.create_output_subarea(\n", " metadata,\n", " CLASS_NAME,\n", " EXEC_MIME_TYPE\n", " );\n", " this.keyboard_manager.register_events(toinsert);\n", " // Render to node\n", " var props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n", " render(props, toinsert[0]);\n", " element.append(toinsert);\n", " return toinsert\n", " }\n", "\n", " events.on('output_added.OutputArea', handle_add_output);\n", " events.on('output_updated.OutputArea', handle_update_output);\n", " events.on('clear_output.CodeCell', handle_clear_output);\n", " events.on('delete.Cell', handle_clear_output);\n", " events.on('kernel_ready.Kernel', handle_kernel_cleanup);\n", "\n", " OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n", " safe: true,\n", " index: 0\n", " });\n", "}\n", "\n", "if (window.Jupyter !== undefined) {\n", " try {\n", " var events = require('base/js/events');\n", " var OutputArea = require('notebook/js/outputarea').OutputArea;\n", " if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n", " register_renderer(events, OutputArea);\n", " }\n", " } catch(err) {\n", " }\n", "}\n" ], "application/vnd.holoviews_load.v0+json": "\nif ((window.PyViz === undefined) || (window.PyViz instanceof HTMLElement)) {\n window.PyViz = {comms: {}, comm_status:{}, kernels:{}, receivers: {}, plot_index: []}\n}\n\n\n function JupyterCommManager() {\n }\n\n JupyterCommManager.prototype.register_target = function(plot_id, comm_id, msg_handler) {\n if (window.comm_manager || ((window.Jupyter !== undefined) && (Jupyter.notebook.kernel != null))) {\n var comm_manager = window.comm_manager || Jupyter.notebook.kernel.comm_manager;\n comm_manager.register_target(comm_id, function(comm) {\n comm.on_msg(msg_handler);\n });\n } else if ((plot_id in window.PyViz.kernels) && (window.PyViz.kernels[plot_id])) {\n window.PyViz.kernels[plot_id].registerCommTarget(comm_id, function(comm) {\n comm.onMsg = msg_handler;\n });\n } else if (typeof google != 'undefined' && google.colab.kernel != null) {\n google.colab.kernel.comms.registerTarget(comm_id, (comm) => {\n var messages = comm.messages[Symbol.asyncIterator]();\n function processIteratorResult(result) {\n var message = result.value;\n console.log(message)\n var content = {data: message.data, comm_id};\n var buffers = []\n for (var buffer of message.buffers || []) {\n buffers.push(new DataView(buffer))\n }\n var metadata = message.metadata || {};\n var msg = {content, buffers, metadata}\n msg_handler(msg);\n return messages.next().then(processIteratorResult);\n }\n return messages.next().then(processIteratorResult);\n })\n }\n }\n\n JupyterCommManager.prototype.get_client_comm = function(plot_id, comm_id, msg_handler) {\n if (comm_id in window.PyViz.comms) {\n return window.PyViz.comms[comm_id];\n } else if (window.comm_manager || ((window.Jupyter !== undefined) && (Jupyter.notebook.kernel != null))) {\n var comm_manager = window.comm_manager || Jupyter.notebook.kernel.comm_manager;\n var comm = comm_manager.new_comm(comm_id, {}, {}, {}, comm_id);\n if (msg_handler) {\n comm.on_msg(msg_handler);\n }\n } else if ((plot_id in window.PyViz.kernels) && (window.PyViz.kernels[plot_id])) {\n var comm = window.PyViz.kernels[plot_id].connectToComm(comm_id);\n comm.open();\n if (msg_handler) {\n comm.onMsg = msg_handler;\n }\n } else if (typeof google != 'undefined' && google.colab.kernel != null) {\n var comm_promise = google.colab.kernel.comms.open(comm_id)\n comm_promise.then((comm) => {\n window.PyViz.comms[comm_id] = comm;\n if (msg_handler) {\n var messages = comm.messages[Symbol.asyncIterator]();\n function processIteratorResult(result) {\n var message = result.value;\n var content = {data: message.data};\n var metadata = message.metadata || {comm_id};\n var msg = {content, metadata}\n msg_handler(msg);\n return messages.next().then(processIteratorResult);\n }\n return messages.next().then(processIteratorResult);\n }\n }) \n var sendClosure = (data, metadata, buffers, disposeOnDone) => {\n return comm_promise.then((comm) => {\n comm.send(data, metadata, buffers, disposeOnDone);\n });\n };\n var comm = {\n send: sendClosure\n };\n }\n window.PyViz.comms[comm_id] = comm;\n return comm;\n }\n window.PyViz.comm_manager = new JupyterCommManager();\n \n\n\nvar JS_MIME_TYPE = 'application/javascript';\nvar HTML_MIME_TYPE = 'text/html';\nvar EXEC_MIME_TYPE = 'application/vnd.holoviews_exec.v0+json';\nvar CLASS_NAME = 'output';\n\n/**\n * Render data to the DOM node\n */\nfunction render(props, node) {\n var div = document.createElement(\"div\");\n var script = document.createElement(\"script\");\n node.appendChild(div);\n node.appendChild(script);\n}\n\n/**\n * Handle when a new output is added\n */\nfunction handle_add_output(event, handle) {\n var output_area = handle.output_area;\n var output = handle.output;\n if ((output.data == undefined) || (!output.data.hasOwnProperty(EXEC_MIME_TYPE))) {\n return\n }\n var id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n var toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n if (id !== undefined) {\n var nchildren = toinsert.length;\n var html_node = toinsert[nchildren-1].children[0];\n html_node.innerHTML = output.data[HTML_MIME_TYPE];\n var scripts = [];\n var nodelist = html_node.querySelectorAll(\"script\");\n for (var i in nodelist) {\n if (nodelist.hasOwnProperty(i)) {\n scripts.push(nodelist[i])\n }\n }\n\n scripts.forEach( function (oldScript) {\n var newScript = document.createElement(\"script\");\n var attrs = [];\n var nodemap = oldScript.attributes;\n for (var j in nodemap) {\n if (nodemap.hasOwnProperty(j)) {\n attrs.push(nodemap[j])\n }\n }\n attrs.forEach(function(attr) { newScript.setAttribute(attr.name, attr.value) });\n newScript.appendChild(document.createTextNode(oldScript.innerHTML));\n oldScript.parentNode.replaceChild(newScript, oldScript);\n });\n if (JS_MIME_TYPE in output.data) {\n toinsert[nchildren-1].children[1].textContent = output.data[JS_MIME_TYPE];\n }\n output_area._hv_plot_id = id;\n if ((window.Bokeh !== undefined) && (id in Bokeh.index)) {\n window.PyViz.plot_index[id] = Bokeh.index[id];\n } else {\n window.PyViz.plot_index[id] = null;\n }\n } else if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n var bk_div = document.createElement(\"div\");\n bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n var script_attrs = bk_div.children[0].attributes;\n for (var i = 0; i < script_attrs.length; i++) {\n toinsert[toinsert.length - 1].childNodes[1].setAttribute(script_attrs[i].name, script_attrs[i].value);\n }\n // store reference to server id on output_area\n output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n }\n}\n\n/**\n * Handle when an output is cleared or removed\n */\nfunction handle_clear_output(event, handle) {\n var id = handle.cell.output_area._hv_plot_id;\n var server_id = handle.cell.output_area._bokeh_server_id;\n if (((id === undefined) || !(id in PyViz.plot_index)) && (server_id !== undefined)) { return; }\n var comm = window.PyViz.comm_manager.get_client_comm(\"hv-extension-comm\", \"hv-extension-comm\", function () {});\n if (server_id !== null) {\n comm.send({event_type: 'server_delete', 'id': server_id});\n return;\n } else if (comm !== null) {\n comm.send({event_type: 'delete', 'id': id});\n }\n delete PyViz.plot_index[id];\n if ((window.Bokeh !== undefined) & (id in window.Bokeh.index)) {\n var doc = window.Bokeh.index[id].model.document\n doc.clear();\n const i = window.Bokeh.documents.indexOf(doc);\n if (i > -1) {\n window.Bokeh.documents.splice(i, 1);\n }\n }\n}\n\n/**\n * Handle kernel restart event\n */\nfunction handle_kernel_cleanup(event, handle) {\n delete PyViz.comms[\"hv-extension-comm\"];\n window.PyViz.plot_index = {}\n}\n\n/**\n * Handle update_display_data messages\n */\nfunction handle_update_output(event, handle) {\n handle_clear_output(event, {cell: {output_area: handle.output_area}})\n handle_add_output(event, handle)\n}\n\nfunction register_renderer(events, OutputArea) {\n function append_mime(data, metadata, element) {\n // create a DOM node to render to\n var toinsert = this.create_output_subarea(\n metadata,\n CLASS_NAME,\n EXEC_MIME_TYPE\n );\n this.keyboard_manager.register_events(toinsert);\n // Render to node\n var props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n render(props, toinsert[0]);\n element.append(toinsert);\n return toinsert\n }\n\n events.on('output_added.OutputArea', handle_add_output);\n events.on('output_updated.OutputArea', handle_update_output);\n events.on('clear_output.CodeCell', handle_clear_output);\n events.on('delete.Cell', handle_clear_output);\n events.on('kernel_ready.Kernel', handle_kernel_cleanup);\n\n OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n safe: true,\n index: 0\n });\n}\n\nif (window.Jupyter !== undefined) {\n try {\n var events = require('base/js/events');\n var OutputArea = require('notebook/js/outputarea').OutputArea;\n if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n register_renderer(events, OutputArea);\n }\n } catch(err) {\n }\n}\n" }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.holoviews_exec.v0+json": "", "text/html": [ "
\n", "
\n", "
\n", "" ] }, "metadata": { "application/vnd.holoviews_exec.v0+json": { "id": "p1002" } }, "output_type": "display_data" }, { "data": {}, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.holoviews_exec.v0+json": "", "text/html": [ "
\n", "
\n", "
\n", "" ], "text/plain": [ ":Bars [都道府県] (大学の数)" ] }, "execution_count": 13, "metadata": { "application/vnd.holoviews_exec.v0+json": { "id": "p1004" } }, "output_type": "execute_result" } ], "source": [ "x_label = \"都道府県\"\n", "y_label = \"大学の数\"\n", "(df3\n", ".group_by(pl.col(\"state_jp\").alias(x_label))\n", ".agg(\n", " pl.len().alias(y_label)\n", ")\n", ".sort(y_label, descending=True)\n", ".plot.bar(\n", " x=x_label, y=y_label, \n", " rot=90, frame_width=900\n", ")\n", ")" ] }, { "cell_type": "markdown", "id": "5024c4ca-ace1-40d5-ab02-2177c8841209", "metadata": {}, "source": [ "### スタック棒グラフ" ] }, { "cell_type": "markdown", "id": "2c5174f3-1d44-4351-8697-bca0a0c696d0", "metadata": {}, "source": [ "以下のプログラムは、先のプログラムに比べて都道府県ごとだけでなく、さらに大学の種類(国立、公立、私立)ごとにもグループ化し、スタックされた棒グラフとして視覚化します。上のプログラムとの主な違いについて説明します。\n", "\n", "1. `group_by()`に`type_jp`列を追加します。これで、都道府県と大学の種類でグループ化します。\n", "2. 並び替えの時、各都道府県内で大学の種類ごとの数の合計に基づいて降順にソートします。これにより、都道府県全体の大学の数が多い順に並べ替えられます。ここで`over()`を使って、演算式内部でグループ化処理を行います。[over()の使い方](over)\n", "3. `bar()`の引数について、`by=\"type_jp\"`によって大学の種類ごとに異なる色で表示され、各都道府県内での大学の種類の分布が視覚的にわかりやすくなります。`stacked=True`により、大学の種類ごとの棒グラフが積み上げられて表示されます。" ] }, { "cell_type": "code", "execution_count": 14, "id": "6e1f1ab8-2cce-45f1-906e-cf7d654e40f7", "metadata": {}, "outputs": [ { "data": {}, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.holoviews_exec.v0+json": "", "text/html": [ "
\n", "
\n", "
\n", "" ], "text/plain": [ ":Bars [都道府県,type_jp] (大学の数)" ] }, "execution_count": 14, "metadata": { "application/vnd.holoviews_exec.v0+json": { "id": "p1066" } }, "output_type": "execute_result" } ], "source": [ "(df3\n", ".group_by(\n", " pl.col(\"state_jp\").alias(x_label), \n", " \"type_jp\"\n", ")\n", ".agg(\n", " pl.len().alias(y_label)\n", ")\n", ".sort(pl.col(y_label).sum().over(x_label), descending=True)\n", ".plot.bar(\n", " x=x_label, y=y_label, by=\"type_jp\", \n", " stacked=True, rot=90, \n", " frame_width=900, frame_height=450\n", ")\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "0f4fe39e-222a-4cd7-98af-27409de8ed32", "metadata": {}, "source": [ "### 地図付き散布グラフ\n", "\n", "次のプログラムは、経度と緯度に基づいてポイントプロットを作成し、大学の種類に基づいて色分けし、インタラクティブな地理情報を含むマップとして表示します。\n", "\n", "* `by=\"type_jp\"`: 大学の種類(type_jp)ごとにポイントを色分けします。これにより、異なる種類の大学が異なる色で表示されます。\n", "* `hover_cols=['name_jp']`: ポイントにマウスをホバーしたときに表示する情報として、大学名(name_jp)を指定します。これにより、ユーザーがポイントにカーソルを合わせたときに大学名が表示されます。\n", "* `tiles=True`: 地図タイルを有効にします。これにより、背景に地図が表示され、ポイントの地理的な位置が視覚的にわかりやすくなります。\n", "* `geo=True`: 地理情報プロットを有効にします。これにより、経度と緯度に基づいてポイントが正確に地図上に配置されます。" ] }, { "cell_type": "code", "execution_count": 17, "id": "35ebe82a-782f-4f85-b17f-a06af601f4ce", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\micromamba\\envs\\jupyter4\\Lib\\site-packages\\holoviews\\core\\data\\pandas.py:239: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.\n", " dataset.data.groupby(group_by, sort=False)]\n" ] }, { "data": {}, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.holoviews_exec.v0+json": "", "text/html": [ "
\n", "
\n", "
\n", "" ], "text/plain": [ ":Overlay\n", " .WMTS.I :WMTS [Longitude,Latitude]\n", " .NdOverlay.I :NdOverlay [type_jp]\n", " :Points [longitude,latitude] (name_jp)" ] }, "execution_count": 17, "metadata": { "application/vnd.holoviews_exec.v0+json": { "id": "p1450" } }, "output_type": "execute_result" } ], "source": [ "df3.plot.points(\n", " 'longitude', 'latitude', \n", " by=\"type_jp\", \n", " hover_cols=['name_jp'],\n", " tiles=True, \n", " geo=True, \n", " frame_width=600, \n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" } }, "nbformat": 4, "nbformat_minor": 5 }