Python:Matplotlib 和 Seaborn 练习测试 (二十六)

在此类工作区中,你将能够练习在课程中学到的可视化技巧。在此工作区中,你将练习为分类数据创建单变量图形。

# prerequisite package imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb

%matplotlib inline

# solution script imports
from solutions_univ import bar_chart_solution_1, bar_chart_solution_2

在此工作区中,你要处理的数据集由视频系列游戏 Pokémon 中的角色属性组成。该数据来自此 GitHub 代码库中的信息数据库。

pokemon = pd.read_csv('./data/pokemon.csv')
pokemon.head()
id species generation_id height weight base_experience type_1 type_2 hp attack defense speed special-attack special-defense
0 1 bulbasaur 1 0.7 6.9 64 grass poison 45 49 49 45 65 65
1 2 ivysaur 1 1.0 13.0 142 grass poison 60 62 63 60 80 80
2 3 venusaur 1 2.0 100.0 236 grass poison 80 82 83 80 100 100
3 4 charmander 1 0.6 8.5 62 fire NaN 39 52 43 65 60 50
4 5 charmeleon 1 1.1 19.0 142 fire NaN 58 64 58 80 80 65

任务 1:该系列游戏出了很多种神奇宝贝。每一代推出了多少种神奇宝贝?请根据 'generation_id' 列绘制出频率条形图

# YOUR CODE HERE
base_color = sb.color_palette()[0]
sb.countplot(data = pokemon, x = 'generation_id', color = base_color)
<matplotlib.axes._subplots.AxesSubplot at 0x7f51fb5935f8>

file

创建好图表后,运行以下单元格以检查我们的解决方案的输出。你的可视化图表并非必须和我们的一样,但是应该能够得出相同的结论。

bar_chart_solution_1()
I used seaborn's countplot function to generate this chart. I also added an additional argument so that each bar has the same color.

file

任务 2:每种神奇宝贝都具有一到两种在进攻和防御时起到作用的'属性' 。每种属性的频率是多少?以下代码创建了一个新的 DataFrame,该 DataFrame 将所有属性计数放入了一个列中。

pkmn_types = pokemon.melt(id_vars = ['id','species'], 
                          value_vars = ['type_1', 'type_2'], 
                          var_name = 'type_level', value_name = 'type').dropna()
pkmn_types.head()
id species type_level type
0 1 bulbasaur type_1 grass
1 2 ivysaur type_1 grass
2 3 venusaur type_1 grass
3 4 charmander type_1 fire
4 5 charmeleon type_1 fire

你的任务是使用此 DataFrame 创建一个神奇宝贝与每种属性的比例_相对频率_图形,按频率从高到底排序。提示:长条之和应该大于 100%,因为很多神奇宝贝具有两种属性。在使用分母计算相对频率时请注意这一点。

# YOUR CODE HERE 相对频率
n_points = pkmn_types.shape[0]
max_count = pkmn_types['type'].value_counts().max()
max_prop = max_count / n_points

# generate tick mark locations and names
tick_props = np.arange(0, max_prop, 0.05)
tick_names = ['{:0.2f}'.format(v) for v in tick_props]

# create the plot
base_color = sb.color_palette()[0]
sb.countplot(data = pkmn_types, y = 'type',  order = pkmn_types['type'].value_counts(sort = True).index, color = base_color)
plt.xticks(tick_props * n_points, tick_names)
plt.xlabel('proportion')
Text(0.5,0,'proportion')

file

bar_chart_solution_2()
I created a horizontal bar chart since there are a lot of Pokemon types. The unique() method was used to get the number of different Pokemon species. I also added an xlabel call to make sure it was clear the bar length represents a relative frequency.

file

为者常成,行者常至