{DAY 28} Matplotlib 绘图2

前言

这篇文章会延伸昨天所学

改变参数的使用

并且画出更多的图表

文章内容分别是
3. 折线图、散布图跟柱状图
4. 长条图
5. 绘制在子图上

折线图、散布图跟柱状图

昨天画过的图今天来更改参数

将图表的展示方式做一点调整

figsize调整图片大小，style改变线条的样式，color改变线条的颜色，legend指定图例

group_b.plot(y=["math score","reading score"],figsize=(20,5), style="--", color=["purple","pink"], legend=["math score","reading score"])

接下来将同样的比较数据放到散布图看看每个人成绩分布的情形

plt.scatter(group_b.index ,group_b["math score"],color="purple")
plt.scatter(group_b.index, group_b["reading score"],color="pink")

若是现在想要依照成绩高低看分布的情行

可以使用柱状图看看成绩的分布状况

利用alpha调整透明度，color调整颜色，让叠加在一起的图更好比较

plt.title() 设置标题

plt.xlabel()， plt.ylabel.()分别设置x,y轴名称

plt.legend()设置图例

plt.hist(group_b["math score"],color="g",alpha=0.3) 
plt.hist(group_b["reading score"],alpha=0.4)
plt.hist(group_b["writing score"],alpha=0.5,color="pink")
plt.title('score distribution')
plt.xlabel("score")
plt.ylabel("numbers")

长条图

现在如果想看各组在各科的成绩平均数比较

可以使用长条图表示

首先也是利用.groupby()再接着使用.mean()算出平均数

race_ethnicity = df.groupby("race/ethnicity").mean()
race_ethnicity

下面介绍两种绘图的方式

直接利用整理过後的dataframe接上.plot.bar()

.plot()之後才在括号里面调整参数

race_ethnicity.plot(kind='bar',  #图表类型
                    title='scores in different group',  #标题
                    xlabel='gruoup',  #x轴标题
                    ylabel='score',  #y轴标题
                    legend=True,  # 显示图例
                    figsize=(10, 5))  # 设定图表大小

绘制在子图上

现在练习将上面画过的图

利用子图的排列

放在同一张画布上

先建立四张子图

fig = plt.figure(figsize=(20,10))
axe1 = fig.add_subplot(2, 2, 1) 
axe2 = fig.add_subplot(2, 2, 2) 
axe3 = fig.add_subplot(2, 2, 3) 
axe4 = fig.add_subplot(2, 2, 4)

分别将四张图表放上去

子图若要加上标题有两种方式

ax.title.set_text(" ")
ax.set_title(" ")

#子图一放上性别占整体的比例
axe1.pie(numbers_of_gender, labels=type_of_gender,autopct="%0.2f%%")
axe1.title.set_text("portion of gender")

#子图二放上各组别占整体的比例
axe2.pie(amounts, labels=category,autopct="%0.2f%%")
axe2.title.set_text("portion of groups")

#子图三放上各组在各科的成绩平均数比较
axe3.set_title("scores on different group")
race_ethnicity.plot.bar(ax=axe3)

#子图四看整体资料依照成绩高低的分布
axe4.hist(df["math score"],color="g",alpha=0.3) #记得调整透明度
axe4.hist(df["reading score"],alpha=0.4)
axe4.hist(df["writing score"],alpha=0.6,color="pink")
score_labels=["math score","reading score","writing score"] 
axe4.legend(labels=score_labels)
axe4.set_title("scores distribution of all")