气象学家公众号 发表于 2024-3-6 20:44:27

Python-matplotlib 学术散点图 EE 统计及绘制

引言
之前的绘制图文Python-matplotlib 学术散点图完善Python-matplotlib 学术型散点图绘制 教程中,对学术散点图已经进行了较为完善的绘制教程,但这几天的通过准备 论文图表再现计划 以及后台小伙伴的留言,发现在绘制的相关性散点图中,各个范围的 Expected Error (EE)的统计个数没有在图表中进行展示 ,即下图中左下角的信息没有绘制。



要完成上述的信息添加,只要涉及的知识点为 数据统计 的简单数据处理过程。下面就此问题进行详细的讲解。
数据处理
要完成数据统计操作,首先要先进行三条拟合线的制作,具体如下:
#导入数据拟合函数
from scipy.stats import linregress
x2 = np.linspace(-10,10)
#制作上拟合线数据
up_y2 = 1.15*x2 + 0.05
#制作下拟合线数据
down_y2 = 0.85*x2 - 0.05
#进行拟合
line_01 = linregress(x2,y2)
line_top = linregress(x2,up_y2)
line_bopptom = linregress(x2,down_y2)

其中,up_y2和down_y2 具体设置可以参考之前的推文Python-matplotlib 学术散点图完善 ,linregress () 拟合的结果如下:


slope 为斜率,intercept 为截距,rvalue 为相关系数 ,pvalue 为p值,stderr 为标准误差。

而原始的数据集如下(部分):


接下来我们根据相同的x值构建对应拟合线的三个y值,具体代码如下:
data_select = data_select.copy()

data_select['true_y'] = data_select.true_data.values
data_select['top_y'] = data_select['true_data'].apply(lambda x : line_top*x + line_top)
data_select['bottom_y'] = data_select['true_data'].apply(lambda x : line_bopptom*x + line_bopptom)
data_select.head()
这里涉及到pandas 处理数据常用 的 apply()函数,该方法对一般的数据处理步骤中经常使用,希望大家能够掌握。构建后的数据如下:



而判断 各个 Expected Error 的依据就是根据所构建的 top_y 、bottom_y和 model01_estimated 。将三者进行对比分析即可。

统计个数
所需数据构建好后,就可根据pandas 的数据选择操作进行筛选,最后统计个数即可,具体代码如下:
#构建 选择条件
top_condi = (data_select['model01_estimated'] > data_select['top_y'])
bottom_condi = (data_select['model01_estimated'] < data_select['bottom_y'])
bottom_top = ((data_select['model01_estimated'] < data_select['top_y']) & \
            (data_select['model01_estimated'] > data_select['bottom_y']))
all_data = len(data_select)
top_counts = len(data_select)
bottom_counts = len(data_select)
bottom_top_counts = len(data_select)
进而求出不同 Expected Error 内的数据个数,本实例结果如下:
all_data = 4348
top_counts = 1681
bottom_counts = 404
bottom_top_counts = 2263
数据可视化
数据可视化的绘制相对就比较简单的,大都和之前的推文也都一样,唯一不同的就是添加新内容部分,具体代码如下:
label_font = {'size':'22','weight':'medium','color':'black'}
ax.text(.7,.25,s='Within EE = ' + '{:.0%}'.format(bottom_top_counts/all_data),transform = ax.transAxes,
      ha='left', va='center',fontdict=text_font)
ax.text(.7,.18,s='Above EE = ' + '{:.0%}'.format(top_counts/all_data),transform = ax.transAxes,
      ha='left', va='center',fontdict=text_font)
ax.text(.7,.11,s='Below EE = ' + '{:.0%}'.format(bottom_counts/all_data),transform = ax.transAxes,
      ha='left', va='center',fontdict=text_font)

完整代码如下:
import pandas as pd
import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error,r2_score
from matplotlib.pyplot import MultipleLocator
#统一修改字体
plt.rcParams['font.family'] = ['Arial']

N = len(test_data['true_data'])
x = test_data['true_data'].values.ravel() #真实值
y = test_data['model01_estimated'].values.ravel()#预测值
C=round(r2_score(x,y),4)
rmse = round(np.sqrt(mean_squared_error(x,y)),3)
#绘制拟合线
x2 = np.linspace(-10,10)
y2=x2
def f_1(x, A, B):
    return A*x + B
A1, B1 = optimize.curve_fit(f_1, x, y)
y3 = A1*x + B1
#开始绘图
fig, ax = plt.subplots(figsize=(7,5),dpi=200)
ax.scatter(x, y,edgecolor=None, c='k', s=12,marker='s')
ax.plot(x2,y2,color='k',linewidth=1.5,linestyle='-',zorder=2)
ax.plot(x,y3,color='r',linewidth=2,linestyle='-',zorder=2)

#添加上线和下线
ax.plot(x2,up_y2,color='k',lw=1.,ls='--',zorder=2,alpha=.8)
ax.plot(x2,down_y2,color='k',lw=1.,ls='--',zorder=2,alpha=.8)
fontdict1 = {"size":17,"color":"k",}
ax.set_xlabel("True Values", fontdict=fontdict1)
ax.set_ylabel("Estimated Values ", fontdict=fontdict1)
ax.grid(which='major',axis='y',ls='--',c='k',alpha=.7)
ax.set_axisbelow(True)
ax.set_xlim((0, 2.0))
ax.set_ylim((0, 2.0))
ax.set_xticks(np.arange(0, 2.2, step=0.2))
ax.set_yticks(np.arange(0, 2.2, step=0.2))

#设置刻度间隔
# x_major_locator=MultipleLocator(.5)
# #把x轴的刻度间隔设置为.5,并存在变量里
# y_major_locator=MultipleLocator(.5)
# ax.xaxis.set_major_locator(x_major_locator)
# #把x轴的主刻度设置为.5的倍数
# ax.yaxis.set_major_locator(y_major_locator)

for spine in ['top','left','right']:
    ax.spines.set_visible(None)
ax.spines['bottom'].set_color('k')
ax.tick_params(bottom=True,direction='out',labelsize=14,width=1.5,length=4,
            left=False)
#ax.tick_params()
#添加题目
titlefontdict = {"size":20,"color":"k",}
ax.set_title('Scatter plot of True data and Model Estimated',titlefontdict,pad=20)
#ax.set_title()
fontdict = {"size":16,"color":"k",'weight':'bold'}
ax.text(0.1,1.8,r'$R^2=$'+str(round(C,3)),fontdict=fontdict)
ax.text(0.1,1.6,"RMSE="+str(rmse),fontdict=fontdict)
ax.text(0.1,1.4,r'$y=$'+str(round(A1,3))+'$x$'+" + "+str(round(B1,3)),fontdict=fontdict)
ax.text(0.1,1.2,r'$N=$'+ str(N),fontdict=fontdict)

#添加上下线的统计个数
text_font = {'size':'15','weight':'medium','color':'black'}
label_font = {'size':'22','weight':'medium','color':'black'}
ax.text(.9,.9,"(a)",transform = ax.transAxes,fontdict=text_font,zorder=4)

ax.text(.7,.25,s='Within EE = ' + '{:.0%}'.format(bottom_top_counts/all_data),transform = ax.transAxes,
      ha='left', va='center',fontdict=text_font)
ax.text(.7,.18,s='Above EE = ' + '{:.0%}'.format(top_counts/all_data),transform = ax.transAxes,
      ha='left', va='center',fontdict=text_font)
ax.text(.7,.11,s='Below EE = ' + '{:.0%}'.format(bottom_counts/all_data),transform = ax.transAxes,
      ha='left', va='center',fontdict=text_font)

ax.text(.8,.056,'\nVisualization by DataCharm',transform = ax.transAxes,
      ha='center', va='center',fontsize = 10,color='black')
# plt.savefig(r'E:\Data_resourses\DataCharm 公众号\Python\学术图表绘制\scatter_EE.png',
#             width=7,height=4,dpi=900,bbox_inches='tight')
plt.show()
结果如下:



文章来源于微信公众号:气象学家


FrankJScott 发表于 前天 21:28

Top Rated Google Serp Api Website

For the guy inquiring about invest in seo, top digital agencies, seo tool comparison, internet marketing experts, diy seo tool, marketing site, serp test, best search engine optimization services, keyword volume, web seo tools,I highly suggest this best google serp api link or google keyword search analysis, website marketing firms, agency for seo, automatic backlink generator, agency platform reviews, platform seo, india top digital marketing company, search engine api, top seo marketing company, seo keyword search, not forgetting sites such as this top reviews website link not forgetting sites such as list of keywords, digital marketing and search engine optimization, product review, web marketing firm, keyword search analysis, digital marketing keywords, web development services in usa, marketing tools, search volume, content marketing agencies, not forgetting sites such as this good on reviews website info which is also great. Also, have a look at this updated blog post about reviews website tips alongside all main keyword, web marketing companies, keywords for google search engine, get products free to review, seo software companies, website development in new york, check website ranking in google search, website development services in usa, digital seo company, analysis of websites, not forgetting sites such as this best google serp api info together with website developer uk, top digital advertising agencies, white label seo tool, site backlinks, digital ad companies,he said for which is worth considering with find website competitors, seo software platform, buy back links, review this product, find competitors website traffic,for good measure. Check more @ Top Storage Units Site d6670e7

vremennaya_prop 发表于 昨天 11:06

временная прописка

Оформление временной регистрации в СПб — помощь на всех этапах
временная регистрация registraciya-vremennaya-spb.ru .

atirebotewara 发表于 昨天 15:20

Allows transantral long-gone obstruction; turning.

vremennaya_prop ??? 2024-10-5 11:06
?????????? ????????? ??????????? ? ??? — ?????? ?? ??? ...

Given the myriad options for addressing fertility issues, the kamagra non genericoften emerges as an important factor for many.

Wondering where to find Allegra at competitive rates? Look no further than our recommended source <a href="https://heavenlyhappyhour.com/retin-a-without-a-doctors-prescription/">retin a coupon</a> , offering a wide selection of choices for your requirements.

Considering the fluctuating https://bulgariannature.com/hydroxychloroquine-best-price/ , it's worth checking out different platforms to purchase your prescriptions through the web.
页: [1]
查看完整版本: Python-matplotlib 学术散点图 EE 统计及绘制