Python scikit-learn (metrics): difference between r2_score and explained_variance_score?

Python scikit-learn (metrics): difference between r2_score and explained_variance_score?

I noticed that that ‘r2_score’ and ‘explained_variance_score’ are both build-in sklearn.metrics methods for regression problems.

I was always under the impression that r2_score is the percent variance explained by the model. How is it different from ‘explained_variance_score’?

When would you choose one over the other?

Thanks!

 

OK, look at this example:

In [123]:
#data
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.957173447537
0.948608137045
In [124]:
#what explained_variance_score really is
1-np.cov(np.array(y_true)-np.array(y_pred))/np.cov(y_true)
Out[124]:
0.95717344753747324
In [125]:
#what r^2 really is
1-((np.array(y_true)-np.array(y_pred))**2).sum()/(4*np.array(y_true).std()**2)
Out[125]:
0.94860813704496794
In [126]:
#Notice that the mean residue is not 0
(np.array(y_true)-np.array(y_pred)).mean()
Out[126]:
-0.25
In [127]:
#if the predicted values are different, such that the mean residue IS 0:
y_pred=[2.5, 0.0, 2, 7]
(np.array(y_true)-np.array(y_pred)).mean()
Out[127]:
0.0
In [128]:
#They become the same stuff
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.982869379015
0.982869379015

 

So, when the mean residue is 0, they are the same. Which one to choose dependents on your needs, that is, is the mean residue suppose to be 0?

 

Most of the answers I found (including here) emphasize on the difference between R2 and Explained Variance Score, that is: The Mean Residue (i.e. The Mean of Error).

However, there is an important question left behind, that is: Why on earth I need to consider The Mean of Error?


Refresher:

R2: is the Coefficient of Determination which measures the amount of variation explained by the (least-squares) Linear Regression.

You can look at it from a different angle for the purpose of evaluating the predicted values of y like this:

Varianceactual_y × R2actual_y = Variancepredicted_y

 

So intuitively, the more R2 is closer to 1, the more actual_y and predicted_y will have samevariance (i.e. same spread)


As previously mentioned, the main difference is the Mean of Error; and if we look at the formulas, we find that’s true:

R2 = 1 - [(Sum of Squared Residuals / n) / Variancey_actual]

Explained Variance Score = 1 - [Variance(Ypredicted - Yactual) / Variancey_actual]

 

in which:

Variance(Ypredicted - Yactual) = (Sum of Squared Residuals - Mean Error) / n 

 

So, obviously the only difference is that we are subtracting the Mean Error from the first formula! … But Why?


When we compare the R2 Score with the Explained Variance Score, we are basically checking the Mean Error; so if R2 = Explained Variance Score, that means: The Mean Error = Zero!

The Mean Error reflects the tendency of our estimator, that is: the Biased v.s Unbiased Estimation.


In Summary:

If you want to have unbiased estimator so our model is not underestimating or overestimating, you may consider taking Mean of Error into account.

 

参考链接:https://stackoverflow.com/questions/24378176/python-sci-kit-learn-metrics-difference-between-r2-score-and-explained-varian

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/119588.html原文链接:https://javaforall.cn

【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛

【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...

(0)


相关推荐

  • python实现各大视频网站电影下载

    python实现各大视频网站电影下载一、前期准备有时候我们想下载自己喜欢的电影,但很多时候要么需要安装客户端才能下载,或者干脆不提供下载的服务,很是不爽,因此这里我们介绍使用python来实现网站的电影下载功能,凡是能在线观看的,都

  • vue中清除浏览器缓存得方法

    vue中清除浏览器缓存得方法1.在HTTP协议中,只有后端返回expires或Cache-Control:max-age=XXX,前端才缓存。但在浏览器中,默认会对htmlcssjs等静态文件、以及重定向进行缓存,如果在HEAD头中指定:<HEAD> <METAHTTP-EQUIV=”Pragma”CONTENT=”no-cache”> <METAHTTP-EQUIV=”Cache-Control”CONTENT=”no-cache”> <METAHTTP-EQUIV

  • .net 常用开源框架

    .net 常用开源框架Json.NETCodePlexArchiveJson.Net是一个读写Json效率比较高的.Net框架.Json.Net使得在.Net环境下使用Json更加简单。通过LinqToJSON可以快速的读写Json,通过JsonSerializer可以序列化你的.Net对象。让你轻松实现.Net中所有类型(对象,基本数据类型等)和Json的转换。Math.NETMath.NETMath.NET的目标是为提供一款自身包含清晰框架的符号运算和数学运算/科学运算,它是C#开发的开.

  • 手把手学习的DSP

    手把手学习的DSPss

  • linux有必要安装杀毒软件吗_linux杀毒软件企业版

    linux有必要安装杀毒软件吗_linux杀毒软件企业版据有些Linux用户在Linux操作系统下使用360安全卫士国产系统适配专版反馈称:有时会显示有木马,或是误报。比方在Deepin系统中用商店安装的360安全卫士在扫描时报有木马,如下图所示:误报依据经验证,这是误报,如果有Linux用户出现相同的情况,可不需要做其他的处理。判断依据是:木马就是任务栏上的快捷方式?显然不是。同时,360杀毒国产系统适配专版在扫描时也出现误报的情况,如下图:其实在L…

  • AppFabric_RAZApp

    AppFabric_RAZAppAppFabric的开发相对还是很简单的,最常见的方法无非是声明一个缓存接口,然后由各种缓存实现.具体的使用除了msdn:http://msdn.microsoft.com/zh-cn/library/hh334305这是实例包的下载地址:http://www.microsoft.com/en-us/download/confirmation.aspx?id=19603这篇博客介绍得不错.使用微软

    2022年10月17日

发表回复

您的电子邮箱地址不会被公开。

关注全栈程序员社区公众号