R-squared intuition (article) | Khan Academy (2024)

Want to join the conversation?

Log in

  • ivan08urbieta

    7 years agoPosted 7 years ago. Direct link to ivan08urbieta's post “Which parameter is then b...”

    Which parameter is then better to evaluate the fit of a line to a data set? the correlation coefficient (r) or the coefficient of determination (r2)?

    (26 votes)

    • Nahuel Prieto

      7 years agoPosted 7 years ago. Direct link to Nahuel Prieto's post “The short answer is this:...”

      R-squared intuition (article) | Khan Academy (4)

      R-squared intuition (article) | Khan Academy (5)

      R-squared intuition (article) | Khan Academy (6)

      The short answer is this: In the case of the Least Squares Regression Line, according to traditional statistics literature, the metric you're looking for is r^2.

      Longer answer:
      IMHO, neither r o r^2 are the best for this. In the case of r, it is calculated using the Standard Deviation, which itself is a statistic that has been long put to doubt because it squares numbers just to remove the sign and then takes a square root AFTER having added those numbers, which resembles more an Euclidean distance than a good dispersion statistic (it introduces an error to the result that is never fully removed). Here is a paper about that topic presented at the British Educational Research Association Annual Conference in 2004: https://www.leeds.ac.uk/educol/documents/00003759.htm .

      If we used the MAD (mean absolute deviation) instead of the standard deviation to calculate both r and the regression line, then the line, as well as r as a metric of its effectiveness, would be more realistic, and we would not even need to square r at all.

      This is a very extensive subject and there are still lots of different opinions out there, so I encourage other people to complement my answer with what they think.

      Hope you found my answer helpful or at least interesting.

      Cheers!

      (114 votes)

  • morecmy

    5 years agoPosted 5 years ago. Direct link to morecmy's post “what's the difference bet...”

    what's the difference between R-squared and the total sum of squared residual?

    (9 votes)

  • Shannon Hegewald

    3 years agoPosted 3 years ago. Direct link to Shannon Hegewald's post “They lost me at the squar...”

    They lost me at the squares

    (6 votes)

    • deka

      2 years agoPosted 2 years ago. Direct link to deka's post “don't worry about them to...”

      don't worry about them too much
      they're simply a visualization of squaring numbers then summing them like 3^2 + 7^2 + 13^2 to assess how far they are from a regression line

      (5 votes)

  • Maryam Azmat

    6 years agoPosted 6 years ago. Direct link to Maryam Azmat's post “If you have two models of...”

    If you have two models of a set of data, a linear model and a quadratic model, and you have worked out the R-squared value through linear regression, and are then asked to explain what the R-squared value of the quadratic model is, without using any figures, what would this explanation be?

    (2 votes)

    • Ian Pulizzotto

      6 years agoPosted 6 years ago. Direct link to Ian Pulizzotto's post “A quadratic model has one...”

      A quadratic model has one extra parameter (the coefficient on x^2) compared to a linear model. Therefore, the quadratic model is either as accurate as, or more accurate than, the linear model for the same data. Recall that the stronger the correlation (i.e. the greater the accuracy of the model), the higher the R^2. So the R^2 for the quadratic model is greater than or equal to the R^2 for the linear model.

      Have a blessed, wonderful day!

      (5 votes)

  • Brown Wang

    7 years agoPosted 7 years ago. Direct link to Brown Wang's post “How we predict sum of squ...”

    How we predict sum of squares in the regression line?

    (2 votes)

    • 347231

      6 years agoPosted 6 years ago. Direct link to 347231's post “Tbh, you really cannot ge...”

      Tbh, you really cannot get around squaring every number. I guess if you have decimals, you could round them them off, but really,, other than that, there’s no shortcut. It is difficult to predict because the powers have to be applied to each and every number. You could always do a bit of mental math and round things off into easier numbers, but it’s not always reliable.

      (5 votes)

  • Bo Stoknes

    2 months agoPosted 2 months ago. Direct link to Bo Stoknes's post “Why do we square the resi...”

    Why do we square the residuals? I get that we need a positive value for all residuals to calculate the sum of the prediction error, but wouldn't it be easier to just calculate the sum of the absolute values of the residuals?

    (3 votes)

  • 24pearcetc

    8 months agoPosted 8 months ago. Direct link to 24pearcetc's post “is there a shorter way to...”

    is there a shorter way to create a estimation without taking all the steps to solve the problem? is there a hack or a way to do it quickly?

    (2 votes)

    • Jose Prieto Lechuga

      8 months agoPosted 8 months ago. Direct link to Jose Prieto Lechuga's post “Maybe using software like...”

      Maybe using software like JASP?

      (1 vote)

  • Neel Kumar

    6 years agoPosted 6 years ago. Direct link to Neel Kumar's post “Can I get the exact data ...”

    Can I get the exact data set, based on that this dot plot have been created.

    (2 votes)

  • gembaindonesia

    9 months agoPosted 9 months ago. Direct link to gembaindonesia's post “Hi. I have in several cas...”

    Hi. I have in several cases lately observed that when you remove several obvious outliers from a data set in "one go" the R-sq actually gets lower? This is rather counter intuitive to me at least, and also when looking into the formula for how R is calculated it doesn't seem to make much sense either? Any insights into this?
    b/r
    Niels

    (2 votes)

    • daniella

      4 months agoPosted 4 months ago. Direct link to daniella's post “The phenomenon you descri...”

      The phenomenon you described, where removing outliers from a dataset results in a lower R^2 value, can occur in certain cases. One possible reason is that the outliers were exerting a disproportionate influence on the correlation between the variables, causing the regression line to be biased. Removing outliers may result in a more accurate estimation of the true relationship between the variables, leading to a lower residual sum of squares and hence a lower R^2 value. Additionally, it's important to consider the nature of the outliers and the underlying data generating process to fully understand the impact of their removal on the regression analysis.

      (1 vote)

  • Scott Samuel

    3 years agoPosted 3 years ago. Direct link to Scott Samuel's post “how do you calculate r^2”

    how do you calculate r^2

    (1 vote)

    • deka

      2 years agoPosted 2 years ago. Direct link to deka's post “in case you already have ...”

      in case you already have r, simply do r*r

      yes. r^2 is nothing but a square of r(correlation coefficient)

      (3 votes)

R-squared intuition (article) | Khan Academy (2024)
Top Articles
Latest Posts
Article information

Author: Foster Heidenreich CPA

Last Updated:

Views: 5591

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Foster Heidenreich CPA

Birthday: 1995-01-14

Address: 55021 Usha Garden, North Larisa, DE 19209

Phone: +6812240846623

Job: Corporate Healthcare Strategist

Hobby: Singing, Listening to music, Rafting, LARPing, Gardening, Quilting, Rappelling

Introduction: My name is Foster Heidenreich CPA, I am a delightful, quaint, glorious, quaint, faithful, enchanting, fine person who loves writing and wants to share my knowledge and understanding with you.