Correlation And Regression Assignment help




In statistics, regression analysis includes any techniques for modelling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps us understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed. Most commonly, regression analysis estimates the conditional expectation of the dependent variable given the independent variables — that is, the average value of the dependent variable when the independent variables are held fixed. Less commonly, the focus is on a quantile, or other location parameter of the conditional distribution of the dependent variable given the independent variables. In all cases, the estimation target is a function of the independent variables called the regression function. In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression function, which can be described by a probability distribution.

Multiple Correlation is a linear relationship among more than two variables. It is measured by the coefficient of multiple determination, denoted as R2, which is a measure of the fit of a linear regression. A regression's R2 falls somewhere between zero and one (assuming a constant term has been included in the regression); a higher value indicates a stronger relationship among the variables, with a value of one indicating that all data points fall exactly on a line in multidimensional space and a value of zero indicating no relationship at all between the independent variables collectively and the dependent variable. Unlike the coefficient of determination in a regression involving just two variables, the coefficient of multiple determination is not computationally commutative: a regression of y on x and z will in general have a different R2 than will a regression of z on x and y. For example, suppose that in a particular sample the variable z is uncorrelated with both x and y, while x and y are linearly related to each other. Then a regression of z on y and x will yield an R2 of zero, while a regression of y on x and z will yield a positive R2. [edit]

Correlation & Regression:


          The term correlation deals with the relationship between two or more variables. If a change in one variable effect a change in other variable, the variables are said to be correlated.

img

There are basically three type of correlation, namely,

  • Positive correlation
  • Negative correlation
  • Zero correlation.

Correlation

Positive correlation:

          If the value of two variables deviate (change) in the same direction i.e. If the increase (or decrease) in one variable results in a corresponding increase (or decrease) in the order, the correlation between them is said to be positive.

Examples of Positive correlation:

  •    The heights and weights of the individuals
  •    The income and expenditure
  •    Experience and salary

 

Negative correlation:

          If the value of two variables deviate (change) in the opposite direction i.e. If the increase (or decrease) in one variable results in a corresponding decrease (or increase) in the order, the correlation between them is said to be negative.

Examples of negative correlation:

  • Number of school students using laptops and mobile phones in the classroom.
  • The number of persons eating their breakfast at night.
  •  How many students bought laptop from their own salary?
  • Number of persons used gold spoon for eating their food.

 

Zero correlation:

          There is no relationship between two variable means such a correlation is a zero correlation

Formula:
Correlation Co-Efficient:
Correlation(r) = [ NΣXY - (ΣX)(ΣY) / SQRT([NΣX2 - (ΣX)2][NΣY2 - (ΣY)2])]
where
              N = Number of values or elements
              X = First Score
              Y = Second Score
              ΣXY = Sum of the product of first and Second Scores
              ΣX = Sum of First Scores
              ΣY = Sum of Second Scores
              ΣX2 = Sum of square First Scores
              ΣY2 = Sum of square Second Scores

Solved example on correlation:

Ex 1:  To determine the correlation value for the given set of X and Y values

X Values

Y Values

32

1.5

45

2.1

17

3.2

39

2.6

21

4.4

25

5.4


Sol: 

Let us count the number of values.
            N = 6

Determine the values for XY, X2, Y

X Value

Y Value

  X*Y

  X*X

 Y*Y

  32

   1.5

    48

 1024

    2.25

  45

   2.1

    94.5

 2025

    4.41

  17

   3.2

    54.4

   289

  10.24

  39

   2.6

  101.4

 1521

   6.76

  21

   4.4

     92.4

   441

  19.36

  25

   5.4

   135

   625

  29.16


Determine the following values ΣX , Σ Y , ΣXY , Σ X^2 , Σ Y^2 .
Σ X = 179
Σ Y = 19.2
Σ XY = 525.7
ΣX^2 = 5925
ΣY^2 = 72.18

Correlation (r) = `[(NΣ XY - (ΣX) (Σ Y)) / sqrt ([NΣX^2 - (Σ X)^2][NΣY^2 - (Σ Y)^2])]`

                        = [(6(525.7) - (179) (19.2)) / sqrt ([6(5925) - (179)^2][(6)(72.18) - (19.2)^2])]

                        = `[(3154.2 - 3436.8)/ sqrt ([35550-32041][433.08-368.64])]`

                        = `[(282.6)/sqrt ([3509][64.44])]`

                        = `[(282.6)/sqrt([226119.96])]`

                        = `[(282.6)/(475.52)]`

                  (r)  = 0.59

Ans:

Correlation (r) = 0.59

Practice problem on negative correlation:

x:        15             12            10         8

y:        13              11             9         6

Regression

Regression Definition:
     A regression is defined as a statistical analysis assessing the association between two variables. It is used to find the relationship between two variables.


Regression Formula:
Regression Equation(y) = a + bx
Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2)
Intercept(a) = (ΣY - b(ΣX)) / N

where
              x and y are the variables.
              b = the slope of the regression line
              a = the intercept point of the line and the y axis.
              N = Number of values or elements
              X = First Score
              Y = Second Score
              ΣXY = Sum of the product of first and Second Scores
              ΣX = Sum of First Scores
              ΣY = Sum of Second Scores
              ΣX 2 = Sum of square First Scores



Submit us an Assignment:

For Demo Class Click here
Read more