Unit 8 Section 5 : Equation of the Line of Best Fit

If you draw a line of best fit, it is possible to determine the equation of the line of best fit. You will remember that the equation of a straight line is given by

y = mx + c

where m is the gradient and c is the intercept.

Example 1

The points with coordinates (0, 6), (2, 7), (4, 8) and (6, 9) lie on a straight line.
Draw the line and determine its equation.

The points and the line are shown on the graph.

The intercept is 6.
The gradient = = , so the equation of the line is
y = x + 6

Example 2

The following graph shows a scatter plot and a line of best fit:

(a)

Determine the equation of the line of best fit.

The intercept and the gradient can be found from the graph, as shown on the following diagram. (Note that the scales on the vertical and horizontal axes are not the same.)
c = 5, m = =
so the line of best fit has equation y = x + 5.
(b)

Use the equation to estimate y when x = 4.

Substitute x = 4 into the equation.
y = × 4 + 5 = 2 + 5 = 7
(c)

Use the equation to estimate x when y = 18.

Substitute y = 18 into the equation for the line of best fit and solve the equation this gives.
18 =
x + 5
13 =
x
(subtracting 5 from both sides)
x = 2 × 13 (multiplying both sides by 2)
= 26

Note of Warning!

In (b) above, the value of x used was within the range of values of x provided by the original data. We can be confident that the estimate we obtain is reasonable. This process is called interpolation.

In (c) above, the value of x we obtain is well outside the range of values of x provided by the original data. This process is called extrapolation and the results must be treated with caution as they may be very unreliable. In some cases, extrapolation can generate bogus predictions.

Exercises

Question 1

Each set of points below lies on a straight line. Determine the equation of each line.

(a)

(0, 3), (5, 5), (10, 7) and (15, 9)

y =
(b)

(1, 5.3), (3, 5.5), (5, 5.7) and (7, 5.9)

y =
(c)

(0, 6), (3, 5.4), (5, 5) and (8, 4.4)

y =
Question 2
x 0100200300400
L 66.46.97.37.6

The relationship between two quantities L and x is to be investigated using the data shown.

(a)

Draw a scatter graph with x on the horizontal axis and draw a line of best fit.

You need to upgrade your Flash Player
Go to http://www.adobe.com/go/getflashplayer.
(b)

Determine the equation of the line of best fit.

L =
(c)

Use the equation to estimate L when x = 250 and 500.

x = 250, L =
x = 500, L =
The first estimate should be reliable, because it has been produced by interpolation, since x = 250 is within the range of the original data. The second estimate may not be as reliable, because it has been produced by extrapolation, since x = 500 is outside of the range of the original data.
Question 3

In the calibration of a thermometer, the height, H cm, of the mercury is recorded at different temperatures. The results are listed below.

Temperature(°C) 5 20 35 50 80
H(cm) 4.5 21.0 35.2 51.2 78.6
(a)

Draw a scatter graph and a line of best fit.

You need to upgrade your Flash Player
Go to http://www.adobe.com/go/getflashplayer.
(b)

Determine the equation of the line of best fit.

H =
(c)

Estimate H when the temperature is 60 °C and 120 °C.

T = 60 °C; H
T = 120 °C; H
(d)

Which of your estimates is the more reliable?

The first is more reliable – T within range of results.
Question 4

Refer back to the scatter graphs and lines of best fit you used each of the questions 1 to 5 in the Exercises in section 8.4. Determine the equation of the line of best fit for each question.

y =
y =
y =
y =
y =
Question 5

A long distance lorry driver records the times it takes to make journeys of different lengths. This information is recorded below:

Length of Journey (miles) 150229260290320
Time Taken (hours) 3 ¼4 ½6 ¼6 ½7 ¾
(a)

Comment on the way that the driver records the time taken.

He has given the times correct to the nearest quarter of an hour.
(b)

Plot the data and draw a line of best fit.

You need to upgrade your Flash Player
Go to http://www.adobe.com/go/getflashplayer.
(c)

Determine the equation of the line of best fit.

T =
Question 6

In an experiment a flask of water is heated. The temperature of the water is recorded at two minute intervals. The results are recorded in the following table:

Time(minutes) 0246810
Temperature(°C) 183042567184
(a)

Plot the data on a graph and determine the equation of the line of best fit.

You need to upgrade your Flash Player
Go to http://www.adobe.com/go/getflashplayer.
T =
(b)

Use the equation to predict the temperature after 11 minutes.

Approximately °C
(c)

Would it be wise to use the line of best fit to predict temperatures for later times than 11 minutes?

It would be wise not to use the line of best fit to predict temperatures for later times than 11 minutes because it is beyond the range of the original data and, as the answer to part (c) shows, we are getting close to the boiling point of water, when the experimental conditions will change.
Question 7

A driver records the petrol consumed on a number of journeys of different lengths. The data is presented in the table below:

Journey Length (miles) 100180250300320350
Petrol Consumption (gallons) 3.55.67.98.49.310.9
(a)

Plot a graph of petrol consumed (vertical axis) against journey length (horizontal axis) and determine the equation of the line of best fit. Use this to predict the petrol needed for a journey of 280 miles.

You need to upgrade your Flash Player
Go to http://www.adobe.com/go/getflashplayer.

Line of best fit has equation P = .

Approximately gallons of petrol are needed for a journey of 280 miles.

Question 8

The number of triplets and higher order births per 100 000 of the population, as recorded for various years between 1984 and 1994, is given in the following table:

Year (weeks) 1984198719881989199119921994
No. of Triplets and Higher Order
Births per 100 000 of the Population
(cm)
13212029323140
(a)

Plot a graph to illustrate this data and draw a line of best fit.

You need to upgrade your Flash Player
Go to http://www.adobe.com/go/getflashplayer.
(b)

Determine the equation of the line of best fit.

T =
(c)

Estimate the number of triplets and higher order births per 100 000 of the population in the year 2020.

Approximately:
This figure is likely to be extremely unreliable because we are extrapolating well beyond the original range of years into the future.