Linear Systems: Why Does Linear Combination Work (Graphically)?

A system of linear equations consists of multiple linear equations. You can think of this as multiple lines graphed on one coordinate plane. Three situations can arise when looking at such a graph. Either:
1) No point(s) are shared by all lines shown
2) There is one point that all lines cross through
3) The lines lie on top of one another, so there are many points that the lines have in common.

There are four commonly used tools for solving linear systems:
– graphing
– substitution
– linear combination
– matrices
Each has its own advantages and disadvantages in various situations, however I often used to wonder about why the linear combination approach works. My earlier post explains why it works from an algebraic perspective. This post will try to explain why it works from a graphical perspective.

Consider the linear system:

$\begin{cases}y=-3x+2\\y=x-6\end{cases}$

which, when graphed, looks like:

The linear combination technique for solving a system tells us to add two equations together… what happens graphically when we do that? Suppose we add the above two equations:

$\begin{cases}y=-3x+2\\y=x-6\end{cases}\\*~\\*~\\*2y=-2x-4~~~(1)\\*~\\*y=-x-2~~~~~~(2)$

If we graph this new equation (2) on the same graph as the original system, we get:

Note that the new equation, a linear combination of the original two equations, shares the one point that both original lines have in common. Why does that happen? Will it always happen?

The left side of equation (1) above is the sum of the two y-terms, or “2y”. The right side is the sum of the x-terms and constants that produced those two y-terms… in other words, we have added the y-coordinates from both lines together. On the left as the two y variables, and on the right as their definitions in terms of x.

Then, as we move from equation (1) to equation (2), we divided both sides by two. What is the vocabulary term that describes adding two numbers together, then dividing by two? Averaging. Pick any x-coordinate on the graph above, and draw a vertical line through it… the blue y-coordinate your line intercepts is the average of the two y-coordinates where your line intercepts the black lines.

When we average the y-coordinates from the black lines, if both black lines have the exact same y-coordinate, the average MUST also have that exact same value. Therefore, the only y-coordinate that will remain unchanged by the y-averaging process is one where both lines already have the same y-coordinate. That is why the sum of the two equations produces an equation that MUST pass through the point at which the original two lines intersected.

At this point, you may be asking yourself: but wait a minute, what happens when we multiply both sides of an equation by a constant before adding? This corresponds to finding a “weighted average”, and when the total is divided by the sum of the weightings, you will arrive at the same result: the only unchanged points will be those that satisfied both equations.

So subtracting one of the original equations from the other corresponds to calculating a weighted average where the weights are 1 and -1. In this example, that causes the y terms to cancel one another out, so we never need to divide by the total weight of 0 (which is a good thing, as division by 0 is undefined):

$\begin{cases}y=-3x+2\\y=x-6\end{cases}\\*~\\*~\\*0=-4x+8\\*~\\*x=2~~~(3)$

Note that this result is another equation that also passes through the one point both original equations had in common:

However, this weighted average of the equations (3) is much more useful than equation (2) above. The weighting used in this example caused one of the variables to disappear from the resulting equation entirely, resulting in either a horizontal or a vertical line. With such a line, one that is perpendicular to one of the axes, it is easy to determine one of the coordinates of the point that is the solution to the system.

And once we know one coordinate of the solution, we can plug that coordinate into ANY of the equations (the original two, or any weighted sum of the original two) to find the other coordinate of the solution to the system, since ALL linear combinations of the original two equations will pass through the solution point. Or, we could repeat the linear combination process in a way that cancels out the other variable, resulting in a line perpendicular to the other axis, then read off the value of the remaining coordinate of the solution from the graph.

To summarize, the linear combination process relies on the fact that a weighted average of the equations in the system will always result in an equation that passes through the point that is the solution to the original system. If we choose our weights cleverly before averaging, we can produce a line perpendicular to one of the axes (and which passes through the solution to the system). This perpendicular line makes it easy to determine the value that one variable must have, at which point we have “solved for” one coordinate of the point that is the solution to the system.

3 comments

Anonymous says:

March 25, 2018 at 2:14 pm

Very nice explanation sir. I have never thought this way. Thanks.

Clarissa says:

February 1, 2019 at 1:21 pm

Thank you for saving me and my friend for our math test.

1. Whit Ford says:
  
  February 1, 2019 at 1:58 pm
  
  You are welcome! That’s one of the many reasons this site exists…