To begin to generate theories and hypotheses from data, it helps to be able to see the relationship between two variables. We can do this in a plot by adding a trendline to the figure, which fits a line to the data in order to identify the relationship. To do this, let's first download a new dataset here. These data allow us to see the relationship between unemployment rates and voting behavior, as well as look at voting behavior over time.
First, let's make a scatterplot of total voting percentages versus unemployment rates. Refer to earlier in the tutorial if you don't remember how to do this. We get the following figure:
Now we want to fit a line to this scatterplot that best represents the relationship between the data. It looks like higher unemployment rates correspond to higher voting rates. To see what a linear relationship looks like, we fit a line to this figure. In the "Chart Layout" section at the top, there's an option called "Trendline." Clicking that will give you the following options:
Selecting the "Linear Trendline" option will produce the following result supporting our assumption that higher unemployment rates correspond to higher voting rates:
Fitting trendlines allows for many options. Sometimes, we can hypothesize that the relationship between our variables is not simply linear. For example, we hypothesize that the relationship of voting behavior over time is not linear. In the following case, we chose to plot an polynomial trendline and we can see that it bends downwards and then upwards as time increases. This relationship would not be visible if we used a linear trendline.
You can edit your trendline and see more options by right-clicking on the chart field to select it and choosing "Format Trendline" from the menu.
Apart from editing the visual appearance of our line, we can choose a number of other options. One of the most important is to display a measure of how well your linear relationship fits your data. The R-squared value is a summary of how much of the variance a particular line explains. In the format box, check the boxes to display the equation and the R-squared values on your graph.
We can now see the equation of the line and the proportion of variance in the data explained by the linear relationship. Visually, this is the same process as running a regression of the Y-variable on the X-variable.
You can also add error bars to a chart to display estimates of uncertainty. The "Chart Layout" section also has an "Error Bars" option. If you click there and select "Error Bars with Standard Errors," the original graph would now look like this instead:
You can also add custom error bars to a chart. This is where you have gone and calculated your own error for each point. To do this, first right click on the error bars on your existing graph. Select the Format Error Bars option.
This will bring up the following menu options. Select Error Bars, then Custom, then Specify Value. Make sure the cap settings are to your liking, you can choose between cap or no cap and the style (both, plus, minus).
Then select your custom error values and insert them in the positive AND negative error values. The select OK.
Follow the same steps for the horizontal error bars if needed.
Gridlines help your graph’s clarity and overall quality. Depending on your discipline, formatting gridlines is very important. First, click on your graph. Then select Chart Layout in the ribbon. Under Axes, select Gridlines.
Then, select from the drop down menu to format your horizontal or vertical Gridlines. Fix your horizontal gridlines and do the same with your vertical gridlines.
The original graph can now look like this: