I teach statistics. I'm a sociologist, but I'm also sort of a math geek -- always have been. That's not what people expect when they think about social scientists.
The other day when I was trying to find something to do to avoid working on fixing all the dates in my on-line class (the most truly boring aspect of teaching on-line), I thought of something I've been meaning to do for years: getting data on electricity use and temperature for my classes to learn about the difference between an association between two variables and a correlation between them.
What I wanted was a spreadsheet with a column for how many kilowatt hours we have used every month for the past few years and a column for the average temperature for our region for each month for the same period. The kilowatt hours used is easy -- the utility company prints the last 12 months of actual kilowatt usage on each bill. So I just had to locate the bills for January in the past several years. Getting the average monthly temperature was a little trickier. But the College of Agriculture at the University of Kentucky has growing season data that provides monthly averages. Some months the information is more detailed and actually breaks down the data by climate zone, but since that was only available half the time (and because I saw little difference between the overall averages for Kentucky and this region) I went with the mean temperature for the state of Kentucky.
I actually took two days to gather the temperature data, since I had to read through separate narrative descriptions for each year and each month to ferret out the relevant data. Once I had all data for the past three years, I created a scatter plot of the points. It didn't look right. There was slight, about r = -.40 linear relationship between the two data sets, where there should have been a strong curvilinear relationship. Then, duh, it hit me. The temperatures were for the correct month, but the electricity usage was off by one month -- one gets the bill for May usage in June not May. So I moved the entire column of figures up by one month, and viola! Here's what the scatter plot looks like:
Exactly as one would expect. Because we don't have whole house air conditioning (only in two rooms, and we only use fans at night) the upturn for warmer months is much less than one would expect in a home with a heat pump.
Now I have great data for my statistics class to play with -- only trouble no one is enrolled for statistics this fall. Enrollments in all our classes are way down, probably due to issues with the economy. Gas prices are up here as elsewhere, as are food prices, and tuition prices are up, too; but unlike elsewhere in the nation, employment in the coal industry is up. Coal companies are hiring, therefore so are other types of businesses in the region. Young people faced with rising costs and better employment opportunities. They choose work over school, and the wives of working husbands, stay home from school to reduce gas costs. Bad news for community colleges.