# Plotting Linear Trendlines on Periodic Data Sets

Watch
Announcements
#1
I am analysing a set of recorded temperatures using Excel. These have been taken once per day over several years. The hypothesis is that there is a linear trend in the data at an annual level. Of course, at a daily and monthly level, temperatures are periodic (cold in winter, warm in summer). I have calculated the annual SLOPE of the linear trend in 2 different ways. 1) By finding the annual average and then finding the SLOPE of the data. 2) By finding the SLOPE of the daily (periodic) data and then multiplying it by 365 to get an annual estimate.I appreciate these 2 methods won't be exactly equivalent but my results are vastly different. Can anyone explain why? Which method is the most accurate result to describe the true annual trend? Can anyone point me towards some literature that might discuss this kind of problem?Thanks.
0
2 months ago
#2
(Original post by cameron.clark)
I am analysing a set of recorded temperatures using Excel. These have been taken once per day over several years. The hypothesis is that there is a linear trend in the data at an annual level. Of course, at a daily and monthly level, temperatures are periodic (cold in winter, warm in summer). I have calculated the annual SLOPE of the linear trend in 2 different ways. 1) By finding the annual average and then finding the SLOPE of the data. 2) By finding the SLOPE of the daily (periodic) data and then multiplying it by 365 to get an annual estimate.I appreciate these 2 methods won't be exactly equivalent but my results are vastly different. Can anyone explain why? Which method is the most accurate result to describe the true annual trend? Can anyone point me towards some literature that might discuss this kind of problem?Thanks.
I'd guess your first approach is the currect one - just look at how such stats are reported / plotted in real life.
https://www.carbonbrief.org/met-offi...year-on-record
I'm really not sure what you mean by the second one. Multiplying by 365 seems strange (average maybe ...?)
0
#3
(Original post by mqb2766)
I'd guess your first approach is the currect one - just look at how such stats are reported / plotted in real life.
https://www.carbonbrief.org/met-offi...year-on-record
I'm really not sure what you mean by the second one. Multiplying by 365 seems strange (average maybe ...?)
Sorry if I wasn't clear re. my 2nd method: by finding the SLOPE of the daily data, I am finding the linear increase DAY ON DAY. I wanted to convert this to an increase YEAR ON YEAR, so I multiplied it by 365 (days in a year). That way I can compare SLOPES from method 1 with method 2.
I'd really love to find some mathematical literature regarding this problem, any ideas?
0
2 months ago
#4
(Original post by cameron.clark)
Sorry if I wasn't clear re. my 2nd method: by finding the SLOPE of the daily data, I am finding the linear increase DAY ON DAY. I wanted to convert this to an increase YEAR ON YEAR, so I multiplied it by 365 (days in a year). That way I can compare SLOPES from method 1 with method 2.
I'd really love to find some mathematical literature regarding this problem, any ideas?
Tbh, I'd just forget about your second method. Im not sure if this is what you did but if you developed 365 models for Jan 1, Jan 2, ... Dec 31, then averaged them (so sum and dvide by 365) would be closer to what you'd get with your first method, but I can't see any real advantage (and several disadvantages) with doing this if the aim is to assess the yearly increase. Fitting a regression model to the averaged yearly temperature would be the standard way to pose the problem. If this isn;t what you meant by DAY ON DAY, can you describe clearly what the input and output is.

What is the context of the problem? Degree project, A level, ...?
Last edited by mqb2766; 2 months ago
0
2 months ago
#5
(Original post by cameron.clark)
I am analysing a set of recorded temperatures using Excel. These have been taken once per day over several years. The hypothesis is that there is a linear trend in the data at an annual level. Of course, at a daily and monthly level, temperatures are periodic (cold in winter, warm in summer). I have calculated the annual SLOPE of the linear trend in 2 different ways. 1) By finding the annual average and then finding the SLOPE of the data. 2) By finding the SLOPE of the daily (periodic) data and then multiplying it by 365 to get an annual estimate.I appreciate these 2 methods won't be exactly equivalent but my results are vastly different. Can anyone explain why? Which method is the most accurate result to describe the true annual trend? Can anyone point me towards some literature that might discuss this kind of problem?Thanks.
The appropriate type of way of analysing data like this is to use a special form of linear regression, where you use one term corresponding to the trend that you want to detect, and then a sin and a cosine term that deal with the periodicity and phase of the periodic part. So if and are sin and cosine functions with period one year, then you fit . What you are interested in is the value of .

If you are not sure of the period, or you are not sure whether the periodicity is sinusoidal, you can use "basis" functions other than the simple sin and cosine (such as more terms in a Fourier series) until your model shows a good fit.
1
#6
(Original post by Gregorius)
The appropriate type of way of analysing data like this is to use a special form of linear regression, where you use one term corresponding to the trend that you want to detect, and then a sin and a cosine term that deal with the periodicity and phase of the periodic part. So if and are sin and cosine functions with period one year, then you fit . What you are interested in is the value of .

If you are not sure of the period, or you are not sure whether the periodicity is sinusoidal, you can use "basis" functions other than the simple sin and cosine (such as more terms in a Fourier series) until your model shows a good fit.
Hi Gregorius, this is some amazing insight, thanks!

So I know that the period is 365 days (1 year). However the periodicity is definitely not sinusoidal and there is a seasonal drift (change in shape). I have found some literature (https://hal.archives-ouvertes.fr/hal-01980565/document) on basis functions/Fourier series but I am still struggling to understand how to apply it to my data. This paper says that is uses the first 13 Fourier basis functions...can you help explain what that means? And what it might look like in application? Thanks.
0
2 months ago
#7
(Original post by cameron.clark)
Hi Gregorius, this is some amazing insight, thanks!

So I know that the period is 365 days (1 year). However the periodicity is definitely not sinusoidal and there is a seasonal drift (change in shape). I have found some literature (https://hal.archives-ouvertes.fr/hal-01980565/document) on basis functions/Fourier series but I am still struggling to understand how to apply it to my data. This paper says that is uses the first 13 Fourier basis functions...can you help explain what that means? And what it might look like in application? Thanks.
I’ll say more tomorrow when I have more time, but the basic idea of Fourier Series is to approximate periodic functions with a series of Sine and Cosine functions…the more you take, the closer the approximation is. If you have a periodic function with a linear trend, then that trig series plus a linear term will approximate that. So take Sine and Cosine functions with the fundamental period (1 year), and start adding their harmonics…Sin and Cosine functions with periods 1/2 year, 1/3 year, 1/4 year, etc.
0
1 month ago
#8
(Original post by Gregorius)
I’ll say more tomorrow when I have more time, but the basic idea of Fourier Series is to approximate periodic functions with a series of Sine and Cosine functions…the more you take, the closer the approximation is. If you have a periodic function with a linear trend, then that trig series plus a linear term will approximate that. So take Sine and Cosine functions with the fundamental period (1 year), and start adding their harmonics…Sin and Cosine functions with periods 1/2 year, 1/3 year, 1/4 year, etc.
What that paper means by "the first 13 Fourier basis functions" is that you should take the constant term plus the first six sine terms and the first six cosine terms of a Fourier series. So if is your day number (counting from t = 0 as the first day) then you need and for k = 1,2,...,6. Practically speaking, you should have a columns of t-values and a corresponding column of temperature values for the corresponding value of t. You then need to calculate the value of each of those sine and cosine terms for k = 1,2,...,6, giving you an additional 12 columns of data. Then you feed all of this into a linear regression, with the temperature column as the outcome, and the t & sine and cosine columns as covariates (the software should give you the constant term automatically).

Depending on the complexity of the periodicity of the temperature data, this may or may not fit well (and you must check the regression diagnostics!) and you may have to use a more sophisticated technique to deal with the periodicity - but then consulting a local statistician would be a good idea.
0
X

new posts Back
to top
Latest
My Feed

### Oops, nobody has postedin the last few hours.

Why not re-start the conversation?

see more

### See more of what you like onThe Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

### Poll

Join the discussion

#### Are you tempted to change your firm university choice on A-level results day?

Yes, I'll try and go to a uni higher up the league tables (61)
29.33%
Yes, there is a uni that I prefer and I'll fit in better (15)
7.21%
No I am happy with my choice (116)
55.77%
I'm using Clearing when I have my exam results (16)
7.69%