The Student Room Group

two-factor factorial design

I'm doing "two-factor factorial design" on spss but having trouble with geting out puts

Post hoc tests are not performed for Date because error term has zero degrees of freedom.
Post hoc tests are not performed for SamplePoint because error term has zero degrees of freedom.
All absolute deviations are constant within each cell. Levene F statistics cannot be computed.

these error messages are appearing
what could be the reason
Original post by Narendran
I'm doing "two-factor factorial design" on spss but having trouble with geting out puts

Post hoc tests are not performed for Date because error term has zero degrees of freedom.
Post hoc tests are not performed for SamplePoint because error term has zero degrees of freedom.
All absolute deviations are constant within each cell. Levene F statistics cannot be computed.

these error messages are appearing
what could be the reason


I'd need to know a few more specifics about what you're doing, but it sounds as if you have level combinations with too few elements.
Reply 2
Original post by Gregorius
I'd need to know a few more specifics about what you're doing, but it sounds as if you have level combinations with too few elements.

doing a river water quality monitoring
parameters are pH, Turbidity dissolved oxygen, biological oxygen demand, Fe, Nitrate, Ammonia content, etc
5 sampling points, done sampling for 5 months(monthly)

Objectives
To find the reasons for the water quality changes with time on a certain point
To find the reasons for the water quality changes along the stretch at a certain time

is this information enough?
Original post by Narendran
doing a river water quality monitoring
parameters are pH, Turbidity dissolved oxygen, biological oxygen demand, Fe, Nitrate, Ammonia content, etc
5 sampling points, done sampling for 5 months(monthly)

Objectives
To find the reasons for the water quality changes with time on a certain point
To find the reasons for the water quality changes along the stretch at a certain time

is this information enough?


What analyses are you attempting to perform?
Reply 4
Original post by Gregorius
What analyses are you attempting to perform?


That's where I'm stuck
I have never learned stat, so I have no idea which I must choose
Original post by Narendran
That's where I'm stuck
I have never learned stat, so I have no idea which I must choose



So let me see if I understand exactly the data that you have collected. If I’m reading you correctly, you are measuring:

(i) at least six parameters (pH, Turbidity dissolved oxygen, biological oxygen demand, Fe, Nitrate, Ammonia content, etc)

(ii) at five geographically separated locations in a river (your sampling points).

(iii) each at five different times (monthly for five months).

So for each parameter (such as pH), you have 25 measurements. Is that correct?

Now, you’ve also given me two objectives that involve a quantity called “water quality”. Can you tell me what the definition of this is?
Reply 6
you perfectly understood :smile:
"water quality" just means the parameters I'm checking, nothing else
I've attached the stat methodology of a similar research
but I cant understand in which order these must be done
can you help me with that
Original post by Narendran
you perfectly understood :smile:
"water quality" just means the parameters I'm checking, nothing else
I've attached the stat methodology of a similar research
but I cant understand in which order these must be done
can you help me with that



To be honest, I think you’re going to need local statistical help the paper that you’ve quoted is using quite advanced techniques (PCA and cluster analysis) that really need quite a lot of statistical knowledge to do properly. I’m also worried that you don’t have enough data to apply these sorts of techniques.

If we go back to your first objective: “To find the reasons for the water quality changes with time on a certain point” then for each parameter (let’s take pH, for example) you have 5 measurement at each location. You could (only just, this is a very small sample size) do a linear regression with pH as outcome and with time value as explanatory variable at each location. This would tell you whether pH was changing over time separately at each location. You could instead assume that the same processes are happening at all the locations and lump the data together by time this gives you 25 observations, and linear regression would be OK with this sample size. You would lose the differentiation between locations though.

But you have many more than one parameter measured at each location, and this is where I suspect that you’re meant to use PCA or cluster analysis. Instead of either (a) using each parameter separately in a regression or (b) doing a fully multivariate analysis (i.e. taking the vector of all parameter measurements as outcome), the idea here is that you take all the different types of parameter measurement and you reduce their dimension using PCA or cluster analysis, and then you stick this reduced outcome variable into a regression. But, as I say, this is hard stuff!

For your second objective: “To find the reasons for the water quality changes along the stretch at a certain time”, I’m not at all how you would do this unless you had some sort of additional spatial model of how the different locations interact.
Reply 8
Original post by Gregorius
To be honest, I think you’re going to need local statistical help the paper that you’ve quoted is using quite advanced techniques (PCA and cluster analysis) that really need quite a lot of statistical knowledge to do properly. I’m also worried that you don’t have enough data to apply these sorts of techniques.

so then ill use regression for each parameter separately for the 1st objective
for the second one I'll ask my research supervisor
thanks a lot for your time
Reply 9
I have a doubt again
in the regression, i can input 1 dependent variable and many independent variables. but for me, the "parameters" are dependent variable & independent variables are time (sample collection dates) & the sampling points.
how can I input the time (dates) as the independent variable and my parameters as dependent variables
Reply 10
I was thinking to input the sampling points by taking the upstream point as "0" & then the next ones with the length from the first point
will it be ok to do in such way?
if I want to do in that way how should I input them?
even though there is another issue, for each date or sample point there will be 5 sets of data
how to deal with that?
Reply 11
cant i do a principal component analysis
Original post by Narendran
I have a doubt again
in the regression, i can input 1 dependent variable and many independent variables. but for me, the "parameters" are dependent variable & independent variables are time (sample collection dates) & the sampling points.
how can I input the time (dates) as the independent variable and my parameters as dependent variables


One way of dealing with time as an independent variable in such a regression is simply to measure time from the first measurement - so the first measurement is at time zero, the second at one month, etc.
Original post by Narendran
I was thinking to input the sampling points by taking the upstream point as "0" & then the next ones with the length from the first point
will it be ok to do in such way?
if I want to do in that way how should I input them?
even though there is another issue, for each date or sample point there will be 5 sets of data
how to deal with that?


Dealing with the sampling points in that way might be reasonable - but this is what I meant by saying that you really need a spatial model of how the sampling points are related to each other. It might be best to use distance from the upstream point, it might be best to use the square of the distance, etc. If you had more sampling points, I's recommend that you simply plotted each parameter versus distance from upstream points and looked to see whether the relationship looked linear. If not, you'd have to transform the distance variable i9n some way. BUt you really don't have enough data to do that.
Original post by Narendran
cant i do a principal component analysis


Possibly. But it looks to me that you have very little data for that approach.

Quick Reply

Latest