Turn on thread page Beta
 You are Here: Home >< Maths

# Best algorithm watch

1. I have a graph with two sets of p-values.
The significance level is p=0.05.
The regions of interest are were ONE value is significant and the other is not.
(the red region = excluded region).

I want to find an algorithm or equation that will give me the point on my graph that is the most significant p-value on one axis and the least significant on the other for all the data points

I have tried measuring the distance to the origin BUT this method does not work when one value is p=0.049 and the other is p=0.999

(the set p=0.049, p=0.999 is clearly less ideal than p=0.025, p=0.7 but using distance to origin, where the origin is p=0.05, p=0.05, it appears to be better)

Grapg us attached
Thanks
Attached Images

2. (Original post by jsmith6131)
I want to find an algorithm or equation that will give me the point on my graph that is the most significant p-value on one axis and the least significant on the other for all the data points
As stated, for general sets of pairs of p-values, this problem doesn't have a solution. That is, there need not be such a point. The set {(0.01, 0.99), (0.001, 0.8)} being a counter example - the second point has the "most significant" first p-value, the first point has the "least significant" second p-value; there is no point for which both conditions are simultaneously true.

Perhaps you meant something different?
3. (Original post by Gregorius)
As stated, for general sets of pairs of p-values, this problem doesn't have a solution. That is, there need not be such a point. The set {(0.01, 0.99), (0.001, 0.8)} being a counter example - the second point has the "most significant" first p-value, the first point has the "least significant" second p-value; there is no point for which both conditions are simultaneously true.

Perhaps you meant something different?
I want to find the solution that gives the best combination of high p-value for one and small p-value for the other.
Does that make it clearer?
4. (Original post by jsmith6131)
I want to find the solution that gives the best combination of high p-value for one and small p-value for the other.
Does that make it clearer?
The problem is that "best combination" is not defined. Presumably what is considered to be so will come from the context - care to elaborate?
5. (Original post by Gregorius)
The problem is that "best combination" is not defined. Presumably what is considered to be so will come from the context - care to elaborate?
Well I am measuring how a number of different parameters change in two population groups. I am trying to work out which parameters best describe the difference in the two populations.

I have p-values for each parameter in the two populations

I want to find the parameters where the p-values are most significant in one group and least significant in the other, at the same time.

Does that make sense?
6. (Original post by jsmith6131)
Well I am measuring how a number of different parameters change in two population groups. I am trying to work out which parameters best describe the difference in the two populations.

I have p-values for each parameter in the two populations

I want to find the parameters where the p-values are most significant in one group and least significant in the other, at the same time.

Does that make sense?
So is the situation something like the following?

You have two populations X and Y from which you have drawn random samples A and B respectively. A set of n parameters are measured in both A and B; some sort of intervention is applied to both A and B and the n parameters are measured again in A and B. This results in measurements and made in A before and after the intervention and and made in B before and after the intervention. The effect of the intervention is measured in A and B separately by calculating p-values for the differences and . This gives you p-values and . You now wish to find which of the parameters is "most significantly" changed in A whilst being "least significantly" changed in B and vice-versa.

Am I close?
7. (Original post by Gregorius)
So is the situation something like the following?

You have two populations X and Y from which you have drawn random samples A and B respectively. A set of n parameters are measured in both A and B; some sort of intervention is applied to both A and B and the n parameters are measured again in A and B. This results in measurements and made in A before and after the intervention and and made in B before and after the intervention. The effect of the intervention is measured in A and B separately by calculating p-values for the differences and . This gives you p-values and . You now wish to find which of the parameters is "most significantly" changed in A whilst being "least significantly" changed in B and vice-versa.

Am I close?
Thank you
Yes. THats basically what is going on.
I tried to find a statistical test that would perform this analysis but I don't think there is one that compares a list of n un-related parameters in populations A and B.
8. (Original post by jsmith6131)
Thank you
Yes. THats basically what is going on.
I tried to find a statistical test that would perform this analysis but I don't think there is one that compares a list of n un-related parameters in populations A and B.
OK, I'll try and get back to you this evening after I've had a think.
9. (Original post by jsmith6131)
Thank you
Yes. THats basically what is going on.
I tried to find a statistical test that would perform this analysis but I don't think there is one that compares a list of n un-related parameters in populations A and B.
Yes, this is a tricky one. The first thing that comes to my mind is some sort of MANOVA or MANCOVA, but I can't immediately see how to bend it into the shape that you want. I'll carry on thinking along these lines, but for the moment...

Perhaps a reasonable approach is to go back, not to p-values, but to normalized effect sizes. There is a ono-to-one correspondence between them, but the scale of effect size would seem to make more sense than the non-linear transform that you go through to get a p-value.

So, consider one of the parameters that you're interested in and consider its values before and after the intervention in samples A and B. In each of these samples calculate the effect size (after-before) and divide it by its standard deviation to get and . Then to get a comparison between and either subtract one from the other, or take a ratio or something like that. Then order the differences (or the ratios).

This approach has the advantage of simplicity; but it has the disadvantage that there is no obvious analytic way of coming up with a statistical way of telling whether one parameter is really more differentiating between the effect in A and B than another. It can be done, using a computationally intensive technique called the bootstrap, but this would require a bit of non-trivial statistical programming.
10. (Original post by Gregorius)
Yes, this is a tricky one. The first thing that comes to my mind is some sort of MANOVA or MANCOVA, but I can't immediately see how to bend it into the shape that you want. I'll carry on thinking along these lines, but for the moment...

Perhaps a reasonable approach is to go back, not to p-values, but to normalized effect sizes. There is a ono-to-one correspondence between them, but the scale of effect size would seem to make more sense than the non-linear transform that you go through to get a p-value.

So, consider one of the parameters that you're interested in and consider its values before and after the intervention in samples A and B. In each of these samples calculate the effect size (after-before) and divide it by its standard deviation to get and . Then to get a comparison between and either subtract one from the other, or take a ratio or something like that. Then order the differences (or the ratios).

This approach has the advantage of simplicity; but it has the disadvantage that there is no obvious analytic way of coming up with a statistical way of telling whether one parameter is really more differentiating between the effect in A and B than another. It can be done, using a computationally intensive technique called the bootstrap, but this would require a bit of non-trivial statistical programming.
Thank you for this. As statistically significance is important though I don't think I can use this method.

I have tried measuring a number of different ratios between the p-values (e.g: p(A)/p(B), (p(A)-p(B))/p(A)+p(B)...and thankfully my dataset is small enough that I can tell my approaches are obviously wrong.

If you think there is no "best" solution then I suppose I can present the "best" of the possible solutions I have come up with as at least I will have something!

But thanks for taking time to help me with my issue. I really appreciate it
11. (Original post by Gregorius)
Yes, this is a tricky one. The first thing that comes to my mind is some sort of MANOVA or MANCOVA, but I can't immediately see how to bend it into the shape that you want. I'll carry on thinking along these lines, but for the moment...

Perhaps a reasonable approach is to go back, not to p-values, but to normalized effect sizes. There is a ono-to-one correspondence between them, but the scale of effect size would seem to make more sense than the non-linear transform that you go through to get a p-value.

So, consider one of the parameters that you're interested in and consider its values before and after the intervention in samples A and B. In each of these samples calculate the effect size (after-before) and divide it by its standard deviation to get and . Then to get a comparison between and either subtract one from the other, or take a ratio or something like that. Then order the differences (or the ratios).

This approach has the advantage of simplicity; but it has the disadvantage that there is no obvious analytic way of coming up with a statistical way of telling whether one parameter is really more differentiating between the effect in A and B than another. It can be done, using a computationally intensive technique called the bootstrap, but this would require a bit of non-trivial statistical programming.

I came up with a method actually and was wondering if you agree.
Let's asume A = p<0.05 ALWAYS
and B = p>0.05 ALWAYS

If we say
Score = 1/A * B
that seems to produce a very good overall score

Do you agree?
thanks
12. (Original post by jsmith6131)
I came up with a method actually and was wondering if you agree.
Let's asume A = p<0.05 ALWAYS
and B = p>0.05 ALWAYS

If we say
Score = 1/A * B
that seems to produce a very good overall score

Do you agree?
thanks
I think that this corresponds to the suggestion I made about taking the ratio of the effect sizes - but doing it on the scale of the p-values.
13. (Original post by Gregorius)
I think that this corresponds to the suggestion I made about taking the ratio of the effect sizes - but doing it on the scale of the p-values.
Yes, That is how I came up with the idea thanks.

But I felt the method had to be done on the p-values not the original values to ensure only paramaters that demonstrated a significant change in one of the two populations (A or B) were included. If I took values that significantly changed in both populations then that parameter would not actually be capable of distinguishing the two groups

But thanks for confirming

Reply
Submit reply
Turn on thread page Beta

### Related university courses

TSR Support Team

We have a brilliant team of more than 60 Support Team members looking after discussions on The Student Room, helping to make it a fun, safe and useful place to hang out.

This forum is supported by:
Updated: March 15, 2016
Today on TSR

### Edexcel C4 Maths Unofficial Markscheme

Find out how you've done here

### 1,800

students online now

Exam discussions

Poll
Useful resources

## Make your revision easier

### Maths Forum posting guidelines

Not sure where to post? Read the updated guidelines here

### How to use LaTex

Writing equations the easy way

### Study habits of A* students

Top tips from students who have already aced their exams

Can you help? Study help unanswered threads

## Groups associated with this forum:

View associated groups

The Student Room, Get Revising and Marked by Teachers are trading names of The Student Room Group Ltd.

Register Number: 04666380 (England and Wales), VAT No. 806 8067 22 Registered Office: International House, Queens Road, Brighton, BN1 3XE

Write a reply...
Reply
Hide
Reputation gems: You get these gems as you gain rep from other members for making good contributions and giving helpful advice.