Best algorithm

OP

Original post by Gregorius

As stated, for general sets of pairs of p-values, this problem doesn't have a solution. That is, there need not be such a point. The set {(0.01, 0.99), (0.001, 0.8)} being a counter example - the second point has the "most significant" first p-value, the first point has the "least significant" second p-value; there is no point for which both conditions are simultaneously true.

Perhaps you meant something different?

I want to find the solution that gives the best combination of high p-value for one and small p-value for the other.
Does that make it clearer?

Reply 3

Original post by jsmith6131

I want to find the solution that gives the best combination of high p-value for one and small p-value for the other.
Does that make it clearer?

The problem is that "best combination" is not defined. Presumably what is considered to be so will come from the context - care to elaborate?

Reply 4

OP

Original post by Gregorius

The problem is that "best combination" is not defined. Presumably what is considered to be so will come from the context - care to elaborate?

Well I am measuring how a number of different parameters change in two population groups. I am trying to work out which parameters best describe the difference in the two populations.

I have p-values for each parameter in the two populations

I want to find the parameters where the p-values are most significant in one group and least significant in the other, at the same time.

Does that make sense?

Reply 5

Original post by jsmith6131

Well I am measuring how a number of different parameters change in two population groups. I am trying to work out which parameters best describe the difference in the two populations.

I have p-values for each parameter in the two populations

I want to find the parameters where the p-values are most significant in one group and least significant in the other, at the same time.

Does that make sense?

So is the situation something like the following?

You have two populations X and Y from which you have drawn random samples A and B respectively. A set of n parameters are measured in both A and B; some sort of intervention is applied to both A and B and the n parameters are measured again in A and B. This results in measurements

x_{1,1}^A, x_{2,1}^A, \cdots, x_{n,1}^A

and

x_{1,2}^A, x_{2,2}^A, \cdots, x_{n,2}^A

made in A before and after the intervention and

x_{1,1}^B, x_{2,1}^B, \cdots, x_{n,1}^B

and

x_{1,2}^B, x_{2,2}^B, \cdots, x_{n,2}^B

made in B before and after the intervention. The effect of the intervention is measured in A and B separately by calculating p-values for the differences

x_{i,2}^A - x_{i, 1}^A

and

x_{i,2}^B - x_{i, 1}^B

. This gives you p-values

p_{i}^A

and

p_{i}^B

. You now wish to find which of the parameters

x_{1}, x_{2}, \cdots, x_{n}

is "most significantly" changed in A whilst being "least significantly" changed in B and vice-versa.

Am I close?

Reply 6

OP

Original post by Gregorius

So is the situation something like the following?

You have two populations X and Y from which you have drawn random samples A and B respectively. A set of n parameters are measured in both A and B; some sort of intervention is applied to both A and B and the n parameters are measured again in A and B. This results in measurements

x_{1,1}^A, x_{2,1}^A, \cdots, x_{n,1}^A

and

x_{1,2}^A, x_{2,2}^A, \cdots, x_{n,2}^A

made in A before and after the intervention and

x_{1,1}^B, x_{2,1}^B, \cdots, x_{n,1}^B

and

x_{1,2}^B, x_{2,2}^B, \cdots, x_{n,2}^B

made in B before and after the intervention. The effect of the intervention is measured in A and B separately by calculating p-values for the differences

x_{i,2}^A - x_{i, 1}^A

and

x_{i,2}^B - x_{i, 1}^B

. This gives you p-values

p_{i}^A

and

p_{i}^B

. You now wish to find which of the parameters

x_{1}, x_{2}, \cdots, x_{n}

is "most significantly" changed in A whilst being "least significantly" changed in B and vice-versa.

Am I close?

Thank you
Yes. THats basically what is going on.
I tried to find a statistical test that would perform this analysis but I don't think there is one that compares a list of n un-related parameters in populations A and B.

Reply 7

Original post by jsmith6131

Thank you
Yes. THats basically what is going on.
I tried to find a statistical test that would perform this analysis but I don't think there is one that compares a list of n un-related parameters in populations A and B.

OK, I'll try and get back to you this evening after I've had a think.

Reply 8

Original post by jsmith6131

Thank you
Yes. THats basically what is going on.
I tried to find a statistical test that would perform this analysis but I don't think there is one that compares a list of n un-related parameters in populations A and B.

Yes, this is a tricky one. The first thing that comes to my mind is some sort of MANOVA or MANCOVA, but I can't immediately see how to bend it into the shape that you want. I'll carry on thinking along these lines, but for the moment...

Perhaps a reasonable approach is to go back, not to p-values, but to normalized effect sizes. There is a ono-to-one correspondence between them, but the scale of effect size would seem to make more sense than the non-linear transform that you go through to get a p-value.

So, consider one of the parameters

x_i

that you're interested in and consider its values before and after the intervention in samples A and B. In each of these samples calculate the effect size (after-before) and divide it by its standard deviation to get

\theta_{i}^A

and

\theta_{i}^B

. Then to get a comparison between

\theta_{i}^A

and

\theta_{i}^B

either subtract one from the other, or take a ratio or something like that. Then order the differences (or the ratios).

This approach has the advantage of simplicity; but it has the disadvantage that there is no obvious analytic way of coming up with a statistical way of telling whether one parameter is really more differentiating between the effect in A and B than another. It can be done, using a computationally intensive technique called the bootstrap, but this would require a bit of non-trivial statistical programming.

Reply 9

OP

Original post by Gregorius

Yes, this is a tricky one. The first thing that comes to my mind is some sort of MANOVA or MANCOVA, but I can't immediately see how to bend it into the shape that you want. I'll carry on thinking along these lines, but for the moment...

Perhaps a reasonable approach is to go back, not to p-values, but to normalized effect sizes. There is a ono-to-one correspondence between them, but the scale of effect size would seem to make more sense than the non-linear transform that you go through to get a p-value.

So, consider one of the parameters

x_i

that you're interested in and consider its values before and after the intervention in samples A and B. In each of these samples calculate the effect size (after-before) and divide it by its standard deviation to get

\theta_{i}^A

and

\theta_{i}^B

. Then to get a comparison between

\theta_{i}^A

and

\theta_{i}^B

either subtract one from the other, or take a ratio or something like that. Then order the differences (or the ratios).

This approach has the advantage of simplicity; but it has the disadvantage that there is no obvious analytic way of coming up with a statistical way of telling whether one parameter is really more differentiating between the effect in A and B than another. It can be done, using a computationally intensive technique called the bootstrap, but this would require a bit of non-trivial statistical programming.

Thank you for this. As statistically significance is important though I don't think I can use this method.

I have tried measuring a number of different ratios between the p-values (e.g: p(A)/p(B), (p(A)-p(B))/p(A)+p(B)...and thankfully my dataset is small enough that I can tell my approaches are obviously wrong.

If you think there is no "best" solution then I suppose I can present the "best" of the possible solutions I have come up with as at least I will have something!

But thanks for taking time to help me with my issue. I really appreciate it

Reply 10

OP

Original post by Gregorius

Yes, this is a tricky one. The first thing that comes to my mind is some sort of MANOVA or MANCOVA, but I can't immediately see how to bend it into the shape that you want. I'll carry on thinking along these lines, but for the moment...

Perhaps a reasonable approach is to go back, not to p-values, but to normalized effect sizes. There is a ono-to-one correspondence between them, but the scale of effect size would seem to make more sense than the non-linear transform that you go through to get a p-value.

So, consider one of the parameters

x_i

that you're interested in and consider its values before and after the intervention in samples A and B. In each of these samples calculate the effect size (after-before) and divide it by its standard deviation to get

\theta_{i}^A

and

\theta_{i}^B

. Then to get a comparison between

\theta_{i}^A

and

\theta_{i}^B

either subtract one from the other, or take a ratio or something like that. Then order the differences (or the ratios).

This approach has the advantage of simplicity; but it has the disadvantage that there is no obvious analytic way of coming up with a statistical way of telling whether one parameter is really more differentiating between the effect in A and B than another. It can be done, using a computationally intensive technique called the bootstrap, but this would require a bit of non-trivial statistical programming.

I came up with a method actually and was wondering if you agree.
Let's asume A = p<0.05 ALWAYS
and B = p>0.05 ALWAYS

If we say
Score = 1/A * B
that seems to produce a very good overall score

Do you agree?
thanks

Reply 11

Original post by jsmith6131

I came up with a method actually and was wondering if you agree.
Let's asume A = p<0.05 ALWAYS
and B = p>0.05 ALWAYS

If we say
Score = 1/A * B
that seems to produce a very good overall score

Do you agree?
thanks

I think that this corresponds to the suggestion I made about taking the ratio of the effect sizes - but doing it on the scale of the p-values.

Reply 12

OP