# Estimating the error in a Least-squares Numerical Fit

Watch
Announcements

Page 1 of 1

Go to first unread

Skip to page:

**Problem**

I'd like to add a disclaimer that this work isn't technically graded or anything else along those lines and is purely optional/vocational.

I have a set of data at some positions (x,y), the observed distribution of points, which is inherently noisy. (x,y) are observables which both have an error in observation. For each point, the error dx and dy is known. (x,y) are observables which both have an error in observation.

I have a set of theoretical curves that should match the data, let's call it the model, which is a function of variable z. I'll call it model(z).

The goal of this is to find the theoretical curve that matches the data most effectively. Each curve is independent of another, they're nonlinear, and have no functional form. The curves are just data in themselves, a set of (x,y). They are not single-valued functions in x or y.

I also wish to get an error estimate on how good the fitting is, estimating an error in z.

**Progress**

My code can currently produce an estimate on z, and the estimate is perfect and matches up with what might be expected. However, I'm at a loss in calculating the error in z.

The code is an attempt at least-squares fitting. For each value z, the theoretical curve "data" is taken. For each of my observed datapoints r_i, I calculate the perpendicular distance to the theoretical curve, min(r_i - data)...

I then get the value of LS, which is given by:

with min(etc) being the geometric difference of data and theory. w_i is an arbitrary weighting factor unique to i, picked arbitrarily to have the fit improve based on the geometry. Sig is just variance^2 in the (unweighted) geometric distances.

**Question**

Would anyone here happen to know how I could estimate an error in the variable z? Or maybe an idea of how I could solve this problem so as to get an error? :-;

0

reply

Report

#2

(Original post by

I'd like to add a disclaimer that this work isn't technically graded or anything else along those lines and is purely optional/vocational.

I have a set of data at some positions (x,y), the observed distribution of points, which is inherently noisy. (x,y) are observables which both have an error in observation. For each point, the error dx and dy is known. (x,y) are observables which both have an error in observation.

I have a set of theoretical curves that should match the data, let's call it the model, which is a function of variable z. I'll call it model(z).

The goal of this is to find the theoretical curve that matches the data most effectively. Each curve is independent of another, they're nonlinear, and have no functional form. The curves are just data in themselves, a set of (x,y). They are not single-valued functions in x or y.

I also wish to get an error estimate on how good the fitting is, estimating an error in z.

My code can currently produce an estimate on z, and the estimate is perfect and matches up with what might be expected. However, I'm at a loss in calculating the error in z.

The code is an attempt at least-squares fitting. For each value z, the theoretical curve "data" is taken. For each of my observed datapoints r_i, I calculate the perpendicular distance to the theoretical curve, min(r_i - data)...

I then get the value of LS, which is given by:

with min(etc) being the geometric difference of data and theory. w_i is an arbitrary weighting factor unique to i, picked arbitrarily to have the fit improve based on the geometry. Sig is just variance^2 in the (unweighted) geometric distances.

Would anyone here happen to know how I could estimate an error in the variable z? Or maybe an idea of how I could solve this problem so as to get an error? :-;

**Callicious**)**Problem**I'd like to add a disclaimer that this work isn't technically graded or anything else along those lines and is purely optional/vocational.

I have a set of data at some positions (x,y), the observed distribution of points, which is inherently noisy. (x,y) are observables which both have an error in observation. For each point, the error dx and dy is known. (x,y) are observables which both have an error in observation.

I have a set of theoretical curves that should match the data, let's call it the model, which is a function of variable z. I'll call it model(z).

The goal of this is to find the theoretical curve that matches the data most effectively. Each curve is independent of another, they're nonlinear, and have no functional form. The curves are just data in themselves, a set of (x,y). They are not single-valued functions in x or y.

I also wish to get an error estimate on how good the fitting is, estimating an error in z.

**Progress**My code can currently produce an estimate on z, and the estimate is perfect and matches up with what might be expected. However, I'm at a loss in calculating the error in z.

The code is an attempt at least-squares fitting. For each value z, the theoretical curve "data" is taken. For each of my observed datapoints r_i, I calculate the perpendicular distance to the theoretical curve, min(r_i - data)...

I then get the value of LS, which is given by:

with min(etc) being the geometric difference of data and theory. w_i is an arbitrary weighting factor unique to i, picked arbitrarily to have the fit improve based on the geometry. Sig is just variance^2 in the (unweighted) geometric distances.

**Question**Would anyone here happen to know how I could estimate an error in the variable z? Or maybe an idea of how I could solve this problem so as to get an error? :-;

Last edited by mqb2766; 1 month ago

0

reply

(Original post by

When the noise is additive iid .. on the output you can get the confidence intervals on the parameters (z?) without too much problem. When there is noise on the input, its less of a straight regression scenario. What is the form of the model (how does z influence the output) and how much noise do you have on the input. Are the z's parameters or some form of smoothing parameters or ...

**mqb2766**)When the noise is additive iid .. on the output you can get the confidence intervals on the parameters (z?) without too much problem. When there is noise on the input, its less of a straight regression scenario. What is the form of the model (how does z influence the output) and how much noise do you have on the input. Are the z's parameters or some form of smoothing parameters or ...

The model itself is a result of numerical simulations computed for each unique value of z, there isn't some discrete functional form for it;

Here's an example. The red is the data, the green is something that can be disregarded for this, and the blue is the model.

0

reply

Report

#4

(Original post by

It's safe to assume the noise in (x,y) is additive (it lies within a +- pretty much, or that assumption can be safely taken, I suppose)

The model itself is a result of numerical simulations computed for each unique value of z, there isn't some discrete functional form for it;

Here's an example. The red is the data, the green is something that can be disregarded for this, and the blue is the model.

**Callicious**)It's safe to assume the noise in (x,y) is additive (it lies within a +- pretty much, or that assumption can be safely taken, I suppose)

The model itself is a result of numerical simulations computed for each unique value of z, there isn't some discrete functional form for it;

Here's an example. The red is the data, the green is something that can be disregarded for this, and the blue is the model.

I don't understand the graph or the form of the model, should the sum index be i or n.

Last edited by mqb2766; 1 month ago

0

reply

(Original post by

I guess it helps to know how the response depends on z for you to talk about an error in its value. If not, you could always do some form of random sampling of z in order to assess the sensitivity and get some idea about the uncertainty.

I don't understand the graph or the form of the model, should the sum index be i or n.

**mqb2766**)I guess it helps to know how the response depends on z for you to talk about an error in its value. If not, you could always do some form of random sampling of z in order to assess the sensitivity and get some idea about the uncertainty.

I don't understand the graph or the form of the model, should the sum index be i or n.

The blue line is the model that I'm geometrically fitting to the red data. The green is just some other catalogue that can be ignored.

I'm going to approach the problem differently with a different fitter/etc (this type of fitting for this isn't well documented + there isn't much going for it yet) but the traditional method has a lot of context + sample code exists for it. Thanks for the insight though, I'm going to give a go at doing some guesswork with estimating the sensitivity of the model to changes in (t).

0

reply

Report

#6

(Original post by

The sum index should be i (i being the i'th datapoint, from 0 to N. I've done it pythonically, sorry about that :c)

The blue line is the model that I'm geometrically fitting to the red data. The green is just some other catalogue that can be ignored.

I'm going to approach the problem differently with a different fitter/etc (this type of fitting for this isn't well documented + there isn't much going for it yet) but the traditional method has a lot of context + sample code exists for it. Thanks for the insight though, I'm going to give a go at doing some guesswork with estimating the sensitivity of the model to changes in (t).

**Callicious**)The sum index should be i (i being the i'th datapoint, from 0 to N. I've done it pythonically, sorry about that :c)

The blue line is the model that I'm geometrically fitting to the red data. The green is just some other catalogue that can be ignored.

I'm going to approach the problem differently with a different fitter/etc (this type of fitting for this isn't well documented + there isn't much going for it yet) but the traditional method has a lot of context + sample code exists for it. Thanks for the insight though, I'm going to give a go at doing some guesswork with estimating the sensitivity of the model to changes in (t).

http://www-personal.umd.umich.edu/~w...arloHOWTO.html

but this is from a very quick goog le. It looks a readable intro, but that is obviously subjective and not carefully reviewed.

Last edited by mqb2766; 1 month ago

1

reply

(Original post by

I've no real trouble understanding the cost function, but not really much wiser about the parameter(s) z and the model. As above, Id probably go down some form of Monte Carlo approach, if I could make few assumptions about the model. An ok overview is

http://www-personal.umd.umich.edu/~w...arloHOWTO.html

but this is from a very quick goog le. It looks a readable intro, but that is obviously subjective and not carefully reviewed.

**mqb2766**)I've no real trouble understanding the cost function, but not really much wiser about the parameter(s) z and the model. As above, Id probably go down some form of Monte Carlo approach, if I could make few assumptions about the model. An ok overview is

http://www-personal.umd.umich.edu/~w...arloHOWTO.html

but this is from a very quick goog le. It looks a readable intro, but that is obviously subjective and not carefully reviewed.

I'll have a crack at using Monte-Carlo once I've finished up the other approach I've been given to try.

0

reply

X

Page 1 of 1

Go to first unread

Skip to page:

### Quick Reply

Back

to top

to top