You are Here: Home >< Maths

# Creating 3d model from 2d photo watch

1. This isn't as complicated as the title probably seems.

If we take a photo of a (inflexible) piece of paper, and assuming we know the field of views of the camera and the dimensions of the paper, it should be possible to figure out exactly what orientation in 3D relative to the camera the paper was in, right?

So there's the photo of a piece of paper, with its distance from the "camera" (i.e. the theoretical camera being where all the lines are coming from) adjusted so that it takes up the entire field of view of the camera. The three lines drawn in 3d go from the camera to three of the corners of the piece of paper- it's safe to assume that the corners of the sheet of paper must lie on these lines, right? Otherwise they wouldn't be in those positions in the photo.

Am I right in thinking there is only one possible way the piece of paper can be in to produce a photo where the corners lie on these particular lines? And, as such, the exact positions of the corners is therefore calculable? I've "done the math" as they say, and thought I had this sorted but my program isn't having any of it, and refuses to give a solution that looks correct. I'm wanting to check my logic is correct before I start figuring out what's wrong with my maths because I've a suspicion it's the logic that's wrong.

Thanks for any help!
2. assuming the shape of the piece of paper is known, yes.

indeed, we can do this for any body of dimension greater than 1 (and of course in real life, the only bodies that exist are those of three dimensions..)
3. You couldn't make a 3D model like the title suggests, you'd need a 3d model already or you can only say that it exists somewhere one the lines, whether a few cm or a few mile away you wouldn't know. You could use the model to find exact points though, yeah.
4. (Original post by Icy_Mikki)
This isn't as complicated as the title probably seems.

If we take a photo of a (inflexible) piece of paper, and assuming we know the field of views of the camera and the dimensions of the paper, it should be possible to figure out exactly what orientation in 3D relative to the camera the paper was in, right?

So there's the photo of a piece of paper, with its distance from the "camera" (i.e. the theoretical camera being where all the lines are coming from) adjusted so that it takes up the entire field of view of the camera. The three lines drawn in 3d go from the camera to three of the corners of the piece of paper- it's safe to assume that the corners of the sheet of paper must lie on these lines, right? Otherwise they wouldn't be in those positions in the photo.

Am I right in thinking there is only one possible way the piece of paper can be in to produce a photo where the corners lie on these particular lines? And, as such, the exact positions of the corners is therefore calculable? I've "done the math" as they say, and thought I had this sorted but my program isn't having any of it, and refuses to give a solution that looks correct. I'm wanting to check my logic is correct before I start figuring out what's wrong with my maths because I've a suspicion it's the logic that's wrong.

Thanks for any help!
Now, only if I could remember the web address of a program that uses an interesting concept, which I found on the web (it had a video demonstration). Basically, you have a camera, define a couple of points, and then 'put' computer-generated 3D objects onto those places. These objects will not change their output location when you move the camera around and such. So you could put your 3D characters on a book and see them walk around, while moving the camera at the same time. Did that make sense?

You'll definitely have to play around with vectors and matrices as well as matrix inverses!

Too bad I don't know enough vectors and matrices to give a good idea.
5. if you got the software to read the shadow and the image format is using vectors(not bitmap, but im not sure about this) then i think you can... but i think 1 angle is not enough, as in, if you got that paper on the pic, yeah you can make a 3d out of it with predefined shape however you wont be able to put on the model whats written at the back of the paper because it doesnt show on the pic
6. (Original post by Charlybob)
You couldn't make a 3D model like the title suggests, you'd need a 3d model already or you can only say that it exists somewhere one the lines, whether a few cm or a few mile away you wouldn't know. You could use the model to find exact points though, yeah.
I'm 99% sure you don't. You just need to know the size of the piece of paper, and, with this and the fields of view, you can surely figure out exactly how far each of the points are from the camera. My method of solving for it just... doesn't work for me atm and I have no idea why.

(Original post by Vjyrik)
You'll definitely have to play around with vectors and matrices as well as matrix inverses!
What would you use the matrices for?
7. I'm not great at 3D projective transforms, but I know 2D pretty well. What is bothering me here is that I know you can do an essentially arbitrary "quad-to-quad" 2D 'perspective'(*) transform. Which has me thinking that I'm not at all sure 3 points is sufficient (because there are infinite possibilities for the 4th point).

(*) that is, applying a homogeneous 3x3 matrix on (x,y,w).

My intuition says that the 2D case implies the same holds for a more normal perspective transform in 3D. I'm slightly struggling to see whether or not knowing the size of the paper / field of view gives you enough information to get round that.
8. (Original post by DFranklin)
I'm not great at 3D projective transforms, but I know 2D pretty well. What is bothering me here is that I know you can do an essentially arbitrary "quad-to-quad" 2D 'perspective'(*) transform. Which has me thinking that I'm not at all sure 3 points is sufficient (because there are infinite possibilities for the 4th point).

(*) that is, applying a homogeneous 3x3 matrix on (x,y,w).

My intuition says that the 2D case implies the same holds for a more normal perspective transform in 3D. I'm slightly struggling to see whether or not knowing the size of the paper / field of view gives you enough information to get round that.
Now, I don't know much about these matters, but surely, there are four corners on the piece of paper...? Which makes me believe that you can determine the orientation of the paper, up to scaling/distance from the camera/angle of view of the camera.

Edit: To clarify, I'm assuming the paper is flat, which basically turns the problem into a 2D perspective transform one. But if we're supposed to work out the shape of the paper as well, then I agree with DFranklin.
9. (Original post by DFranklin)
I'm not great at 3D projective transforms, but I know 2D pretty well. What is bothering me here is that I know you can do an essentially arbitrary "quad-to-quad" 2D 'perspective'(*) transform. Which has me thinking that I'm not at all sure 3 points is sufficient (because there are infinite possibilities for the 4th point).
I was thinking about this earlier too. I always assumed 3 points was enough, but maybe it isn't.

We could modify the situation to be figuring out where two dots are in 2D given a 1D photo of them (and given how far apart the dots are)- an example being drawing two dots on a piece of paper, then looking at a thin side of the paper and trying to figure out where the dots are on the large side. I *think* there are at least two possible positions the dots could be in which create the same 1D photo, so it isn't possible to figure out exactly where the dots are, right? So, logically, that'd suggest three dots isn't enough to find out where the 3 dots are in 3D given a 2D photo of them.

Anyway, here's my way of trying to find out the points if I only use three corners- if anyone is willing to check over it and see if it's the logic that's causing my program to go wrong in then I'll be extremely grateful!

#Create unit vectors which point from the camera to each of the three dots on the photo- called X, Y and Z

#Since each of the corners of the paper lie on the lines whose unit vectors are X,Y and Z, then their positions in 3d space relative to the camera must be multiples of the unit vector along the line- call these multiples a, b and c.

#So the coordinates of the 3D vector locations of the corners of the paper relative to the camera are aX, bY and cZ.

# The distance between aX and bY must be the length of a piece of paper.

#The distance between bY and cZ must be the other dimension of a piece of paper.

#The angle between the vectors (aX-bY) and (cZ-bY) must be equal to 90 degrees (since a piece of paper is a right angle)

#We now have three simultaneous equations which can be solved..? I thought?

Where does my method break down? Thanks for the responses so far guys!
10. That approach does seem to make sense, but I think those equations are not quite enough to determine the camera position; in particular the first two are effectively quadratic equations and so you will typically get 4 solutions.

You may well be accounting for that; what would concern me more is the conditioning of your equations. My feeling is that they will be very poorly conditioned and so very small onscreen errors will result in large errors in camera position. This may be what is causing your problem. Normally people use a lot more points and do some kind of least squares fit for improved robustness.
11. (Original post by Icy_Mikki)
#The angle between the vectors (aX-bY) and (cZ-bY) must be equal to 90 degrees (since a piece of paper is a right angle)
I'm not too sure about that one.

I haven't done much to do with vectors, so I could very easily be wrong, but might as well say.

Grab a sheet of paper by two opposite corners. Hold the corners horizontally. Then twist the paper so it's ALMOST flat vertically. The angle the camera gets is close to a flat line, it's not a right angle between them from the camera. Won't that be the angle those vectors will give?

If not, then yeah it should work and you can just ignore my stupidity >_>
12. (Original post by ukgea)
Now, I don't know much about these matters, but surely, there are four corners on the piece of paper...? Which makes me believe that you can determine the orientation of the paper, up to scaling/distance from the camera/angle of view of the camera.
I don't know if these attachments clarifiy what I'm worried about; the two different images are both valid perspective transformations of a (imaginary) planar rectangle, and 3 corners are identically placed in both of them.

(But, having read the other posts, I think the two images are effectively for pieces of paper of "different size", relative to the camera position, which is why you should still be able to distinguish between them).
Attached Images

13. (Original post by DFranklin)
I don't know if these attachments clarifiy what I'm worried about; the two different images are both valid perspective transformations of a (imaginary) planar rectangle, and 3 corners are identically placed in both of them.

(But, having read the other posts, I think the two images are effectively for pieces of paper of "different size", relative to the camera position, which is why you should still be able to distinguish between them).
Yes, but I was thinking that you should try to use the fourth corner as well. Or, if the fourth corner is outside the photograph, you can always construct its position by intersecting the two sides adjacent to it, which you can (in theory) do as long as you have a piece (of non-zero length) of each of them inside the photo. Then the basic strategy:

Introduce two planes, P1 and P2, the planes of the paper and the photograph, respectively. (Where the plane of the photograph is an imagined plane 1m in front of the camera, perpendicular to the viewing direction of the camera, on which the photograph is thought to be projected).

Then introduce homogeneous coordinates in these two planes, and let A be the 3x3 perspective transformation matrix for transforming between homogeneous coordinates in P1 and P2. Since you know the position of four points in P1, and the position of their images in P2, you get 12 equations to play with.

You then have 9 unknowns in A, and a further 4 unknowns arising from that you don't actually know the scale factor of the homogeneous coordinates of the four corners. This is 13 unknowns.

But, scaling the whole of A by a constant non-zero factor doesn't actually change the transformation it represents, so you really only have 8 unknowns in A. So there, now you should be able to work out A (up to a scale factor).

Now, it really feels like that if you know A, then you really know the relative positions so to speak, in 3D, of the planes P1 and P2, up to some scaling, because the perspective transformation matrix more or less represents a perspective transformation (duh) which should translate nicely into a central projection in 3D.

So, someone want to try?
14. ukgea: Yes, I agree that it makes more sense to use the fourth corner, but unfortunately the main discussion has been about what you can deduce from only 3.

TSR Support Team

We have a brilliant team of more than 60 Support Team members looking after discussions on The Student Room, helping to make it a fun, safe and useful place to hang out.

This forum is supported by:
Updated: August 22, 2008
Today on TSR

### University open days

• Southampton Solent University
Sun, 18 Nov '18
Wed, 21 Nov '18
• Buckinghamshire New University
Wed, 21 Nov '18
Poll
Useful resources

### Maths Forum posting guidelines

Not sure where to post? Read the updated guidelines here

### How to use LaTex

Writing equations the easy way

### Study habits of A* students

Top tips from students who have already aced their exams