I'm a bit late to reply but understanding the forumla is a bit easier if you have some grasp of some relatively easy linear algebra. Any 3 linearly independent vectors in R^3 form a basis for it. If we assume direction vectors a and line are NOT parallel then by definition they are linearly independent. Now the cross product of them is by definition perpendicular to both and so forms the 3rd linearly independent vector and thus a basis from which you can define any other vector in 3d space. Great.
Now take your two original points associated with line a and b and draw a line between them, this vector will be some combination of the basis vectors. You can find out how much that point (or rather the vector to that point) contains each component by doing the dot product. OK, so take the dot product to the point normal to both lines with the line between the points.
Now consider other points along the line and the vector between them. They will have the same components as the previous vector but with some scalar multiple of each direction vector. Thus it is impossible the normal line changes.
Ok, now add or take away a scalar multiple of each line until the component of the line between two points is zero for both line dimensions. By the definition of vector length this means they will contribute a minimal amount at these points. The other dimension (the one from the cross product normal to both) cannot be increased or decreased moving along the line. Thus, the part of the vector between the two lines made up of the component within the normal direction has to be the minimum distance between the two lines. Every other point is just a vector stretched away from this point by the other components.