Rendering requires the identification of those surface points that are visible through the pixels of the virtual camera. Ray tracing solves this visibility problem for each pixel independently, thus it does not reuse visibility information gathered at other pixels. The algorithms of this section, however, exploit such information using the following simple techniques:
They simultaneously attack the visibility problem for all pixels, and handle larger parts of the scene at once.
Where feasible, they exploit the incremental principle which is based on the recognition that the visibility problem becomes simpler to solve if the solution at the neighbouring pixel is taken into account.
They solve each task in that coordinate system which makes the solution easier. The scene is transformed from one coordinate system to the other by homogeneous linear transformations.
They minimize unnecessary computations, therefore remove those objects by clipping in an early stage of rendering which cannot be projected onto the window of the camera.
Homogeneous linear transformations and clipping may change the type of the surface except for points, line segments and polygons.
Footnote. Although Bézier and B-Spline curves and surfaces are invariant to affine transformations, and NURBS is invariant even to homogeneous linear transformations, but clipping changes these object types as well.
Therefore, before rendering is started, each shape is approximated by points, line segments, and meshes (Subsection 22.3).
Figure 22.37. Steps of incremental rendering. (a) Modelling defines objects in their reference state. (b) Shapes are tessellated to prepare for further processing. (c) Modelling transformation places the object in the world coordinate system. (d) Camera transformation translates and rotates the scene to get the eye to be at the origin and to look parallel with axis . (e) Perspective transformation converts projection lines meeting at the origin to parallel lines, that is, it maps the eye position onto an ideal point. (f) Clipping removes those shapes and shape parts, which cannot be projected onto the window. (g) Hidden surface elimination removes those surface parts that are occluded by other shapes. (h) Finally, the visible polygons are projected and their projections are filled with their visible colours.
Steps of incremental rendering are shown in Figure 22.37. Objects are defined in their reference state, approximated by meshes, and are transformed to the virtual world. The time dependence of this transformation is responsible for object animation. The image is taken from the camera about the virtual world, which requires the identification of those surface points that are visible from the camera, and their projection onto the window plane. The visibility and projection problems could be solved in the virtual world as happens in ray tracing, but this would require the intersection calculations of general lines and polygons. Visibility and projection algorithms can be simplified if the scene is transformed to a coordinate system, where the coordinates of a point equal to the coordinates of that pixel onto which this point is projected, and the coordinate can be used to decide which point is closer if more than one surfaces are projected onto the same pixel. Such coordinate system is called the screen coordinate system. In screen coordinates the units of axes and are equal to the pixel size. Since it is usually not worth computing the image on higher accuracy than the pixel size, coordinates are integers. Because of performance reasons, coordinate is also often integer. Screen coordinates are denoted by capital letters.
The transformation taking to the screen coordinate system is defined by a sequence of transformations, and the elements of this sequence are discussed separately. However, this transformation is executed as a single multiplication with a transformation matrix obtained as the product of elementary transformation matrices.
Rendering is expected to generate an image from a camera defined by eye position () (the focal point of the camera), looking target () where the camera looks at, and by vertical direction (Figure 22.38).
Figure 22.38. Parameters of the virtual camera: eye position , target , and vertical direction , from which camera basis vectors are obtained, front and back clipping planes, and vertical field of view (the horizontal field of view is computed from aspect ratio ).
Camera parameter defines the vertical field of view, is the ratio of the width and the height of the window, and are the distances of the front and back clipping planes from the eye, respectively. These clipping planes allow to remove those objects that are behind, too close to, or too far from the eye.
We assign a coordinate system, i.e. three orthogonal unit basis vectors to the camera. Horizontal basis vector , vertical basis vector , and basis vector pointing to the looking direction are obtained as follows:
The camera transformation translates and rotates the space of the virtual world in order to get the camera to move to the origin, to look at direction axis , and to have vertical direction parallel to axis , that is, this transformation maps unit vectors to the basis vectors of the coordinate system. Transformation matrix can be expressed as the product of a matrix translating the eye to the origin and a matrix rotating basis vectors of the camera to the basis vectors of the coordinate system:
Let us note that the columns of the rotation matrix are vectors . Since these vectors are orthogonal, it is easy to see that this rotation maps them to coordinate axes . For example, the rotation of vector is:
In the next step the viewing pyramid containing those points which can be projected onto the window is normalized making the field of view equal to 90 degrees (Figure 22.39).
Normalization is a simple scaling transformation:
The main reason of this transformation is to simplify the formulae of the next transformation step, called perspective transformation.
The perspective transformation distorts the virtual world to allow the replacement of the perspective projection by parallel projection during rendering.
After the normalizing transformation, the potentially visible points are inside a symmetrical finite frustum of pyramid of 90 degree apex angle (Figure 22.39). The perspective transformation maps this frustum onto a cube, converting projection lines crossing the origin to lines parallel to axis (Figure 22.40).
Figure 22.40. The perspective transformation maps the finite frustum of pyramid defined by the front and back clipping planes, and the edges of the window onto an axis aligned, origin centred cube of edge size 2.
Perspective transformation is expected to map point to point, line to line, but to map the eye position to infinity. It means that perspective transformation cannot be a linear transformation of Cartesian coordinates. Fortunately, homogeneous linear transforms also map point to point, line to line, and are able to handle points at infinity with finite coordinates. Let us thus try to find the perspective transformation in the form of a homogeneous linear transformation defined by a matrix:
Figure 22.40 shows a line (projection ray) and its transform. Let and be the and the slopes of the line, respectively. This line is defined by equation in the normalized camera space. The perspective transformation maps this line to a “horizontal” line crossing point and being parallel to axis . Let us examine the intersection points of this line with the front and back clipping planes, that is, let us substitute and into parameter of the line equation. The transformation should map these points to and , respectively.
The perspective transformation of the point on the first clipping plane is:
where is an arbitrary, non-zero scalar since the point defined by homogeneous coordinates does not change if the homogeneous coordinates are simultaneously multiplied by a non-zero scalar. Setting to , we get:
Note that the first coordinate of the transformed point equals to the first coordinate of the original point on the clipping plane for arbitrary , , and values. This is possible only if the first column of matrix is . Using the same argument for the second coordinate, we can conclude that the second column of the matrix is . Furthermore, in equation (22.32) the third and the fourth homogeneous coordinates of the transformed point are not affected by the first and the second coordinates of the original point, requiring . The conditions on the third and the fourth homogeneous coordinates can be formalized by the following equations:
Applying the same procedure for the intersection point of the projection line and the back clipping plane, we can obtain other two equations:
Solving this system of linear equations, the matrix of the perspective transformation can be expressed as:
Since perspective transformation is not affine, the fourth homogeneous coordinate of the transformed point is usually not 1. If we wish to express the coordinates of the transformed point in Cartesian coordinates, the first three homogeneous coordinates should be divided by the fourth coordinate. Homogeneous linear transforms map line segment to line segment and triangle to triangle, but it may happen that the resulting line segment or triangle contains ideal points (Subsection 22.5.2). The intuition behind the homogeneous division is a traveling from the projective space to the Euclidean space, which converts a line segment containing an ideal point to two half lines. If just the two endpoints of the line segment is transformed, then it is not unambiguous whether the two transformed points need to be connected by a line segment or the complement of this line segment should be considered as the result of the transformation. This ambiguity is called the wrap around problem.
The wrap around problem does not occur if we can somehow make sure that the original shape does not contain points that might be mapped onto ideal points. Examining the matrix of the perspective transformation we can conclude that the fourth homogeneous coordinate of the transformed point will be equal to the coordinate of the original point. Ideal points having zero fourth homogeneous coordinate () may thus be obtained transforming the points of plane , i.e. the plane crossing the origin and parallel to the window. However, if the shapes are clipped onto a first clipping plane being in front of the eye, then these points are removed. Thus the solution of the wrap around problem is the execution of the clipping step before the homogeneous division.
The purpose of clipping is to remove all shapes that either cannot be projected onto the window or are not between the front and back clipping planes. To solve the wrap around problem, clipping should be executed before the homogeneous division. The clipping boundaries in homogeneous coordinates can be obtained by transforming the screen coordinate AABB back to homogeneous coordinates. In screen coordinates, i.e. after homogeneous division, the points to be preserved by clipping meet the following inequalities:
On the other hand, points that are in front of the eye after camera transformation have negative coordinates, and the perspective transformation makes the fourth homogeneous coordinate equal to in normalized camera space. Thus the fourth homogeneous coordinate of points in front of the eye is always positive. Let us thus add condition to the set of conditions of inequalities (22.33). If is positive, then inequalities (22.33) can be multiplied by , resulting in the definition of the clipping region in homogeneous coordinates:
Points can be clipped easily, since we should only test whether or not the conditions of inequalities (22.34) are met. Clipping line segments and polygons, on the other hand, requires the computation of the intersection points with the faces of the clipping boundary, and only those parts should be preserved which meet inequalities (22.34).
Clipping algorithms using Cartesian coordinates were discussed in Subsection 22.4.3. Those methods can also be applied in homogeneous coordinates with two exceptions. Firstly, for homogeneous coordinates, inequalities (22.34) define whether a point is in or out. Secondly, intersections should be computed using the homogeneous coordinate equations of the line segments and the planes.
Let us consider a line segment with endpoints and . This line segment can be an independent shape or an edge of a polygon. Here we discuss the clipping on half space of equation (clipping methods on other half spaces are very similar). Three cases need to be distinquished:
If both endpoints of the line segment are inside, that is and , then the complete line segment is in, thus is preserved.
If both endpoints are outside, that is and , then all points of the line segment are out, thus it is completely eliminated by clipping.
If one endpoint is outside, while the other is in, then the intersection of the line segment and the clipping plane should be obtained. Then the endpoint being out is replaced by the intersection point. Since the points of a line segment satisfy equation (22.18), while the points of the clipping plane satisfy equation , parameter of the intersection point is computed as:
Substituting parameter into the equation of the line segment, homogeneous coordinates of the intersection point are obtained.
Clipping may introduce new vertices. When the vertices have some additional features, for example, the surface colour or normal vector at these vertices, then these additional features should be calculated for the new vertices as well. We can use linear interpolation. If the values of a feature at the two endpoints are and , then the feature value at new vertex generated by clipping is .
Having executed the perspective transformation, the Cartesian coordinates of the visible points are in . These normalized device coordinates should be further scaled and translated according to the resolution of the screen and the position of the viewport where the image is expected. Denoting the left-bottom corner pixel of the screen viewport by , the right-top corner by , and coordinates expressing the distance from the eye are expected in , the matrix of the viewport transformation is:
Coordinate systems after the perspective transformation are left handed, unlike the coordinate systems of the virtual world and the camera, which are right handed. Left handed coordinate systems seem to be unusual, but they meet our natural expectation that the screen coordinate grows from left to right, the coordinate from bottom to top and, the coordinate grows in the direction of the camera target.
After clipping, homogeneous division, and viewport transformation, shapes are in the screen coordinate system where a point of coordinates can be assigned to a pixel by extracting the first two Cartesian coordinates .
Rasterization works in the screen coordinate system and identifies those pixels which have to be coloured to approximate the projected shape. Since even simple shapes can cover many pixels, rasterization algorithms should be very fast, and should be appropriate for hardware implementation.
Let the endpoints of a line segment be and in screen coordinates. Let us further assume that while we are going from the first endpoint towards the second, both coordinates are growing, and is the faster changing coordinate, that is,
In this case the line segment is moderately ascending. We discuss only this case, other cases can be handled by exchanging the coordinates and replacing additions by substractions.
Line drawing algorithms are expected to find pixels that approximate a line in a way that there are no holes and the approximation is not fatter than necessary. In case of moderately ascending line segments this means that in each pixel column exactly one pixel should be filled with the colour of the line. This coloured pixel is the one closest to the line in this column. Using the following equation of the line
in pixel column of coordinate , the pixel closest to the line has coordinate that is equal to the rounding of . Unfortunately, the determination of requires a floating point multiplication, addition, and a rounding operation, which are too slow.
In order to speed up line drawing, we apply a fundamental trick of computer graphics, the incremental principle. The incremental principle is based on the recognition that it is usually simpler to evaluate a function using value than computing it from . Since during line drawing the columns are visited one by one, when column is processed, value is already available. In case of a line segment we can write:
Note that the evaluation of this formula requires just a single floating point addition ( is less than 1). This fact is exploited in digital differential analyzator algorithms (DDA-algorithms). The DDA line drawing algorithm is then:
1 2 3
Further speedups can be obtained using fixed point number representation. This means that the product of the number and is stored in an integer variable, where is the number of fractional bits. The number of fractional bits should be set to exclude cases when the rounding errors accumulate to an incorrect result during long iteration sequences. If the longest line segment covers columns, then the minimum number of fractional bits guaranteeing that the accumulated error is less than 1 is . Thanks to clipping only lines fitting to the screen are rasterized, thus is equal to the maximum screen resolution.
The performance and simplicity of the DDA line drawing algorithm can still be improved. On the one hand, the software implementation of the DDA algorithm requires shift operations to realize truncation and rounding operations. On the other hand – once for every line segment – the computation of slope involves a division which is computationally expensive. Both problems are solved in the Bresenham line drawing algorithm.
Figure 22.41. Notations of the Bresenham algorithm: is the signed distance between the closest pixel centre and the line segment along axis , which is positive if the line segment is above the pixel centre. is the distance along axis between the pixel centre just above the closest pixel and the line segment.
Let us denote the vertical, signed distance of the line segment and the closest pixel centre by , and the vertical distance of the line segment and the pixel centre just above the closest pixel by (Figure 22.41). As the algorithm steps onto the next pixel column, values and change and should be recomputed. While the new and values satisfy inequality , that is, while the lower pixel is still closer to the line segment, the shaded pixel of the next column is in the same row as in the previous column. Introducing error variable , the row of the shaded pixel remains the same until this error variable is negative (). As the pixel column is incremented, variables are updated using the incremental formulae (, ):
These formulae are valid if the closest pixel in column is in the same row as in column . If stepping to the next column, the upper pixel gets closer to the line segment (error variable becomes positive), then variables should be recomputed for the new closest row and for the pixel just above it. The formulae describing this case are as follows:
Note that is a signed distance which is negative if the line segment is below the closest pixel centre, and positive otherwise. We can assume that the line starts at a pixel centre, thus the initial values of the control variables are:
This algorithm keeps updating error variable and steps onto the next pixel row when the error variable becomes positive. In this case, the error variable is decreased to have a negative value again. The update of the error variable requires a non-integer addition and the computation of its increment involves a division, similarly to the DDA algorithm. It seems that this approach is not better than the DDA.
Let us note, however, that the sign changes of the error variable can also be recognized if we examine the product of the error variable and a positive number. Multiplying the error variable by we obtain decision variable . In case of moderately ascending lines the decision and error variables change their sign simultaneously. The incremental update formulae of the decision variable can be obtained by multiplying the update formulae of error variable by :
The initial value of the decision variable is .
The decision variable starts at an integer value and is incremented by integers in each step, thus it remains to be an integer and does not require fractional numbers at all. The computation of the increments need only integer additions or subtractions and multiplications by 2.
The complete Bresenham line drawing algorithm is:
1 2 3 4 5 6
THENThe line stays in the current pixel row. 9
ELSEThe line steps onto the next pixel row. 10 11
The fundamental idea of the Bresenham algorithm was the replacement of the fractional error variable by an integer decision variable in a way that the conditions used by the algorithm remained equivalent. This approach is also called the method of invariants, which is useful in many rasterization algorithms.
The input of an algorithm filling single connected polygons is the array of vertices (this array is usually the output of the polygon clipping algorithm). Edge of the polygon connects vertices and . The last vertex needs not be treated in a special way if the first vertex is put again after the last vertex in the array. Multiply connected polygons are defined by more than one closed polylines, thus are specified by more than one vertex arrays.
The filling is executed by processing a horizontal pixel row called scan line at a time. For a single scan line, the pixels belonging to the interior of the polygon can be found by the following steps. First the intersections of the polygon edges and the scan line are calculated. Then the intersection points are sorted in the ascending order of their coordinates. Finally, pixels between the first and the second intersection points, and between the third and the fourth intersection points, or generally between the th and the th intersection points are set to the colour of the polygon (Figure 22.42). This algorithm fills those pixels which can be reached from infinity by crossing the polygon boundary odd number of times.
The computation of the intersections between scan lines and polygon edges can be speeded up using the following observations:
An edge and a scan line can have intersection only if coordinate of the scan line is between the minimum and maximum coordinates of the edge. Such edges are the active edges. When implementing this idea, an active edge table (AET for short) is needed which stores the currently active edges.
The computation of the intersection point of a line segment and the scan line requires floating point multiplication, division, and addition, thus it is time consuming. Applying the incremental principle, however, we can also obtain the intersection point of the edge and a scan line from the intersection point with the previous scan line using a single, fixed-point addition (Figure 22.43).
Figure 22.43. Incremental computation of the intersections between the scan lines and the edges. Coordinate always increases with the reciprocal of the slope of the line.
When the incremental principle is exploited, we realize that coordinate of the intersection with an edge always increases by the same amount when scan line is incremented. If the edge endpoint having the larger coordinate is and the endpoint having the smaller coordinate is , then the increment of the coordinate of the intersection is , where and . This increment is usually not an integer, hence increment and intersection coordinate should be stored in non-integer, preferably fixed-point variables. An active edge is thus represented by a fixed-point increment , the fixed-point coordinate value of intersection , and the maximum vertical coordinate of the edge (). The maximum vertical coordinate is needed to recognize when the edge becomes inactive.
Scan lines are processed one after the other. First, the algorithm determines which edges become active for this scan line, that is, which edges have minimum coordinate being equal to the scan line coordinate. These edges are inserted into the active edge table. The active edge table is also traversed and those edges whose maximum coordinate equals to the scan line coordinate are removed (note that this way the lower end of an edge is supposed to belong to the edge, but the upper edge is not). Then the active edge table is sorted according to the coordinates of the edges, and the pixels between each pair of edges are filled. Finally, the coordinates of the intersections in the edges of the active edge table are prepared for the next scan line by incrementing them by the reciprocal of the slope .
FOReach of Put activated edges into the AET. 3
FOReach of the AET Remove deactivated edges from the AET. 6
Sort-AETSort according to . 9
FOReach pair of edges of the AET 10
FOReach in the AET Incremental principle. 13
The algorithm works scan line by scan line and first puts the activated edges to the active edge table. The active edge table is maintained by three operations. Operation
computes variables of an edge and inserts this structure into the table. Operation
removes an item from the table when the edge is not active any more . Operation
sorts the table in the ascending order of the value of the items. Having sorted the lists, every two consecutive items form a pair, and the pixels between the endpoints of each of these pairs are filled. Finally, the coordinates of the items are updated according to the incremental principle.
The three-dimensional visibility problem is solved in the screen coordinate system. We can assume that the surfaces are given as triangle meshes.
The z-buffer algorithm finds that surface for each pixel, where the coordinate of the visible point is minimal. For each pixel we allocate a memory to store the minimum coordinate of those surfaces which have been processed so far. This memory is called the z-buffer or the depth-buffer.
When a triangle of the surface is rendered, all those pixels are identified which fall into the interior of the projection of the triangle by a triangle filling algorithm. As the filling algorithm processes a pixel, the coordinate of the triangle point visible in this pixel is obtained. If this value is larger than the value already stored in the z-buffer, then there exists an already processed triangle that is closer than the current triangle in this given pixel. Thus the current triangle is obscured in this pixel and its colour should not be written into the raster memory. However, if the new value is smaller than the value stored in the z-buffer, then the current triangle is the closest so far, and its colour and coordinate should be written into the pixel and the z-buffer, respectively.
The z-buffer algorithm is then:
FOReach pixel Clear screen. 2
)3 maximum value after clipping 4
FOReach triangle Rendering. 5
FOReach pixel of triangle 6
DOcoordinate of that point which projects onto pixel 7
, colour of triangle
in this point)9
When the triangle is filled, the general polygon filling algorithm of the previous section could be used. However, it is worth exploiting the special features of the triangle. Let us sort the triangle vertices according to their coordinates and assign index 1 to the vertex of the smallest coordinate and index 3 to the vertex of the largest coordinate. The third vertex gets index 2. Then let us cut the triangle into two pieces with scan line . After cutting we obtain a “lower” triangle and an “upper” triangle. Let us realize that in such triangles the first (left) and the second (right) intersections of the scan lines are always on the same edges, thus the administration of the polygon filling algorithm can be significantly simplified. In fact, the active edge table management is not needed anymore, only the incremental intersection calculation should be implemented. The classification of left and right intersections depend on whether is on the right or on the left side of the oriented line segment from to . If is on the left side, the projected triangle is called left oriented, and right oriented otherwise.
When the details of the algorithm is introduced, we assume that the already re-indexed triangle vertices are
The rasterization algorithm is expected to fill the projection of this triangle and also to compute the coordinate of the triangle in every pixel (Figure 22.45).
Figure 22.45. A triangle in the screen coordinate system. Pixels inside the projection of the triangle on plane need to be found. The coordinates of the triangle in these pixels are computed using the equation of the plane of the triangle.
The coordinate of the triangle point visible in pixel is computed using the equation of the plane of the triangle (equation (22.1)):
where and . Whether the triangle is left oriented or right oriented depends on the sign of the coordinate of the normal vector of the plane. If is negative, then the triangle is left oriented. If it is negative, then the triangle is right oriented. Finally, when is zero, then the projection maps the triangle onto a line segment, which can be ignored during filling.
Using the equation of the plane, function expressing the coordinate corresponding to pixel is:
According to the incremental principle, the evaluation the coordinate can take advantage of the value of the previous pixel:
Since increment is constant for the whole triangle, it needs to be computed only once. Thus the calculation of the coordinate in a scan line requires just a single addition per pixel. The coordinate values along the edges can also be obtained incrementally from the respective values at the previous scan line (Figure 22.46). The complete incremental algorithm which renders a lower left oriented triangle is as follows (the other cases are very similar):
1 Normal vector. 2 increment. 3 4 5
)One scan line. 8
IFVisibility test. 9
)10 11 12 Next scan line.
This algorithm simultaneously identifies the pixels to be filled and computes the coordinates with linear interpolation. Linear interpolation requires just a single addition when a pixel is processed. This idea can also be used for other features as well. For example, if the colour of the triangle vertices are available, the colour of the internal points can be set to provide smooth transitions applying linear interpolation. Note also that the addition to compute the feature value can also be implemented by a special purpose hardware. Graphics cards have a great number of such interpolation units.
If a pixel of the image corresponds to a given object, then its neighbours usually correspond to the same object, that is, visible parts of objects appear as connected territories on the screen. This is a consequence of object coherence and is called image coherence.
Figure 22.47. Polygon-window relations: (a) distinct; (b) surrounding ; (c) intersecting; (d) contained.
If the situation is so fortunate—from a labor saving point of view—that a polygon in the object scene obscures all the others and its projection onto the image plane covers the image window completely, then we have to do no more than simply fill the image with the colour of the polygon. If no polygon edge falls into the window, then either there is no visible polygon, or some polygon covers it completely (Figure 22.47). The window is filled with the background colour in the first case, and with the colour of the closest polygon in the second case. If at least one polygon edge falls into the window, then the solution is not so simple. In this case, using a divide-and-conquer approach, the window is subdivided into four quarters, and each subwindow is searched recursively for a simple solution.
The basic form of the algorithm called Warnock-algorithm rendering a rectangular window with screen coordinates (lower left corner) and (upper right corner) is this:
IFor Is the window larger than a pixel? 2
IFat least one edge projects onto the window 3
THENNon-trivial case: Subdivision and recursion. 4
ELSETrivial case: window is homogeneous. 9 the polygon visible in pixel 10
IFno visible polygon 11
THENfill rectangle with the background colour 12 tab/≥
ELSEfill rectangle with the colour of
Note that the algorithm can handle non-intersecting polygons only. The algorithm can be accelerated by filtering out those distinct polygons which can definitely not be seen in a given subwindow at a given step. Furthermore, if a surrounding polygon appears at a given stage, then all the others behind it can be discarded, that is all those which fall onto the opposite side of it from the eye. Finally, if there is only one contained or intersecting polygon, then the window does not have to be subdivided further, but the polygon (or rather the clipped part of it) is simply drawn. The price of saving further recurrence is the use of a scan-conversion algorithm to fill the polygon.
If we simply scan convert polygons into pixels and draw the pixels onto the screen without any examination of distances from the eye, then each pixel will contain the colour of the last polygon falling onto that pixel. If the polygons were ordered by their distance from the eye, and we took the farthest one first and the closest one last, then the final picture would be correct. Closer polygons would obscure farther ones — just as if they were painted an opaque colour. This method is known as the painter's algorithm.
The only problem is that the order of the polygons necessary for performing the painter's algorithm is not always simple to compute. We say that a polygon does not obscure another polygon if none of the points of is obscured by . To have this relation, one of the following conditions should hold
Polygons and do not overlap in range, and the minimum coordinate of polygon is greater than the maximum coordinate of polygon .
The bounding rectangle of on the plane does not overlap with that of .
Each vertex of is farther from the viewpoint than the plane containing .
Each vertex of is closer to the viewpoint than the plane containing .
The projections of and do not overlap on the plane.
All these conditions are sufficient. The difficulty of their test increases, thus it is worth testing the conditions in the above order until one of them proves to be true. The first step is the calculation of an initial depth order. This is done by sorting the polygons according to their maximal value into a list. Let us first take the polygon which is the last item on the resulting list. If the range of does not overlap with any of the preceding polygons, then is correctly positioned, and the polygon preceding can be taken instead of for a similar examination. Otherwise overlaps a set of polygons. The next step is to try to check whether does not obscure any of the polygons in , that is, that is at its right position despite the overlapping. If it turns out that obscures for a polygon in the set , then has to be moved behind in the list, and the algorithm continues stepping back to . Unfortunately, this algorithm can run into an infinite loop in case of cyclic overlapping. Cycles can be resolved by cutting. In order to accomplish this, whenever a polygon is moved to another position in the list, we mark it. If a marked polygon is about to be moved again, then — assuming that is a part of a cycle — is cut into two pieces by the plane of , so that does not obscure and does not obscure , and only is moved behind .
Binary space partitioning divides first the space into two halfspaces, the second plane divides the first halfspace, the third plane divides the second halfspace, further planes split the resulting volumes, etc. The subdivision can well be represented by a binary tree, the so-called BSP-tree illustrated in Figure 22.48. The kd-tree discussed in Subsection 22.6.2 is also a special version of BSP-trees where the splitting planes are parallel with the coordinate planes. The BSP-tree of this subsection, however, uses general planes.
The first splitting plane is associated with the root node of the BSP-tree, the second and third planes are associated with the two children of the root, etc. For our application, not so much the planes, but rather the polygons defining them, will be assigned to the nodes of the tree, and the set of polygons contained by the volume is also necessarily associated with each node. Each leaf node will then contain either no polygon or one polygon in the associated set.
algorithm for creating the BSP-tree for a set of polygons uses the following notations. A node of the binary tree is denoted by , the polygon associated with the node by , and the two child nodes by and , respectively. Let us consider a splitting plane of normal and place vector . Point belongs to the positive (right) subspace of this plane if the sign of scalar product is positive, otherwise it is in the negative (left) subspace. The BSP construction algorithm is:
1 Create a new 2
IFis empty or contains just a single polygon 3
THEN4 5 6
ELSEone polygon from list 7 Remove polygon from list 8 polygons of which overlap with the positive subspace of 9 polygons of which overlap with the negative subspace of 10
The size of the BSP-tree, i.e. the number of polygons stored in it, is on the one hand highly dependent on the nature of the object scene, and on the other hand on the “choice strategy” used when one polygon from list is selected.
Having constructed the BSP-tree the visibility problem can be solved by traversing the tree in the order that if a polygon obscures another than it is processed later. During such a traversal, we determine whether the eye is at the left or right subspace at each node, and continue the traversal in the child not containing the eye. Having processed the child not containing the eye, the polygon of the node is drawn and finally the child containing the eye is traversed recursively.
22.7-1 Implement the complete Bresenham algorithm that can handle not only moderately ascending but arbitrary line segments.
22.7-2 The presented polygon filling algorithm tests each edges at a scan line whether it becomes active here. Modify the algorithm in a way that such tests are not executed at each scan line, but only once.
22.7-3 Implement the complete z-buffer algorithm that renders left/righ oriented, upper/lower triangles.
22.7-4 Improve the presented Warnock-algorithm and eliminate further recursions when only one edge is projected onto the subwindow.
22.7-5 Apply the BSP-tree for discrete time collision detection.
22.7-6 Apply the BSP-tree as a space partitioning structure for ray tracing.
Ray tracing renderer
Implement a rendering system applying the ray tracing algorithm. Objects are defined by triangle meshes and quadratic surfaces, and are associated with diffuse reflectivities. The virtual world also contains point light sources. The visible colour of a point is proportional to the diffuse reflectivity, the intensity of the light source, the cosine of the angle between the surface normal and the illumination direction (Lambert's law), and inversely proportional with the distance of the point and the light source. To detect whether or not a light source is not occluded from a point, use the ray tracing algorithm as well.
Continuous time collision detection with ray tracing
Using ray tracing develop a continuous time collision detection algorithm which computes the time of collision between a moving and rotating polyhedron and a still half space. Approximate the motion of a polygon vertex by a uniform, constant velocity motion in small intervals .
Incremental rendering system
Implement a three-dimensional renderer based on incremental rendering. The modelling and camera transforms can be set by the user. The objects are given as triangle meshes, where each vertex has colour information as well. Having transformed and clipped the objects, the z-buffer algorithm should be used for hidden surface removal. The colour at the internal points is obtained by linear interpolation from the vertex colours.
The elements of Euclidean, analytic and projective geometry are discussed in the books of Maxwell [ 238 ], [ 237 ] and Coxeter [ 76 ]. The application of projective geometry in computer graphics is presented in Herman's dissertation [ 163 ] and Krammer's paper [ 205 ]. Curve and surface modelling is the main focus of computer aided geometric design (CAD, CAGD), which is discussed by Gerald Farin [ 103 ], and Rogers and Adams [ 289 ]. Geometric models can also be obtained measuring real objects, as proposed by reverse engineering methods [ 335 ]. Implicit surfaces can be studied by reading Bloomenthal's work [ 45 ]. Solid modelling with implicit equations is also booming thanks to the emergence of functional representation methods (F-Rep), which are surveyed at http://cis.k.hosei.ac.jp/~F-rep. Blobs have been first proposed by Blinn [ 44 ]. Later the exponential influence function has been replaced by polynomials [ 347 ], which are more appropriate when roots have to be found in ray tracing.
Geometric algorithms give solutions to geometric problems such as the creation of convex hulls, clipping, containment test, tessellation, point location, etc. This field is discussed in the books of Preparata and Shamos [ 278 ] and of Marc de Berg [ 81 ], [ 82 ]. The triangulation of general polygons is still a difficult topic despite to a lot of research efforts. Practical triangulation algorithms run in [ 82 ], [ 298 ], [ 355 ], but Chazelle [ 62 ] proposed an optimal algorithm having linear time complexity. The presented proof of the two ears theorem has originally been given by Joseph O'Rourke [ 260 ]. Subdivision surfaces have been proposed and discussed by Catmull and Clark [ 59 ], Warren and Weimer [ 338 ], and by Brian Sharp [ 301 ], [ 300 ]. The butterfly subdivision approach has been published by Dyn et al. [ 95 ]. The Sutherland–Hodgeman polygon clipping algorithm is taken from [ 309 ].
Collision detection is one of the most critical problems in computer games since it prevents objects to fly through walls and it is used to decide whether a bullet hits an enemy or not. Collision detection algorithms are reviewed by Jiménez, Thomas and Torras [ 183 ].
Glassner's book [ 133 ] presents many aspects of ray tracing algorithms. The 3D DDA algorithm has been proposed by Fujimoto et al. [ 122 ]. Many papers examined the complexity of ray tracing algorithms. It has been proven that for objects, ray tracing can be solved in time [ 81 ], [ 312 ], but this is theoretical rather than practical result since it requires memory and preprocessing time, which is practically unacceptable. In practice, the discussed heuristic schemes are preferred, which are better than the naive approach only in the average case. Heuristic methods have been analyzed by probabilistic tools by Márton [ 312 ], who also proposed the probabilistic scene model used in this chapter as well. We can read about heuristic algorithms, especially about the efficient implementation of the kd-tree based ray tracing in Havran's dissertation [ 158 ]. A particularly efficient solution is given in Szécsi's paper [ 310 ].
The probabilistic tools, such as the Poisson point process can be found in the books of Karlin and Taylor [ 189 ] and Lamperti [ 211 ]. The cited fundamental law of integral geometry can be found in the book of Santaló [ 295 ].
The geoinformatics application of quadtrees and octrees are also discussed in chapter 16 of this book.
The algorithms of incremental image synthesis are discussed in many computer graphics textbooks [ 114 ]. Visibility algorithms have been compared in [ 309 ], [ 311 ]. The painter's algorithm has been proposed by Newell et al. [ 255 ]. Fuchs examined the construction of minimal depth BSP-trees [ 121 ]. The source of the Bresenham algorithm is [ 48 ].
Graphics cards implement the algorithms of incremental image synthesis, including transformations, clipping, z-buffer algorithm, which are accessible through graphics libraries (OpenGL, DirectX). Current graphics hardware includes two programmable processors, which enables the user to modify the basic rendering pipeline. Furthermore, this flexibility allows non graphics problems to be solved on the graphics hardware. The reason of using the graphics hardware for non graphics problems is that graphics cards have much higher computational power than CPUs. We can read about such algorithms in the ShaderX or in the GPU Gems [ 106 ] series or visiting the http://www.gpgpu.org web page.