Click to choose your tutorial
Tutorial 1: SDK
Tutorial 2: Creating a Window
Tutorial 3: Vulkan Instance
Tutorial 4: Supported Extensions
Tutorial 5: Drawing Points
Tutorial 6: Drawing Lines
Tutorial 7: Drawing Triangles
Tutorial 8: Introduction to Shaders
Tutorial 9: Loading Shaders
Tutorial 10: Using Multiple Shaders
Tutorial 11: Loading 3D Model
Tutorial 12: Displaying 3D Model
Tutorial 13: Vertex Animation
Tutorial 14: Bone Animation
OpenGL Tutorials
site is v0.1a build Sep 11, 2016
Join to be notified of new tutorials

Tutorial 0 - Theory Behind OpenGL Coordinate System, Camera, Local and World Coordinates, Far and Near Clipping Planes.

Written by staff. Contact us to submit your article for review.
Sep 13, 2016
tags Tags keywords
Post #25

This tutorial looks longer than the Curious Case of Benjamin Button movie script. Don't get discouraged though. This is fundamental knowledge. Before I started to write tutorials about OpenGL API I have decided to take some time off and write an additional pre-production (if you will) tutorial about the 3D graphics fundamentals.

In other words this tutorial puts the pieces of the 3D graphics puzzle together. This could be really useful for those who are starting out to learn about computer graphics and how it all works.

This article is sponsored by The Drawing Channel on YouTube:

Learn how to draw anything!

Not many people would enjoy listings upon listings of source code presented without any context or general explanation of what the HECK is going on. We'll make things clear in this tutorial, but if you already have a good grasp of what perspective projection is and what back-face culling is, you would probably want to skip this part and go directly to Download my OpenGL Base Code page.

I do, however, still suggest reading it because not only does it contain 3D fundamentals but also addresses some information on how this series of OpenGL tutorials are structured and why it is here. More importantly some really basic OpenGL-related information (like general naming convention that OpenGL uses for functions and variables) is explained and if you don't understand it, things will get tough for you.

I also want to add that writing tutorials makes me learn something as well, since I'm self-taught and pretend to be not the formal education-friendly kind of a person. Other than that, a lot of this information is available from the Numerous OpenGL books . If you notice any mistakes or errors in the code or anywhere else throughout the tutorial, please let me know. This is going to be a really quick overview of 3D terms and techniques but I will try to add things to it as I listen to reader's suggestions.

The state of OpenGL tutorials

OpenGL graphics programming is an extensive subject. Not many programmers learn even 50% of the whole story. There are already plenty of tutorials written on it. So why bother writing another one? Well, there may be thousands of other tutorials about OpenGL API, but let's ask ourselves, how many OpenGL tutorial pages can fit on the first top 10 results page in Google? That's right, only 10! And these 10 are considered the top 10 by the OpenGL programming community.

One of my goals is to one day appear in the top 10 results in Google. Call me Ishmael. Okay, okay... The real reason is simply I felt the lack of detailed information in those tutorials, the kind you would see in a book, only this time you don't have to buy anything (except of course, some of the great OpenGL books I advertise here on this very website, which I actually do recommend getting -- half of this stuff I learned from those books). It seems like people just want to write tutorials just for the sake of having a tutorial section on their site.

Writing tutorials (or any other documentation or books) is hard work and it is time-consuming. Not everyone has the opportunity and patience to write a few solid tutorials, even me. But I try. I also feel that there's a great deal of demand for 3D-tutorials (be it D3D or GL) with the recent advances in computer graphics hardware (What's your weapon of mass destruction? Nvidia, ATI, anyone?)

I didn't pick OpenGL because it was "better" or more portable than D3D (and is it really anymore?) but because this is what I know about the most. In the future I will try to document D3D as well but only if there's enough demand (send me an e-mail to let me know you're interested in D3D, if you are, whoever you are). Well, I think it's time to actually start writing something useful.

3D Basics Everyone Should Know Before Touching OpenGL

In this part I will cover 3D graphics in general and most of the following topics don't have to be constrained to OpenGL alone. So what is exactly 3D and how can it be represented to the viewer on the computer screen? To describe the idea behind rendering 3D objects on the screen it's best for me to use a 3D object. Lets examine the following image of a wire-framed 3D cube.

Wireframe cube in 3D space with OpenGL/gl

You see, for your brain 3D objects are so common that by looking at this picture you will instantly recognize a 3D shape even though it's nothing more than a collection of 12 2D lines connected to each other with specific angles between them. And yet it's hard to think of this image as being "flat". 3D graphics on the visual level is (mostly) all about rendering objects to the screen.

The question is what are the main requirements to render an object so that you will be able to correctly recognize it as a 3D object and not just a collection of lines or perhaps polygons? Obviously, the idea is to render objects to the screen the way you would see them in real life. And how do you see objects in real life? This is where the meaning of perspective comes from.

In the pre-computer ages artists had used the same techniques for painting their masterpieces that today's 3D software is using for creating 3D images. The point behind perspective is that all objects farther away from the viewer look smaller than objects closer to the viewer, and ultimately they disappear into the vanishing point.

The concept of a vanishing point in art is also relevant in rendering 3D computer graphics

This is true for most 3D graphics applications. Now lets take a look at the OpenGL coordinate system we will be using. It is so-called 3D Cartesian coordinate system. As you can see, additionally to the x and y-axis known in 2D graphics we have the z-axis which extends into negative space from the center of the screen from the viewer and into positive space from the center of the screen towards the viewer. This image visually mimics this principle.

The concept of a vanishing point in art is also relevant in rendering 3D computer graphics. The concept of a 3D Cartesian coordinate system is used.

Perspective and Orthographic Projections

As we take little steps towards the end of this tutorial, I think it's the right time to explain projection right here. There are two types of projections actually. Perspective Projection and Orthographic Projection (described shortly). First I want to talk about Perspective projection because I've already explained perspective. Objects that you're going to render will be actually what we might call "projected" to the screen.

What I mean by projection is the actual conversion from the 3D coordinates (usually vertices of objects) to the 2D flat surface of the screen. Since the computer screen has only two dimensions, we, somehow, have to display the 3D objects on the 2D screen. And that's precisely what projection does for us. Perspective projection works as follows. I will take a single pixel as an example. Imagine we have a pixel with coordinates of (5, -3, 2) on the x y and z-axis respectively and we want to project it to the screen. We do it with the following formula. Assume we have a structure POINT3D containing the coordinates of the point initialized with the mentioned values for this example.

// initialize point POINT3D point = { 5, -3, 2 }; // find the right position on the screen in 2D coordinates int x2d = HALFWIDTH + point.x * ViewingDistance / point.z; int y2d = HALFHEIGHT + point.y * ViewingDistance / point.z; // project the 3D point to the screen Pixel(x2d, y2d);

Let's take the formula apart. As you already know, usually in 2D all coordinates are based on the 4th quadrant in 2D Cartesian Coordinate system. That means that (0, 0) is at the upper left corner of the screen. In 3D graphics, we want our view, or the camera to be exact, (camera is explained a little further into this tutorial) to be located as in the following image, so that we're always looking straight down the negative space of the z-axis.

3D camera view and its relationship to the 2D Cartesian coordinate system, with an example of viewing frustrum and clipping volume. Note that by default the camera is looking down the z-axis.

As you can see, if we had a 3D point at (0, 0, -16) it would be exactly in the center of the screen. A little modification is required here. Take a look at the projection formula again. There we're adding halves of the screen resolution first to center all results. We're in fact translating the point from (0, 0) to (halfwidth, halfheight) on the screen.

If we're in 640x480 resolution we would be translating the point to (320, 240). Take the constant ViewingDistance out of the equation for a second. And you will realize that the second part of the formula is just the relationship between "X and Z" for x2d and "Y and Z" for y2d. This is the most important idea behind perspective-projected objects.

As you recall objects that appear farther from the viewer are smaller, and this is the exact relationship between the 2D points and the perspective, which is achieved by division of the both horizontal and vertical coordinates by the amount of how far away the object is. However there is a problem. By merely dividing the x and y coordinates by depth (the z coordinate) we will only get the ratio between the depth and vertical/horizontal position of the pixel. And what we need is how they are actually related to the Viewing Distance and Viewing Volume. These two terms are explained below.

The Viewing Volume is the space between the near clipping plane (or the viewing plane) and the far clipping plane as seen on the second picture below. So, back to our equation for a second, we simply multiply x and y by ViewingDistance to get the right relationship between the Viewing Volume and the X and Y coordinates. Simple as that.

Viewing Distance is closely related to the Viewing Volume. The longer the viewing distance, the narrower is the line of sight and therefore the smaller the viewing volume. Well, the good news is that we don't have to worry about all of this in OpenGL since everything is done behind the scenes, however you still need to understand these terms to understand why images appear the way they appear on the screen, and I just wanted to explain the basics of perspective projection. The above formula could be used in a software 3D rendered but we're not interested in that at this moment.

In conclusion, here's how a whole object (as opposed to the pixel in previous example) would be projected onto the screen in theory. At the upper right corner of this image there is a real object (cube) in space. I tried to make the projected version of the cube as it appears on the screen as close as possible to what it would be like, but I'm sure this is wrong. Just keep in mind that the whole object is projected on the flat screen pixel by pixel (and polygon by polygon on a higher scale).

Projection is a principle of converting a 3-dimensional object in world space into the screen space (flat 2D) coordinates, for displaying on the computer display.

I talked about Viewing Volume and how it is related to the perspective projection equation. But what is Viewing Volume? The Viewing Volume is also known as the Clipping volume or the Frustum. Here's the visual representation of the viewing volume.

Viewing volume reresented by near clipping plane, far clipping plane, also forming the viewing plane (what's drawn on the screen)

There are two planes, the viewing plane and the far clipping plane. The viewing plane is actually the screen and the far plan indicates how far you can "see", whatever is behind the far clipping plane will not be visible. The viewing volume is the space between those two planes. The viewing volume is sometimes called clipping volume because you usually want to clip your polygons against it.

Orthographic Projection

As I mentioned before there is another type of projection, which is the Orthographic Projection. This type of projection cannot be used for games or real-time applications with desirable results since it ignores the z-axis coordinate. In other words, if you draw a bunch of trees close and far away from the view, they will all appear the same size. Orthographic projection is used with technical design software and OpenGL supports it as well. In this series of OpenGL tutorials we will be always using the perspective projection.

The 3D Camera

At this point I should explain what camera is. The camera is always located at the origin of the virtual "view". Note however, that it is NOT NECESSARY located at the origin of the COORDINATE SYSTEM since you can move the camera around and transform it to anywhere in the world.

The camera and the view are basically the same things. Camera is only mentioned to represent a virtual viewing point but there is actually no physical camera anywhere around. I already talked about it but it is important to understand that there is some space between the origin of the camera and the viewing plane. As you saw in the previous image. That space is the VIEWING DISTANCE.

If you look straight ahead for example you are considered to be looking down the camera's z-axis into the negative z space, in 3D terms. Camera rotation is possible around all 3 axis as you would expect and is made even easier for you by OpenGL. Camera rotation is responsible for moving the view, and it's what happens when you move your virtual head around with the mouse or arrow keys in a 3D-FPS shooter. Lets examine the camera a little closer.

Camera, as any other object in space has 2 coordinate systems. The two are the Local Coordinate System and the World Coordinate System. The local coordinates are the camera's rotation degrees on all of it's LOCAL xyz-axis and actual displacement from the local coordinate system.

The world coordinates specify the camera's position in the world. For example, when you walk around in a 3D FPS-shooter kind of game you are actually moving the camera's world coordinates and when you look around you change the camera's local coordinates.

It is possible to use the local camera coordinates for moving also, by translating them to the new location but only BEFORE rotation is performed because rotation is also done in local coordinates around (0,0,0) and if you move the camera before rotating to say (0, 5, 0) it will not rotate correctly as its center will be displaced and taken into account during rotation.

Remember this rule: always rotate around the local center (0,0,0) first, and only then translate the model to some world coordinate. This is the proper order of doing basic 3D transforms. However, there are cases when you need to transform first, and then rotate, though not as common, and is usually reserved for more complex rotations. If this sounds confusing, don't worry. It will all settle down the more you study and actually code a few basic rotations and transforms in OpenGL, if you haven't already. Here's how the camera's coordinates are transformed.

The camera, local coordinates of the camera and world position of the camera.

If you understand this so far, that's good. Now, let's move on to object rotation basics. This is exactly the same as demonstrated on the camera rotation part of the above image. The only difference is that we're not viewing the world FROM that object, but are in fact OBSERVING that object from the current camera position. This is the way an object is rotated around all of the 3 possible axis. When we get down to actually doing it in the following tutorials, I will make it more clear, so don't worry if you don't get something at this moment.

Just the same way it is with the camera, the objects also have two coordinate systems and as you might have guessed already, the objects are positioned according to the LOCAL and WORLD coordinate systems. The local coordinates are usually used for rotating the object and the world coordinates are used for positioning the object in the world or, say, in a 3D level.

As you add objects and static polygons (e.g. walls, terrain, etc.) to your 3D world you want to clip all of the polygons that are not located in the camera's viewing volume. You also want to clip off parts of the polygons that are on the edge of the view volume against the bounding box of the screen. The former is provided for us by OpenGL. Another issue associated with drawing polygons is that you don't want to draw the back faces (or sides) of the polygons when they are facing the camera.

Imagine a textured polygon which is rotated by 180 degrees so its "back" is facing us. Let's also assume that that polygon is a part of a bigger structure, a wall for example. Usually you will never want to see what's "behind" the wall. Have you ever wanted to see what's behind your room's wallpaper? I surely hope not. So the point is, if you rotate a textured polygon, its coordinates are reversed judged against the camera view and you never want to see that anyway and that space is usually covered with another side of the wall, so why bother drawing it? That's right, there is no reason to and a technique called Back-face Culling comes to our help.

Back-face culling works this way: it calculates the normal of the polygon (a normal is a perpendicular pointing straight out of the polygon at a 90-deg angle, and is very common in 3D graphics) and if it is pointing in the same direction as the camera, the surface of that polygon is not rendered as illustrated in this image.

Polygon normals and their use for back face culling (or front face culling.)

This technique was so common among the older 3D engines that developers of OpenGL decided to take it into consideration and do all the dirty job for us in hardware to speed up the pipeline which is in fact the next topic of this tutorial.

3D Graphics Pipeline

In case you're all wondering what's up with all these pipelines everyone is talking about, a pipeline is actually nothing more than an order of relatively distinctive operations. At this stage it is early to talk about what the operations are. Depending on what kind of program you're writing, be it a 3D FPS engine or a flight simulator, the pipeline might actually change into different forms that will work the best for a given task. And therefore I'm not going to describe it here in detail, but I will as soon as we get some tasks to do in further tutorials.

OpenGL Variable and Function Naming Conventions

In conclusion I want to say a few words on this topic. OpenGL was made for use with various environments, not just Windows. You can always find more information in the numerous OpenGL books that are reasonably affordable, for a technical book, considering the amount of knowledge you would have gained by the time you finished a book. In this section I explain naming conventions for both OpenGL functions and variables.

Although you don't have to use OpenGL-defined types I still feel obligated to describe them here so that anyone who wants their software to be platform-independent understand what this all means. Well, lets see. OpenGL has a number of predefined types. If you never plan being platform-independent it might be the best way to use local C types such as int, float and double. However if that's not the case, OpenGL has definitions that will work on the current system whatever the system is. All you have to do is add GL in front of the standard C types. For example, if you want to use a floating number type use GLfloat instead of C's float and if you want to use an int, use GLint. That works for the rest of the normal C types as well.

If you want to use an unsigned value, just add a "u" between GL and the type like so: GLuint; is an unsigned integer. There is also a GLboolean which is identical to bool in C. GLbitfield is used to define binary fields. A little less obvious type in OpenGL is clamp; its variations are clampf and clampi for floating and integer variables respectively. It is short for ColorR AMPlitude and used for color compositions. There are no types for pointers. Pointers are defined the usual way. For instance this is an array of pointers to int: GLint *i[16];

Each OpenGL function has a neat naming convention and its format is:

<library><function name><number of arguments><type of arguments>

To demonstrate this on a real name function I will use the glVertex3f function.

glVertex3f(0.0f, 0.0f, 0.0f); | | || | | || | | |+- f means all parameters are floats | | | | | +- 3 is the number of parameters | | | +- Vertex is the name of the function that renders a 3D point (or a vertex) | +- gl specifies the opengl library

The last two parameters are mostly encountered in the functions that are responsible for drawing primitives. Many other functions are usually used in this form:

<library><function name>

Final words

Well, what can I say, this has been a long read but this isn't even close to the full picture. I tried however to cover most general topics that came to my mind. This should definitely make it easier for beginners to read the rest of tutorials.

Hope the illustrations helped you in some way to understand the described topics better. Now, sit tight and wait for the next tutorials which will actually put what's been said in here to action! Feedback and suggestions are welcome.

article tab
Follow OpenGL Tutorials
You will only receive important news about OpenGL tutorial updates.
Who is joining?
  • Programmers You want to stay in touch to receive OpenGL tutorial updates.
  • Game Devs You're a game developer, and you also want to learn more about OpenGL!
  • Supporters You have invested interest in supporting OpenGL tutorial site.
Follow OpenGL Tutorials