libogc/GX: Difference between revisions

From devkitPro
Jump to navigation Jump to search
mNo edit summary
(Added optimization tip.)
Line 2: Line 2:
http://wiibrew.org/wiki/Developer_tips#Note_on_Projection_Matrices -->
http://wiibrew.org/wiki/Developer_tips#Note_on_Projection_Matrices -->


'''THIS ARTICLE IS A WORK IN PROGRESS AND IS CURRENTLY NOT TO BE CONSIDERED AN AUTHORITATIVE SOURCE OF INFORMATION. YOU HAVE BEEN WARNED.'''
'''THIS ARTICLE IS A WORK IN PROGRESS AND IS CURRENTLY NOT TO BE CONSIDERED AN AUTHORITATIVE SOURCE OF  
 
INFORMATION. YOU HAVE BEEN WARNED.'''


==Preface==
==Preface==


Some concepts in this article require some prerequisite knowledge about 3D programming in general, the GX API specifically or both. The NeHe OpenGL lessons are an excellent supplementary resource for learning the specifics of OpenGL and can work hand-in-hand with the OpenGL textbooks. libogc also has lessons 1 through 10 converted to the GX and gu APIs, so cross-referencing the source code for those lessons can help you understand at is going on.
Some concepts in this article require some prerequisite knowledge about 3D programming in general, the GX API  
 
specifically or both. The NeHe OpenGL lessons are an excellent supplementary resource for learning the specifics  
 
of OpenGL and can work hand-in-hand with the OpenGL textbooks. libogc also has lessons 1 through 10 converted to  
 
the GX and gu APIs, so cross-referencing the source code for those lessons can help you understand at is going  
 
on.
 
You'll ''definitely'' need to have a handle on how to program in C, especially concerning things such as
 
floating-point numbers and pointers, both of which are used heavily. The functions, compiler directives and
 
other parts of the API are viewable in the Doxygen pages. Some functions have real documentation with them (the
 
most often-used ones usually), but almost all of the GX Doxygen listings have no accompanying description, and
 
you may find yourself having to wing it while learning and writing GX. If you get completely hung-up on how a


You'll ''definitely'' need to have a handle on how to program in C, especially concerning things such as floating-point numbers and pointers, both of which are used heavily. The functions, compiler directives and other parts of the API are viewable in the Doxygen pages. Some functions have real documentation with them (the most often-used ones usually), but almost all of the GX Doxygen listings have no accompanying description, and you may find yourself having to wing it while learning and writing GX. If you get completely hung-up on how a particular function or block of functions affects the system, remember that Google is your friend; sometimes you can find the answer on the devkitpro forums (use site:forums.devkitpro.org in Google to search there); sometimes it can be found elsewhere (the wiibrew forums are sometimes useful, but avoid their wiki!), and sometimes you won't get any useful results at all. Hopefully, the Doxygen documentation will be expanded in time, but until then, you'll be put through a trial-by-fire.
particular function or block of functions affects the system, remember that Google is your friend; sometimes you  


This article was initially written by me, ccfreak2k. shagkur originally wrote most of the GX backend code in libogc, and as such could be considered the authority on the subject, but he has not been around for a while. As such, this article includes what I believe to be correct information, and is all the information I have compiled on the subject thus far. I have done my best to make sure that the information is as accurate as possible (since wrong information is worse than no information!). The examples provided with libogc are known to be correct implementations of GX, so defer to those if there's a conflict on how something works.
can find the answer on the devkitpro forums (use site:forums.devkitpro.org in Google to search there); sometimes


Most important of all, however, is that you '''understand how your code works'''. Becoming a [http://en.wikipedia.org/wiki/Cargo_cult_programming cargo cult programmer] is a bad idea, so make the effort to understand what you're writing and why you're writing it.
it can be found elsewhere (the wiibrew forums are sometimes useful, but avoid their wiki!), and sometimes you
 
won't get any useful results at all. Hopefully, the Doxygen documentation will be expanded in time, but until
 
then, you'll be put through a trial-by-fire.
 
This article was initially written by me, ccfreak2k. shagkur originally wrote most of the GX backend code in
 
libogc, and as such could be considered the authority on the subject, but he has not been around for a while. As
 
such, this article includes what I believe to be correct information, and is all the information I have compiled
 
on the subject thus far. I have done my best to make sure that the information is as accurate as possible (since
 
wrong information is worse than no information!). The examples provided with libogc are known to be correct
 
implementations of GX, so defer to those if there's a conflict on how something works.
 
Most important of all, however, is that you '''understand how your code works'''. Becoming a  
 
[http://en.wikipedia.org/wiki/Cargo_cult_programming cargo cult programmer] is a bad idea, so make the effort to  
 
understand what you're writing and why you're writing it.


==Introduction==
==Introduction==


Among the GameCube's many subsystems lies probably one of the biggest: '''GX'''. GX is the name of the API used to draw graphics using the famous Flipper chip.
Among the GameCube's many subsystems lies probably one of the biggest: '''GX'''. GX is the name of the API used  
 
to draw graphics using the famous Flipper chip.


==GX Setup and Particulars==
==GX Setup and Particulars==


GX shares some similarities with OpenGL, and differs greatly in many ways as well. OpenGL, by design, masks a lot of the nitty-gritty hardware specifics, leaving implementation of it to hardware vendors, whereas the GX API is very close to the metal and many functions have little, if any, processing performed by the CPU. What this means is that, if you write smart code and know how the hardware works under the hood, you can bring out the best performance of the machine, but you'll also be working with an API that is altogether more complex than writing in a higher-level API such as OpenGL.
GX shares some similarities with OpenGL, and differs greatly in many ways as well. OpenGL, by design, masks a  
 
lot of the nitty-gritty hardware specifics, leaving implementation of it to hardware vendors, whereas the GX API  
 
is very close to the metal and many functions have little, if any, processing performed by the CPU. What this  
 
means is that, if you write smart code and know how the hardware works under the hood, you can bring out the  
 
best performance of the machine, but you'll also be working with an API that is altogether more complex than  
 
writing in a higher-level API such as OpenGL.
 
Before you start initializing GX, you'll want to make sure you have the VIDEO subsystem set up. Almost every
 
GameCube application that displays ''anything'' will do this. This includes allocating framebuffers acquiring
 
the TV screen attributes. Right after you have VIDEO set up is when you'll generally initialize GX.


Before you start initializing GX, you'll want to make sure you have the VIDEO subsystem set up. Almost every GameCube application that displays ''anything'' will do this. This includes allocating framebuffers acquiring the TV screen attributes. Right after you have VIDEO set up is when you'll generally initialize GX.
To start, you'll need to allocate a "GP FIFO". The GP FIFO, or "graphics processor FIFO" is a portion of memory


To start, you'll need to allocate a "GP FIFO". The GP FIFO, or "graphics processor FIFO" is a portion of memory reserved for uploading commands to the GP. A FIFO is a type of pipe, but that's not necessary to know for now. To initialize the GX subsystem, you'll need to make some room for the FIFO, which you'll do like this:
reserved for uploading commands to the GP. A FIFO is a type of pipe, but that's not necessary to know for now.  
 
To initialize the GX subsystem, you'll need to make some room for the FIFO, which you'll do like this:


<code>
<code>
Line 32: Line 94:
</code>
</code>


The FIFO must be 32-byte aligned, which is what <tt>memalign()</tt> does. It's like <tt>malloc()</tt>, except it gives us a block of memory aligned to whatever alignment we specify. We also give <tt>DEFAULT_FIFO_SIZE</tt> as the size of memory that we want. The size of the FIFO required generally depends on how many commands you're dispatching per unit of time, but the default size is adequate in many cases. <tt>memset()</tt> clears the FIFO memory to 0 because allocated memory is uninitialized, and we don't want the GP to mistake garbage for commands. With that out of the way, it's time to switch on GX:
The FIFO must be 32-byte aligned, which is what <tt>memalign()</tt> does. It's like <tt>malloc()</tt>, except it  
 
gives us a block of memory aligned to whatever alignment we specify. We also give <tt>DEFAULT_FIFO_SIZE</tt> as  
 
the size of memory that we want. The size of the FIFO required generally depends on how many commands you're  
 
dispatching per unit of time, but the default size is adequate in many cases. <tt>memset()</tt> clears the FIFO  
 
memory to 0 because allocated memory is uninitialized, and we don't want the GP to mistake garbage for commands.  
 
With that out of the way, it's time to switch on GX:


<code>
<code>
Line 38: Line 110:
</code>
</code>


Here we give it a pointer to the FIFO and the size of the FIFO. From this point on, you probably won't be dealing with the FIFO anymore, as you'll be interfacing with the GP using the GX API now. What manner of initialization happens after this depends greatly on how you're using GX, but one function you'll generally use is this one:
Here we give it a pointer to the FIFO and the size of the FIFO. From this point on, you probably won't be  
 
dealing with the FIFO anymore, as you'll be interfacing with the GP using the GX API now. What manner of  
 
initialization happens after this depends greatly on how you're using GX, but one function you'll generally use  
 
is this one:


<code>
<code>
Line 45: Line 123:
</code>
</code>


This tells the GP to clear the screen to the specified background color at the beginning of every new frame, which will eliminate the "hall of mirrors effect" that would happen otherwise.
This tells the GP to clear the screen to the specified background color at the beginning of every new frame,  


What happens after this will generally mirror initialization in OpenGL with some big exceptions, which we'll discuss below.
which will eliminate the "hall of mirrors effect" that would happen otherwise.
 
What happens after this will generally mirror initialization in OpenGL with some big exceptions, which we'll  
 
discuss below.


==Performance Optimization==
==Performance Optimization==


There's many ways to optimize GX performance, usually dealing with redundant calls or eliminating calls if data between frames doesn't change (for example, you can save the view matrix after transformations and only create a new one if the viewpoint changes). One way you can optimize is by creating a separate thread for GX drawing operations. This can help if your other threads are almost always busy, such as if, for example, they're keeping track of the game state or managing AI. As soon as any thread is done with any work, it can put itself to sleep until new work can be done, which allows your other threads more CPU time to finish their work. How to use threading in libogc is not discussed here; however, here are some things to keep in mind if you decide to move your rendering code to a separate thread:
There's many ways to optimize GX performance, usually dealing with redundant calls or eliminating calls if data  
 
between frames doesn't change (for example, you can save the view matrix after transformations and only create a  
 
new one if the viewpoint changes). One way you can optimize is by creating a separate thread for GX drawing  
 
operations. This can help if your other threads are almost always busy, such as if, for example, they're keeping  
 
track of the game state or managing AI. As soon as any thread is done with any work, it can put itself to sleep  
 
until new work can be done, which allows your other threads more CPU time to finish their work. How to use  
 
threading in libogc is not discussed here; however, here are some things to keep in mind if you decide to move  
 
your rendering code to a separate thread:
 
* Use message passing to communicate render state changes. For example, if a new object needs to be rendered,
 
use the message-passing interface of libogc to pass the pointer to the new object and have your rendering thread
 
dereference it and read in the data, passing it to its drawing routines. Additionally, you can also use message
 
passing to tell the render thread when an object should NOT be rendered anymore.
 
* Use scheduling, sleep/wakeup and callbacks to your advantage. If your rendering thread is done with its


* Use message passing to communicate render state changes. For example, if a new object needs to be rendered, use the message-passing interface of libogc to pass the pointer to the new object and have your rendering thread dereference it and read in the data, passing it to its drawing routines. Additionally, you can also use message passing to tell the render thread when an object should NOT be rendered anymore.
drawing, it should sleep until the next frame. VIDEO_WaitVSync() will put the calling thread in a wait state


* Use scheduling, sleep/wakeup and callbacks to your advantage. If your rendering thread is done with its drawing, it should sleep until the next frame. VIDEO_WaitVSync() will put the calling thread in a wait state until the next vertical interrupt occurs, where it will then be woken up.
until the next vertical interrupt occurs, where it will then be woken up.


Even if you're not using a seperate rendering thread, there are other ways to optimize:
Even if you're not using a seperate rendering thread, there are other ways to optimize:


* Compress your textures. This doesn't necessarily increase performance, but if a texture is going to be, say, applied to a wall, you can decrease its in-memory size by setting colfmt to 14 in the SCF file, which converts the texture into a compressed format, much like DXT1. The cost for decompressing is tiny, and the loss of quality isn't too dramatic, so it's definitely worth it if you're up against the wall.
* Compress your textures. This doesn't necessarily increase performance, but if a texture is going to be, say,  
 
applied to a wall, you can decrease its in-memory size by setting colfmt to 14 in the SCF file, which converts  
 
the texture into a compressed format, much like DXT1. The cost for decompressing is tiny, and the loss of  
 
quality isn't too dramatic, so it's definitely worth it if you're up against the wall.
 
* Only call functions when you need to. For example, setting up a texture to be painted only needs to occur once
 
per texture. Uploading a texture (using <tt>GX_LoadTexObj()</tt>) also only needs to occur once unless you need
 
to upload a new one or a new version of an already-uploaded texture (in which case you need to invalidate it
 
from texture cache too).


==Attribute Slots==
==Attribute Slots==


Many things in GX that can take different formats, such as vertex attributes or texture coordinate matrices, can be stored in slots. This lets you define all of your formats ahead of time and simply switch slots when required to load the stored formats. You're basically required to use at least one slot, even if you redefine the formats every time you change. You'll also find examples of slot usage peppered through this document and through example code, although I have yet to see any code that uses more than one slot. The biggest example involves the TEV, where you might have different settings for different textures, of which you can have eight textures per pass.
Many things in GX that can take different formats, such as vertex attributes or texture coordinate matrices, can  
 
be stored in slots. This lets you define all of your formats ahead of time and simply switch slots when required  
 
to load the stored formats. You're basically required to use at least one slot, even if you redefine the formats  
 
every time you change. You'll also find examples of slot usage peppered through this document and through  
 
example code, although I have yet to see any code that uses more than one slot. The biggest example involves the  
 
TEV, where you might have different settings for different textures, of which you can have eight textures per  
 
pass.


==Vertex Initialization==
==Vertex Initialization==


One aspect of the close-to-the-hardware nature of GX is in vertex attributes. Before you start drawing triangles, you need to tell GX how they should be drawn. You'll generally do it like this:
One aspect of the close-to-the-hardware nature of GX is in vertex attributes. Before you start drawing  
 
triangles, you need to tell GX how they should be drawn. You'll generally do it like this:


<code>
<code>
Line 82: Line 216:
</code>
</code>


<tt>GX_InvVtxCache()</tt> tells the GP to invalidate the vertex cache. You'll do this at least once as part of initialization, but it might also useful if, you, for example, want to change the scene, which may involve having completely different objects to be drawn.
<tt>GX_InvVtxCache()</tt> tells the GP to invalidate the vertex cache. You'll do this at least once as part of  
 
initialization, but it might also useful if, you, for example, want to change the scene, which may involve  
 
having completely different objects to be drawn.
 
The calls to <tt>GX_ClearVtxDesc()</tt> clears the vertex attribute table, which is what is defined in the lines
 
following that.
 
The <tt>GX_SetVtxDesc()</tt> calls tell the GP how we'll be providing the vertex data. We're specifying the
 
format for vertex positions, normals and texture coordinates, respectively. <tt>GX_DIRECT</tt> in this context
 
means we'll be specifying them using calls like <tt>GX_Position3f()</tt> (similar to <tt>glVertex3f()</tt>).
 
<tt>GX_SetVtxAttrFmt()</tt> tells the GP how we're going to specify the information for vertices. Specifically
 
it's the same as the previous three (vertex position, vertex normal and texture coordinates), except we can


The calls to <tt>GX_ClearVtxDesc()</tt> clears the vertex attribute table, which is what is defined in the lines following that.
specify different formats for different vertex format slots. For example, if you were drawing a HUD, you could


The <tt>GX_SetVtxDesc()</tt> calls tell the GP how we'll be providing the vertex data. We're specifying the format for vertex positions, normals and texture coordinates, respectively. <tt>GX_DIRECT</tt> in this context means we'll be specifying them using calls like <tt>GX_Position3f()</tt> (similar to <tt>glVertex3f()</tt>).
store its vertex format in a different slot, then call that slot up when drawing the HUD instead of having to  


<tt>GX_SetVtxAttrFmt()</tt> tells the GP how we're going to specify the information for vertices. Specifically it's the same as the previous three (vertex position, vertex normal and texture coordinates), except we can specify different formats for different vertex format slots. For example, if you were drawing a HUD, you could store its vertex format in a different slot, then call that slot up when drawing the HUD instead of having to reload the vertex attributes yourself every time. You give the slot you want to use in an operation in the call to <tt>GX_Begin()</tt>, which is detailed below.
reload the vertex attributes yourself every time. You give the slot you want to use in an operation in the call  
 
to <tt>GX_Begin()</tt>, which is detailed below.


==Drawing Triangles==
==Drawing Triangles==


The heart of drawing 3D shapes on the screen is more or less identical to OpenGL with a few changes. These steps are almost exactly the same as other primitives that GX supports, except you'd replace GX_TRIANGLES with the appropriate type.
The heart of drawing 3D shapes on the screen is more or less identical to OpenGL with a few changes. These steps  
 
are almost exactly the same as other primitives that GX supports, except you'd replace GX_TRIANGLES with the  
 
appropriate type.


After drawing is set up (textures loaded, matrices applied, etc), the drawing commands can be dispatched. You lead off with this:
After drawing is set up (textures loaded, matrices applied, etc), the drawing commands can be dispatched. You  
 
lead off with this:


<code>
<code>
Line 100: Line 260:
</code>
</code>


The first argument tells the GP what we want to draw. In this case, we're drawing triangles, i.e. every three vertices will be a single triangle on the screen. Other options for this are points, lines, line strip, triangle  strip, triangle fan and quads. Which you use depends on what you're drawing and can have huge performance implications with high triangle counts.
The first argument tells the GP what we want to draw. In this case, we're drawing triangles, i.e. every three  


The second argument is the vertex format to load. If you read through [[#Vertex Initialization|Vertex Initialization]], you'd know that there are eight vertex format slots that you can use, and which one you want to use for that draw operation is specified here.
vertices will be a single triangle on the screen. Other options for this are points, lines, line strip, triangle


The last argument is the vertex count. This count ''must'' match the number of vertices that you are drawing. If this number is larger than the actual number of vertices that you specify, then the thread will hang as soon as <tt>GX_DrawDone()</tt> is called.
strip, triangle fan and quads. Which you use depends on what you're drawing and can have huge performance


Once this call is made, you can call functions that give the specific information on each vertex. You need to have position for each, which you can call like this:
implications with high triangle counts.
 
The second argument is the vertex format to load. If you read through [[#Vertex Initialization|Vertex
 
Initialization]], you'd know that there are eight vertex format slots that you can use, and which one you want
 
to use for that draw operation is specified here.
 
The last argument is the vertex count. This count ''must'' match the number of vertices that you are drawing. If
 
this number is larger than the actual number of vertices that you specify, then the thread will hang as soon as
 
<tt>GX_DrawDone()</tt> is called.
 
Once this call is made, you can call functions that give the specific information on each vertex. You need to  
 
have position for each, which you can call like this:


<code>
<code>
Line 112: Line 288:
</code>
</code>


If you followed [[#Vertex Initialization|Vertex Initialization]], you saw that the vertex attribute example gave <tt>GX_POS_XYZ</tt> for the position vertex attribute, which means the vertices for this VA slot would be coordinates with three members, and here we're specifying the position coordinates for one of them. We would need to make two more of these calls so that we complete a triangle in addition to satisfy the vertex count we gave earlier. If you wanted to specify a normal for each vertex, you would make this call following the previous one:
If you followed [[#Vertex Initialization|Vertex Initialization]], you saw that the vertex attribute example gave  
 
<tt>GX_POS_XYZ</tt> for the position vertex attribute, which means the vertices for this VA slot would be  
 
coordinates with three members, and here we're specifying the position coordinates for one of them. We would  
 
need to make two more of these calls so that we complete a triangle in addition to satisfy the vertex count we  
 
gave earlier. If you wanted to specify a normal for each vertex, you would make this call following the previous  
 
one:


<code>
<code>
Line 118: Line 304:
</code>
</code>


This works much like <tt>glNormal</tt>, except you need one of these for ''every'' vertex, as opposed to just once as in OpenGL. The same goes to calls to <tt>GX_TexCoord*()</tt>.
This works much like <tt>glNormal</tt>, except you need one of these for ''every'' vertex, as opposed to just  
 
once as in OpenGL. The same goes to calls to <tt>GX_TexCoord*()</tt>.
 
<!-- A wiibrew article says the correct order per vertex is: position, normal, color, bi-normals, tex coords. -


<!-- A wiibrew article says the correct order per vertex is: position, normal, color, bi-normals, tex coords. -->
->


After you have made the appropriate draw calls, you close it with this:
After you have made the appropriate draw calls, you close it with this:
Line 130: Line 320:
This puts a token in the FIFO that tells the GP that you're finished giving commands for this drawing operation.
This puts a token in the FIFO that tells the GP that you're finished giving commands for this drawing operation.


In your first GX programs, you might only have a very basic draw block that might only draw a single triangle with three vertices, and you might loop that block for every triangle you want to draw. In these cases, the above lines are sufficient. Obviously, the coordinate values would probably be read from a variable rather than declared statically if you're loading objects at run-time.
In your first GX programs, you might only have a very basic draw block that might only draw a single triangle  
 
with three vertices, and you might loop that block for every triangle you want to draw. In these cases, the  
 
above lines are sufficient. Obviously, the coordinate values would probably be read from a variable rather than  
 
declared statically if you're loading objects at run-time.


==Display Lists==
==Display Lists==


Feeding in direct values works for your basic projects, but it can quickly grow out of control. A more intelligent way to handle draw operations, especially on large, complex meshes, is to use a display list. A display list is, basically, a series of commands for the GX to execute. It's exactly the same as what you did for drawing before, except now you're telling libogc to pack it into a concise list.
Feeding in direct values works for your basic projects, but it can quickly grow out of control. A more  
 
intelligent way to handle draw operations, especially on large, complex meshes, is to use a display list. A  
 
display list is, basically, a series of commands for the GX to execute. It's exactly the same as what you did  
 
for drawing before, except now you're telling libogc to pack it into a concise list.
 
First you'll need to allocate a section of RAM to hold the list. This section, once again, needs to be 32-byte


First you'll need to allocate a section of RAM to hold the list. This section, once again, needs to be 32-byte aligned. It also needs to be a multiple of 32 bytes in size:
aligned. It also needs to be a multiple of 32 bytes in size:


<code>
<code>
Line 144: Line 348:
</code>
</code>


The exact size of the second argument to <tt>memalign()</tt> depends on which commands you're dispatching and how many of them you're dispatching. Each GX command takes a certain number of bytes in the FIFO. For example, <tt>GX_Begin()</tt> costs three bytes. If you're certain about the size that you'll need, you can use a constant here; otherwise you'll either have to pick a big number or attempt to calculate it. This size ''must'' be a multiple of 32 AND be ''larger'' than the actual size of the list rounded up to 32 (for example, if your list is 131 bytes, allocating 160 bytes won't work). You'll also want to call <tt>memset()</tt> to zero out the memory.
The exact size of the second argument to <tt>memalign()</tt> depends on which commands you're dispatching and  


You'll also need to flush out the CPU's data cache for this allocation, as the GP will need to be able to access the list:
how many of them you're dispatching. Each GX command takes a certain number of bytes in the FIFO. For example,
 
<tt>GX_Begin()</tt> costs three bytes. If you're certain about the size that you'll need, you can use a constant
 
here; otherwise you'll either have to pick a big number or attempt to calculate it. This size ''must'' be a
 
multiple of 32 AND be ''larger'' than the actual size of the list rounded up to 32 (for example, if your list is
 
131 bytes, allocating 160 bytes won't work). You'll also want to call <tt>memset()</tt> to zero out the memory.
 
You'll also need to flush out the CPU's data cache for this allocation, as the GP will need to be able to access  
 
the list:


<code>
<code>
Line 153: Line 369:
</code>
</code>


The first call flushes the data from the cache, which causes all of it to get written to memory immediately. The second call makes any subsequent writes to this memory range go straight to memory instead of being simply cached. Both of these take the same arguments: pointer to the memory range, and the size of the range. Once you do that, you'll need to begin writing to the display list like this:
The first call flushes the data from the cache, which causes all of it to get written to memory immediately. The  
 
second call makes any subsequent writes to this memory range go straight to memory instead of being simply  
 
cached. Both of these take the same arguments: pointer to the memory range, and the size of the range. Once you  
 
do that, you'll need to begin writing to the display list like this:


<code>
<code>
Line 159: Line 381:
</code>
</code>


The first argument is a pointer to memory you allocated earlier. The second argument is the size of the display list, padded to 32 bytes (this should probably be the same size as your memory allocation unless you allocated one big block). After you do this, subsequent commands will be routed into the display list instead of being painted immediately. This behaviour is known as "retained mode," as opposed to "immediate mode" which is what you were doing before. From here, you'll need to call <tt>GX_Begin()</tt> and the like to start "drawing" what you want to keep in the list, such as a box or a race car. When you're done, do this:
The first argument is a pointer to memory you allocated earlier. The second argument is the size of the display  
 
list, padded to 32 bytes (this should probably be the same size as your memory allocation unless you allocated  
 
one big block). After you do this, subsequent commands will be routed into the display list instead of being  
 
painted immediately. This behaviour is known as "retained mode," as opposed to "immediate mode" which is what  
 
you were doing before. From here, you'll need to call <tt>GX_Begin()</tt> and the like to start "drawing" what  
 
you want to keep in the list, such as a box or a race car. When you're done, do this:


<code>
<code>
Line 166: Line 398:
</code>
</code>


<tt>GX_EndDispList()</tt>'s return value is somewhat important. A return value of 0 indicates that the display list size (given by argument 3 to <tt>GX_BeginDispList()</tt>) is insufficient for the number of commands that you've given it, so a larger list is required. A value greater than 0 represents the effective size of the display list. It's advisable to hold on to this number because you'll need it. If you don't allocate enough space, it's also possible for the return value to be very large (67108864 in my testing).
<tt>GX_EndDispList()</tt>'s return value is somewhat important. A return value of 0 indicates that the display  
 
list size (given by argument 3 to <tt>GX_BeginDispList()</tt>) is insufficient for the number of commands that  
 
you've given it, so a larger list is required. A value greater than 0 represents the effective size of the  
 
display list. It's advisable to hold on to this number because you'll need it. If you don't allocate enough  
 
space, it's also possible for the return value to be very large (67108864 in my testing).


When you're ready to use the display list, you call a single function to "play back" the commands in the list:
When you're ready to use the display list, you call a single function to "play back" the commands in the list:
Line 174: Line 414:
</code>
</code>


The first argument is a pointer to the display list, just like when you created the list. The second argument is the size of the display list, which <tt>GX_EndDispList()</tt> conveniently gave to you. This call is basically equivalent to all the draw commands you gave earlier, except it's faster and cleaner. If you need to change anything about the object (like a vertex color), you'll need to rebuild the list. An alternative to this is to build multiple similar lists if you need to change only a few parameters.
The first argument is a pointer to the display list, just like when you created the list. The second argument is  
 
the size of the display list, which <tt>GX_EndDispList()</tt> conveniently gave to you. This call is basically  
 
equivalent to all the draw commands you gave earlier, except it's faster and cleaner. If you need to change  
 
anything about the object (like a vertex color), you'll need to rebuild the list. An alternative to this is to  
 
build multiple similar lists if you need to change only a few parameters.


The neheGX lesson 12 is an example application that uses display lists.
The neheGX lesson 12 is an example application that uses display lists.
Line 180: Line 428:
==Textures==
==Textures==


Textures are normally provided in so-called TPL format. These are converted at compile-time to the appropriate format and linked into the final application in binary format. The build system has the necessary configuration already present to handle this. neheGX lessons with textures (such as lesson 5) have a makefile with the necessary lines present to handle conversion.
Textures are normally provided in so-called TPL format. These are converted at compile-time to the appropriate  
 
format and linked into the final application in binary format. The build system has the necessary configuration  


Accompanied with such TPL files are SCF files. These are XML files which have a list of textures. So far, I have only specified one texture per SCF file, like this:
already present to handle this. neheGX lessons with textures (such as lesson 5) have a makefile with the
 
necessary lines present to handle conversion.
 
Accompanied with such TPL files are SCF files. These are XML files which have a list of textures. So far, I have  
 
only specified one texture per SCF file, like this:


<code>
<code>
Line 188: Line 444:
</code>
</code>


''filepath'' is the path, relative to the texture directory, to the image file to be converted. ''id'' is the ID to use to identify the texture in code when loading it into GX. ''colfmt'' tells gxtexconv what format the resulting texture should be (for example, format 4 is RGB565).
''filepath'' is the path, relative to the texture directory, to the image file to be converted. ''id'' is the ID  
 
to use to identify the texture in code when loading it into GX. ''colfmt'' tells gxtexconv what format the  
 
resulting texture should be (for example, format 4 is RGB565).
 
<!-- A wiibrew article says colfmt determines the resulting texture format; says colfmt 14 is DXT1 and 4 is


<!-- A wiibrew article says colfmt determines the resulting texture format; says colfmt 14 is DXT1 and 4 is RGB565. -->
RGB565. -->


If you are going to use textures the whole time (i.e. you're not intending to unload any textures), then you can use this method of texture loading. First, you'll need to include the headers for each texture. In my example, Mud.bmp, I need to include these:
If you are going to use textures the whole time (i.e. you're not intending to unload any textures), then you can  
 
use this method of texture loading. First, you'll need to include the headers for each texture. In my example,  
 
Mud.bmp, I need to include these:


<code>
<code>
Line 205: Line 471:
</code>
</code>


When performing texture loading (usually some time after you've set up GX and the viewport), you'll use this structure to specify the texture in memory. After that, you'll need to "open" the TPL in memory:
When performing texture loading (usually some time after you've set up GX and the viewport), you'll use this  
 
structure to specify the texture in memory. After that, you'll need to "open" the TPL in memory:


<code>
<code>
Line 211: Line 479:
</code>
</code>


You know that <tt>mudTPL</tt> has already been declared, but unless you were to peek at the headers included earlier, you won't know where the last two arguments came from. They were generated by gxtexconv and are the binary array and array size, respectively. Now it's time to actually put the texture into a struct to pass on to the GX subsystem. First you'll need to declare the variable somewhere:
You know that <tt>mudTPL</tt> has already been declared, but unless you were to peek at the headers included  
 
earlier, you won't know where the last two arguments came from. They were generated by gxtexconv and are the  
 
binary array and array size, respectively. Now it's time to actually put the texture into a struct to pass on to  
 
the GX subsystem. First you'll need to declare the variable somewhere:


<code>
<code>
Line 223: Line 497:
</code>
</code>


<tt>mudTPL</tt> is the TPLFile struct from earlier, <tt>mud</tt> is the ID that we gave in the SCF file and <tt>texture</tt> is the new GXTexObj struct. From now on, you won't be dealing with the TPL struct.
<tt>mudTPL</tt> is the TPLFile struct from earlier, <tt>mud</tt> is the ID that we gave in the SCF file and  


It is possible to make a texture loader to load images in from the storage media, but WinterMute recommends against this because it can cause severe memory fragmentation during the conversion process. If you decide to do this, the steps are different from the ones above, but the end result will be the same: a GXTexObj with the texture in it.
<tt>texture</tt> is the new GXTexObj struct. From now on, you won't be dealing with the TPL struct.


Now you'll want to make sure that the texture projection (its appearance on the surface) is correct, which we'll do this way:
It is possible to make a texture loader to load images in from the storage media, but WinterMute recommends
 
against this because it can cause severe memory fragmentation during the conversion process. If you decide to do
 
this, the steps are different from the ones above, but the end result will be the same: a GXTexObj with the
 
texture in it.
 
Now you'll want to make sure that the texture projection (its appearance on the surface) is correct, which we'll  
 
do this way:


<code>
<code>
Line 241: Line 525:
</code>
</code>


<tt>GX_SetTexCoordGen()</tt> tells the graphics hardware how the texture coordinates should be generated; in other words, it tells the hardware how the textures should appear on the surface. In this example, we're telling it that texture 0 (remember that the hardware can handle eight textures per pass) uses a 3x4 identity matrix, which means that no special transformations should be performed while painting the textures. <tt>GX_IDENTITY</tt> can alternatively be <tt>GX_TEXMTX0</tt> through <tt>GX_TEXMTX9</tt> if you want to apply a matrix to a texture.
<tt>GX_SetTexCoordGen()</tt> tells the graphics hardware how the texture coordinates should be generated; in  
 
other words, it tells the hardware how the textures should appear on the surface. In this example, we're telling  
 
it that texture 0 (remember that the hardware can handle eight textures per pass) uses a 3x4 identity matrix,  
 
which means that no special transformations should be performed while painting the textures.  
 
<tt>GX_IDENTITY</tt> can alternatively be <tt>GX_TEXMTX0</tt> through <tt>GX_TEXMTX9</tt> if you want to apply a  
 
matrix to a texture.
 
The calls to <tt>guMtxTrans()</tt> and <tt>guMtxConcat()</tt> transform the texture matrix to match our


The calls to <tt>guMtxTrans()</tt> and <tt>guMtxConcat()</tt> transform the texture matrix to match our viewpoint. They're subsequently loaded using <tt>GX_LoadTexMtxImm()</tt> into the first texture matrix slot.
viewpoint. They're subsequently loaded using <tt>GX_LoadTexMtxImm()</tt> into the first texture matrix slot.


Finally, <tt>GX_InvalidateTexAll()</tt> tells the graphics hardware to invalidate the textures in its texture cache, which will cause it to reload textures from system memory. You need to call this every time you make a change to a texture.
Finally, <tt>GX_InvalidateTexAll()</tt> tells the graphics hardware to invalidate the textures in its texture  
 
cache, which will cause it to reload textures from system memory. You need to call this every time you make a  
 
change to a texture.


==Matrix Math==
==Matrix Math==


Matrices are a big part of 3D graphics, as their versatility is well-suited to the types of operations performed when transforming, such as moving the camera around. libogc includes hand-written assembly routines for matrix math that utilize the Gekko CPU's paired-single instructions, making such operations blazing fast as well. If you need a primer on how matrix math works, take a look at the article here: http://www.gamedev.net/reference/articles/article877.asp
Matrices are a big part of 3D graphics, as their versatility is well-suited to the types of operations performed  


If you are experienced with OpenGL, you'll probably be familiar with functions like <tt>glRotate</tt> and <tt>glTranslate</tt>. These types of functions are not present in libogc; you'll need to construct the appropriate chain of matrix functions to do the same thing. Fear not, however, as the matrix functions you'll use instead are very intuitive and aren't too much more difficult.
when transforming, such as moving the camera around. libogc includes hand-written assembly routines for matrix  


In your applications, or at least your early ones, matrices transform things such as your viewpoint in two ways: translation and rotation. Translation is the "panning" of something and "rotation" is the spin of it. For example, let's say we want to rotate the camera horizontally to represent the player looking left and right in a level. You might start with this code:
math that utilize the Gekko CPU's paired-single instructions, making such operations blazing fast as well. If
 
you need a primer on how matrix math works, take a look at the article here:
 
http://www.gamedev.net/reference/articles/article877.asp
 
If you are experienced with OpenGL, you'll probably be familiar with functions like <tt>glRotate</tt> and
 
<tt>glTranslate</tt>. These types of functions are not present in libogc; you'll need to construct the
 
appropriate chain of matrix functions to do the same thing. Fear not, however, as the matrix functions you'll
 
use instead are very intuitive and aren't too much more difficult.
 
In your applications, or at least your early ones, matrices transform things such as your viewpoint in two ways:  
 
translation and rotation. Translation is the "panning" of something and "rotation" is the spin of it. For  
 
example, let's say we want to rotate the camera horizontally to represent the player looking left and right in a  
 
level. You might start with this code:


<code>
<code>
Line 260: Line 580:
</code>
</code>


<tt>axis</tt> is a vector (an array with three elements) which will represent the axis that we want to rotate on, and <tt>m</tt>, <tt>v</tt> and <tt>mv</tt> are the matrices that represent the rotation, our view, and the combination of both. Specifically, <tt>m</tt> represents the ''new'' matrix that we're going to get the rotation from, and <tt>v</tt> is our ''current view matrix'' (i.e. it's the matrix that represents where our "camera" is pointing in the world). <tt>v</tt> is usually already declared and assigned earlier as our view matrix, but here I declared it for the purpose of demonstrating its presence. As per the linked article above, we need to start with a "neutral" matrix, which is the "identity". You'll use this to set it up:
<tt>axis</tt> is a vector (an array with three elements) which will represent the axis that we want to rotate  
 
on, and <tt>m</tt>, <tt>v</tt> and <tt>mv</tt> are the matrices that represent the rotation, our view, and the  
 
combination of both. Specifically, <tt>m</tt> represents the ''new'' matrix that we're going to get the rotation  
 
from, and <tt>v</tt> is our ''current view matrix'' (i.e. it's the matrix that represents where our "camera" is  
 
pointing in the world). <tt>v</tt> is usually already declared and assigned earlier as our view matrix, but here  
 
I declared it for the purpose of demonstrating its presence. As per the linked article above, we need to start  
 
with a "neutral" matrix, which is the "identity". You'll use this to set it up:


<code>
<code>
Line 266: Line 598:
</code>
</code>


This sets the given matrix to an identity matrix, and is basically equivalent to initializing it before using it. Next we need to apply the appropriate transformation; in this case, we're going to rotate, say, 90 degrees laterally:
This sets the given matrix to an identity matrix, and is basically equivalent to initializing it before using  
 
it. Next we need to apply the appropriate transformation; in this case, we're going to rotate, say, 90 degrees  
 
laterally:


<code>
<code>
Line 275: Line 611:
</code>
</code>


The first argument, <tt>m</tt>, is our matrix-to-be-rotated. The second argument, <tt>axis</tt>, tells by what percentage we should rotate on which axes. The last argument, 90 in this case, is the number of degrees to rotate. You would rotate the same amount if <tt>axis.x</tt> was 0.5 and degrees was 180. In reality, this is actually a wrapper function that changes degrees to rads and then applies it, but this happens behind the scenes. Finally, we need to actually apply this rotation to our view matrix so that the changes are visible. We'll use this function to do that:
The first argument, <tt>m</tt>, is our matrix-to-be-rotated. The second argument, <tt>axis</tt>, tells by what  
 
percentage we should rotate on which axes. The last argument, 90 in this case, is the number of degrees to  
 
rotate. You would rotate the same amount if <tt>axis.x</tt> was 0.5 and degrees was 180. In reality, this is  
 
actually a wrapper function that changes degrees to rads and then applies it, but this happens behind the  
 
scenes. Finally, we need to actually apply this rotation to our view matrix so that the changes are visible.  
 
We'll use this function to do that:


<code>
<code>
Line 281: Line 627:
</code>
</code>


This function concatenates the matrices <tt>m</tt> and <tt>v</tt> together and gives the output in <tt>mv</tt>. Since <tt>mv</tt> is the result, we should use it when applying the transformation to our scene instead of <tt>v</tt>.
This function concatenates the matrices <tt>m</tt> and <tt>v</tt> together and gives the output in <tt>mv</tt>.  
 
Since <tt>mv</tt> is the result, we should use it when applying the transformation to our scene instead of  
 
<tt>v</tt>.
 
Translation would work the same, but there's a shortcut we can use. If we want to translate <tt>mv</tt>, we can


Translation would work the same, but there's a shortcut we can use. If we want to translate <tt>mv</tt>, we can do this:
do this:


<code>
<code>
Line 292: Line 644:
</code>
</code>


<tt>xtrans</tt>, <tt>ytrans</tt> and <tt>ztrans</tt> are floats that represent the amount that we want to translate in each direction. The trick here is that the source and destination matrix are the same, so we reduce the complexity and number of operations. If <tt>mv</tt> is our view matrix, we'll need to actually apply it, which we'll do here:
<tt>xtrans</tt>, <tt>ytrans</tt> and <tt>ztrans</tt> are floats that represent the amount that we want to  
 
translate in each direction. The trick here is that the source and destination matrix are the same, so we reduce  
 
the complexity and number of operations. If <tt>mv</tt> is our view matrix, we'll need to actually apply it,  
 
which we'll do here:


<code>
<code>
Line 298: Line 656:
</code>
</code>


This tells GX to load the matrix into the first position matrix slot, of which we have ten. For any position operations done from this point forward, this matrix will be applied to the vertices. Remember that this matrix needs to be updated any time the view changes; otherwise the view will stay the same.
This tells GX to load the matrix into the first position matrix slot, of which we have ten. For any position  
 
operations done from this point forward, this matrix will be applied to the vertices. Remember that this matrix  
 
needs to be updated any time the view changes; otherwise the view will stay the same.


These operations are largely the same for any other matrix transformation, such as for textures or lighting.
These operations are largely the same for any other matrix transformation, such as for textures or lighting.
Line 306: Line 668:
Much of the information in this section is taken from gl2gx's project wiki, and some of it is derived from code.
Much of the information in this section is taken from gl2gx's project wiki, and some of it is derived from code.


Lighting in GX is different from OpenGL. While OpenGL has diffuse, ambient and specular colors, as well as a global ambient color, GX only has diffuse light colors. Additionally, OpenGL has diffuse, ambient, specular and emission material colors, whereas GX has diffuse and ambient.
Lighting in GX is different from OpenGL. While OpenGL has diffuse, ambient and specular colors, as well as a  
 
global ambient color, GX only has diffuse light colors. Additionally, OpenGL has diffuse, ambient, specular and  
 
emission material colors, whereas GX has diffuse and ambient.

Revision as of 18:46, 13 March 2010


THIS ARTICLE IS A WORK IN PROGRESS AND IS CURRENTLY NOT TO BE CONSIDERED AN AUTHORITATIVE SOURCE OF

INFORMATION. YOU HAVE BEEN WARNED.

Preface

Some concepts in this article require some prerequisite knowledge about 3D programming in general, the GX API

specifically or both. The NeHe OpenGL lessons are an excellent supplementary resource for learning the specifics

of OpenGL and can work hand-in-hand with the OpenGL textbooks. libogc also has lessons 1 through 10 converted to

the GX and gu APIs, so cross-referencing the source code for those lessons can help you understand at is going

on.

You'll definitely need to have a handle on how to program in C, especially concerning things such as

floating-point numbers and pointers, both of which are used heavily. The functions, compiler directives and

other parts of the API are viewable in the Doxygen pages. Some functions have real documentation with them (the

most often-used ones usually), but almost all of the GX Doxygen listings have no accompanying description, and

you may find yourself having to wing it while learning and writing GX. If you get completely hung-up on how a

particular function or block of functions affects the system, remember that Google is your friend; sometimes you

can find the answer on the devkitpro forums (use site:forums.devkitpro.org in Google to search there); sometimes

it can be found elsewhere (the wiibrew forums are sometimes useful, but avoid their wiki!), and sometimes you

won't get any useful results at all. Hopefully, the Doxygen documentation will be expanded in time, but until

then, you'll be put through a trial-by-fire.

This article was initially written by me, ccfreak2k. shagkur originally wrote most of the GX backend code in

libogc, and as such could be considered the authority on the subject, but he has not been around for a while. As

such, this article includes what I believe to be correct information, and is all the information I have compiled

on the subject thus far. I have done my best to make sure that the information is as accurate as possible (since

wrong information is worse than no information!). The examples provided with libogc are known to be correct

implementations of GX, so defer to those if there's a conflict on how something works.

Most important of all, however, is that you understand how your code works. Becoming a

cargo cult programmer is a bad idea, so make the effort to

understand what you're writing and why you're writing it.

Introduction

Among the GameCube's many subsystems lies probably one of the biggest: GX. GX is the name of the API used

to draw graphics using the famous Flipper chip.

GX Setup and Particulars

GX shares some similarities with OpenGL, and differs greatly in many ways as well. OpenGL, by design, masks a

lot of the nitty-gritty hardware specifics, leaving implementation of it to hardware vendors, whereas the GX API

is very close to the metal and many functions have little, if any, processing performed by the CPU. What this

means is that, if you write smart code and know how the hardware works under the hood, you can bring out the

best performance of the machine, but you'll also be working with an API that is altogether more complex than

writing in a higher-level API such as OpenGL.

Before you start initializing GX, you'll want to make sure you have the VIDEO subsystem set up. Almost every

GameCube application that displays anything will do this. This includes allocating framebuffers acquiring

the TV screen attributes. Right after you have VIDEO set up is when you'll generally initialize GX.

To start, you'll need to allocate a "GP FIFO". The GP FIFO, or "graphics processor FIFO" is a portion of memory

reserved for uploading commands to the GP. A FIFO is a type of pipe, but that's not necessary to know for now.

To initialize the GX subsystem, you'll need to make some room for the FIFO, which you'll do like this:

void *gp_fifo = NULL;
gp_fifo = memalign(32,DEFAULT_FIFO_SIZE);
memset(gp_fifo,0,DEFAULT_FIFO_SIZE);

The FIFO must be 32-byte aligned, which is what memalign() does. It's like malloc(), except it

gives us a block of memory aligned to whatever alignment we specify. We also give DEFAULT_FIFO_SIZE as

the size of memory that we want. The size of the FIFO required generally depends on how many commands you're

dispatching per unit of time, but the default size is adequate in many cases. memset() clears the FIFO

memory to 0 because allocated memory is uninitialized, and we don't want the GP to mistake garbage for commands.

With that out of the way, it's time to switch on GX:

GX_Init(gp_fifo,DEFAULT_FIFO_SIZE);

Here we give it a pointer to the FIFO and the size of the FIFO. From this point on, you probably won't be

dealing with the FIFO anymore, as you'll be interfacing with the GP using the GX API now. What manner of

initialization happens after this depends greatly on how you're using GX, but one function you'll generally use

is this one:

GXColor background = {0,0,0,0xff};
GX_SetCopyClear(background, 0x00ffffff);

This tells the GP to clear the screen to the specified background color at the beginning of every new frame,

which will eliminate the "hall of mirrors effect" that would happen otherwise.

What happens after this will generally mirror initialization in OpenGL with some big exceptions, which we'll

discuss below.

Performance Optimization

There's many ways to optimize GX performance, usually dealing with redundant calls or eliminating calls if data

between frames doesn't change (for example, you can save the view matrix after transformations and only create a

new one if the viewpoint changes). One way you can optimize is by creating a separate thread for GX drawing

operations. This can help if your other threads are almost always busy, such as if, for example, they're keeping

track of the game state or managing AI. As soon as any thread is done with any work, it can put itself to sleep

until new work can be done, which allows your other threads more CPU time to finish their work. How to use

threading in libogc is not discussed here; however, here are some things to keep in mind if you decide to move

your rendering code to a separate thread:

  • Use message passing to communicate render state changes. For example, if a new object needs to be rendered,

use the message-passing interface of libogc to pass the pointer to the new object and have your rendering thread

dereference it and read in the data, passing it to its drawing routines. Additionally, you can also use message

passing to tell the render thread when an object should NOT be rendered anymore.

  • Use scheduling, sleep/wakeup and callbacks to your advantage. If your rendering thread is done with its

drawing, it should sleep until the next frame. VIDEO_WaitVSync() will put the calling thread in a wait state

until the next vertical interrupt occurs, where it will then be woken up.

Even if you're not using a seperate rendering thread, there are other ways to optimize:

  • Compress your textures. This doesn't necessarily increase performance, but if a texture is going to be, say,

applied to a wall, you can decrease its in-memory size by setting colfmt to 14 in the SCF file, which converts

the texture into a compressed format, much like DXT1. The cost for decompressing is tiny, and the loss of

quality isn't too dramatic, so it's definitely worth it if you're up against the wall.

  • Only call functions when you need to. For example, setting up a texture to be painted only needs to occur once

per texture. Uploading a texture (using GX_LoadTexObj()) also only needs to occur once unless you need

to upload a new one or a new version of an already-uploaded texture (in which case you need to invalidate it

from texture cache too).

Attribute Slots

Many things in GX that can take different formats, such as vertex attributes or texture coordinate matrices, can

be stored in slots. This lets you define all of your formats ahead of time and simply switch slots when required

to load the stored formats. You're basically required to use at least one slot, even if you redefine the formats

every time you change. You'll also find examples of slot usage peppered through this document and through

example code, although I have yet to see any code that uses more than one slot. The biggest example involves the

TEV, where you might have different settings for different textures, of which you can have eight textures per

pass.

Vertex Initialization

One aspect of the close-to-the-hardware nature of GX is in vertex attributes. Before you start drawing

triangles, you need to tell GX how they should be drawn. You'll generally do it like this:

GX_InvVtxCache();
GX_ClearVtxDesc();

GX_SetVtxDesc(GX_VA_POS, GX_DIRECT);
GX_SetVtxDesc(GX_VA_NRM, GX_DIRECT);
GX_SetVtxDesc(GX_VA_TEX0, GX_DIRECT);

GX_SetVtxAttrFmt(GX_VTXFMT0, GX_VA_POS, GX_POS_XYZ, GX_F32, 0);
GX_SetVtxAttrFmt(GX_VTXFMT0, GX_VA_NRM, GX_NRM_XYZ, GX_F32, 0);
GX_SetVtxAttrFmt(GX_VTXFMT0, GX_VA_TEX0, GX_TEX_ST, GX_F32, 0);

GX_InvVtxCache() tells the GP to invalidate the vertex cache. You'll do this at least once as part of

initialization, but it might also useful if, you, for example, want to change the scene, which may involve

having completely different objects to be drawn.

The calls to GX_ClearVtxDesc() clears the vertex attribute table, which is what is defined in the lines

following that.

The GX_SetVtxDesc() calls tell the GP how we'll be providing the vertex data. We're specifying the

format for vertex positions, normals and texture coordinates, respectively. GX_DIRECT in this context

means we'll be specifying them using calls like GX_Position3f() (similar to glVertex3f()).

GX_SetVtxAttrFmt() tells the GP how we're going to specify the information for vertices. Specifically

it's the same as the previous three (vertex position, vertex normal and texture coordinates), except we can

specify different formats for different vertex format slots. For example, if you were drawing a HUD, you could

store its vertex format in a different slot, then call that slot up when drawing the HUD instead of having to

reload the vertex attributes yourself every time. You give the slot you want to use in an operation in the call

to GX_Begin(), which is detailed below.

Drawing Triangles

The heart of drawing 3D shapes on the screen is more or less identical to OpenGL with a few changes. These steps

are almost exactly the same as other primitives that GX supports, except you'd replace GX_TRIANGLES with the

appropriate type.

After drawing is set up (textures loaded, matrices applied, etc), the drawing commands can be dispatched. You

lead off with this:

GX_Begin(GX_TRIANGLES,GX_VTXFMT0,3);

The first argument tells the GP what we want to draw. In this case, we're drawing triangles, i.e. every three

vertices will be a single triangle on the screen. Other options for this are points, lines, line strip, triangle

strip, triangle fan and quads. Which you use depends on what you're drawing and can have huge performance 

implications with high triangle counts.

The second argument is the vertex format to load. If you read through Vertex

Initialization, you'd know that there are eight vertex format slots that you can use, and which one you want

to use for that draw operation is specified here.

The last argument is the vertex count. This count must match the number of vertices that you are drawing. If

this number is larger than the actual number of vertices that you specify, then the thread will hang as soon as

GX_DrawDone() is called.

Once this call is made, you can call functions that give the specific information on each vertex. You need to

have position for each, which you can call like this:

GX_Position3f32(1.0f,0,0);

If you followed Vertex Initialization, you saw that the vertex attribute example gave

GX_POS_XYZ for the position vertex attribute, which means the vertices for this VA slot would be

coordinates with three members, and here we're specifying the position coordinates for one of them. We would

need to make two more of these calls so that we complete a triangle in addition to satisfy the vertex count we

gave earlier. If you wanted to specify a normal for each vertex, you would make this call following the previous

one:

GX_Normal3f32(0,0,1.0f);

This works much like glNormal, except you need one of these for every vertex, as opposed to just

once as in OpenGL. The same goes to calls to GX_TexCoord*().


If you are going to use textures the whole time (i.e. you're not intending to unload any textures), then you can

use this method of texture loading. First, you'll need to include the headers for each texture. In my example,

Mud.bmp, I need to include these:

#include "mud_tpl.h"
#include "mud.h"

Next, you'll need to declare the texture's in-memory representation struct:

TPLFile mudTPL;

When performing texture loading (usually some time after you've set up GX and the viewport), you'll use this

structure to specify the texture in memory. After that, you'll need to "open" the TPL in memory:

TPL_OpenTPLFromMemory(&mudTPL, (void *)mud_tpl,mud_tpl_size);

You know that mudTPL has already been declared, but unless you were to peek at the headers included

earlier, you won't know where the last two arguments came from. They were generated by gxtexconv and are the

binary array and array size, respectively. Now it's time to actually put the texture into a struct to pass on to

the GX subsystem. First you'll need to declare the variable somewhere:

GXTexObj texture;

Then you can load it in:

TPL_GetTexture(&mudTPL,mud,&texture);

mudTPL is the TPLFile struct from earlier, mud is the ID that we gave in the SCF file and

texture is the new GXTexObj struct. From now on, you won't be dealing with the TPL struct.

It is possible to make a texture loader to load images in from the storage media, but WinterMute recommends

against this because it can cause severe memory fragmentation during the conversion process. If you decide to do

this, the steps are different from the ones above, but the end result will be the same: a GXTexObj with the

texture in it.

Now you'll want to make sure that the texture projection (its appearance on the surface) is correct, which we'll

do this way:

Mtx mv,mr;
f32 w = rmode->viWidth;
f32 h = rmode->viHeight;
GX_SetTexCoordGen(GX_TEXCOORD0, GX_TG_MTX3x4, GX_TG_TEX0, GX_IDENTITY);
guLightPerspective(mv, 45, (f32)w/h, 1.05f, 1.0f, 0.0f, 0.0f);
guMtxTrans(mr, 0.0f, 0.0f, -1.0f);
guMtxConcat(mv, mr, mv);
GX_LoadTexMtxImm(mv, GX_TEXMTX0, GX_MTX3x4);
GX_InvalidateTexAll();

GX_SetTexCoordGen() tells the graphics hardware how the texture coordinates should be generated; in

other words, it tells the hardware how the textures should appear on the surface. In this example, we're telling

it that texture 0 (remember that the hardware can handle eight textures per pass) uses a 3x4 identity matrix,

which means that no special transformations should be performed while painting the textures.

GX_IDENTITY can alternatively be GX_TEXMTX0 through GX_TEXMTX9 if you want to apply a

matrix to a texture.

The calls to guMtxTrans() and guMtxConcat() transform the texture matrix to match our

viewpoint. They're subsequently loaded using GX_LoadTexMtxImm() into the first texture matrix slot.

Finally, GX_InvalidateTexAll() tells the graphics hardware to invalidate the textures in its texture

cache, which will cause it to reload textures from system memory. You need to call this every time you make a

change to a texture.

Matrix Math

Matrices are a big part of 3D graphics, as their versatility is well-suited to the types of operations performed

when transforming, such as moving the camera around. libogc includes hand-written assembly routines for matrix

math that utilize the Gekko CPU's paired-single instructions, making such operations blazing fast as well. If

you need a primer on how matrix math works, take a look at the article here:

http://www.gamedev.net/reference/articles/article877.asp

If you are experienced with OpenGL, you'll probably be familiar with functions like glRotate and

glTranslate. These types of functions are not present in libogc; you'll need to construct the

appropriate chain of matrix functions to do the same thing. Fear not, however, as the matrix functions you'll

use instead are very intuitive and aren't too much more difficult.

In your applications, or at least your early ones, matrices transform things such as your viewpoint in two ways:

translation and rotation. Translation is the "panning" of something and "rotation" is the spin of it. For

example, let's say we want to rotate the camera horizontally to represent the player looking left and right in a

level. You might start with this code:

guVector axis;
Mtx m,v,mv;

axis is a vector (an array with three elements) which will represent the axis that we want to rotate

on, and m, v and mv are the matrices that represent the rotation, our view, and the

combination of both. Specifically, m represents the new matrix that we're going to get the rotation

from, and v is our current view matrix (i.e. it's the matrix that represents where our "camera" is

pointing in the world). v is usually already declared and assigned earlier as our view matrix, but here

I declared it for the purpose of demonstrating its presence. As per the linked article above, we need to start

with a "neutral" matrix, which is the "identity". You'll use this to set it up:

guMtxIdentity(m);

This sets the given matrix to an identity matrix, and is basically equivalent to initializing it before using

it. Next we need to apply the appropriate transformation; in this case, we're going to rotate, say, 90 degrees

laterally:

axis.x = 1.0f;
axis.y = 0;
axis.z = 0;
guMtxRotAxisDeg(m, &axis, 90);

The first argument, m, is our matrix-to-be-rotated. The second argument, axis, tells by what

percentage we should rotate on which axes. The last argument, 90 in this case, is the number of degrees to

rotate. You would rotate the same amount if axis.x was 0.5 and degrees was 180. In reality, this is

actually a wrapper function that changes degrees to rads and then applies it, but this happens behind the

scenes. Finally, we need to actually apply this rotation to our view matrix so that the changes are visible.

We'll use this function to do that:

guMtxConcat(m,v,mv);

This function concatenates the matrices m and v together and gives the output in mv.

Since mv is the result, we should use it when applying the transformation to our scene instead of

v.

Translation would work the same, but there's a shortcut we can use. If we want to translate mv, we can

do this:

xtrans = 1.0f;
ytrans = 0;
ztrans = 0;
guMtxTransApply(mv, mv, xtrans, ytrans, ztrans);

xtrans, ytrans and ztrans are floats that represent the amount that we want to

translate in each direction. The trick here is that the source and destination matrix are the same, so we reduce

the complexity and number of operations. If mv is our view matrix, we'll need to actually apply it,

which we'll do here:

GX_LoadPosMtxImm(mv, GX_PNMTX0);

This tells GX to load the matrix into the first position matrix slot, of which we have ten. For any position

operations done from this point forward, this matrix will be applied to the vertices. Remember that this matrix

needs to be updated any time the view changes; otherwise the view will stay the same.

These operations are largely the same for any other matrix transformation, such as for textures or lighting.

Lighting

Much of the information in this section is taken from gl2gx's project wiki, and some of it is derived from code.

Lighting in GX is different from OpenGL. While OpenGL has diffuse, ambient and specular colors, as well as a

global ambient color, GX only has diffuse light colors. Additionally, OpenGL has diffuse, ambient, specular and

emission material colors, whereas GX has diffuse and ambient.