Archive for Equinox

I Had a Leak!

An image generated with multi-threading

Equinox, my little renderer, is slowly making progress. I had reached a point where I could render a simple shape (a sphere) which had its own transformation matrix. Having a working renderer that supports a pinhole orthographic camera and a sphere, I figured this was the perfect time to try to do something that I had never done before, Multi-threaded programming! I will discuss the multithreading portion of Equinox on another post. At the moment I will like to cover a nasty little side effect that poped out out of nowhere (as usual) and bit me square in the ass!

Equinox was already rendering an array of spheres, granted the render times where higher than expected, but at this point this was expected. I usually concentrate on getting my code to work, before I concentrate on optimization. This is a common practice in software development. I was implementing the multi-threading when I saw that using 8 cores did not give the render any significant speed up. I opened the activity monitor window to track the processor usage and to my surprise the eight working threads would only pickup the first 8 buckets, afterwards only two or three threads would be performing calculations. No wonder I was not seeing much of a speed up. What a strange behavior I thought. But even more surprising was the memory usage.

As I saw the threads act all weird, I also saw the memory usage sky rocketed to 1.6 GIGS! Something was severely wrong as there is no way a single EXR image and 20 spheres would use so much memory. It seemed that I had just ran into what is known as Memory Leaks. This is something that I had never really had to deal with in the past. All scripting languages that I had ever used handle dynamic memory allocation and deallocation. Languages such as MaxScript, Mel, HScript, RSL, PHP and Python they all do automatic memory management. Even In C++, I had used constructors and destructors, which also performs memory management.

As you might have read, Equinox is being developed in C, which means the programmer (that would be me :D ) is 100% in charge of all memory management. So I put on my “CSI” hat and I began to dive into the code, hoping to find where I was leaking memory. After a little bit of digging I was able to track the issue to this function:

void EiProcessBucket(EtBucket bucket, Rgba *px) {
    extern EtWorld world;

    int x,y;
    for (int y = bucket.pos.y; y < bucket.pos.y + bucket.height; y++) {
        for (int x = bucket.pos.x; x < bucket.pos.x + bucket.width; x++){
            EtCameraInput camIn;
            camIn.x = x;
            camIn.y=y;
            camIn.xres=world.film.xres;
            camIn.yres=world.film.yres;
            EtCameraOutput *camOut = EiCameraOutput;
            EtCameraMethods *cammtds = (EtCameraMethods*)world.cameras->mtds;
            cammtds->createRay(world.cameras,camIn,camOut);

            float r,g,b;
            int j;
            g = y / (float)camIn.yres;
            r = x / (float)camIn.xres;
            b = 1;
            for (j = 0; j < buf_len(world.shapes);j++) {
                EtNode *shape = &world.shapes[j];
                EtShapeMethods *mtds = (EtShapeMethods*)shape->mtds;
                if (mtds->intersectP(shape,camOut->I)){
                    r = g = b = 1;
                    }
            }
            // (y * camIn.xres) + x makes sure the pixel
            // values are stored in scan lines
            Rgba *p = &px[(y * camIn.xres)+x];
            p->r = r;
            p->g = g;
            p->b = b;
            p->a = 1;
        }
    }
}

I started by commenting out most of the lines in the inner loop. I left only this code inside the loop:

   EtCameraInput camIn;
   camIn.x = x; camIn.y=y;
   camIn.xres=world.film.xres; camIn.yres=world.film.yres;
   EtCameraOutput *camOut = EiCameraOutput;
   EtCameraMethods *cammtds = (EtCameraMethods*)world.cameras->mtds;
   cammtds->createRay(world.cameras,camIn,camOut);

   float r,g,b;
   int j;
   g = y / (float)camIn.yres;
   r = x / (float)camIn.xres;
   b = 1;

I ran the code again and I saw that the memory usage had dropped from 1.6 gigs to 100 mbs. Huge improvement, but still not what it should be. I see that I am not manually allocating any memory into the heap and that all my variables are on the stack. This seemed to tell me that I dont really have a memory leak, but I am consuming way to much memory. I analyzed the code a little longer and spotted the what I though was the problem. For every pixel of the image I am creating a new variable of type EtCameraInput and a new pointer to a EtCameraOutput. This looks bad, I moved the declaration of such variables to outside of the double “for” loop. This greatly reduced memory consumption, however it still did not feel as the right answer to the problem.

The stack should deallocate “camIn”, “camOut” and “cammtds” once they go out of scope, so this cant be the issue. I looked at the code and found this line to be quite interesting:

//
//
EtCameraOutput *camOut = EiCameraOutput;
//
//

In an effort to write a framework that would allow me to write faster code, I had written this pre processor macro.

//
//
#define EiCameraOutput (EtCameraOutput*)malloc(sizeof(EtCameraOutput))
//
//

So I was allocating memory on the heap and I was not deallocating It. Once again, I seem to conspire against myself by trying to be a little too smart. Maybe using such shorthand macros is not such a good idea. I re-arranged the code to what is listed below and the memory usage dropped to 3 mb.

void EiProcessBucket(EtBucket bucket, Rgba *px) {
        extern EtWorld world;

        int x,y;
        float r,g,b;
        int j;
        Rgba *p;
        EtCameraOutput *camOut = EiCameraOutput;
        EtCameraMethods *cammtds = (EtCameraMethods*)world.cameras->mtds;
        for (int y = bucket.pos.y; y < bucket.pos.y + bucket.height; y++)
        {
            for (int x = bucket.pos.x; x < bucket.pos.x + bucket.width; x++)
            {
                EtCameraInput camIn;
                camIn.xres=world.film.xres;
                camIn.yres=world.film.yres;
                camIn.x = x; camIn.y=y; 

                g = y / (float)camIn.yres;
                r = x / (float)camIn.xres;
                b = 1;
           ......
           ......
           }
       }
       free(camOut);

Next I uncomment the lower part of the loop, the part where the intersections are actually performed and vroom, memory usage again launched to 1.5 gigs! There is a very serious memory leak in this block of code. After a good amount of digging I found that the code responsible for the memory leak was the intersectP method in the sphere shape. Inside this function I am performing several operations to apply the transformation to the shape. Here is the code that handles the transformation

    EtPoint pos = EiNodeGetPnt(node,"center");
    EtMatrix m = EiNodeGetMtrx(node,"matrix");
    EtMatrix mi;
    EiMatrixInvert(&m,&mi);
    // create a transform
    EtTransform xf = EiTransform(m);
    // create a trasform for the center
    EtTransform xpoint = EiTranslate(pos);
    // Multiply the object transform by the pos
    EiTransformMult(&xf,&xpoint);

I commented lines of code one by one and I realized that the issue was in EiTransformMult. I opened the transform module and I found 2 variables that where dynamically allocated and where not being deallocated at all. Here is the old code

void EiTransformMult(EtTransform *aa, const EtTransform *bb)
{
    EtMatrix *mat = malloc(sizeof(EtMatrix));
    float *mm = (float*)mat;
    float m[16];
    EtTransform *tmp=malloc(sizeof(EtTransform));
    float *a = (float*)&aa->m;
    float *b = (float*)&bb->m;

    ......

    memcpy(&tmp->m,&m,sizeof(float) * 16);
    EtMatrix mi;
    EiMatrixInvert(&tmp->m,&mi);
    memcpy(&tmp->mInv,&mi,sizeof(float)*16);
    memcpy(aa,tmp,sizeof(EtTransform));
}

First of all, one of the allocated pointers (*mat) is not even used anymore, so I deleted it. The other pointer, *tmp, is never deallocated. I added a free(tmp) at the end of the function and voila! The renderer memory usage now stays at a mere 3mbs per render and as a side effect, now using 8 threads greatly improves the render times. A scene that takes 71 seconds to render on 1 thread, takes about 20 seconds with 8 threads. Here is an image rendered with the latest version of Equinox.

So keep an eye open for those malloc()s while programming in C. Remember to always deallocate whatever you allocated or evil leprechauns will spawn and consume as much memory as the can get your hands on. Oh, and like a good friend told me once, dont try to be too smart or over complicate things with C. It is already super simple, which makes it super powerful.

Multiple Objects (and an API)

Moving forward. Made a little more progress on Equinox. One of the things that I disliked the most about renderran was the fact that it was kind of hard to create new scenes. specially for those not familiar with the code. The way the code was written you could only create objects by making instances of them, knowing where to place them in the code and how to access their parameters. I talked with some more experienced programmers about how I could make scene creation easier. I was thinking about creating file format that I could parse. My friends recommended that I first create an API, and since I was writing the code in C, I could then create some Python bindings with ctypes. I thought these where great ideas since I have never written a rendering API and never written python bindings for C. Even more learning to do!

I have seen on several rendering APIs that there is usually a “begin” statement. For RenderMan it is RiBegin(), Arnold has AiBegin so naturally I wanted Equinox to use EiBegin(). This placed me in a spot that I had not been before. I new that EiBegin would do a lot of initialization of the necessary defaults for a scene to be rendered. Things such as default resolution and image output. I also knew that I needed a “world” structure. A structure that would hold the data of my rendering world, such as an array of lights, materials, textures, cameras and objects. I gave other APIs a look and I saw that none of them passed a world structure around to every function. This being the case, I realized that I needed the world to be global variable.

I am sure that at some point I will need other global variables so I needed to figure out a way to create global variables when someone used the Equinox api. The goal was to be able to do this:

#include "equinox.h"

int main() {
    // Initialize the Equinox rendering environment
    EiBegin();
    ....
}

Getting the global variables to work was a bit tricky. I wanted the global variables to be created automatically so I created a Globals.h file. This created som issues mainly because I would get redefined symbols if I placed the global file in any of the source files that would end up in my equinox library. I did some reading and consulting and I figured that I needed to make sure that Globals.h was only included once and only in the main file. Everywhere else all references to my global variables would just need to use the “extern” keyword. The commit for these changes can be found on github. Here is an image generated with the very early API.

and here is the code that generated the image

#include "equinox.h"

int main() {

  EiBegin();
  // Create the camera
  EtNode cam = EiNode("ortho_camera");

  int i;
  for (i = -250; i <= 250; i+=50) {
    EtNode sph = EiNode("sphere");
    EiNodeSetFlt(&sph,"radius",25.f);
    EiNodeSetFlt(&sph,"zmin",-25.f);
    EiNodeSetFlt(&sph,"zmax",25.f);
    EiNodeSetPnt(&sph,"center",i,0,0);
  }
  // start the renderer
  EiRender();
  return 0;
}

Introducing Equinox

Some time ago I began to write a little ray tracer based on the book “Raytracing from the Ground Up“. The book proved to be very easy to follow and before I knew it I had a little program that could generate images. I continued to push the rendering engine until I was able to get several interesting images. The project was hosted on google code and a blog for the raytracer was hosted on blogspot. Here are some of the images i was able to generate with it.

Eventually I wanted to implement more features into the renderer, but as I went through the code I realized that it was very messy and hard to extend, even though I had used objects (C++) to write the code. I started to read other rendering books and I decided to have another go at the renderer. This time I would start from scratch but I would build a plugin based system. I would concentrate on building a core rendering engine, a rendering API for describing scenes and a plugin API to extend the renderer as needed.

I began to read on how to implement a plugin architecture and a lot of the information I found mentioned that it is better to implement the plugin part in C and combine the code with C++. I began to talk with several co-workers about how to approach this project. Slavomir Kaslev, a very talented developer at Blizzard Cinematics, convinced me to write the whole system in C. I analyzed his proposal and even though I knew that C++ and objects would make things a bit easier, learning C is something that has been on my list for a long time.

So straight C it is! Boy what a decision it turned out to be. I was immediately amazed with how much “magic” languages such as C++ and Python perform behind the scenes to make the developers life easier. I decided to merge different concepts from different renderers and books I have read. I liked the Arnold C API and their node based approach. Being that my renderer would also be C, Arnold proved a good inspiration for the API. Since I dont have source code access to Arnold, and even if I did I would not want to use it ( that work belongs to Marcos and Solid Angle ), I decided to use PBRT as a base for most of the calculations, but at one point I would try to get away from the “physical” aspect of it and try to develop different integration and rendering methods.

It took me a lot of work, a lot of errors and a good deal of consults with my C mentors to get to a point where I could generate an image. At the moment I am still working on the plugin API, trying to decide what goes where. In the mean time here is the first image generated with Equinox.