skip to Main Content

This is a code that outputs in a .ppm file the mandelbrot fractal. How can I optimize this?

#include<bits/stdc++.h>
using namespace std;

int findMandelbrot(double cr, double ci, int max_iterations)
{
    int i = 0;
    double zr = 0.0, zi = 0.0;
    while (i < max_iterations && zr * zr + zi * zi < 4.0)
    {
        double temp = zr * zr - zi * zi + cr;
        zi = 2.0 * zr * zi + ci;
        zr = temp;
        ++i;
    }
    return i;
}

double mapToReal(int x, int imageWidth, double minR, double maxR)
{
    double range = maxR - minR;
    return x * (range / imageWidth) + minR;
}

double mapToImaginary(int y, int imageHeight, double minI, double maxI)
{
    double range = maxI - minI;
    return y * (range / imageHeight) + minI;
}

int main()
{
    ifstream f("input.txt");
    int imageWidth, imageHeight, maxN;
    double minR, maxR, minI, maxI;

    if (!f)
    {
        cout << "Could not open file!" << endl;
        return 1;
    }

    f >> imageWidth >> imageHeight >> maxN;
    f >> minR >> maxR >> minI >> maxI;

    ofstream g("output_image.ppm");
    g << "P3" << endl;
    g << imageWidth << " " << imageHeight << endl;
    g << "255" << endl;


    double start = clock();

    for (int i = 0; i < imageHeight; i++)
    {
        for (int j = 0; j < imageWidth; j++)
        {
            double cr = mapToReal(j, imageWidth, minR, maxR);
            double ci = mapToImaginary(i, imageHeight, minI, maxI);

            int n = findMandelbrot(cr, ci, maxN);

            int r = ((int)sqrt(n) % 256);
            int gr = (2*n % 256);
            int b = (n % 256);

            g << r << " " << gr << " " << b << " ";
        }
        g << endl;

        if(i == imageHeight / 2) break;
    }

    cout << "Finished!" << endl;

    double stop = clock();

    cout << (stop-start)/CLOCKS_PER_SEC;
    return 0;
}

I go until imageHeight / 2 because in a photoshop I can just copy the other half.
I was thinking about loghartimic power but tried something and only works with integers…

3

Answers


  1. So this is the hot loop:

    int i = 0;
    double zr = 0.0, zi = 0.0;
    while (i < max_iterations && zr * zr + zi * zi < 4.0)
    {
        double temp = zr * zr - zi * zi + cr;
        zi = 2.0 * zr * zi + ci;
        zr = temp;
        ++i;
    }
    return i;
    

    I know how to implement non-integer power in fast CPU instructions but it won’t get you out of a bind as it doesn’t work for complex numbers at all. Nor will using std::complex help at all. You won’t gain anything to pay for the non-inlining and certainly can’t apply optimizations as you find them. So the best I can do is this:

    int i = max_iterations;
    double zr = 0.0, zi = 0.0;
    do {
        double temp = zr * zr - zi * zi + cr;
        zi = 2.0 * zr * zi + ci;
        zr = temp;
    } while (--i && zr * zr + zi * zi < 4.0)
    return max_iterations - i;
    

    Yes I know taking one integer test out of the loop didn’t buy much over all that. I only found one other optimizer and you’ll have to check if its really faster:

    int i = max_iterations;
    double zr = 0.0, zi = 0.0;
    do {
        double tempr = zr * zr - zi * zi + cr;
        double tempi = zr * zi;
        zi = tempi + tempi + ci;
        zr = tempr;
    } while (--i && zr * zr + zi * zi < 4.0);
    return max_iterations - i;
    

    That’s all there is folks.

    Login or Signup to reply.
  2. In findMandelbrot you use the expressions zr * zr and zi * zi in the loop test but then recalculate the same two expressions if the test succeeds. So one obvious improvement might be to cache those with something like…

    int findMandelbrot (double cr, double ci, int max_iterations)
    {
      int i = 0;
      double zr = 0.0, zi = 0.0;
      double zr2 = 0.0, zi2 = 0.0;
      while (i < max_iterations && zr2 + zi2 < 4.0) {
        double temp = zr2 - zi2 + cr;
        zi = 2.0 * zr * zi + ci;
        zr = temp;
        zr2 = zr * zr;
        zi2 = zi * zi;
        ++i;
      }
      return(i - 1);
    }
    
    Login or Signup to reply.
  3. There are many ways to optimize the mandelbrot fractal.

    One way is to optimize the code for your CPU or even your GPU. An impressive speedup is shown in: Mandelbrot with SSE, AVX and OpenCL. That optimizes the inner loop by nearly a factor of 1000. That’s 3 orders of magnitude faster.

    But there are other ways to optimize. The first you have already mentioned: The mandelbrot set is mirrored at y=0. So you only need half of it. There are some more trivial ways to avoid running the inner loop. If you scroll way down on Wikipedia on Mandelbrot to Optimizations you see “Cardioid / bulb checking”. That is a simple check for points in the apple shaped main part or the circle directly left of it. For a poster that can cover a lot of points.

    Another speedup I’ve seen uses the distance estimation (also on Wikipedia) method when generating previews or just the outline of the mandelbrot set in black&white. Random points in the image are computed and if outside the mandelbrot set a circle is drawn with radius given by the distance estimation method. Anything in that circle is not part of the mandelbrot set. This can quickly cover lots of points outside the mandelbrot set.

    Then there are methods of approximating results that might not result in a perfect image but usually are good enough. For example:

    • boundary tracing

    Calculate points following the boundary where pixels take N and N+1 iterations to escape. Same for N+1 and N+2. The everything between those two boundaries takes N iterations.

    • divide and conquer boxes

    Calculate the border of a rectangle (start with the whole image). If the border all takes N iterations then the inside of the rectangle takes N iterations and is filled in. Otherwise split the rectangle in 4 and repeat for each. The result might be improved by calculating a 2 or 3 pixel wide border but then less is saved.

    • guessing

    Calculate the image at a low resolution and then double the resolution keeping the calculated points. Go over the image and if 5×5 of the original points have the same N then fill a rectangle around the inner 3×3 points (works with 3×3, 7×7, 9×9, … points too). Points not filled in you calculate. Then repeat the whole thing till you git the final resolution.

    • single orbital iterations

    This is the hardest to get right and I’ve only ever seen one implementation of it. The idea is that points close together behave the same in the iteration. If you iterate for a grid of 3×3 points (covering the whole image to start with) you can interpolate the values for points in between using newton interpolation. That works quite well — until it doesn’t.

    So besides the 3×3 points grid you also iterate 4 test points, the middle of each of the grid cells. Do 8 iteration for those 13 points and then interpolate the 4 test points from the grid points. If the calculates and interpolated results drift apart too much (this is the hard part) you throw away the last 8 iterations and subdivide the grid in 4 half sized grids. The missing points you interpolate. Repeat till you reach the final resolution.

    The potential benefits are extreme even if it only works for a few iterations. Say you want your 40000 x 40000 pixel image. If SOI works for 10 loops (80 iterations) before the first subdivision then you saves 80 * 40000 * 40000 = 128_000_000_000 iteration by calculating 1040 and some interpolations and checks. Or a speedup of 123_076_923, 8 orders of magnitudes. But only for the first 80 iterations. As the grid is divided more and more the speedup becomes less. But every bit saved adds up.

    The great thing about this method is that is works for smooth/continious coloring or mapping to heights. The other methods only work with mapping iterations to color bands.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search