Friday, January 30, 2015

Euler Zero: Problem 1

***SPOILER ALERT***
Solution to a Project Euler problem follows:
If you have not already done so, read this post before continuing. If you are working through the Project Euler problems and wish to solve them on your own (in C++), you might want to stop reading now if you have any self-respect. You've been given fair notice.
[euler zero answer format]

Problem 001
The first problem is simple: calculate the sum of all multiples of 3 or 5 less than 1000. Time for some template metaprogramming, as this can be solved in a fairly straightforward way with simple template recursion. The first thing to do is calculate the correct answer, for verification of the minimal runtime solution. For a problem this basic, it's simply:

euler001-poc.cpp:

#include <iostream>

// using a const int here instead of hard-coding in the for loop
// so you can change it to 10 and verify against the example on
// the project page (or change it to whatever if Project Euler 
// changes the parameters of the problem).
const int limit = 1000; 

int main()
{
    int sum = 0;
    for (int i = 3; i < limit; i++){
        if ( ((i % 3) == 0) || ((i % 5) == 0)){
            sum += i;
        }
    }
    std::cout << "Sum = " << sum << "\n";

    return 0;
}

Not surprisingly, the template-based solution is also fairly simple:

euler001-opt.cpp:

#include <iostream>

const int term  = 2;     // counting backwards from limit-1 to 3, inclusive
const int limit = 1000;  // same reason as proof-of-correctness version

template<int i>
struct PrevMultiple {
    enum { sum = (((i % 3) == 0) ||  ((i % 5) == 0) ? i : 0) 
               + (PrevMultiple<i - 1>::sum)
         };
};

template<>
struct PrevMultiple<term> {
    enum { sum = 0 };
};

int main()
{
    std::cout << "Sum = " << PrevMultiple<limit-1>::sum << "\n";
    return 0;
}

Instead of using a loop, we count backwards by recursing until we reach the explicit specialization for 2 (which does not itself recurse, thereby ending the cycle). Alternatively, we could have counted up to 1000 instead.

One interesting thing that came up in this problem was that Visual C++ can not compile it because it has a hard-coded limit to the depth of template recursion. You have to use gcc to compile it, with:

  gcc -std=c++03 -Wall -pedantic -ftemplate-depth=1000 -lstdc++ -o euler001-opt euler001-opt.cpp

Note that it doesn't even require C++11, although even using gcc the -ftemplate-depth=1000 is the key to getting this to compile without hitting a default limit on template recursion (VC++ has no such option). To see the actual runtime cost, you can look at the generated assembly code with:

  gcc -std=c++03 -Wall -pedantic -ftemplate-depth=1000 -S -masm=intel euler001-opt.cpp

This shows us that main() looks like this (the assembly source will be in euler001-opt.s):

    main:
    .LFB1196:
     .cfi_startproc
     push rbp
     .cfi_def_cfa_offset 16
     .cfi_offset 6, -16
     mov rbp, rsp
     .cfi_def_cfa_register 6
     mov esi, OFFSET FLAT:.LC0
     mov edi, OFFSET FLAT:_ZSt4cout
     call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
     mov esi, 233168
     mov rdi, rax
     call _ZNSolsEi
     mov esi, OFFSET FLAT:.LC1
     mov rdi, rax
     call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
     mov eax, 0
     pop rbp
     .cfi_def_cfa 7, 8
     ret
     .cfi_endproc
As you can see from the line in blue, the template recursion results in an integer literal (a constant) that the compiler has been forced to calculate for us at compile time. The rest of the assembly code is fluff related to using std::cout to display the result.

Total runtime cost: zero