Saturday, January 31, 2015

GCC 5 and template recursion limits

I found this part of the changes slated for the GCC 5 release series mildly interesting:
Excessive template instantiation depth is now a fatal error. This prevents excessive diagnostics that usually do not help to identify the problem.
Although as of earlier this month GCC 5 feature development is now over, I have not yet used or even built it yet so I'm curious about what this means as it seemed to be a fatal error already. For example, compiling a source file with a known template recursion depth of 1000 without the -ftemplate-depth option results in:
 $ gcc -std=c++11 -Wall -pedantic -lstdc++ deep.cpp
 deep.cpp:14:10: error: template instantiation depth exceeds maximum of 900 (use -ftemplate-depth= to increase the maximum) instantiating ‘struct PrevMultiple<99>’
 deep.cpp:14:10:   recursively required from ‘struct PrevMultiple<998>’
 deep.cpp:14:10:   required from ‘struct PrevMultiple<999>’
 deep.cpp:26:51:   required from here

 deep.cpp:14:10: error: incomplete type ‘PrevMultiple<99>’ used in nested name specifier
Note that the error occurred (and compilation stopped) while there were still 100 recursions remaining. No attempt was made to descend into the remaining instantiations (as VC++ will try to do, complaining the entire time), so how is that not a fatal error already? And what is "excessive" about those diagnostics? They not only tell you exactly what the problem is, they also tell you how to fix it if it's not an accidental runaway recursive expansion.

I can't imagine that '-ftemplate-depth' is going away, so what does that change even mean? A cryptic error message more in line with other gcc messages will replace the perfectly clear and helpful diagnostics currently displayed?

Friday, January 30, 2015

Euler Zero: Problem 1

***SPOILER ALERT***
Solution to a Project Euler problem follows:
If you have not already done so, read this post before continuing. If you are working through the Project Euler problems and wish to solve them on your own (in C++), you might want to stop reading now if you have any self-respect. You've been given fair notice.
[euler zero answer format]

Problem 001
The first problem is simple: calculate the sum of all multiples of 3 or 5 less than 1000. Time for some template metaprogramming, as this can be solved in a fairly straightforward way with simple template recursion. The first thing to do is calculate the correct answer, for verification of the minimal runtime solution. For a problem this basic, it's simply:

euler001-poc.cpp:

#include <iostream>

// using a const int here instead of hard-coding in the for loop
// so you can change it to 10 and verify against the example on
// the project page (or change it to whatever if Project Euler 
// changes the parameters of the problem).
const int limit = 1000; 

int main()
{
    int sum = 0;
    for (int i = 3; i < limit; i++){
        if ( ((i % 3) == 0) || ((i % 5) == 0)){
            sum += i;
        }
    }
    std::cout << "Sum = " << sum << "\n";

    return 0;
}

Not surprisingly, the template-based solution is also fairly simple:

euler001-opt.cpp:

#include <iostream>

const int term  = 2;     // counting backwards from limit-1 to 3, inclusive
const int limit = 1000;  // same reason as proof-of-correctness version

template<int i>
struct PrevMultiple {
    enum { sum = (((i % 3) == 0) ||  ((i % 5) == 0) ? i : 0) 
               + (PrevMultiple<i - 1>::sum)
         };
};

template<>
struct PrevMultiple<term> {
    enum { sum = 0 };
};

int main()
{
    std::cout << "Sum = " << PrevMultiple<limit-1>::sum << "\n";
    return 0;
}

Instead of using a loop, we count backwards by recursing until we reach the explicit specialization for 2 (which does not itself recurse, thereby ending the cycle). Alternatively, we could have counted up to 1000 instead.

One interesting thing that came up in this problem was that Visual C++ can not compile it because it has a hard-coded limit to the depth of template recursion. You have to use gcc to compile it, with:

  gcc -std=c++03 -Wall -pedantic -ftemplate-depth=1000 -lstdc++ -o euler001-opt euler001-opt.cpp

Note that it doesn't even require C++11, although even using gcc the -ftemplate-depth=1000 is the key to getting this to compile without hitting a default limit on template recursion (VC++ has no such option). To see the actual runtime cost, you can look at the generated assembly code with:

  gcc -std=c++03 -Wall -pedantic -ftemplate-depth=1000 -S -masm=intel euler001-opt.cpp

This shows us that main() looks like this (the assembly source will be in euler001-opt.s):

    main:
    .LFB1196:
     .cfi_startproc
     push rbp
     .cfi_def_cfa_offset 16
     .cfi_offset 6, -16
     mov rbp, rsp
     .cfi_def_cfa_register 6
     mov esi, OFFSET FLAT:.LC0
     mov edi, OFFSET FLAT:_ZSt4cout
     call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
     mov esi, 233168
     mov rdi, rax
     call _ZNSolsEi
     mov esi, OFFSET FLAT:.LC1
     mov rdi, rax
     call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
     mov eax, 0
     pop rbp
     .cfi_def_cfa 7, 8
     ret
     .cfi_endproc
As you can see from the line in blue, the template recursion results in an integer literal (a constant) that the compiler has been forced to calculate for us at compile time. The rest of the assembly code is fluff related to using std::cout to display the result.

Total runtime cost: zero

Euler Zero: format of answers

If you didn't already read it, see Project Euler, Part II for background, as this post is basically just a continuation that summarizes what my approach will be for each problem.

Each solution will have two variants - a proof-of-correctness version ("POC"), and the zero-runtime solution ("OPT") we can check against it. I do not plan on spending any time whatsoever trying to optimize the proof version. I do not care if the POC version is slow - the goal is simply to obtain a correct answer we can use as a check on the result of the optimal version, so I am not going to spend any time trying for a faster or more "elegant" solution. I'd rather aim for as much simplicity as possible, as I anticipate a zero-runtime solution for some of the later problems could get a bit... complex. If the zero-runtime solution uses a somewhat suboptimal method to generate the result, it doesn't even matter because everything is calculated before the program is ever run.

Thursday, January 29, 2015

Project Euler, Part II

As I said in my first Project Euler post, my goal would be not to just develop a correct solution to each problem, but one with the absolute minimum possible run-time cost. Well, the absolute minimum time for anything is zero.

But before I get to that:

After doing the first problem, I had a dilemma. The Project Euler web site asks that people not post the correct answer (or their solution), so that others may enjoy the challenge (paraphrasing). While on the one hand I understand (and respect) their request, on the other hand there are entire sites dedicated to publishing solutions and answers for Project Euler problems.

In the end, I decided the cat was already out of the bag, so I'll be posting my solutions. If someone genuinely wants to solve a given Project Euler problem on their own, they won't just google for the answer anyway. I will be prefacing each post that contains a solution to a Project Euler problem with a spoiler alert notice, so anyone actually working through the problems on their own shouldn't have anything ruined for them.

All that being said, I'll reiterate that the theoretical minimum run-time for anything is zero. So that's the goal: a zero-run-time solution to each problem from Project Euler.

Welcome to Euler Zero. Posts will be labeled accordingly.

Code should never be complicated.

Solutions or designs should never be complicated, either. Problems can be complicated. Designs, code and solutions might certainly be complex, but they should never be complicated. The hard part is making the complex simple.

Monday, January 26, 2015

Sunday, January 25, 2015

VC++ template recursion limits

I ran into the Visual C++ template recursion limit for nested templates yesterday. In VS2012 it's apparently capped at 500, with no way to adjust it (a least none that I can find). With gcc, at least the -ftemplate-depth option is available. Why, Microsoft, why? If you're going to put a cap on it, especially a hard cap that can't be adjusted, couldn't you at least conform to the recommended minimum (1024) from Annex B of the Standard?

Thursday, January 22, 2015

Project Euler

I just looked at Project Euler. I've noticed the occasional question on StackOverflow with "Project Euler" in the subject, and always kind of blew them off because so many seemed to have the same questions. I just assumed (yeah, yeah, never a good idea) they were more of the annoying "help me with my homework" or "what is teh codez to fix" types of questions that sometimes flood Stack Overflow, so I pretty much ignored them. I never looked at the project-euler tag-wiki for PE, so I thought it was just some crappy development toolkit or project that attracted a lot of stupid people that all had to solve the same problem. Again. And again.

So anyway, I was talking to a former colleague the other day and he mentioned PE because he knows that meaningless programming puzzles are like Sudoku to me. It turns out that although many of the problems are pretty easy, just as many could be fairly interesting (and some seem quite difficult). Especially if your goal is for more than just writing a correct solution: what is the absolute minimum run-time cost for a correct solution to each problem?

This could be fun.

Sunday, January 11, 2015

When life gives you feces... make feces-ade.

This Ingenious Machine Turns Feces Into Drinking Water

You can give Bill Gates all the crap you want to (literally, in this case), but the Bill & Melinda Gates Foundation does a lot of really good things.

#include <noextension>

Once again I am annoyed by the C++ Standard committee's decision to omit file extensions for headers in the Standard Library. With all due respect to the committee members, in the real world people occasionally have to search for files, perform operations on groups of files, or associate files with tools.