This is a thought exercise, not a particular problem, but I'd like to hear your opinion. Suppose I have some matrix expression DSL using templates (Eigen, ublas, etc.).
Now suppose I have some matrices which are constants, for example:
Matrix2 sigma1 = {{0,1}, {1,0}};
Matrix2 sigma2 = {{0,i}, {-i,0}};
... etc
... and I have some operations with those matrices involving runtime values:
a*sigma1 + b*sigma2; // a,b runtime
What ideas do you have to implement constant matrices such that compiler can optimize the expressions the most? In particular, how do I resolve (i,j)
operator to a constant?
a
or b
from your code are Nan
or inf
. Gcc has an option (-ffinite-math-only) which (I think) can solve this issue - sbabbi 2012-04-12 00:35
From what I understand of the problem space: given a domain-specific language, we want to determine the minimal transformation such that some operator over the data (for example, (i,j)
), results in a lookup of a constant instead of the computation of mathematical formulae (for example, a*sigma1 + b*sigma2
).
Let's explore some of the possibilities here:
Performing the mathematical operations directly
The 0th-level implementation. Given no explicit optimization by your compiler, what happens if we execute instructions directly?
The answer is it depends. But, on most processors, your code execution will fall onto the CPU cache, where your assembly and branch executions will be optimized to the best of your cores' abilities. The guts of that process are actually really cool, but we'll assume you want to move beyond these capabilities and address the constantification of your operation directly in your code.
Bound and capture the space using a compiler-compiler
The first-order optimization would be to bound and capture the possible input and output space using a compiler-compiler. While this will address your input range effectively, limiting it only to the set of inputs and outputs that you desire, it does nothing for performance otherwise. So, we must move on.
Stringification and Macro Expansion
The second-order optimization would be to perform string or macro expansion of your value space directly. While this is fraught with corner-cases and surprising implementation-level tar pits, it can be done directly by your compiler if the operation is desired. (See also: loop unwinding)
Manual derivation and stack-bound satisfiability of the closed-form expression
(using, for example, a lookup table)
Finally, our third-order optimization would be to bound your space directly. This requires you to have a well-defined closed-form relation with a bounded input and output space to work effectively. If this relation cannot be determined or carries no bounds, you're out of luck and should consider keeping your current implementation if a better one is not known to exist.
Of these optimization techniques, the most applicable to linear algebraic operations are the latter two, given the problem bounds that you've described. Because most operations, such as a matrix translation, rotation, and scale operation are deterministic in nature, you can indeed optimize and bound your space effectively.
For a more theoretical answer, I recommend consulting http://cs.stackexchange.com, http://cstheory.stackexchange.com, and http://math.stackexchange.com. Both have many, many threads devoted to the decidability and proof of closed-form, polynomial solutions for entire classes of equations.
Ok, well this is hideous, but related to my comment on the original post.
Using this structure it should be possible to define the relevant operations you need, but that will be a lot of work to write all the appropriate specializations. You may also prefer your rows/columns to be swapped.
Defining the matrices at the end certainly isn't as elegant as in your original post, but perhaps this could be improved, especially with the use of 'auto' in C++11.
//-----------------------------------------------------------------------------
struct Unused {};
struct Imaginary {
Imaginary() {}
Imaginary(Unused const& unused) {}
};
struct MinusImaginary {
MinusImaginary() {}
MinusImaginary(Unused const& unused) {}
};
//-----------------------------------------------------------------------------
template <int I, int F = 0>
struct Fixed {
Fixed() {}
Fixed(Unused const& unused) {}
};
//-----------------------------------------------------------------------------
struct Float
{
Float(float value) : value_(value) {}
const float value_;
};
//-----------------------------------------------------------------------------
template <typename COL0, typename COL1>
struct Vector2
{
typedef COL0 col0_t;
typedef COL1 col1_t;
template <typename T0, typename T1>
Vector2(T0 const& t0, T1 const& t1)
: col0_(t0)
, col1_(t1)
{}
COL0 col0_;
COL1 col1_;
};
//-----------------------------------------------------------------------------
template <typename ROW0, typename ROW1>
struct Matrix2
{
typedef ROW0 row0_t;
typedef ROW1 row1_t;
Matrix2()
: row0_(Unused(), Unused())
, row1_(Unused(), Unused())
{
}
template <typename M00, typename M01, typename M10, typename M11>
Matrix2(M00 const& m00, M01 const& m01, M10 const& m10, M11 const& m11)
: row0_(m00, m01)
, row1_(m10, m11)
{
}
ROW0 row0_;
ROW1 row1_;
};
//-----------------------------------------------------------------------------
Matrix2<
Vector2< Fixed<0>, Fixed<1> >,
Vector2< Fixed<1>, Fixed<0> >
> sigma1;
const float f = 0.1f;
//-----------------------------------------------------------------------------
Matrix2<
Vector2< Fixed<0>, Imaginary >,
Vector2< MinusImaginary, Fixed<0> >
> sigma2;
//-----------------------------------------------------------------------------
Matrix2<
Vector2< Fixed<0>, Float >,
Vector2< Float, Fixed<0> >
> m3(Unused(), 0.2f,
0.8f, Unused());
// EDIT: Nicer initialization syntax in c++11
//-----------------------------------------------------------------------------
template <typename M00, typename M01, typename M10, typename M11>
Matrix2< Vector2<M00, M01>, Vector2<M10, M11> >
MakeMatrix(M00 const& m00, M01 const& m01, M10 const& m10, M11 const& m11)
{
return Matrix2< Vector2<M00, M01>, Vector2<M10, M11> >(m00,m01,m10,m11);
}
auto m4 = MakeMatrix(Fixed<0>(), Float(0.2f),
Float(0.8f), Fixed<0>() );