FastMath - Fast Math Library for Delphi
FastMath is a Delphi math library that is optimized for fast performance (sometimes at the cost of not performing error checking or losing a little accuracy). It uses hand-optimized assembly code to achieve much better performance then the equivalent functions provided by the Delphi RTL.
This makes FastMath ideal for high-performance math-intensive applications such as multi-media applications and games. For even better performance, the library provides a variety of "approximate" functions (which all start with a Fast -prefix). These can be very fast, but you will lose some (sometimes surprisingly little) accuracy. For gaming and animation, this loss in accuracy is usually perfectly acceptable and outweighed by the increase in speed. Don't use them for scientific calculations though...
You may want to call DisableFloatingPointExceptions at application startup to suppress any floating-point exceptions. Instead, it will return extreme values (like Nan or Infinity) when an operation cannot be performed. If you use FastMath in multiple threads, you should call DisableFloatingPointExceptions in the Execute block of those threads.
Superior Performance
Most operations can be performed on both singular values (scalars) as well as vectors (consisting of 2, 3 or 4 values). SIMD optimized assembly code is used to calculate multiple outputs at the same time. For example, adding two 4-value vectors together is almost as fast as adding two single values together, resulting in a 4-fold speed increase. Many functions are written in such a way that the performance is even better. You will find a lot of functions that are 10 or more times faster then their Delphi counterparts.
On 32-bit and 64-bit desktop platforms (Windows and OS X), this performance is achieved by using the SSE2 instruction set. This means that the computer must support SSE2. However, since SSE2 was introduced back in 2001, the vast majority of computers in use today will support it. All 64-bit desktop computers have SSE2 support by default. However, you can always compile this library with the FM_NOSIMD define to disable SIMD optimization and use plain Pascal versions. This can also be useful to compare the speed of the Pascal versions with the SIMD optimized versions.
On 32-bit mobile platforms (iOS and Android), the NEON instruction set is used for SIMD optimization. This means that your device needs to support NEON. But since Delphi already requires this, this poses no further restrictions.
On 64-bit mobile platforms (iOS), the Arm64/AArch64 SIMD instruction set is used.
Architecture and Design Decisions
FastMath operations on single-precision floating-point values only. Double-precision floating-point arithmetic is (currently) unsupported.
Most functions operate on single values (of type Single ) and 2-, 3- and 4-dimensional vectors (of types TVector2 , TVector3 and TVector4 respectively). Vectors are not only used to represent points or directions in space, but can also be regarded as arrays of 2, 3 or 4 values that can be used to perform calculations in parallel. In addition to floating-point vectors, there are also vectors that operator on integer values (TIVector2 , TIVector3 and TIVector4 ).
There is also support for 2x2, 3x3 and 4x4 matrices (called TMatrix2 , TMatrix3 and TMatrix4 ). By default, matrices are stored in row-major order, like those in the RTL's System.Math.Vectors unit. However, you can change this layout with the FM_COLUMN_MAJOR define. This will store matrices in column-major order instead, which is useful for OpenGL applications (which work best with this layout). In addition, this define will also clip the depth of camera matrices to -1..1 instead of the default 0..1. Again, this is more in line with the default for OpenGL applications.
For representing rotations in 3D space, there is also a TQuaternion , which is similar to the RTL's TQuaternion3D type.
The operation of the library is somewhat inspired by shader languages (such as GLSL and HLSL). In those languages you can also treat single values and vectors similarly. For example, you can use the Sin function to calculate a single sine value, but you can also use it with a TVector4 type to calculate 4 sine values in one call. When combined with the "approximate" Fast * functions, this can result in an enormous performance boost. For example, using FastSin(TVector4) to calculate 4 sine values in parallel is up to 40 times faster than 4 separate calls to Sin(Single) . On the extreme end, calling FastExp2(TVector4) is up to 300 times faster than 4 separate Exp2(Single) calls.
Overloaded Operators
All vector and matrix types support overloaded operators which allow you to negate, add, subtract, multiply and divide scalars, vectors and matrices. There are also overloaded operators that compare vectors and matrices for equality. These operators check for "exact" matches (like Delphi's "=" operator). They don't allow for very small variations (like Delphi's SameValue functions).
The arithmetic operators "+", "-", "*" and "/" usually work component-wise when applied to vectors. For example if A and B are of type TVector4 , then C := A * B will set C to (A.X * B.X, A.Y * B.Y, A.Z * B.Z, A.W * B.W) . It will not perform a dot or cross product (you can use the Dot and Cross functions to compute those).
For matrices, the "+" and "-" operators also operate component-wise. However, when multiplying (or dividing) matrices with vectors or other matrices, then the usual linear algebraic multiplication (or division) is used. For example:
M := M1 * M2 performs a linear algebraic matrix multiplication
V := M1 * V1 performs a matrix * row vector linear algebraic multiplication
V := V1 * M1 performs a column vector * matrix linear algebraic multiplication
To multiply matrices component-wise, you can use the CompMult method.
Interoperability with the Delphi RTL
FastMath provides its own vector and matrix types for superior performance. Most of them are equivalent in functionality and data storage to the Delphi RTL types. You can typecast between them or implicitly convert from the FastMath type to the RTL type or vice versa (eg. MyVector2 := MyPointF ). The following table shows the mapping:
Functions
Below you will find a categorized list of the global functions supported by FastMath:
Helper functions for creating vectors and matrices
Angle and Trigonometry Functions
Radians: converts degrees to radians
Degrees: converts radians to degrees
Sin: calculates a sine of an angle
Cos: calculates a cosine of an angle
SinCos: calculates a sine/cosine pair
Tan: calculates the tangent of an angle
ArcSin: calculates an arc sine
ArcCos: calculates an arc cosine
ArcTan: calculates an arc tangent
ArcTan2: calculates an arctangent angle and quadrant
Sinh: calculates a hyperbolic sine
Cosh: calculates a hyperbolic cosine
Tanh: calculates a hyperbolic tangent
ArcSinh: calculates an inverse hyperbolic sine
ArcCosh: calculates an inverse hyperbolic cosine
ArcTanh: calculates an inverse hyperbolic tangent
Exponential Functions
Power: raises a base to a power
Exp: calculates a natural exponentiation (that is, e raised to a given power)
Ln: calculates a natural logarithm
Exp2: calculates 2 raised to a power
Log2: calculates a base 2 logarithm
Sqrt: calculates a square root
InverseSqrt: calculates an inverse square root
Fast Approximate Functions
Common Functions
Abs: calculates an absolute value
Sign: calculates the sign of a value
Floor: rounds a value towards negative infinity
Trunc: rounds a value towards 0
Round: rounds a value towards its nearest integer
Ceil: rounds a value towards positive infinity
Frac: returns the fractional part of a number
FMod: calculates the remainder of a floating-point division
ModF: splits a floating-point value into its integer and fractional parts
Min: calculates the minimum of two values
Max: calculates the maximum of two values
EnsureRange: clamps a given value into a range
Mix: calculates a linear blend between two values, using on a progress value
Step: step function
SmoothStep: performs smooth Hermite interpolation between 0 and 1
FMA: Fused Multiply and Add
Matrix Functions
Configuration Functions
Generated by PasDocEx, based on PasDoc 0.14.0.
|