FastMath - Fast Math Library for Delphi

FastMath is a Delphi math library that is optimized for fast performance (sometimes at the cost of not performing error checking or losing a little accuracy). It uses hand-optimized assembly code to achieve much better performance then the equivalent functions provided by the Delphi RTL.

This makes FastMath ideal for high-performance math-intensive applications such as multi-media applications and games. For even better performance, the library provides a variety of "approximate" functions (which all start with a Fast-prefix). These can be very fast, but you will lose some (sometimes surprisingly little) accuracy. For gaming and animation, this loss in accuracy is usually perfectly acceptable and outweighed by the increase in speed. Don't use them for scientific calculations though...

You may want to call DisableFloatingPointExceptions at application startup to suppress any floating-point exceptions. Instead, it will return extreme values (like Nan or Infinity) when an operation cannot be performed. If you use FastMath in multiple threads, you should call DisableFloatingPointExceptions in the Execute block of those threads.

Superior Performance

Most operations can be performed on both singular values (scalars) as well as vectors (consisting of 2, 3 or 4 values). SIMD optimized assembly code is used to calculate multiple outputs at the same time. For example, adding two 4-value vectors together is almost as fast as adding two single values together, resulting in a 4-fold speed increase. Many functions are written in such a way that the performance is even better. You will find a lot of functions that are 10 or more times faster then their Delphi counterparts.

On 32-bit and 64-bit desktop platforms (Windows and OS X), this performance is achieved by using the SSE2 instruction set. This means that the computer must support SSE2. However, since SSE2 was introduced back in 2001, the vast majority of computers in use today will support it. All 64-bit desktop computers have SSE2 support by default. However, you can always compile this library with the FM_NOSIMD define to disable SIMD optimization and use plain Pascal versions. This can also be useful to compare the speed of the Pascal versions with the SIMD optimized versions.

On 32-bit mobile platforms (iOS and Android), the NEON instruction set is used for SIMD optimization. This means that your device needs to support NEON. But since Delphi already requires this, this poses no further restrictions.

On 64-bit mobile platforms (iOS), the Arm64/AArch64 SIMD instruction set is used.

Architecture and Design Decisions

FastMath operations on single-precision floating-point values only. Double-precision floating-point arithmetic is (currently) unsupported.

Most functions operate on single values (of type Single) and 2-, 3- and 4-dimensional vectors (of types TVector2, TVector3 and TVector4 respectively). Vectors are not only used to represent points or directions in space, but can also be regarded as arrays of 2, 3 or 4 values that can be used to perform calculations in parallel. In addition to floating-point vectors, there are also vectors that operator on integer values (TIVector2, TIVector3 and TIVector4).

There is also support for 2x2, 3x3 and 4x4 matrices (called TMatrix2, TMatrix3 and TMatrix4). By default, matrices are stored in row-major order, like those in the RTL's System.Math.Vectors unit. However, you can change this layout with the FM_COLUMN_MAJOR define. This will store matrices in column-major order instead, which is useful for OpenGL applications (which work best with this layout). In addition, this define will also clip the depth of camera matrices to -1..1 instead of the default 0..1. Again, this is more in line with the default for OpenGL applications.

For representing rotations in 3D space, there is also a TQuaternion, which is similar to the RTL's TQuaternion3D type.

The operation of the library is somewhat inspired by shader languages (such as GLSL and HLSL). In those languages you can also treat single values and vectors similarly. For example, you can use the Sin function to calculate a single sine value, but you can also use it with a TVector4 type to calculate 4 sine values in one call. When combined with the "approximate" Fast* functions, this can result in an enormous performance boost. For example, using FastSin(TVector4) to calculate 4 sine values in parallel is up to 40 times faster than 4 separate calls to Sin(Single). On the extreme end, calling FastExp2(TVector4) is up to 300 times faster than 4 separate Exp2(Single) calls.

Overloaded Operators

All vector and matrix types support overloaded operators which allow you to negate, add, subtract, multiply and divide scalars, vectors and matrices. There are also overloaded operators that compare vectors and matrices for equality. These operators check for "exact" matches (like Delphi's "=" operator). They don't allow for very small variations (like Delphi's SameValue functions).

The arithmetic operators "+", "-", "*" and "/" usually work component-wise when applied to vectors. For example if A and B are of type TVector4, then C := A * B will set C to (A.X * B.X, A.Y * B.Y, A.Z * B.Z, A.W * B.W). It will not perform a dot or cross product (you can use the Dot and Cross functions to compute those).

For matrices, the "+" and "-" operators also operate component-wise. However, when multiplying (or dividing) matrices with vectors or other matrices, then the usual linear algebraic multiplication (or division) is used. For example:

  • M := M1 * M2 performs a linear algebraic matrix multiplication

  • V := M1 * V1 performs a matrix * row vector linear algebraic multiplication

  • V := V1 * M1 performs a column vector * matrix linear algebraic multiplication

To multiply matrices component-wise, you can use the CompMult method.

Interoperability with the Delphi RTL

FastMath provides its own vector and matrix types for superior performance. Most of them are equivalent in functionality and data storage to the Delphi RTL types. You can typecast between them or implicitly convert from the FastMath type to the RTL type or vice versa (eg. MyVector2 := MyPointF). The following table shows the mapping:

Purpose

FastMath

Delphi RTL

2D point/vector

TVector2

TPointF

3D point/vector

TVector3

TPoint3D

4D point/vector

TVector4

TVector3D

2x2 matrix

TMatrix2

N/A

3x3 matrix

TMatrix3

TMatrix

4x4 matrix

TMatrix4

TMatrix3D

quaternion

TQuaternion

TQuaternion3D

Functions

Below you will find a categorized list of the global functions supported by FastMath:

Helper functions for creating vectors and matrices

Angle and Trigonometry Functions

  • Radians: converts degrees to radians

  • Degrees: converts radians to degrees

  • Sin: calculates a sine of an angle

  • Cos: calculates a cosine of an angle

  • SinCos: calculates a sine/cosine pair

  • Tan: calculates the tangent of an angle

  • ArcSin: calculates an arc sine

  • ArcCos: calculates an arc cosine

  • ArcTan: calculates an arc tangent

  • ArcTan2: calculates an arctangent angle and quadrant

  • Sinh: calculates a hyperbolic sine

  • Cosh: calculates a hyperbolic cosine

  • Tanh: calculates a hyperbolic tangent

  • ArcSinh: calculates an inverse hyperbolic sine

  • ArcCosh: calculates an inverse hyperbolic cosine

  • ArcTanh: calculates an inverse hyperbolic tangent

Exponential Functions

  • Power: raises a base to a power

  • Exp: calculates a natural exponentiation (that is, e raised to a given power)

  • Ln: calculates a natural logarithm

  • Exp2: calculates 2 raised to a power

  • Log2: calculates a base 2 logarithm

  • Sqrt: calculates a square root

  • InverseSqrt: calculates an inverse square root

Fast Approximate Functions

Common Functions

  • Abs: calculates an absolute value

  • Sign: calculates the sign of a value

  • Floor: rounds a value towards negative infinity

  • Trunc: rounds a value towards 0

  • Round: rounds a value towards its nearest integer

  • Ceil: rounds a value towards positive infinity

  • Frac: returns the fractional part of a number

  • FMod: calculates the remainder of a floating-point division

  • ModF: splits a floating-point value into its integer and fractional parts

  • Min: calculates the minimum of two values

  • Max: calculates the maximum of two values

  • EnsureRange: clamps a given value into a range

  • Mix: calculates a linear blend between two values, using on a progress value

  • Step: step function

  • SmoothStep: performs smooth Hermite interpolation between 0 and 1

  • FMA: Fused Multiply and Add

Matrix Functions

  • OuterProduct: multiplies a column vector with a row vector

Configuration Functions


Generated by PasDocEx, based on PasDoc 0.14.0.