You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
.NET provides a broad range of support for various development domains, ranging from the creation of performance-oriented framework code to the rapid development of cloud native services, and beyond. In recent years, especially with the rise of AI and machine learning, there has been a prevalent push towards improving numerical support and allowing developers to more easily write and consume general purpose and reusable algorithms that work with a range of types and scenarios. While .NET's support for scalar algorithms and fixed-sized vectors/matrices is strong and continues to grow, its built-in library support for other concepts, such as tensors and arbitrary length vectors/matrices, is behind the times. Developers writing .NET applications, services, and libraries currently need to seek external dependencies in order to utilize functionality that is considered core or built-in to other ecosystems. In particular, for developers incorporating AI and copilots into their existing .NET applications and services, we strive to ensure that the core numerics support necessary to be successful is available and efficient, and that .NET developers are not forced to seek out non-.NET solutions in order for their .NET projects to be successful.
.NET Framework Compatible APIs
While there are many reasons for developers to migrate to modern .NET and it is the goto target for many new libraries or applications, there remains many large repos that have been around for years where migration can be non-trivial. And while there are many new language and runtime features which cannot ever work on .NET Framework, there still remains a subset of APIs that would be beneficial to provide and which can help those existing codebases bridge the gap until they are able to successfully complete migration.
It is therefore proposed that an out of band System.Numerics.Tensors package be provided which provides a core set of APIs that are compatible with and appropriate for use on .NET Standard 2.0. There are also two types that would be beneficial to polyfill downlevel as part of this process, System.Half and System.MathF, which will significantly improves the usability of the libraries for common scenarios.
Provide the Tensor class
Note
The following is an extreme simplification meant to give the general premise behind the APIs being exposed. It is intentionally skimming over several concepts and deeper domain specific terminology that is not critical to cover for the design proposal to be reviewed.
At a very high overview, you have scalars, vectors, and matrices. A scalar is largely just a single value T, a vector is a set of scalars, a matrix is a set of vectors. All of these can be loosely thought of as types of tensors and they have various relationships and can be used to build up and represent higher types. That is, you could consider that T is a scalar, or a vector of length 1, or a 1x1 matrix. Likewise, you can then consider that T[] is a vector or unknown length and that T[,] represents a 2-dimensional matrix and so on. All of these can be defined as types of tensors. -- As a note, this terminology is partially where the name for std::vector<T> in C++ (and similar in other languages) comes from. This is also where the general considerations of "vectorization" when writing SIMD or SIMT arise, although they are typically over fixed-length, rather than arbitrary length.
From the mathematical perspective, many of the operations you can do on scalars also apply to the higher types. There can sometimes be limitations or other considerations that can specialize or restrict these operations, but in principle they tend to exist and work mostly the same way. It is therefore generally desirable to support many such operations more broadly. This is particularly prevalent for "element-wise" operations which could be thought of as effectively doing array.Select((x) => M(x)) and where an explicit API can often provide significant performance improvements over the naive approach.
The core set of APIs described below cover the main arithmetic and conversion operators provided for T as well as element-wise operations for the functionality exposed by System.Math/System.MathF. Some design notes include:
Where the scalar API Method would return a bool, the vector API has two overloads MethodAny and MethodAll.
APIS will require that developers utilize Span<T> to access these APIs, overloads taking T[] are not provided
APIs will support the destination buffer and one of the source buffers being the same.
APIs taking more than 1 operand will support the other operands being a scalar and it indicating the same value is used for every element.
In order to make usage of the APIs less "clunky", particularly given these are mathematical APIs where building up complex expressions may be more prevalent, it is proposed a slight deviation from the normal API signature is taken. Rather than simply returning the number of elements written to the destination span, it is proposed that the APIs return a Span<T> instead. This would simply be destination sliced to the appropriate length. This still provides the required information on number of elements written, but gives the additional advantage that the result can be immediately passed into the next user of the algorithm without requiring the user to slice or do length checks themselves.
For targeting modern .NET, there will be a separate future proposal detailing a Tensor<T> type. This then matches a similar split we have for other generic types such as Vector<T> and Vector. The non-generic Tensor class holds extension methods, APIs that are only meant to support a particular set of T, and static APIs. While Tensor<T> will hold operators, instance methods, and core properties. The APIs defined for use on .NET Framework are effectively the "workhorse" APIs that Tensor<T> would then delegate to. They more closely resemble the signatures from the BLAS and LAPACK libraries, which are the industry standard baseline for Linear Algebra operations and allow tensor like functionality to be supported for arbitrary memory while allowing modern .NET to provide a type safe and friendlier way to work with such functionality that can simultaneously take advantage of newer language/runtime features, such as static virtuals in interfaces or generic math.
namespaceSystem.Numerics.Tensors;publicstaticpartialclassTensor{// Element-wise ArithmeticpublicstaticSpan<float>Add(ReadOnlySpan<float>left,ReadOnlySpan<float>right,Span<float>destination);publicstaticSpan<float>Add(ReadOnlySpan<float>left,Tright,Span<float>destination);publicstaticSpan<float>Subtract(ReadOnlySpan<float>left,ReadOnlySpan<float>right,Span<float>destination);publicstaticSpan<float>Subtract(ReadOnlySpan<float>left,Tright,Span<float>destination);publicstaticSpan<float>Multiply(ReadOnlySpan<float>left,ReadOnlySpan<float>right,Span<float>destination);publicstaticSpan<float>Multiply(ReadOnlySpan<float>left,Tright,Span<float>destination);// BLAS1: scalpublicstaticSpan<float>Divide(ReadOnlySpan<float>left,ReadOnlySpan<float>right,Span<float>destination);publicstaticSpan<float>Divide(ReadOnlySpan<float>left,Tright,Span<float>destination);publicstaticSpan<float>Negate(ReadOnlySpan<float>values,Span<float>destination);publicstaticSpan<float>AddMultiply(ReadOnlySpan<float>left,ReadOnlySpan<float>right,ReadOnlySpan<float>multiplier,Span<float>destination);publicstaticSpan<float>AddMultiply(ReadOnlySpan<float>left,ReadOnlySpan<float>right,Tmultiplier,Span<float>destination);publicstaticSpan<float>AddMultiply(ReadOnlySpan<float>left,Tright,ReadOnlySpan<float>multiplier,Span<float>destination);publicstaticSpan<float>MultiplyAdd(ReadOnlySpan<float>left,ReadOnlySpan<float>right,ReadOnlySpan<float>addend,Span<float>destination);publicstaticSpan<float>MultiplyAdd(ReadOnlySpan<float>left,ReadOnlySpan<float>right,Taddend,Span<float>destination);// BLAS1: axpypublicstaticSpan<float>MultiplyAdd(ReadOnlySpan<float>left,Tright,ReadOnlySpan<float>addend,Span<float>destination);publicstaticSpan<float>Exp(ReadOnlySpan<float>x,Span<float>destination);publicstaticSpan<float>Log(ReadOnlySpan<float>x,Span<float>destination);publicstaticSpan<float>Cosh(ReadOnlySpan<float>x,Span<float>destination);publicstaticSpan<float>Sinh(ReadOnlySpan<float>x,Span<float>destination);publicstaticSpan<float>Tanh(ReadOnlySpan<float>x,Span<float>destination);// Vector Arithmetic// A measure of similarity between two non-zero vectors of an inner product space.// It is widely used in natural language processing and information retrieval.publicstaticfloatCosineSimilarity(ReadOnlySpan<float>x,ReadOnlySpan<float>y);// A measure of distance between two points in a Euclidean space.// It is widely used in mathematics, engineering, and machine learning.publicstaticfloatDistance(ReadOnlySpan<float>x,ReadOnlySpan<float>y);// A mathematical operation that takes two vectors and returns a scalar.// It is widely used in linear algebra and machine learning.publicstaticfloatDot(ReadOnlySpan<float>x,ReadOnlySpan<float>y);// BLAS1: dot// A mathematical operation that takes a vector and returns a unit vector in the same direction.// It is widely used in linear algebra and machine learning.publicstaticfloatNormalize(ReadOnlySpan<float>x);// BLAS1: nrm2// A function that takes a collection of real numbers and returns a probability distribution.// It is widely used in machine learning and deep learning.publicstaticfloatSoftMax(ReadOnlySpan<float>x);// A function that takes a real number and returns a value between 0 and 1.// It is widely used in machine learning and deep learning.publicstaticSpan<float>Sigmoid(ReadOnlySpan<float>x,Span<float>destination);publicstaticfloatMax(ReadOnlySpan<float>value);publicstaticfloatMin(ReadOnlySpan<float>value);publicstaticintIndexOfMaxMagnitude(ReadOnlySpan<float>value);// BLAS1: iamaxpublicstaticintIndexOfMinMagnitude(ReadOnlySpan<float>value);publicstaticfloatSum(ReadOnlySpan<float>value);publicstaticfloatSumOfSquares(ReadOnlySpan<float>value);publicstaticfloatSumOfMagnitudes(ReadOnlySpan<float>value);// BLAS1: asumpublicstaticfloatProduct(ReadOnlySpan<float>value);publicstaticfloatProductOfSums(ReadOnlySpan<float>left,ReadOnlySpan<float>right);publicstaticfloatProductOfDifferences(ReadOnlySpan<float>left,ReadOnlySpan<float>right);// Vector ConversionspublicstaticSpan<Half>ConvertToHalf(ReadOnlySpan<float>value,Span<Half>destination);publicstaticSpan<float>ConvertToSingle(ReadOnlySpan<Half>value,Span<float>destination);}
Polyfill System.Half in Microsoft.Bcl.Half
System.Half is a core interchange type for AI scenarios, often being used to minify the storage impact for the tens of thousands to millions of data points that need to be stored. It is not, however, as frequently used for computation.
We initially exposed this type in .NET 5 purely as an interchange type and it is therefore lacking the arithmetic operators. These operators were later added in .NET 6/7 as a part of the generic math initiative. For .NET Framework, the interchange surface area should be sufficient and will follow the general guidance required for polyfills that they meet the initial shape we shipped, even if the version it shipped on is no longer in support (that is, the .NET Standard 2.0 surface area needs to remain compatible with .NET 5, even though .NET 5 is out of support).
System.MathF was added in .NET Core 2.0 to provide float support that had parity with the double support in System.Math. Given that float is the core computational type used in AI scenarios, many downlevel libraries currently provide their own internal wrappers around System.Math. .NET ships several such shims for its own scenarios and the proposed System.Numerics library would be no exception. As such, it would be beneficial to simply provide this functionality officially and allow such targets to remove their shims. This simplifies their experience and may give additional performance or correctness over the naive approach.
Future of Numerics and AI
.NET provides a broad range of support for various development domains, ranging from the creation of performance-oriented framework code to the rapid development of cloud native services, and beyond. In recent years, especially with the rise of AI and machine learning, there has been a prevalent push towards improving numerical support and allowing developers to more easily write and consume general purpose and reusable algorithms that work with a range of types and scenarios. While .NET's support for scalar algorithms and fixed-sized vectors/matrices is strong and continues to grow, its built-in library support for other concepts, such as tensors and arbitrary length vectors/matrices, is behind the times. Developers writing .NET applications, services, and libraries currently need to seek external dependencies in order to utilize functionality that is considered core or built-in to other ecosystems. In particular, for developers incorporating AI and copilots into their existing .NET applications and services, we strive to ensure that the core numerics support necessary to be successful is available and efficient, and that .NET developers are not forced to seek out non-.NET solutions in order for their .NET projects to be successful.
.NET Framework Compatible APIs
While there are many reasons for developers to migrate to modern .NET and it is the goto target for many new libraries or applications, there remains many large repos that have been around for years where migration can be non-trivial. And while there are many new language and runtime features which cannot ever work on .NET Framework, there still remains a subset of APIs that would be beneficial to provide and which can help those existing codebases bridge the gap until they are able to successfully complete migration.
It is therefore proposed that an out of band
System.Numerics.Tensorspackage be provided which provides a core set of APIs that are compatible with and appropriate for use on .NET Standard 2.0. There are also two types that would be beneficial to polyfill downlevel as part of this process,System.HalfandSystem.MathF, which will significantly improves the usability of the libraries for common scenarios.Provide the
TensorclassNote
The following is an extreme simplification meant to give the general premise behind the APIs being exposed. It is intentionally skimming over several concepts and deeper domain specific terminology that is not critical to cover for the design proposal to be reviewed.
At a very high overview, you have
scalars,vectors, andmatrices. Ascalaris largely just a single valueT, avectoris a set ofscalars, amatrixis a set ofvectors. All of these can be loosely thought of as types oftensorsand they have various relationships and can be used to build up and represent higher types. That is, you could consider thatTis ascalar, or avector of length 1, or a1x1matrix. Likewise, you can then consider thatT[]is a vector or unknown length and thatT[,]represents a 2-dimensional matrix and so on. All of these can be defined as types of tensors. -- As a note, this terminology is partially where the name forstd::vector<T>in C++ (and similar in other languages) comes from. This is also where the general considerations of "vectorization" when writing SIMD or SIMT arise, although they are typically over fixed-length, rather than arbitrary length.From the mathematical perspective, many of the operations you can do on scalars also apply to the higher types. There can sometimes be limitations or other considerations that can specialize or restrict these operations, but in principle they tend to exist and work mostly the same way. It is therefore generally desirable to support many such operations more broadly. This is particularly prevalent for "element-wise" operations which could be thought of as effectively doing
array.Select((x) => M(x))and where an explicit API can often provide significant performance improvements over the naive approach.The core set of APIs described below cover the main arithmetic and conversion operators provided for
Tas well as element-wise operations for the functionality exposed bySystem.Math/System.MathF. Some design notes include:Methodwould return abool, the vector API has two overloadsMethodAnyandMethodAll.Span<T>to access these APIs, overloads takingT[]are not providedSpan<T>instead. This would simply bedestinationsliced to the appropriate length. This still provides the required information on number of elements written, but gives the additional advantage that the result can be immediately passed into the next user of the algorithm without requiring the user to slice or do length checks themselves.For targeting modern .NET, there will be a separate future proposal detailing a
Tensor<T>type. This then matches a similar split we have for other generic types such asVector<T>andVector. The non-genericTensorclass holds extension methods, APIs that are only meant to support a particular set ofT, and static APIs. WhileTensor<T>will hold operators, instance methods, and core properties. The APIs defined for use on .NET Framework are effectively the "workhorse" APIs thatTensor<T>would then delegate to. They more closely resemble the signatures from the BLAS and LAPACK libraries, which are the industry standard baseline for Linear Algebra operations and allow tensor like functionality to be supported for arbitrary memory while allowing modern .NET to provide a type safe and friendlier way to work with such functionality that can simultaneously take advantage of newer language/runtime features, such as static virtuals in interfaces or generic math.Polyfill
System.HalfinMicrosoft.Bcl.HalfSystem.Halfis a core interchange type for AI scenarios, often being used to minify the storage impact for the tens of thousands to millions of data points that need to be stored. It is not, however, as frequently used for computation.We initially exposed this type in .NET 5 purely as an interchange type and it is therefore lacking the arithmetic operators. These operators were later added in .NET 6/7 as a part of the generic math initiative. For .NET Framework, the interchange surface area should be sufficient and will follow the general guidance required for polyfills that they meet the initial shape we shipped, even if the version it shipped on is no longer in support (that is, the .NET Standard 2.0 surface area needs to remain compatible with .NET 5, even though .NET 5 is out of support).
Polyfill
System.MathFinMicrosoft.Bcl.MathFSystem.MathFwas added in .NET Core 2.0 to providefloatsupport that had parity with thedoublesupport inSystem.Math. Given thatfloatis the core computational type used in AI scenarios, many downlevel libraries currently provide their own internal wrappers aroundSystem.Math. .NET ships several such shims for its own scenarios and the proposedSystem.Numericslibrary would be no exception. As such, it would be beneficial to simply provide this functionality officially and allow such targets to remove their shims. This simplifies their experience and may give additional performance or correctness over the naive approach.