-
Notifications
You must be signed in to change notification settings - Fork 96
Allow to enable half and bfloat16 at the same time #1827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: add_bfloat16
Are you sure you want to change the base?
Conversation
2bfb6ac
to
307e9e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some first notes, I still have to finish the rest of the PR. However, I think these are already the most relevant comments.
include/ginkgo/core/base/math.hpp
Outdated
|
||
template <typename T> | ||
struct next_precision_impl {}; | ||
struct next_precision_impl { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe combine this with the move version as
template<typename T, size_t step = 1>
struct next_precision_impl{...};
Then we don't need to use both next_precision
and next_precision_move
. As an example see: https://godbolt.org/z/dPK7ccxoM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to know tuple_element_t
. I will change it to yours.
using NextNextDense = matrix::Dense<next_precision_move<ValueType, 2>>; | ||
using NextNextNextDense = matrix::Dense<next_precision_move<ValueType, 3>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
using NextNextDense = matrix::Dense<next_precision_move<ValueType, 2>>; | |
using NextNextNextDense = matrix::Dense<next_precision_move<ValueType, 3>>; | |
using Next2Dense = matrix::Dense<next_precision_move<ValueType, 2>>; | |
using Next3Dense = matrix::Dense<next_precision_move<ValueType, 3>>; |
#if GINKGO_ENABLE_HALF || GINKGO_ENABLE_BFLOAT16 | ||
public ConvertibleTo< | ||
MultiVector<next_precision<next_precision<ValueType>>>>, | ||
public ConvertibleTo<MultiVector<next_precision_move<ValueType, 2>>>, | ||
#endif | ||
#if GINKGO_ENABLE_HALF && GINKGO_ENABLE_BFLOAT16 | ||
public ConvertibleTo<MultiVector<next_precision_move<ValueType, 3>>>, | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we perhaps always have these ConvertibleTo
base classes? Right now, we always provide the gko::half
and gko::bfloat16
classes, so the ConvertibleTo
variants are valid. It only requires making the gko::array
fp16 conversion functions available, which is a very manageable cost compared to the full fp16 build.
From the user perspective, using (any) fp16 only breaks when they try to instantiate one of our classes. So they will not be able to use the conversion function with fp16 anyway.
I should note that the error will be a linker error, which is most likely very painful for the users to decipher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConvertibleTo
that class needs to instantiate the class then the member function will require the function all available.
We do not go the same way to deal with the dpcpp single (throw not_implment) because the current macro design to change that with the three conditions (single/half/bfloat) will be a disaster. also, I guess it will exceed the symbol limit in windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have not communicated exactly what I meant, so I will create a PR to illustrate. In short, I think the #if
guards for the ConvertibleTo
are unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good. My comment on the ConvertibleTo
stuff can be ignored for this PR.
core/preconditioner/jacobi.cpp
Outdated
temporary_conversion<matrix::Diagonal<ValueType>>::template create< | ||
matrix::Diagonal<previous_precision<ValueType>>, | ||
matrix::Diagonal<previous_precision_move<ValueType, 2>>, | ||
matrix::Diagonal<previous_precision_move<ValueType, 3>>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be guarded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only gives some redundant check with the fixed *_move
template.
The code using the redundant part means nothing matched here, so it will throw the error in the end.
include/ginkgo/core/base/math.hpp
Outdated
* Move U until next_precision_move_impl<T, move> == U | ||
*/ | ||
template <typename T, unsigned move, | ||
typename U = typename next_precision_impl<T>::type> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it be
typename U = typename next_precision_impl<T>::type> | |
typename U = typename next_precision_impl<T, move>::type> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it is just to start the search from the next one.
The stop criterion of recursion is when the searching U is the same type as T.
It might jump to much for example <half, 2>
in <half, float, double>
, the search will only check double
, half
(stop) without considering float
.
307e9e4
to
9318cec
Compare
Co-authored-by: Marcel Koch <[email protected]>
9318cec
to
4f0036b
Compare
This PR allow to enable
bfloat16
andhalf
precision at the same time.Doing the operation between
bfloat16
andhalf
precision returnsfloat
type.It will revert the
gko::float16
alias changes in #1825 .float16
will behalf
precision again.Moreover, it adds the
next_precision_move<type, move>
to represent thenext_precision<next_precision<...
. same forprevious_precision_move
likely in another PR:
private:
and convert_to always call the templated functionrun<>
to acceptstype_list
and create a list for that