Add bayesian nn#357
Conversation
…rsions of Mini-Unet
|
Hi @zelioluca , and sorry for late reply! We agreed with @msmiyels that he will first work on the CNN and after that on the BNN if he has time. The plan is that Micha would finish the development of these 2 tools, so you might not need to do programming work on these anymore :). We'll ask some questions and help if the need arises! |
|
Hello niko and micha how are you?ok ok qnything you need I am here! 😁
…On Thu, 4 Apr 2024, 11:24 Niko Aarnio, ***@***.***> wrote:
Hi @zelioluca <https://github.com/zelioluca> , and sorry for late reply!
We agreed with @msmiyels <https://github.com/msmiyels> that he will first
work on the CNN and after that on the BNN if he has time. The plan is that
Micha would finish the development of these 2 tools, so you might not need
to do programming work on these anymore :). We'll ask some questions and
help if the need arises!
—
Reply to this email directly, view it on GitHub
<#357 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF3KUFOJENVYFBPORRUEMOLY3UE35AVCNFSM6AAAAABFEMI5YOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZWGUYTAMZTGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Hi @nmaarnio , @zelioluca , sorry for the long waiting time ⌛ and esp. @zelioluca for building this up. As mentioned last week in the WP3 meeting, I got hands on the BNN code for review. Besides some minor EIS specific style guidance and toolkit things, there are a few points that I came across. I'm not a bayesian expert, too, so take them with a grain of salt. Overview:
Tests: I ran a slightly modified version of the code on a test bench using real-world data from two mineral assessments for which we have clear expectations how the results should look like (both for an ANN and BNN). Technically, the code runs, but I wasn´t able to reproduce the expected outputs (even close). I also put some effort into changing the behavior of the BNN by addressing some of the above listed points, but that also did not work out. I started once with the BNNs the same way, but it seems that the "classical" approach or what we find in the Keras and TF doc's just does not work very well for that particular problem. I guess the regression thing is easier to solve than the binary classification. How to proceed: So what I've offered in the meeting was to integrate a Bayesian nn we used in another project and refactor it to achieve EIS conformity by the end of September for review. @nmaarnio : do we want to close this PR and open a new branch for the other version? I would be happy to merge the two code bases in the future (say the basic idea of this code here, but as working version based on the stuff from the bayesian we use), but since all of my efforts to do so ended up in a dead end, it seems easier to substitude than to spend any more time on solving for now. |
|
Hello there how are you? Yes in my opinion we should use the one that is
working. I m not a baysian guy too... I took the code from another guy and
drop into the plug in.
…On Mon, 16 Sept 2024, 11:02 Michael Steffen, ***@***.***> wrote:
Hi @nmaarnio <https://github.com/nmaarnio> , @zelioluca
<https://github.com/zelioluca> ,
sorry for the long waiting time ⌛ and esp. @zelioluca
<https://github.com/zelioluca> for building this up. As mentioned last
week in the WP3 meeting, I got hands on the BNN code for review. Besides
some minor EIS specific style guidance and toolkit things, there are a few
points that I came across. I'm not a bayesian expert, too, so take them
with a grain of salt.
*Overview*:
1. Train size must be number of samples instead number of attributes
when calculating Kullback-Leibler weights
2. Prediction and results are structured bit weird: they have each
statistic per sample (pixel) in a nested structure which is not what we aim
for
3. Activation must have other options (like reluor linear , whereas
latter is the default if None is provided) or even *no specification*
at all
4. Use of last_activation is misleading since the selected activation
is *only* applied to the *hidden layers*, but None to the output
layer(s)
5. Do not understand why using a *non*-bayesian layer for the output
layer(s) definition
6. Unfortunately, the choice of loss and distributions depends on the
goal, so that I'm not sure that the “simple” negative log likelihood is the
best choice for everything, especially binary classification problems
7. Would stick to the approach that has been implemented in the other
tools as well, i.e., using arrays instead of TF generator objects for
data input (which makes the generate_predictions_with_tensor_api
functionality obsolete).
8. The way of defining the input layer technically works but looks
*overcomplicated*:
Never used a dictionary with names etc. to get the number of
attributes. Usually, the input_shape is just one parameter that is
provided to the network or function
9. Is it intended, that the batch normalization is done right after
the input layer definition? This way, it will *not affect* the batches
during training between the hidden layers
10. *None* of the test run results provided expected or usable results
for the binary classification problem, which is the main case people are
going to work with
*Tests*:
I ran a slightly modified version of the code on a test bench using
real-world data from two mineral assessments for which we have clear
expectations how the results should look like (both for an ANN and BNN).
Technically, the code runs, but I wasn´t able to reproduce the expected
outputs (even close).
I also put some effort into changing the behavior of the BNN by addressing
some of the above listed points, but that also did not work out. I started
once with the BNNs the same way, but it seems that the "classical" approach
or what we find in the Keras and TF doc's just does not work very well for
that particular problem. I guess the regression thing is easier to solve
than the binary classification.
*How to proceed*:
So what I've offered in the meeting was to integrate a Bayesian nn we used
in another project and refactor it to achieve EIS conformity by the end of
September for review. @nmaarnio <https://github.com/nmaarnio> : do we
want to close this PR and open a new branch for the other version?
I would be happy to merge the two code bases in the future (say the basic
idea of this code here, but as working version based on the stuff from the
bayesian we use), but since all of my efforts to do so ended up in a dead
end, it seems easier to substitude than to spend any more time on solving
for now.
—
Reply to this email directly, view it on GitHub
<#357 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF3KUFLSSCYCVRSQQ4F22C3ZW2GBXAVCNFSM6AAAAABFEMI5YOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJSGI2DQOBXGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Hi, yes I think it is best to close this PR now and you can start with a new branch @msmiyels . We don't need to delete this branch, so if there is time and will in the future, we can of course return to this implementation if we see a need. |
|
Closing now |
Hi Let s start to check the code together!I m not an expert in bayesian NN but I study the original documentation of tit that is here:
One important thing: I got trouble with the package version to install. With the current eis env I suggest you to install the 0.22 the newest one crashed. In this PR there are still some sort of commit from old stuff I do not get way! Monday I will copy a paste there some other documentation to explain what this Bayesian truly do :-)
I put two functions there:
Both of them do the same exact stuff it changes how the input is presented to the NN.
PS I start to do the last PR today!