Skip to content

Add bayesian nn#357

Closed
zelioluca wants to merge 14 commits into
masterfrom
add_bayesian_nn
Closed

Add bayesian nn#357
zelioluca wants to merge 14 commits into
masterfrom
add_bayesian_nn

Conversation

@zelioluca
Copy link
Copy Markdown
Collaborator

@zelioluca zelioluca commented Mar 23, 2024

Hi Let s start to check the code together!I m not an expert in bayesian NN but I study the original documentation of tit that is here:

One important thing: I got trouble with the package version to install. With the current eis env I suggest you to install the 0.22 the newest one crashed. In this PR there are still some sort of commit from old stuff I do not get way! Monday I will copy a paste there some other documentation to explain what this Bayesian truly do :-)

I put two functions there:

  • generate_prediction_using_traditional_arrays: this I think is the version to use for the toolikit, because it take np array as input like my cnn.
  • generate_predictions_with_tensor_api: this one use the tf apoi to feed the bayesian network.
    Both of them do the same exact stuff it changes how the input is presented to the NN.

PS I start to do the last PR today!

@zelioluca zelioluca requested a review from nmaarnio March 23, 2024 05:46
@nmaarnio
Copy link
Copy Markdown
Collaborator

nmaarnio commented Apr 4, 2024

Hi @zelioluca , and sorry for late reply! We agreed with @msmiyels that he will first work on the CNN and after that on the BNN if he has time. The plan is that Micha would finish the development of these 2 tools, so you might not need to do programming work on these anymore :). We'll ask some questions and help if the need arises!

@zelioluca
Copy link
Copy Markdown
Collaborator Author

zelioluca commented Apr 4, 2024 via email

@nmaarnio nmaarnio mentioned this pull request Apr 8, 2024
@msmiyels
Copy link
Copy Markdown
Collaborator

Hi @nmaarnio , @zelioluca ,

sorry for the long waiting time ⌛ and esp. @zelioluca for building this up. As mentioned last week in the WP3 meeting, I got hands on the BNN code for review. Besides some minor EIS specific style guidance and toolkit things, there are a few points that I came across. I'm not a bayesian expert, too, so take them with a grain of salt.

Overview:

  1. Train size must be number of samples instead number of attributes when calculating Kullback-Leibler weights
  2. Prediction and results are structured bit weird: they have each statistic per sample (pixel) in a nested structure which is not what we aim for
  3. Activation must have other options (like reluor linear , whereas latter is the default if None is provided) or even no specification at all
  4. Use of last_activation is misleading since the selected activation is only applied to the hidden layers, but None to the output layer(s)
  5. Do not understand why using a non-bayesian layer for the output layer(s) definition
  6. Unfortunately, the choice of loss and distributions depends on the goal, so that I'm not sure that the “simple” negative log likelihood is the best choice for everything, especially binary classification problems
  7. Would stick to the approach that has been implemented in the other tools as well, i.e., using arrays instead of TF generator objects for data input (which makes the generate_predictions_with_tensor_api functionality obsolete).
  8. The way of defining the input layer technically works but looks overcomplicated:
    Never used a dictionary with names etc. to get the number of attributes. Usually, the input_shape is just one parameter that is provided to the network or function
  9. Is it intended, that the batch normalization is done right after the input layer definition? This way, it will not affect the batches during training between the hidden layers
  10. None of the test run results provided expected or usable results for the binary classification problem, which is the main case people are going to work with

Tests:

I ran a slightly modified version of the code on a test bench using real-world data from two mineral assessments for which we have clear expectations how the results should look like (both for an ANN and BNN). Technically, the code runs, but I wasn´t able to reproduce the expected outputs (even close).

I also put some effort into changing the behavior of the BNN by addressing some of the above listed points, but that also did not work out. I started once with the BNNs the same way, but it seems that the "classical" approach or what we find in the Keras and TF doc's just does not work very well for that particular problem. I guess the regression thing is easier to solve than the binary classification.

How to proceed:

So what I've offered in the meeting was to integrate a Bayesian nn we used in another project and refactor it to achieve EIS conformity by the end of September for review. @nmaarnio : do we want to close this PR and open a new branch for the other version?

I would be happy to merge the two code bases in the future (say the basic idea of this code here, but as working version based on the stuff from the bayesian we use), but since all of my efforts to do so ended up in a dead end, it seems easier to substitude than to spend any more time on solving for now.

@zelioluca
Copy link
Copy Markdown
Collaborator Author

zelioluca commented Sep 16, 2024 via email

@nmaarnio
Copy link
Copy Markdown
Collaborator

Hi, yes I think it is best to close this PR now and you can start with a new branch @msmiyels . We don't need to delete this branch, so if there is time and will in the future, we can of course return to this implementation if we see a need.

@nmaarnio
Copy link
Copy Markdown
Collaborator

Closing now

@nmaarnio nmaarnio closed this Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants