[MRG] Add CAN Method by YanisLalou · Pull Request #251 · scikit-adaptation/skada

YanisLalou · 2024-10-09T09:29:30Z

Paper: https://arxiv.org/pdf/1901.00976
Mostly eq 3-4-5 + paragraph 3.4

New Features:

CAN and CANLoss Implementation:
- Added CANLoss class to skada/deep/_divergence.py to implement the contrastive domain discrepancy (CDD) loss.
- Added CAN function to skada/deep/_divergence.py to implement the CAN domain adaptation method.

New Utilities:

SphericalKMeans:
- Added SphericalKMeans class to skada/deep/utils.py for clustering using cosine similarity.

Testing:

New Tests for CAN:
- Added tests for CAN in test_deep_divergence.py to ensure the new method works as expected.

Still needs to be done:

Double check the implementation

tgnassou · 2024-10-10T09:45:23Z

skada/deep/losses.py

+        if mask.sum() > 0:
+            class_features = features_s[mask]
+            normalized_features = F.normalize(class_features, p=2, dim=1)
+            centroid = normalized_features.mean(dim=0)


In the paper it seems to be only a sum no ?

In spherical k-means paper:

skada/deep/utils.py

skada/deep/losses.py

tgnassou · 2024-10-10T10:36:16Z

skada/deep/losses.py

+    # Discard ambiguous classes
+    class_counts = torch.bincount(cluster_labels_t, minlength=n_classes)
+    valid_classes = class_counts >= class_threshold
+    mask_t = valid_classes[cluster_labels_t]


I don't see what this line is doing?

class_counts = torch.bincount(cluster_labels_t, minlength=n_classes) counts how many samples are in each cluster.

valid_classes = class_counts >= class_threshold creates a boolean tensor where True indicates classes that have at least class_threshold samples.

mask_t = valid_classes[cluster_labels_t] is using the cluster labels as indices into the valid_classes tensor. This create a boolean mask_t, where True` indicates samples that belong to classes with enough representation.

This part of the code corresponds to the Filter the ambiguous classes part of the paper pseudo algorithm.

tgnassou · 2024-10-10T12:47:01Z

skada/deep/losses.py

+    features_t = features_t[mask_t]
+    cluster_labels_t = cluster_labels_t[mask_t]
+
+    # Define sigmas


Do you cannot use the mmd distance from DAN?

The formula is not exactly the same as for the mmd since before computing each mean we apply a specific mask

…er epoch

skada/deep/losses.py

skada/deep/utils.py

tgnassou · 2024-10-11T10:06:58Z

skada/deep/utils.py

+
+            for n_iter in range(self.max_iter):
+                # Assign samples to closest centroids
+                dissimilarities = self._compute_dissimilarities(X, centroids)


There is a difference here with the function cosine_similarities de torch?

In paper: cosine_dissimilarity is 0.5*(1 − cosine_similarity)

YanisLalou added 2 commits October 9, 2024 11:23

First commit for CAN

dd134c1

Update ref + docs

e588483

tgnassou reviewed Oct 10, 2024

View reviewed changes

YanisLalou added 3 commits October 10, 2024 17:07

Add a callback to CAN to compute the kmeans centroids only one time p…

d2eb3ee

…er epoch

change spherical-kmean compute_similiraties to compute_disimiliraties

9355a67

Change cosine dissimilarity function

b3f301f

tgnassou reviewed Oct 11, 2024

View reviewed changes

YanisLalou and others added 5 commits October 11, 2024 14:30

Compute kmeans centroids once per epoch

f64873d

Add X normalization

03f3cd5

Merge branch 'main' into can_branch

efff28d

Change in test deep divergence can reg value

8efe6bd

To compute centroids, switch from computing mean to sum

3dfdc3d

tgnassou approved these changes Oct 24, 2024

View reviewed changes

tgnassou merged commit 8e72df4 into scikit-adaptation:main Oct 24, 2024

YanisLalou changed the title ~~[WIP] Add CAN Method~~ [MRG] Add CAN Method Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Add CAN Method#251

[MRG] Add CAN Method#251
tgnassou merged 10 commits intoscikit-adaptation:mainfrom
YanisLalou:can_branch

YanisLalou commented Oct 9, 2024 •

edited

Loading

Uh oh!

tgnassou Oct 10, 2024

Uh oh!

tgnassou Oct 10, 2024

Uh oh!

YanisLalou Oct 22, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tgnassou Oct 10, 2024

Uh oh!

YanisLalou Oct 11, 2024

Uh oh!

tgnassou Oct 11, 2024

Uh oh!

tgnassou Oct 10, 2024

Uh oh!

YanisLalou Oct 11, 2024

Uh oh!

Uh oh!

Uh oh!

tgnassou Oct 11, 2024

Uh oh!

YanisLalou Oct 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YanisLalou commented Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New Features:

New Utilities:

Testing:

Still needs to be done:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

YanisLalou commented Oct 9, 2024 •

edited

Loading