lab-cosmo · rosecers · Nov 24, 2020 · Nov 17, 2020 · Nov 18, 2020 · Nov 20, 2020
diff --git a/notebooks/3_KernelMethods.ipynb b/notebooks/3_KernelMethods.ipynb
@@ -8,7 +8,7 @@
     "\n",
     "In this notebook, we introduce kernel-based versions of the models covered in the [Linear Methods](1_LinearMethods.ipynb) and [PCovR](2_PrincipalCovariatesRegression.ipynb) notebooks.\n",
     "\n",
-    "As always, for each model, we first go step-by-step through the derivation, with equations, embedded links, and citations supplied where useful. Second, we will employ a \"Utility Class\" for the model, which can be found in the utilities folder and contains all necessary functions.\n",
+    "As always, for each model, we first go step-by-step through the derivation, with equations, embedded links, and citations supplied where useful. Second, we will employ a \"Sklearn/Skcosmo Class\" for the models, which can be found in the skcosmo/sklearn mudule and contains all necessary functions.\n",
     "\n",
     "**Note**: To display values of variables in the markdown cells, you should [enable the jupyter contrib nbextension](https://stackoverflow.com/questions/52812231/print-variable-in-jupyter-notebook-markdown-cell-python)."
    ]
@@ -35,9 +35,13 @@
     "from utilities.plotting import (\n",
     "    plot_projection, plot_regression, check_mirrors, get_cmaps, table_from_dict\n",
     ")\n",
-    "from utilities.kernels import linear_kernel, gaussian_kernel, center_kernel\n",
-    "from utilities.classes import PCA, LR, KPCA, KRR, KPCovR\n",
-    "\n",
+    "from utilities.kernels import center_kernel # to be replaced the function from skcosmo\n",
+    "from sklearn.metrics.pairwise import linear_kernel, rbf_kernel \n",
+    "from utilities.classes import KPCovR # to be replaced with the class from skcosmo\n",
+    "from sklearn.decomposition import PCA, KernelPCA\n",
+    "from sklearn.linear_model import Ridge\n",
+    "from sklearn.kernel_ridge import KernelRidge\n",
+    "from functools import partial\n",
     "cmaps = get_cmaps()\n",
     "plt.style.use('../utilities/kernel_pcovr.mplstyle')\n",
     "dbl_fig=(2*plt.rcParams['figure.figsize'][0], plt.rcParams['figure.figsize'][1])"
@@ -83,7 +87,7 @@
     "    k(\\mathbf{x}, \\mathbf{x}^{\\prime}) = \\exp{\\left(-\\gamma\\lVert \\mathbf{x} - \\mathbf{x}^{\\prime}\\rVert^2\\right)}.\n",
     "\\end{equation}\n",
     "\n",
-    "In this notebook we use the gaussian kernel (```gaussian_kernel(XA, XB, gamma=1.0)```), however, we have also included the linear kernel function (```linear_kernel(XA, XB)```) in ```utilities/kernels.py```. You can check rather easily that if you use a linear kernel all of the kernel methods reduce explicitly to linear regression or principal component analysis in the original feature space."
+    "In this notebook we use the gaussian kernel (```rbf_kernel(XA, XB, gamma=1.0)```), however, we have also included the linear kernel function (```linear_kernel(XA, XB)```) in ```sklearn```. You can check rather easily that if you use a linear kernel all of the kernel methods reduce explicitly to linear regression or principal component analysis in the original feature space."
    ]
   },
   {
@@ -95,7 +99,8 @@
     "# Changing the kernels in this cell will change them throughout the notebook tutorial.\n",
     "# The function variable designates the kernel function to use throughout the notebook.\n",
     "# The string variable is used to pass to the utility classes. \n",
-    "kernel_func = gaussian_kernel\n",
+    "kernel_params = {\"kernel\": \"precomputed\"}\n",
+    "kernel_func = partial(rbf_kernel, gamma=1.0)\n",
     "kernel_type = 'gaussian'"
    ]
   },
@@ -590,10 +595,10 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "lr = LR(regularization=regularization)\n",
+    "lr = Ridge(alpha=regularization)\n",
     "\n",
     "lr.fit(X_train, Y_train)\n",
-    "Y_lr = lr.transform(X_test)\n",
+    "Y_lr = lr.predict(X_test)\n",
     "\n",
     "table_from_dict([get_stats(x=X_test, \n",
     "                           yp=Y_lr,\n",
@@ -623,11 +628,11 @@
    "outputs": [],
    "source": [
     "regularizations = np.linspace(-10, -2, 24)\n",
-    "krrs = [KRR(regularization=10**i, kernel_type=kernel_type) \n",
+    "krrs = [KernelRidge(alpha=10**i, **kernel_params) \n",
     "        for i in regularizations]\n",
     "\n",
     "for k in krrs:\n",
-    "    k.fit(X=X_train, K=K_train, Y=Y_train)"
+    "    k.fit(X=K_train, y = Y_train)"
    ]
   },
   {
@@ -636,10 +641,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "coeff_deter = np.array([k.statistics(X=X_test, \n",
-    "                                     Y=Y_test, \n",
-    "                                     K=K_test)['Coefficient of Determination<br>($R^2$)'] for k in krrs])\n",
-    "best_regularization = 10**regularizations[coeff_deter.argmax()]"
+    "ell = np.array([np.linalg.norm(Y_test - k.predict(K_test)) ** 2.0 / np.linalg.norm(Y_test) **2.0 for k in krrs])\n",
+    "best_regularization = 10**regularizations[ell.argmin()]"
    ]
   },
   {
@@ -655,15 +658,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "plt.semilogx(10**regularizations, coeff_deter, marker='o', zorder=-1)\n",
+    "plt.semilogx(10**regularizations, ell, marker='o', zorder=-1)\n",
     "plt.scatter(best_regularization,\n",
-    "            max(coeff_deter), marker='o', color='r',\n",
+    "            min(ell), marker='o', color='r',\n",
     "           )\n",
     "\n",
     "plt.xlabel(r\"$\\lambda$\")\n",
-    "plt.ylabel(r\"$R^2$\")\n",
+    "plt.ylabel(r\"$\\ell$\")\n",
     "\n",
-    "plt.title(r\"Effect of $\\lambda$ on $R^2$ for Kernel Ridge Regression\")\n",
+    "plt.title(r\"Effect of $\\lambda$ on $\\ell$ for Kernel Ridge Regression\")\n",
     "\n",
     "plt.show()"
    ]
@@ -717,11 +720,10 @@
     "\n",
     "regularization = 1e-12\n",
     "\n",
-    "krr = KRR(regularization=regularization, kernel_type=kernel_type)\n",
-    "krr.fit(X_train, Y_train)\n",
-    "Yhat_train = krr.transform(X_train).reshape(-1,Y.shape[-1])\n",
-    "PKY = krr.PKY\n",
-    "\n",
+    "krr = KernelRidge(alpha=regularization, **kernel_params)\n",
+    "krr.fit(K_train, Y_train)\n",
+    "Yhat_train = krr.predict(K_train).reshape(-1,Y.shape[-1])\n",
+    "PKY = krr.dual_coef_\n",
     "K_pca = K_train/(np.trace(K_train)/n_train)\n",
     "K_lr  = np.matmul(Yhat_train, Yhat_train.T)\n",
     "    \n",
@@ -829,13 +831,13 @@
    "source": [
     "fig, axes = plt.subplots(1,2, figsize=dbl_fig)\n",
     "\n",
-    "ref = KPCA(n_PC=2, kernel_type=kernel_type)\n",
-    "ref.fit(X_train)\n",
-    "xref = ref.transform(X_test)\n",
+    "ref = KernelPCA(n_components=2, **kernel_params)\n",
+    "ref.fit(K_train)\n",
+    "T_kpca = ref.transform(K_test)\n",
     "\n",
     "plot_projection(Y_test, T_kpcovr_test, fig=fig, ax=axes[0], \n",
     "                title = r\"KPCovR ($\\alpha={}$)\".format(alpha), **cmaps)\n",
-    "plot_projection(Y_test, xref, fig=fig, ax=axes[1], title = \"KPCA\", **cmaps)\n",
+    "plot_projection(Y_test, T_kpca, fig=fig, ax=axes[1], title = \"KPCA\", **cmaps)\n",
     "\n",
     "fig.suptitle(r\"These will become increasingly different as $\\alpha \\to 0.0.$\", \n",
     "             y=0.0, fontsize=plt.rcParams['font.size']+6)\n",
@@ -889,9 +891,9 @@
    "source": [
     "fig, axes = plt.subplots(1,2, figsize=dbl_fig)\n",
     "\n",
-    "ref = KRR(regularization=regularization, kernel_type=kernel_type)\n",
-    "ref.fit(X_train, Y_train)\n",
-    "yref = ref.transform(X_test)\n",
+    "ref = KernelRidge(alpha=regularization, **kernel_params)\n",
+    "ref.fit(K_train, Y_train)\n",
+    "yref = ref.predict(K_test)\n",
     "\n",
     "errors = np.concatenate((np.abs(Ypred-Y_test),np.abs(yref-Y_test)))\n",
     "\n",
@@ -1121,9 +1123,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# The Utility Classes\n",
+    "# Implementation in `scikit-learn`\n",
     "\n",
-    "Classes from the utility module enable computing KPCA, KRR, and KPCovR with a scikit.learn-like syntax. "
+    "Classes from `sklearn` enable computing KPCA and KRR. Here we'll review the basics of their usage; readers are encouraged to refer to sklearn documentation for further detail."
    ]
   },
   {
@@ -1132,16 +1134,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "from utilities.classes import KPCA, KRR, KPCovR"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "**Important Note**: In all kernel classes, the functions `fit`, `transform`, and `statistics` will either compute the designated kernel for the supplied $\\mathbf{X}$ data or use the provided precomputed kernel. The kernel is **always** a keyword argument (e.g. `model.fit(K=K)`), and the first argument, positionally, is $\\mathbf{X}$.\n",
-    "\n",
-    "In each demonstration, we will show the function signature using X, but we will also supply our precomputed kernel for computational efficiency."
+    "from sklearn.decomposition import KernelPCA\n",
+    "from sklearn.kernel_ridge import KernelRidge"
    ]
   },
   {
@@ -1157,14 +1151,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "kpca = KPCA(n_PC=2, kernel_type=kernel_type)"
+    "# We set kernel = 'precomputed' and supply kpca.fit(X) with X = K_train to improve computational efficiency.\n",
+    "kpca = KernelPCA(n_components=2, **kernel_params)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Calling `kpca.fit(X)` will generate the kernel for $\\mathbf{X}$ and compute/internally store the eigenvectors/values."
+    "Calling `kpca.fit(K_train)` will generate the kernel for $\\mathbf{X}$ and compute/internally store the eigenvectors/values."
    ]
   },
   {
@@ -1173,14 +1168,14 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "kpca.fit(X_train, K=K_train)"
+    "kpca.fit(K_train)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Calling `kpca.transform(X)` will compute and return the KPCA projection $\\mathbf{T}$."
+    "Calling `kpca.transform(K)` will compute and return the KPCA projection $\\mathbf{T}$."
    ]
   },
   {
@@ -1189,8 +1184,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "T_KPCA_train = kpca.transform(X_train, K=K_train)\n",
-    "T_KPCA_test= kpca.transform(X_test, K=K_test)"
+    "T_KPCA_train = kpca.transform(K_train)\n",
+    "T_KPCA_test= kpca.transform(K_test)"
    ]
   },
   {
@@ -1208,21 +1203,35 @@
     "plt.show()"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Calling `kpca.statistics(X, Y)` will output the statistics of the projection of the kernel created from $\\mathbf{X}$ and the regression onto $\\mathbf{Y}$."
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
-    "table_from_dict([kpca.statistics(X_train, Y_train, K=K_train), \n",
-    "                 kpca.statistics(X_test, Y_test, K=K_test)], \n",
+    "PTX = np.matmul(np.diagflat(1/((v_K[:n_PC]))) ,            \n",
+    "    np.matmul(T_KPCA_train.T, X_train))\n",
+    "\n",
+    "Xr_train = np.matmul(T_KPCA_train, PTX)\n",
+    "Xr_test = np.matmul(T_KPCA_test, PTX)\n",
+    "\n",
+    "K_approx_train = np.matmul(T_KPCA_train,T_KPCA_train.T)\n",
+    "K_test_test = kernel_func(X_test, X_test)\n",
+    "K_test_test = center_kernel(K_test_test)\n",
+    "K_approx_test = np.matmul(T_KPCA_test,T_KPCA_test.T)\n",
+    "\n",
+    "table_from_dict([get_stats(x=X_train, \n",
+    "                           xr=Xr_train,\n",
+    "                           y=Y_train, \n",
+    "                           t=T_KPCA_train, \n",
+    "                           k=K_train, \n",
+    "                           kapprox=K_approx_train), \n",
+    "                 get_stats(x=X_test, \n",
+    "                           xr=Xr_test,\n",
+    "                           y=Y_test, \n",
+    "                           t=T_KPCA_test, \n",
+    "                           k=K_test_test, \n",
+    "                           kapprox=K_approx_test)], \n",
     "                 headers = [\"Training\", \"Testing\"], \n",
     "                 title=\"KPCA\")"
    ]
@@ -1240,7 +1249,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "krr = KRR(regularization=regularization, kernel_type=kernel_type)"
+    "# We set kernel = 'precomputed' and supply krr.fit(X,y) with X = K_train to improve computational efficiency.\n",
+    "krr = KernelRidge(alpha=regularization, **kernel_params)"
    ]
   },
   {
@@ -1256,14 +1266,14 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "krr.fit(X_train, Y_train, K=K_train)"
+    "krr.fit(K_train, Y_train)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Calling `krr.transform(X)` will generate the desired kernel from input $\\mathbf{X}$ and compute and return the predicted $\\mathbf{Y}$ values from $\\hat{\\mathbf{Y}}_{KRR} = \\mathbf{K}\\mathbf{P}_{KY}$."
+    "Calling `krr.predict(X)` will generate the desired kernel from input $\\mathbf{X}$ and compute and return the predicted $\\mathbf{Y}$ values from $\\hat{\\mathbf{Y}}_{KRR} = \\mathbf{K}\\mathbf{P}_{KY}$."
    ]
   },
   {
@@ -1272,8 +1282,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "Y_krr_train = krr.transform(X_train, K=K_train)\n",
-    "Y_krr_test = krr.transform(X_test, K=K_test)"
+    "Y_krr_train = krr.predict(K_train)\n",
+    "Y_krr_test = krr.predict(K_test)"
    ]
   },
   {
@@ -1291,11 +1301,25 @@
     "plt.show()"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "table_from_dict([get_stats(x=X_train, yp=Y_krr_train,y=Y_train, k=K_train), \n",
+    "                 get_stats(x=X_test,  yp=Y_krr_test, y=Y_test, k=K_test)], \n",
+    "                 headers = [\"Training\", \"Testing\"], \n",
+    "                 title=\"Kernel Ridge Regression\")"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Calling `krr.statistics(X,Y)` will output the statistics of the regression of $\\mathbf{X}$ and $\\mathbf{Y}$."
+    "# Implementation in `skcosmo`\n",
+    "\n",
+    "Classes from `skcosmo` enable computing KPCovR. Here we'll review the basics of their usage; readers are encouraged to refer to skcosmo documentation for further detail."
    ]
   },
   {
@@ -1304,10 +1328,16 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "table_from_dict([krr.statistics(X=X_train, Y=Y_train, K=K_train), \n",
-    "                 krr.statistics(X=X_test, Y=Y_test, K=K_test)], \n",
-    "                 headers = [\"Training\", \"Testing\"], \n",
-    "                 title=\"Kernel Ridge Regression\")"
+    "from utilities.classes import KPCovR # to be replaced with the class from skcosmo"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Important Note**: In the class, the functions `fit`, `transform`, and `statistics` will either compute the designated kernel for the supplied $\\mathbf{X}$ data or use the provided precomputed kernel. The kernel is **always** a keyword argument (e.g. `model.fit(K=K)`), and the first argument, positionally, is $\\mathbf{X}$.\n",
+    "\n",
+    "We will show the function signature using X, but we will also supply our precomputed kernel for computational efficiency."
    ]
   },
   {
@@ -1330,7 +1360,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Calling `kpcovr.fit(X,Y)` will compute the weights $\\mathbf{P}_{KY}$ and internally store them."
+    "Calling `kpcovr.fit(X, Y, K)` will compute the weights $\\mathbf{P}_{KY}$ and internally store them."
    ]
   },
   {
@@ -1346,7 +1376,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Calling `kpcovr.transform(X)` will return the latent space transformation $\\mathbf{T}$, the predicted properties, and the reconstructed input data."
+    "Calling `kpcovr.transform(X, K)` will return the latent space transformation $\\mathbf{T}$, the predicted properties, and the reconstructed input data."
    ]
   },
   {
@@ -1394,13 +1424,6 @@
     "                ], \\\n",
     "                headers = [r\"KPCovR ($\\alpha={}$)\".format(kp.alpha), \"KRR\", \"KPCA\"])"
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {