Skip to content

Commit 6ccfb9e

Browse files
committed
Scaled features
1 parent 57af448 commit 6ccfb9e

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

04 - Clustering.ipynb

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
"\n",
1111
"For example, let's take a look at the Palmer Islands penguin dataset, which contains measurements of penguins.\n",
1212
"\n",
13-
"Let's start by examining a dataset that contains observations of multiple classes. We'll use a dataset that contains observations of three different species pf penguin.\n",
13+
"Let's start by examining a dataset that contains observations of multiple classes. We'll use a dataset that contains observations of three different species of penguin.\n",
1414
"\n",
1515
"> **Citation**: The penguins dataset used in the this exercise is a subset of data collected and made available by [Dr. Kristen\n",
1616
"Gorman](https://www.uaf.edu/cfos/people/faculty/detail/kristen-gorman.php)\n",
@@ -52,8 +52,13 @@
5252
},
5353
"cell_type": "code",
5454
"source": [
55+
"from sklearn.preprocessing import MinMaxScaler\n",
5556
"from sklearn.decomposition import PCA\n",
5657
"\n",
58+
"# Normalize the numeric features so they're on the same scale\n",
59+
"penguin_features[penguins.columns[0:4]] = MinMaxScaler().fit_transform(penguin_features[penguins.columns[0:4]])\n",
60+
"\n",
61+
"# Get two principal components\n",
5762
"pca = PCA(n_components=2).fit(penguin_features.values)\n",
5863
"penguins_2d = pca.transform(penguin_features.values)\n",
5964
"penguins_2d[0:10]"

0 commit comments

Comments
 (0)