Skip to content

Commit

Permalink
project2-push
Browse files Browse the repository at this point in the history
  • Loading branch information
henryem committed Mar 28, 2016
1 parent 118528d commit 4ed9f45
Showing 1 changed file with 32 additions and 27 deletions.
59 changes: 32 additions & 27 deletions labs/project2/project2.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -57,7 +57,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"metadata": {
"collapsed": false
},
Expand All @@ -79,7 +79,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"metadata": {
"collapsed": false
},
Expand All @@ -99,7 +99,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 6,
"metadata": {
"collapsed": false
},
Expand All @@ -111,7 +111,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 7,
"metadata": {
"collapsed": false
},
Expand All @@ -129,7 +129,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -165,7 +165,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 9,
"metadata": {
"collapsed": false,
"scrolled": true
Expand All @@ -188,7 +188,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 10,
"metadata": {
"collapsed": false
},
Expand All @@ -202,12 +202,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question 1.1.2:** Assign `stemmed_message` to the stemmed version of the word \"message\"?"
"**Question 1.1.2:** Assign `stemmed_message` to the stemmed version of the word \"message\"."
]
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 11,
"metadata": {
"collapsed": false
},
Expand All @@ -221,7 +221,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 12,
"metadata": {
"collapsed": false
},
Expand All @@ -239,7 +239,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 13,
"metadata": {
"collapsed": false
},
Expand All @@ -253,7 +253,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 14,
"metadata": {
"collapsed": false
},
Expand All @@ -271,20 +271,24 @@
},
{
"cell_type": "code",
"execution_count": 61,
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# In our solution, we found it useful to first make an array\n",
"# called shortened containing the number of words that was\n",
"# chopped off of each word in vocab_table, but you don't have\n",
"# to do that.\n",
"shortened = ...\n",
"most_shortened = ...\n",
"vocab_table.where('Word', most_shortened)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 16,
"metadata": {
"collapsed": false
},
Expand All @@ -307,16 +311,17 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Here we have defined the proportion of our data\n",
"# that we want to designate for training as 11/16ths\n",
"# of our total dataset, and the amount reserved for\n",
"# validation is 2/16ths. \n",
"# of our total dataset. 2/16ths of the data is\n",
"# reserved for validation. The remaining 3/16ths\n",
"# will be used for testing.\n",
"\n",
"training_proportion = 11/16\n",
"validation_proportion = 2/16\n",
Expand Down Expand Up @@ -344,7 +349,7 @@
},
{
"cell_type": "code",
"execution_count": 62,
"execution_count": 18,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -376,13 +381,13 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 19,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Just run this cell\n",
"# Just run this cell to define genre_color.\n",
"\n",
"def genre_color(genre):\n",
" \"\"\"Assign a color to each genre.\"\"\"\n",
Expand Down Expand Up @@ -415,7 +420,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 20,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -497,7 +502,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question 2.1.2.** Complete the function `distance` that computes the Euclidean distance between any two songs, using two features. The last two lines call the `distance` function to show that *Lookin' for Love* is closer to *In Your Eyes* than *Insane In The Brain*. "
"**Question 2.1.2.** Complete the function `distance_two_features` that computes the Euclidean distance between any two songs, using two features. The last two lines call the `distance_two_features` function to show that *Lookin' for Love* is closer to *In Your Eyes* than *Insane In The Brain*. "
]
},
{
Expand Down Expand Up @@ -543,7 +548,7 @@
"source": [
"**Question 2.1.3.** Define the higher-order function `distance_from` that takes a single song title and two features. It returns a function `for_song` that takes a second song title and computes the distance between the first and second songs.\n",
"\n",
"*Hint: Call `distance` in your solution rather than re-implementing its computation.*"
"*Hint: Call `distance_two_features` in your solution rather than re-implementing its computation.*"
]
},
{
Expand Down Expand Up @@ -665,7 +670,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question 3.1.** Write a function to compute the Euclidean distance between two *arrays* of features of *arbitrary* (but equal) length. Use it to compute the distance between the first song in the training set and the first song in the test set, *using all of the features*."
"**Question 3.1.** Write a function to compute the Euclidean distance between two *arrays* of features of *arbitrary* (but equal) length. Use it to compute the distance between the first song in the training set and the first song in the test set, *using all of the features*. (Remember that the title, artist, and genre of the songs are not features.)"
]
},
{
Expand Down Expand Up @@ -851,7 +856,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question 3.1.4.** Now compute the 5-nearest neighbors classification of the first song in the test set. That is, decide on its genre by finding the most common genre among its 5 nearest neighbors, according to the distances you've calculated. Then check whether your classifier chose the right genre. "
"**Question 3.1.4.** Now compute the 5-nearest neighbors classification of the first song in the test set. That is, decide on its genre by finding the most common genre among its 5 nearest neighbors, according to the distances you've calculated. Then check whether your classifier chose the right genre. (Depending on the features you chose, your classifier might not get this song right, and that's okay.)"
]
},
{
Expand Down Expand Up @@ -1214,7 +1219,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"An *ablation* study involves attempting to determine which features matter most for classification accuracy by removing each of them individually.\n",
"An *ablation* study involves attempting to determine which features matter most for classification accuracy by removing (\"ablating\") each of them individually.\n",
"\n",
"**Question 4.1.3.** Create a two-column table `ablation_accuracies` that shows the accuracy on the validation set of a 5-NN classifier that has all `staff_features` except one. Include a row for every feature in `staff_features` that you leave out. (*Hint*: Lists have a `.remove` method that takes the element to be removed.)"
]
Expand Down

0 comments on commit 4ed9f45

Please sign in to comment.