Skip to content

Commit

Permalink
Separated the logic to group by market cap in the notebook (quantopia…
Browse files Browse the repository at this point in the history
…n#294)

New groupby logic
  • Loading branch information
quantopiancal authored Nov 21, 2018
1 parent e9e209b commit 908df78
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 69 deletions.
80 changes: 35 additions & 45 deletions notebooks/tutorials/3_factset_alphalens_lesson_4/notebook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,16 @@
"1. Grouping assets by market cap, then analyzing each cap type individually.\n",
"2. Writing group neutral strategies.\n",
"3. Determining an alpha factor's decay rate.\n",
"4. Dealing with a common Alphalens error named MaxLossExceededError.\n",
"\n",
"**All sections of this lesson will use the data produced by the Pipeline created in the following cell. Please run it.**\n",
"4. Dealing with a common Alphalens error named MaxLossExceededError."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Grouping By Market Cap\n",
"\n",
"**Important note**: Until this lesson, we passed the output of `run_pipeline()` to `get_clean_factor_and_forward_returns()` without any changes. This was possible because the previous lessons' Pipelines only returned one column. This lesson's Pipeline returns two columns, which means we need to *specify the column* we're passing as factor data. Look for commented code near `get_clean_factor_and_forward_returns()` in the following cell to see how to do this."
"The following code defines a universe and creates an alpha factor within a pipeline. It also returns a classifier by using the quantiles() function. This function is useful for grouping your assets by an arbitrary column of data. In this example, we will group our assets by their market cap, and analyze how effective our alpha factor is among the different cap types (small, medium, and large cap)."
]
},
{
Expand Down Expand Up @@ -54,19 +59,17 @@
"\n",
" factor_to_analyze = (ciwc_processed + spwc_processed).zscore()\n",
"\n",
" # The following columns will help us group assets by market cap. This will allow us to analyze\n",
" # whether our alpha factor's predictiveness varies among assets with different market caps.\n",
" market_cap = factset.Fundamentals.mkt_val.latest\n",
" is_small_cap = market_cap.percentile_between(0, 100)\n",
" is_mid_cap = market_cap.percentile_between(50, 100)\n",
" is_large_cap = market_cap.percentile_between(90, 100)\n",
" \n",
" # .quantiles(), when supplied with bins=3, tells you which third that the assets value places in.\n",
" # for example, in 2018, Apple is in the third bin, because it has a large market cap.\n",
" # A different asset with a smaller market cap would probably be in the first or second bin.\n",
" cap_type = market_cap.quantiles(bins=3, mask=base_universe)\n",
"\n",
" return Pipeline(\n",
" columns = {\n",
" 'factor_to_analyze': factor_to_analyze, \n",
" 'small_cap_filter': is_small_cap,\n",
" 'mid_cap_filter': is_mid_cap,\n",
" 'large_cap_filter': is_large_cap,\n",
" 'cap_type': cap_type\n",
" },\n",
" screen = (\n",
" base_universe\n",
Expand All @@ -75,29 +78,14 @@
" )\n",
" )\n",
"\n",
"\n",
"# Create the pipeline data\n",
"pipeline_output = run_pipeline(make_pipeline(), '2013-1-1', '2014-1-1')\n",
"pricing_data = get_pricing(pipeline_output.index.levels[1], '2013-1-1', '2014-3-1', fields='open_price')\n",
"\n",
"# To group by market cap, we will follow the following steps.\n",
"# Replace the quantile values in the cap_type column for added clarity\n",
"pipeline_output['cap_type'].replace([0, 1, 2], ['small_cap', 'mid_cap', 'large_cap'], inplace=True)\n",
"\n",
"# Convert the \"True\" values to ones, so they can be added together\n",
"pipeline_output[['small_cap_filter', 'mid_cap_filter', 'large_cap_filter']] *= 1\n",
"\n",
"# If a stock passed the large_cap filter, it also passed the mid_cap and small_cap filters.\n",
"# This means we can add the three columns, and stocks that are large_cap will get a value of 3,\n",
"# stocks that are mid cap will get a value of 2, and stocks that are small cap will get 1.\n",
"pipeline_output['cap_type'] = (\n",
" pipeline_output['small_cap_filter'] + pipeline_output['mid_cap_filter'] + pipeline_output['large_cap_filter']\n",
")\n",
"\n",
"# drop the old columns, we don't need them anymore\n",
"pipeline_output.drop(['small_cap_filter', 'mid_cap_filter', 'large_cap_filter'], axis=1, inplace=True)\n",
"\n",
"# rename the 1's, 2's and 3's for clarity\n",
"pipeline_output['cap_type'].replace([1, 2, 3], ['small_cap', 'mid_cap', 'large_cap'], inplace=True)\n",
"pricing_data = get_pricing(pipeline_output.index.levels[1], '2013-1-1', '2014-3-1', fields='open_price')\n",
"\n",
"# the final product\n",
"pipeline_output.head(5)"
]
},
Expand All @@ -111,7 +99,9 @@
"\n",
"You can group assets by any classifier, but sector and market cap are most common. The Pipeline in the first cell of this lesson returns a column named `cap_type`, whose values represent the assets market capitalization. All we have to do now is pass that column to the `groupby` argument of `get_clean_factor_and_forward_returns()`\n",
"\n",
"**Run the following cell, and notice the charts at the bottom of the tear sheet showing how our factor performs among different cap types.**"
"**Run the following cell, and notice the charts at the bottom of the tear sheet showing how our factor performs among different cap types.**\n",
"\n",
"**Important note**: Until this lesson, we passed the output of `run_pipeline()` to `get_clean_factor_and_forward_returns()` without any changes. This was possible because the previous lessons' Pipelines only returned one column. This lesson's Pipeline returns two columns, which means we need to *specify the column* we're passing as factor data. Look for commented code near `get_clean_factor_and_forward_returns()` in the following cell to see how to do this."
]
},
{
Expand All @@ -125,7 +115,7 @@
"from alphalens.tears import create_returns_tear_sheet\n",
"\n",
"factor_data = get_clean_factor_and_forward_returns(\n",
" factor=pipeline_output['factor_to_analyze'],\n",
" factor=pipeline_output['factor_to_analyze'], # This is how you pass a single column from pipeline_output\n",
" prices=pricing_data,\n",
" groupby=pipeline_output['cap_type'],\n",
")\n",
Expand Down Expand Up @@ -247,26 +237,26 @@
"metadata": {},
"outputs": [],
"source": [
"new_pipeline_output = run_pipeline(\n",
"pipeline_output = run_pipeline(\n",
" make_pipeline(),\n",
" start_date='2013-1-1', \n",
" end_date='2014-1-1' # *** NOTE *** Our factor data ends in 2014\n",
")\n",
"\n",
"new_pricing_data = get_pricing(\n",
"pricing_data = get_pricing(\n",
" pipeline_output.index.levels[1], \n",
" start_date='2013-1-1',\n",
" end_date='2015-2-1', # *** NOTE *** Our pricing data ends in 2015\n",
" fields='open_price'\n",
")\n",
"\n",
"new_factor_data = get_clean_factor_and_forward_returns(\n",
" new_pipeline_output['factor_to_analyze'], \n",
" new_pricing_data,\n",
"factor_data = get_clean_factor_and_forward_returns(\n",
" pipeline_output['factor_to_analyze'], \n",
" pricing_data,\n",
" periods=range(1,252,20) # Change the step to 10 or more for long look forward periods to save time\n",
")\n",
"\n",
"mean_information_coefficient(new_factor_data).plot()"
"mean_information_coefficient(factor_data).plot()"
]
},
{
Expand All @@ -293,23 +283,23 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 2",
"language": "python",
"name": "python3"
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
}
30 changes: 6 additions & 24 deletions notebooks/tutorials/3_factset_alphalens_lesson_5/notebook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -60,16 +60,13 @@
" # The following columns will help us group assets by market cap. This will allow us to analyze\n",
" # whether our alpha factor's predictiveness varies among assets with different market caps.\n",
" market_cap = factset.Fundamentals.mkt_val.latest\n",
" is_small_cap = market_cap.percentile_between(0, 100)\n",
" is_mid_cap = market_cap.percentile_between(50, 100)\n",
" is_large_cap = market_cap.percentile_between(90, 100)\n",
" cap_type = market_cap.quantiles(bins=3, mask=base_universe)\n",
"\n",
" return Pipeline(\n",
" columns = {\n",
" 'factor_to_analyze': factor_to_analyze, \n",
" 'small_cap_filter': is_small_cap,\n",
" 'mid_cap_filter': is_mid_cap,\n",
" 'large_cap_filter': is_large_cap,\n",
" 'factor_to_analyze': factor_to_analyze,\n",
" 'cap_type': cap_type\n",
" \n",
" },\n",
" screen = (\n",
" base_universe\n",
Expand All @@ -78,26 +75,11 @@
" )\n",
" )\n",
"\n",
"# To group by market cap, we will follow the following steps.\n",
"\n",
"# Convert the \"True\" values to ones, so they can be added together\n",
"pipeline_output[['small_cap_filter', 'mid_cap_filter', 'large_cap_filter']] *= 1\n",
"\n",
"# If a stock passed the large_cap filter, it also passed the mid_cap and small_cap filters.\n",
"# This means we can add the three columns, and stocks that are large_cap will get a value of 3,\n",
"# stocks that are mid cap will get a value of 2, and stocks that are small cap will get 1.\n",
"pipeline_output['cap_type'] = (\n",
" pipeline_output['small_cap_filter'] + pipeline_output['mid_cap_filter'] + pipeline_output['large_cap_filter']\n",
")\n",
"\n",
"# drop the old columns, we don't need them anymore\n",
"pipeline_output.drop(['small_cap_filter', 'mid_cap_filter', 'large_cap_filter'], axis=1, inplace=True)\n",
"\n",
"pipeline_output = run_pipeline(make_pipeline(), '2015-1-1', '2016-1-1')\n",
"# rename the 1's, 2's and 3's for clarity\n",
"pipeline_output['cap_type'].replace([1, 2, 3], ['small_cap', 'mid_cap', 'large_cap'], inplace=True)\n",
"\n",
"# the final product\n",
"pipeline_output.head(5)"
"pricing_data = get_pricing(pipeline_output.index.levels[1], '2015-1-1', '2016-6-1', fields='open_price')"
]
},
{
Expand Down

0 comments on commit 908df78

Please sign in to comment.