ml-finance-python
python scripts for finance machine learning
git clone https://9o.is/git/ml-finance-python.git
notebook.ipynb
(11175B)
1 {
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {},
6 "source": [
7 "# Exercises: Instability of Parameter Estimates\n",
8 "\n",
9 "## Lecture Link\n",
10 "\n",
11 "This exercise notebook refers to this lecture. Please use the lecture for explanations and sample code.\n",
12 "\n",
13 "https://www.quantopian.com/lectures#Instability-of-Estimates\n",
14 "\n",
15 "Part of the Quantopian Lecture Series:\n",
16 "\n",
17 "* [www.quantopian.com/lectures](https://www.quantopian.com/lectures)\n",
18 "* [github.com/quantopian/research_public](https://github.com/quantopian/research_public)"
19 ]
20 },
21 {
22 "cell_type": "code",
23 "execution_count": null,
24 "metadata": {},
25 "outputs": [],
26 "source": [
27 "import numpy as np\n",
28 "import matplotlib.pyplot as plt\n",
29 "import pandas as pd\n",
30 "from statsmodels.stats.stattools import jarque_bera\n",
31 "\n",
32 "# Set a seed so we can play with the data without generating new random numbers every time\n"
33 ]
34 },
35 {
36 "cell_type": "markdown",
37 "metadata": {},
38 "source": [
39 "# Exercise 1: Sample Size vs. Standard Deviation\n",
40 "\n",
41 "Using the below normal distribution with mean 100 and standard deviation 50, find the means and standard deviations of samples of size 5, 25, 100, and 500."
42 ]
43 },
44 {
45 "cell_type": "code",
46 "execution_count": null,
47 "metadata": {
48 "scrolled": true
49 },
50 "outputs": [],
51 "source": [
52 "POPULATION_MU = 100\n",
53 "POPULATION_SIGMA = 25\n",
54 "sample_sizes = [5, 25, 100, 500]\n",
55 "\n",
56 "#Your code goes here\n"
57 ]
58 },
59 {
60 "cell_type": "markdown",
61 "metadata": {},
62 "source": [
63 "# Exercise 2: Instability of Predictions on Mean Alone\n",
64 "\n",
65 "## a. Finding Means\n",
66 "\n",
67 "Find the means of the following three data sets $X$, $Y$, and $Z$."
68 ]
69 },
70 {
71 "cell_type": "code",
72 "execution_count": null,
73 "metadata": {
74 "scrolled": true
75 },
76 "outputs": [],
77 "source": [
78 "X = [ 31., 6., 21., 32., 41., 4., 48., 38., 43., 36., 50., 20., 46., 33., 8., 27., 17., 44., 16., 39., 3., 37.,\n",
79 " 35., 13., 49., 2., 18., 42., 22., 25., 15., 24., 11., 19., 5., 40., 12., 10., 1., 45., 26., 29., 7., 30.,\n",
80 " 14., 23., 28., 0., 34., 9., 47.]\n",
81 "Y = [ 15., 41., 33., 29., 3., 28., 28., 8., 15., 22., 39., 38., 22., 10., 39., 40., 24., 15., 21., 25., 17., 33.,\n",
82 " 40., 32., 42., 5., 39., 8., 15., 25., 37., 33., 14., 25., 1., 31., 45., 5., 6., 19., 13., 39., 18., 49.,\n",
83 " 13., 38., 8., 25., 32., 40., 17.]\n",
84 "Z = [ 38., 23., 16., 35., 48., 18., 48., 38., 24., 27., 24., 35., 37., 28., 11., 12., 31., -1., 9., 19., 20., 0.,\n",
85 " 23., 33., 34., 24., 14., 28., 12., 25., 53., 19., 42., 21., 15., 36., 47., 20., 26., 41., 33., 50., 26., 22.,\n",
86 " -1., 35., 10., 25., 23., 24., 6.]\n",
87 "\n",
88 "#Your code goes here\n"
89 ]
90 },
91 {
92 "cell_type": "markdown",
93 "metadata": {},
94 "source": [
95 "## b. Checking for Normality\n",
96 "\n",
97 "Use the `jarque_bera` function to conduct a Jarque-Bera test on $X$, $Y$, and $Z$ to determine whether their distributions are normal. "
98 ]
99 },
100 {
101 "cell_type": "code",
102 "execution_count": null,
103 "metadata": {
104 "scrolled": true
105 },
106 "outputs": [],
107 "source": [
108 "#Your code goes here\n"
109 ]
110 },
111 {
112 "cell_type": "markdown",
113 "metadata": {},
114 "source": [
115 "## c. Instability of Estimates\n",
116 "\n",
117 "Create a histogram of the sample distributions of $X$, $Y$, and $Z$ along with the best estimate/mean based on the sample."
118 ]
119 },
120 {
121 "cell_type": "code",
122 "execution_count": null,
123 "metadata": {
124 "scrolled": false
125 },
126 "outputs": [],
127 "source": [
128 "#Your code goes here\n"
129 ]
130 },
131 {
132 "cell_type": "markdown",
133 "metadata": {},
134 "source": [
135 "# Exercise 3: Sharpe Ratio Window Adjustment\n",
136 "\n",
137 "## a. Effect on Variability\n",
138 "\n",
139 "Just as in the lecture, find the mean and standard deviation of the running sharpe ratio for THO, this time testing for multiple window lengths: 300, 150, and 50. Restrict your mean and standard deviation calculation to pricing data up to 200 days away from the end."
140 ]
141 },
142 {
143 "cell_type": "code",
144 "execution_count": null,
145 "metadata": {},
146 "outputs": [],
147 "source": [
148 "def sharpe_ratio(asset, riskfree):\n",
149 " return np.mean(asset - riskfree)/np.std(asset - riskfree)\n",
150 "\n",
151 "start = '2010-01-01'\n",
152 "end = '2015-01-01'\n",
153 "\n",
154 "treasury_ret = get_pricing('BIL', fields='price', start_date=start, end_date=end).pct_change()[1:]\n",
155 "pricing = get_pricing('THO', fields='price', start_date=start, end_date=end)\n",
156 "returns = pricing.pct_change()[1:]\n",
157 "\n",
158 "#Your code goes here\n"
159 ]
160 },
161 {
162 "cell_type": "markdown",
163 "metadata": {},
164 "source": [
165 "## b. Out-of-Sample Instability\n",
166 "\n",
167 "Plot the running sharpe ratio of all three window lengths, as well as their in-sample mean and standard deviation bars."
168 ]
169 },
170 {
171 "cell_type": "code",
172 "execution_count": null,
173 "metadata": {
174 "scrolled": false
175 },
176 "outputs": [],
177 "source": [
178 "#Your code goes here\n"
179 ]
180 },
181 {
182 "cell_type": "markdown",
183 "metadata": {},
184 "source": [
185 "# Exercise 4: Weather\n",
186 "\n",
187 "## a. Temperature in Boston\n",
188 "\n",
189 "Find the mean and standard deviation of Boston weekly average temperature data for the year of 2015 stored in `b15_df`. "
190 ]
191 },
192 {
193 "cell_type": "code",
194 "execution_count": null,
195 "metadata": {
196 "scrolled": true
197 },
198 "outputs": [],
199 "source": [
200 "b15_df = pd.DataFrame([ 29., 22., 19., 17., 19., 19., 15., 16., 18., 25., 21.,\n",
201 " 25., 29., 27., 36., 38., 40., 44., 49., 50., 58., 61.,\n",
202 " 67., 69., 74., 72., 76., 81., 81., 80., 83., 82., 80.,\n",
203 " 79., 79., 80., 74., 72., 68., 68., 65., 61., 57., 50.,\n",
204 " 46., 42., 41., 35., 30., 27., 28., 28.],\n",
205 " columns = ['Weekly Avg Temp'],\n",
206 " index = pd.date_range('1/1/2012', periods=52, freq='W') )\n",
207 "\n",
208 "#Your code goes here\n"
209 ]
210 },
211 {
212 "cell_type": "markdown",
213 "metadata": {},
214 "source": [
215 "## b. Temperature in Palo Alto\n",
216 "\n",
217 "Find the mean and standard deviation of Palo Alto weekly average temperature data for the year of 2015 stored in `p15_df`."
218 ]
219 },
220 {
221 "cell_type": "code",
222 "execution_count": null,
223 "metadata": {
224 "scrolled": true
225 },
226 "outputs": [],
227 "source": [
228 "p15_df = pd.DataFrame([ 49., 53., 51., 47., 50., 46., 49., 51., 49., 45., 52.,\n",
229 " 54., 54., 55., 55., 57., 56., 56., 57., 63., 63., 65.,\n",
230 " 65., 69., 67., 70., 67., 67., 68., 68., 70., 72., 72.,\n",
231 " 70., 72., 70., 66., 66., 68., 68., 65., 66., 62., 61.,\n",
232 " 63., 57., 55., 55., 55., 55., 55., 48.],\n",
233 " columns = ['Weekly Avg Temp'],\n",
234 " index = pd.date_range('1/1/2012', periods=52, freq='W'))\n",
235 "\n",
236 "#Your code goes here\n"
237 ]
238 },
239 {
240 "cell_type": "markdown",
241 "metadata": {},
242 "source": [
243 "## c. Predicting 2016 Temperatures\n",
244 "\n",
245 "Use the means you found in parts a and b to attempt to predict 2016 temperature data for both cities. Do this by creating two histograms for the 2016 temperature data in `b16_df` and `p16_df` with a vertical line where the 2015 means were to represent your prediction."
246 ]
247 },
248 {
249 "cell_type": "code",
250 "execution_count": null,
251 "metadata": {},
252 "outputs": [],
253 "source": [
254 "b16_df = pd.DataFrame([ 26., 22., 20., 19., 18., 19., 17., 17., 19., 20., 23., 22., 28., 28., 35., 38., 42., 47., 49., 56., 59., 61.,\n",
255 " 61., 70., 73., 73., 73., 77., 78., 82., 80., 80., 81., 78., 82., 78., 76., 71., 69., 66., 60., 63., 56., 50.,\n",
256 " 44., 43., 34., 33., 31., 28., 27., 20.],\n",
257 " columns = ['Weekly Avg Temp'],\n",
258 " index = pd.date_range('1/1/2012', periods=52, freq='W'))\n",
259 "\n",
260 "p16_df = pd.DataFrame([ 50., 50., 51., 48., 48., 49., 50., 45., 52., 50., 51., 52., 50., 56., 58., 55., 61., 56., 61., 62., 62., 64.,\n",
261 " 64., 69., 71., 66., 69., 70., 68., 71., 70., 69., 72., 71., 66., 69., 70., 70., 66., 67., 64., 64., 65., 61.,\n",
262 " 61., 59., 56., 53., 55., 52., 52., 51.],\n",
263 " columns = ['Weekly Avg Temp'],\n",
264 " index = pd.date_range('1/1/2012', periods=52, freq='W'))\n",
265 "\n",
266 "#Your code goes here\n"
267 ]
268 },
269 {
270 "cell_type": "markdown",
271 "metadata": {},
272 "source": [
273 "---\n",
274 "\n",
275 "Congratulations on completing the instability of parameter estimates exercises!\n",
276 "\n",
277 "As you learn more about writing trading models and the Quantopian platform, enter a daily [Quantopian Contest](https://www.quantopian.com/contest). Your strategy will be evaluated for a cash prize every day.\n",
278 "\n",
279 "Start by going through the [Writing a Contest Algorithm](https://www.quantopian.com/tutorials/contest) tutorial."
280 ]
281 },
282 {
283 "cell_type": "markdown",
284 "metadata": {},
285 "source": [
286 "*This presentation is for informational purposes only and does not constitute an offer to sell, a solic\n",
287 "itation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. (\"Quantopian\"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian, Inc. has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian, Inc. at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.*"
288 ]
289 }
290 ],
291 "metadata": {
292 "kernelspec": {
293 "display_name": "Python 2",
294 "language": "python",
295 "name": "python2"
296 },
297 "language_info": {
298 "codemirror_mode": {
299 "name": "ipython",
300 "version": 2
301 },
302 "file_extension": ".py",
303 "mimetype": "text/x-python",
304 "name": "python",
305 "nbconvert_exporter": "python",
306 "pygments_lexer": "ipython2",
307 "version": "2.7.10"
308 }
309 },
310 "nbformat": 4,
311 "nbformat_minor": 1
312 }