ml-finance-python
python scripts for finance machine learning
git clone https://9o.is/git/ml-finance-python.git
notebook.ipynb
(29126B)
1 {
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {
6 "collapsed": true
7 },
8 "source": [
9 "# EventVestor: Stock Splits\n",
10 "\n",
11 "In this notebook, we'll take a look at EventVestor's *Stock Splits* dataset, available on the [Quantopian Store](https://www.quantopian.com/store). This dataset spans January 01, 2007 through the current day, and documents stock splits and reverse stock splits.\n",
12 "\n",
13 "### Blaze\n",
14 "Before we dig into the data, we want to tell you about how you generally access Quantopian Store data sets. These datasets are available through an API service known as [Blaze](http://blaze.pydata.org). Blaze provides the Quantopian user with a convenient interface to access very large datasets.\n",
15 "\n",
16 "Blaze provides an important function for accessing these datasets. Some of these sets are many millions of records. Bringing that data directly into Quantopian Research directly just is not viable. So Blaze allows us to provide a simple querying interface and shift the burden over to the server side.\n",
17 "\n",
18 "It is common to use Blaze to reduce your dataset in size, convert it over to Pandas and then to use Pandas for further computation, manipulation and visualization.\n",
19 "\n",
20 "Helpful links:\n",
21 "* [Query building for Blaze](http://blaze.pydata.org/en/latest/queries.html)\n",
22 "* [Pandas-to-Blaze dictionary](http://blaze.pydata.org/en/latest/rosetta-pandas.html)\n",
23 "* [SQL-to-Blaze dictionary](http://blaze.pydata.org/en/latest/rosetta-sql.html).\n",
24 "\n",
25 "Once you've limited the size of your Blaze object, you can convert it to a Pandas DataFrames using:\n",
26 "> `from odo import odo` \n",
27 "> `odo(expr, pandas.DataFrame)`\n",
28 "\n",
29 "### Free samples and limits\n",
30 "One other key caveat: we limit the number of results returned from any given expression to 10,000 to protect against runaway memory usage. To be clear, you have access to all the data server side. We are limiting the size of the responses back from Blaze.\n",
31 "\n",
32 "There is a *free* version of this dataset as well as a paid one. The free one includes about three years of historical data, though not up to the current day.\n",
33 "\n",
34 "With preamble in place, let's get started:"
35 ]
36 },
37 {
38 "cell_type": "code",
39 "execution_count": 3,
40 "metadata": {
41 "collapsed": false
42 },
43 "outputs": [],
44 "source": [
45 "# import the dataset\n",
46 "from quantopian.interactive.data.eventvestor import stock_splits\n",
47 "# or if you want to import the free dataset, use:\n",
48 "# from quantopian.data.eventvestor import stock_splits_free\n",
49 "\n",
50 "# import data operations\n",
51 "from odo import odo\n",
52 "# import other libraries we will use\n",
53 "import pandas as pd"
54 ]
55 },
56 {
57 "cell_type": "code",
58 "execution_count": 4,
59 "metadata": {
60 "collapsed": false
61 },
62 "outputs": [
63 {
64 "data": {
65 "text/plain": [
66 "dshape(\"\"\"var * {\n",
67 " event_id: ?float64,\n",
68 " asof_date: datetime,\n",
69 " trade_date: ?datetime,\n",
70 " symbol: ?string,\n",
71 " event_type: ?string,\n",
72 " event_headline: ?string,\n",
73 " split_type: ?string,\n",
74 " split_factor: ?string,\n",
75 " new_shares: ?float64,\n",
76 " old_shares: ?float64,\n",
77 " effective_date: ?datetime,\n",
78 " event_rating: ?float64,\n",
79 " timestamp: datetime,\n",
80 " sid: ?int64\n",
81 " }\"\"\")"
82 ]
83 },
84 "execution_count": 4,
85 "metadata": {},
86 "output_type": "execute_result"
87 }
88 ],
89 "source": [
90 "# Let's use blaze to understand the data a bit using Blaze dshape()\n",
91 "stock_splits.dshape"
92 ]
93 },
94 {
95 "cell_type": "code",
96 "execution_count": 5,
97 "metadata": {
98 "collapsed": false
99 },
100 "outputs": [
101 {
102 "data": {
103 "text/html": [
104 "1062"
105 ],
106 "text/plain": [
107 "1062"
108 ]
109 },
110 "execution_count": 5,
111 "metadata": {},
112 "output_type": "execute_result"
113 }
114 ],
115 "source": [
116 "# And how many rows are there?\n",
117 "# N.B. we're using a Blaze function to do this, not len()\n",
118 "stock_splits.count()"
119 ]
120 },
121 {
122 "cell_type": "code",
123 "execution_count": 6,
124 "metadata": {
125 "collapsed": false
126 },
127 "outputs": [
128 {
129 "data": {
130 "text/html": [
131 "<table border=\"1\" class=\"dataframe\">\n",
132 " <thead>\n",
133 " <tr style=\"text-align: right;\">\n",
134 " <th></th>\n",
135 " <th>event_id</th>\n",
136 " <th>asof_date</th>\n",
137 " <th>trade_date</th>\n",
138 " <th>symbol</th>\n",
139 " <th>event_type</th>\n",
140 " <th>event_headline</th>\n",
141 " <th>split_type</th>\n",
142 " <th>split_factor</th>\n",
143 " <th>new_shares</th>\n",
144 " <th>old_shares</th>\n",
145 " <th>effective_date</th>\n",
146 " <th>event_rating</th>\n",
147 " <th>timestamp</th>\n",
148 " <th>sid</th>\n",
149 " </tr>\n",
150 " </thead>\n",
151 " <tbody>\n",
152 " <tr>\n",
153 " <th>0</th>\n",
154 " <td>61191</td>\n",
155 " <td>2007-01-09</td>\n",
156 " <td>2007-01-09</td>\n",
157 " <td>MDCI</td>\n",
158 " <td>Stock Split</td>\n",
159 " <td>Medical Action announces 3-for-2 stock split, ...</td>\n",
160 " <td>Split</td>\n",
161 " <td>3-for-2</td>\n",
162 " <td>3</td>\n",
163 " <td>2</td>\n",
164 " <td>NaT</td>\n",
165 " <td>1</td>\n",
166 " <td>2007-01-10</td>\n",
167 " <td>4737</td>\n",
168 " </tr>\n",
169 " <tr>\n",
170 " <th>1</th>\n",
171 " <td>61190</td>\n",
172 " <td>2007-01-09</td>\n",
173 " <td>2007-01-09</td>\n",
174 " <td>SSI</td>\n",
175 " <td>Stock Split</td>\n",
176 " <td>Stage Stores announces 3-for-2 stock split, pa...</td>\n",
177 " <td>Split</td>\n",
178 " <td>3-for-2</td>\n",
179 " <td>3</td>\n",
180 " <td>2</td>\n",
181 " <td>NaT</td>\n",
182 " <td>1</td>\n",
183 " <td>2007-01-10</td>\n",
184 " <td>23395</td>\n",
185 " </tr>\n",
186 " <tr>\n",
187 " <th>2</th>\n",
188 " <td>61189</td>\n",
189 " <td>2007-01-17</td>\n",
190 " <td>2007-01-17</td>\n",
191 " <td>APH</td>\n",
192 " <td>Stock Split</td>\n",
193 " <td>Amphenol announces 2-for-1 stock split, payabl...</td>\n",
194 " <td>Split</td>\n",
195 " <td>2-for-1</td>\n",
196 " <td>2</td>\n",
197 " <td>1</td>\n",
198 " <td>NaT</td>\n",
199 " <td>1</td>\n",
200 " <td>2007-01-18</td>\n",
201 " <td>465</td>\n",
202 " </tr>\n",
203 " </tbody>\n",
204 "</table>"
205 ],
206 "text/plain": [
207 " event_id asof_date trade_date symbol event_type \\\n",
208 "0 61191 2007-01-09 2007-01-09 MDCI Stock Split \n",
209 "1 61190 2007-01-09 2007-01-09 SSI Stock Split \n",
210 "2 61189 2007-01-17 2007-01-17 APH Stock Split \n",
211 "\n",
212 " event_headline split_type split_factor \\\n",
213 "0 Medical Action announces 3-for-2 stock split, ... Split 3-for-2 \n",
214 "1 Stage Stores announces 3-for-2 stock split, pa... Split 3-for-2 \n",
215 "2 Amphenol announces 2-for-1 stock split, payabl... Split 2-for-1 \n",
216 "\n",
217 " new_shares old_shares effective_date event_rating timestamp sid \n",
218 "0 3 2 NaT 1 2007-01-10 4737 \n",
219 "1 3 2 NaT 1 2007-01-10 23395 \n",
220 "2 2 1 NaT 1 2007-01-18 465 "
221 ]
222 },
223 "execution_count": 6,
224 "metadata": {},
225 "output_type": "execute_result"
226 }
227 ],
228 "source": [
229 "# Let's see what the data looks like. We'll grab the first three rows.\n",
230 "stock_splits[:3]"
231 ]
232 },
233 {
234 "cell_type": "markdown",
235 "metadata": {},
236 "source": [
237 "Let's go over the columns:\n",
238 "- **event_id**: the unique identifier for this event.\n",
239 "- **asof_date**: EventVestor's timestamp of event capture.\n",
240 "- **trade_date**: for event announcements made before trading ends, trade_date is the same as event_date. For announcements issued after market close, trade_date is next market open day.\n",
241 "- **symbol**: stock ticker symbol of the affected company.\n",
242 "- **event_type**: this should always be *Stock Split*.\n",
243 "- **event_headline**: a brief description of the event\n",
244 "- **split_type**: *stock split* or *reverse split*\n",
245 "- **split_factor**: the `x-for-y` split factor. This is equivalently expressed by `new_shares` and `old_shares`.\n",
246 "- **new_shares**: number of new shares for `x` number of old shares\n",
247 "- **old_shares**: number of old shares exchanged for the number of new shares.\n",
248 "- **effective_date**: effective date of stock split.\n",
249 "- **event_rating**: this is always 1. The meaning of this is uncertain.\n",
250 "- **timestamp**: this is our timestamp on when we registered the data.\n",
251 "- **sid**: the equity's unique identifier. Use this instead of the symbol."
252 ]
253 },
254 {
255 "cell_type": "markdown",
256 "metadata": {},
257 "source": [
258 "We've done much of the data processing for you. Fields like `timestamp` and `sid` are standardized across all our Store Datasets, so the datasets are easy to combine. We have standardized the `sid` across all our equity databases.\n",
259 "\n",
260 "We can select columns and rows with ease. Below, we'll fetch Nike's stock splits."
261 ]
262 },
263 {
264 "cell_type": "code",
265 "execution_count": 7,
266 "metadata": {
267 "collapsed": false,
268 "scrolled": true
269 },
270 "outputs": [
271 {
272 "data": {
273 "text/html": [
274 "<table border=\"1\" class=\"dataframe\">\n",
275 " <thead>\n",
276 " <tr style=\"text-align: right;\">\n",
277 " <th></th>\n",
278 " <th>event_id</th>\n",
279 " <th>asof_date</th>\n",
280 " <th>trade_date</th>\n",
281 " <th>symbol</th>\n",
282 " <th>event_type</th>\n",
283 " <th>event_headline</th>\n",
284 " <th>split_type</th>\n",
285 " <th>split_factor</th>\n",
286 " <th>new_shares</th>\n",
287 " <th>old_shares</th>\n",
288 " <th>effective_date</th>\n",
289 " <th>event_rating</th>\n",
290 " <th>timestamp</th>\n",
291 " <th>sid</th>\n",
292 " </tr>\n",
293 " </thead>\n",
294 " <tbody>\n",
295 " <tr>\n",
296 " <th>0</th>\n",
297 " <td>61171</td>\n",
298 " <td>2007-02-15</td>\n",
299 " <td>2007-02-15</td>\n",
300 " <td>NKE</td>\n",
301 " <td>Stock Split</td>\n",
302 " <td>Nike announces 2-for-1 stock split</td>\n",
303 " <td>Split</td>\n",
304 " <td>2-for-1</td>\n",
305 " <td>2</td>\n",
306 " <td>1</td>\n",
307 " <td>NaT</td>\n",
308 " <td>1</td>\n",
309 " <td>2007-02-16</td>\n",
310 " <td>5328</td>\n",
311 " </tr>\n",
312 " <tr>\n",
313 " <th>1</th>\n",
314 " <td>1509519</td>\n",
315 " <td>2012-11-15</td>\n",
316 " <td>2012-11-16</td>\n",
317 " <td>NKE</td>\n",
318 " <td>Stock Split</td>\n",
319 " <td>Nike Announces Two-For-One Stock Split</td>\n",
320 " <td>Split</td>\n",
321 " <td>2-for-1</td>\n",
322 " <td>2</td>\n",
323 " <td>1</td>\n",
324 " <td>NaT</td>\n",
325 " <td>1</td>\n",
326 " <td>2012-11-16</td>\n",
327 " <td>5328</td>\n",
328 " </tr>\n",
329 " </tbody>\n",
330 "</table>"
331 ],
332 "text/plain": [
333 " event_id asof_date trade_date symbol event_type \\\n",
334 "0 61171 2007-02-15 2007-02-15 NKE Stock Split \n",
335 "1 1509519 2012-11-15 2012-11-16 NKE Stock Split \n",
336 "\n",
337 " event_headline split_type split_factor new_shares \\\n",
338 "0 Nike announces 2-for-1 stock split Split 2-for-1 2 \n",
339 "1 Nike Announces Two-For-One Stock Split Split 2-for-1 2 \n",
340 "\n",
341 " old_shares effective_date event_rating timestamp sid \n",
342 "0 1 NaT 1 2007-02-16 5328 \n",
343 "1 1 NaT 1 2012-11-16 5328 "
344 ]
345 },
346 "execution_count": 7,
347 "metadata": {},
348 "output_type": "execute_result"
349 }
350 ],
351 "source": [
352 "# get apple's sid first\n",
353 "nike_sid = symbols('NKE').sid\n",
354 "splits = stock_splits[(stock_splits.sid == nike_sid)]\n",
355 "# When displaying a Blaze Data Object, the printout is automatically truncated to ten rows.\n",
356 "splits.sort('asof_date')"
357 ]
358 },
359 {
360 "cell_type": "markdown",
361 "metadata": {},
362 "source": [
363 "Now suppose we want a DataFrame of `stock_splits`, but limited to reverse splits only. Of those, we then want to display the `split_factor`, `timestamp`, and `sid`."
364 ]
365 },
366 {
367 "cell_type": "code",
368 "execution_count": 8,
369 "metadata": {
370 "collapsed": false
371 },
372 "outputs": [
373 {
374 "data": {
375 "text/html": [
376 "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
377 "<table border=\"1\" class=\"dataframe\">\n",
378 " <thead>\n",
379 " <tr style=\"text-align: right;\">\n",
380 " <th></th>\n",
381 " <th>asof_date</th>\n",
382 " <th>split_factor</th>\n",
383 " <th>sid</th>\n",
384 " </tr>\n",
385 " </thead>\n",
386 " <tbody>\n",
387 " <tr>\n",
388 " <th>0</th>\n",
389 " <td>2007-02-20</td>\n",
390 " <td>1-for-3</td>\n",
391 " <td>21120</td>\n",
392 " </tr>\n",
393 " <tr>\n",
394 " <th>3</th>\n",
395 " <td>2007-03-29</td>\n",
396 " <td>1-for-4</td>\n",
397 " <td>16607</td>\n",
398 " </tr>\n",
399 " <tr>\n",
400 " <th>4</th>\n",
401 " <td>2007-07-17</td>\n",
402 " <td>8-for-9</td>\n",
403 " <td>12626</td>\n",
404 " </tr>\n",
405 " <tr>\n",
406 " <th>5</th>\n",
407 " <td>2007-08-01</td>\n",
408 " <td>1-for-6</td>\n",
409 " <td>17914</td>\n",
410 " </tr>\n",
411 " <tr>\n",
412 " <th>7</th>\n",
413 " <td>2007-08-07</td>\n",
414 " <td>1-for-10</td>\n",
415 " <td>6276</td>\n",
416 " </tr>\n",
417 " <tr>\n",
418 " <th>8</th>\n",
419 " <td>2007-08-20</td>\n",
420 " <td>1-for-20</td>\n",
421 " <td>17504</td>\n",
422 " </tr>\n",
423 " <tr>\n",
424 " <th>10</th>\n",
425 " <td>2007-11-01</td>\n",
426 " <td>1-for-10</td>\n",
427 " <td>10583</td>\n",
428 " </tr>\n",
429 " <tr>\n",
430 " <th>11</th>\n",
431 " <td>2007-11-14</td>\n",
432 " <td>1-for-4</td>\n",
433 " <td>17799</td>\n",
434 " </tr>\n",
435 " <tr>\n",
436 " <th>12</th>\n",
437 " <td>2008-01-31</td>\n",
438 " <td>0-for-0</td>\n",
439 " <td>26837</td>\n",
440 " </tr>\n",
441 " <tr>\n",
442 " <th>13</th>\n",
443 " <td>2008-02-22</td>\n",
444 " <td>1-for-5</td>\n",
445 " <td>24074</td>\n",
446 " </tr>\n",
447 " <tr>\n",
448 " <th>14</th>\n",
449 " <td>2008-03-07</td>\n",
450 " <td>1-for-12</td>\n",
451 " <td>19187</td>\n",
452 " </tr>\n",
453 " <tr>\n",
454 " <th>17</th>\n",
455 " <td>2008-04-11</td>\n",
456 " <td>1-for-10</td>\n",
457 " <td>14420</td>\n",
458 " </tr>\n",
459 " <tr>\n",
460 " <th>18</th>\n",
461 " <td>2008-04-23</td>\n",
462 " <td>1-for-4</td>\n",
463 " <td>19635</td>\n",
464 " </tr>\n",
465 " <tr>\n",
466 " <th>20</th>\n",
467 " <td>2008-04-25</td>\n",
468 " <td>1-for-10</td>\n",
469 " <td>1365</td>\n",
470 " </tr>\n",
471 " <tr>\n",
472 " <th>21</th>\n",
473 " <td>2008-05-07</td>\n",
474 " <td>0-for-0</td>\n",
475 " <td>6804</td>\n",
476 " </tr>\n",
477 " <tr>\n",
478 " <th>22</th>\n",
479 " <td>2008-05-09</td>\n",
480 " <td>1-for-3</td>\n",
481 " <td>7121</td>\n",
482 " </tr>\n",
483 " <tr>\n",
484 " <th>23</th>\n",
485 " <td>2008-05-30</td>\n",
486 " <td>1-for-5</td>\n",
487 " <td>24074</td>\n",
488 " </tr>\n",
489 " <tr>\n",
490 " <th>24</th>\n",
491 " <td>2008-06-02</td>\n",
492 " <td>1-for-10</td>\n",
493 " <td>19682</td>\n",
494 " </tr>\n",
495 " <tr>\n",
496 " <th>25</th>\n",
497 " <td>2008-06-02</td>\n",
498 " <td>1-for-5</td>\n",
499 " <td>25206</td>\n",
500 " </tr>\n",
501 " <tr>\n",
502 " <th>26</th>\n",
503 " <td>2008-06-16</td>\n",
504 " <td>1-for-5</td>\n",
505 " <td>15881</td>\n",
506 " </tr>\n",
507 " <tr>\n",
508 " <th>27</th>\n",
509 " <td>2008-07-01</td>\n",
510 " <td>1-for-5</td>\n",
511 " <td>25206</td>\n",
512 " </tr>\n",
513 " <tr>\n",
514 " <th>28</th>\n",
515 " <td>2008-07-03</td>\n",
516 " <td>1-for-20</td>\n",
517 " <td>21291</td>\n",
518 " </tr>\n",
519 " <tr>\n",
520 " <th>29</th>\n",
521 " <td>2008-07-09</td>\n",
522 " <td>1-for-10</td>\n",
523 " <td>1365</td>\n",
524 " </tr>\n",
525 " <tr>\n",
526 " <th>30</th>\n",
527 " <td>2008-07-11</td>\n",
528 " <td>1-for-10</td>\n",
529 " <td>21113</td>\n",
530 " </tr>\n",
531 " <tr>\n",
532 " <th>31</th>\n",
533 " <td>2008-08-12</td>\n",
534 " <td>1-for-5</td>\n",
535 " <td>14469</td>\n",
536 " </tr>\n",
537 " <tr>\n",
538 " <th>32</th>\n",
539 " <td>2008-08-21</td>\n",
540 " <td>1-for-2</td>\n",
541 " <td>26470</td>\n",
542 " </tr>\n",
543 " <tr>\n",
544 " <th>33</th>\n",
545 " <td>2008-08-29</td>\n",
546 " <td>1-for-10</td>\n",
547 " <td>16607</td>\n",
548 " </tr>\n",
549 " <tr>\n",
550 " <th>34</th>\n",
551 " <td>2008-09-05</td>\n",
552 " <td>1-for-10</td>\n",
553 " <td>23635</td>\n",
554 " </tr>\n",
555 " <tr>\n",
556 " <th>35</th>\n",
557 " <td>2008-09-11</td>\n",
558 " <td>1-for-20</td>\n",
559 " <td>9774</td>\n",
560 " </tr>\n",
561 " <tr>\n",
562 " <th>36</th>\n",
563 " <td>2008-09-16</td>\n",
564 " <td>1-for-10</td>\n",
565 " <td>14420</td>\n",
566 " </tr>\n",
567 " <tr>\n",
568 " <th>...</th>\n",
569 " <td>...</td>\n",
570 " <td>...</td>\n",
571 " <td>...</td>\n",
572 " </tr>\n",
573 " <tr>\n",
574 " <th>391</th>\n",
575 " <td>2015-05-26</td>\n",
576 " <td>1-for-6</td>\n",
577 " <td>32867</td>\n",
578 " </tr>\n",
579 " <tr>\n",
580 " <th>392</th>\n",
581 " <td>2015-05-28</td>\n",
582 " <td>1-for-10</td>\n",
583 " <td>12765</td>\n",
584 " </tr>\n",
585 " <tr>\n",
586 " <th>393</th>\n",
587 " <td>2015-06-01</td>\n",
588 " <td>1-for-4</td>\n",
589 " <td>19709</td>\n",
590 " </tr>\n",
591 " <tr>\n",
592 " <th>394</th>\n",
593 " <td>2015-06-18</td>\n",
594 " <td>1-for-8</td>\n",
595 " <td>35162</td>\n",
596 " </tr>\n",
597 " <tr>\n",
598 " <th>395</th>\n",
599 " <td>2015-06-18</td>\n",
600 " <td>0-for-0</td>\n",
601 " <td>39627</td>\n",
602 " </tr>\n",
603 " <tr>\n",
604 " <th>396</th>\n",
605 " <td>2015-06-18</td>\n",
606 " <td>1-for-8</td>\n",
607 " <td>40461</td>\n",
608 " </tr>\n",
609 " <tr>\n",
610 " <th>397</th>\n",
611 " <td>2015-06-23</td>\n",
612 " <td>1-for-4</td>\n",
613 " <td>19709</td>\n",
614 " </tr>\n",
615 " <tr>\n",
616 " <th>398</th>\n",
617 " <td>2015-06-25</td>\n",
618 " <td>1-for-7</td>\n",
619 " <td>39627</td>\n",
620 " </tr>\n",
621 " <tr>\n",
622 " <th>400</th>\n",
623 " <td>2015-06-26</td>\n",
624 " <td>1-for-5</td>\n",
625 " <td>28718</td>\n",
626 " </tr>\n",
627 " <tr>\n",
628 " <th>401</th>\n",
629 " <td>2015-06-29</td>\n",
630 " <td>1-for-10</td>\n",
631 " <td>12765</td>\n",
632 " </tr>\n",
633 " <tr>\n",
634 " <th>402</th>\n",
635 " <td>2015-07-08</td>\n",
636 " <td>1-for-4</td>\n",
637 " <td>40822</td>\n",
638 " </tr>\n",
639 " <tr>\n",
640 " <th>403</th>\n",
641 " <td>2015-07-10</td>\n",
642 " <td>1-for-8</td>\n",
643 " <td>4982</td>\n",
644 " </tr>\n",
645 " <tr>\n",
646 " <th>404</th>\n",
647 " <td>2015-07-13</td>\n",
648 " <td>1-for-8</td>\n",
649 " <td>44063</td>\n",
650 " </tr>\n",
651 " <tr>\n",
652 " <th>405</th>\n",
653 " <td>2015-07-15</td>\n",
654 " <td>1-for-10</td>\n",
655 " <td>42820</td>\n",
656 " </tr>\n",
657 " <tr>\n",
658 " <th>406</th>\n",
659 " <td>2015-07-20</td>\n",
660 " <td>1-for-10</td>\n",
661 " <td>88</td>\n",
662 " </tr>\n",
663 " <tr>\n",
664 " <th>407</th>\n",
665 " <td>2015-07-20</td>\n",
666 " <td>1-for-10</td>\n",
667 " <td>41717</td>\n",
668 " </tr>\n",
669 " <tr>\n",
670 " <th>408</th>\n",
671 " <td>2015-07-21</td>\n",
672 " <td>1-for-10</td>\n",
673 " <td>40531</td>\n",
674 " </tr>\n",
675 " <tr>\n",
676 " <th>409</th>\n",
677 " <td>2015-07-27</td>\n",
678 " <td>1-for-10</td>\n",
679 " <td>88</td>\n",
680 " </tr>\n",
681 " <tr>\n",
682 " <th>410</th>\n",
683 " <td>2015-07-31</td>\n",
684 " <td>0-for-0</td>\n",
685 " <td>35162</td>\n",
686 " </tr>\n",
687 " <tr>\n",
688 " <th>411</th>\n",
689 " <td>2015-08-03</td>\n",
690 " <td>1-for-10</td>\n",
691 " <td>42820</td>\n",
692 " </tr>\n",
693 " <tr>\n",
694 " <th>412</th>\n",
695 " <td>2015-08-04</td>\n",
696 " <td>1-for-4</td>\n",
697 " <td>28076</td>\n",
698 " </tr>\n",
699 " <tr>\n",
700 " <th>413</th>\n",
701 " <td>2015-08-04</td>\n",
702 " <td>1-for-10</td>\n",
703 " <td>6523</td>\n",
704 " </tr>\n",
705 " <tr>\n",
706 " <th>414</th>\n",
707 " <td>2015-08-06</td>\n",
708 " <td>1-for-10</td>\n",
709 " <td>19074</td>\n",
710 " </tr>\n",
711 " <tr>\n",
712 " <th>415</th>\n",
713 " <td>2015-08-18</td>\n",
714 " <td>1-for-3</td>\n",
715 " <td>19350</td>\n",
716 " </tr>\n",
717 " <tr>\n",
718 " <th>416</th>\n",
719 " <td>2015-08-24</td>\n",
720 " <td>1-for-5</td>\n",
721 " <td>28416</td>\n",
722 " </tr>\n",
723 " <tr>\n",
724 " <th>417</th>\n",
725 " <td>2015-08-25</td>\n",
726 " <td>1-for-7</td>\n",
727 " <td>39627</td>\n",
728 " </tr>\n",
729 " <tr>\n",
730 " <th>418</th>\n",
731 " <td>2015-08-31</td>\n",
732 " <td>1-for-4</td>\n",
733 " <td>28076</td>\n",
734 " </tr>\n",
735 " <tr>\n",
736 " <th>419</th>\n",
737 " <td>2015-09-03</td>\n",
738 " <td>1-for-7</td>\n",
739 " <td>24624</td>\n",
740 " </tr>\n",
741 " <tr>\n",
742 " <th>420</th>\n",
743 " <td>2015-09-09</td>\n",
744 " <td>0-for-0</td>\n",
745 " <td>22253</td>\n",
746 " </tr>\n",
747 " <tr>\n",
748 " <th>421</th>\n",
749 " <td>2015-09-16</td>\n",
750 " <td>1-for-15</td>\n",
751 " <td>22660</td>\n",
752 " </tr>\n",
753 " </tbody>\n",
754 "</table>\n",
755 "<p>361 rows × 3 columns</p>\n",
756 "</div>"
757 ],
758 "text/plain": [
759 " asof_date split_factor sid\n",
760 "0 2007-02-20 1-for-3 21120\n",
761 "3 2007-03-29 1-for-4 16607\n",
762 "4 2007-07-17 8-for-9 12626\n",
763 "5 2007-08-01 1-for-6 17914\n",
764 "7 2007-08-07 1-for-10 6276\n",
765 "8 2007-08-20 1-for-20 17504\n",
766 "10 2007-11-01 1-for-10 10583\n",
767 "11 2007-11-14 1-for-4 17799\n",
768 "12 2008-01-31 0-for-0 26837\n",
769 "13 2008-02-22 1-for-5 24074\n",
770 "14 2008-03-07 1-for-12 19187\n",
771 "17 2008-04-11 1-for-10 14420\n",
772 "18 2008-04-23 1-for-4 19635\n",
773 "20 2008-04-25 1-for-10 1365\n",
774 "21 2008-05-07 0-for-0 6804\n",
775 "22 2008-05-09 1-for-3 7121\n",
776 "23 2008-05-30 1-for-5 24074\n",
777 "24 2008-06-02 1-for-10 19682\n",
778 "25 2008-06-02 1-for-5 25206\n",
779 "26 2008-06-16 1-for-5 15881\n",
780 "27 2008-07-01 1-for-5 25206\n",
781 "28 2008-07-03 1-for-20 21291\n",
782 "29 2008-07-09 1-for-10 1365\n",
783 "30 2008-07-11 1-for-10 21113\n",
784 "31 2008-08-12 1-for-5 14469\n",
785 "32 2008-08-21 1-for-2 26470\n",
786 "33 2008-08-29 1-for-10 16607\n",
787 "34 2008-09-05 1-for-10 23635\n",
788 "35 2008-09-11 1-for-20 9774\n",
789 "36 2008-09-16 1-for-10 14420\n",
790 ".. ... ... ...\n",
791 "391 2015-05-26 1-for-6 32867\n",
792 "392 2015-05-28 1-for-10 12765\n",
793 "393 2015-06-01 1-for-4 19709\n",
794 "394 2015-06-18 1-for-8 35162\n",
795 "395 2015-06-18 0-for-0 39627\n",
796 "396 2015-06-18 1-for-8 40461\n",
797 "397 2015-06-23 1-for-4 19709\n",
798 "398 2015-06-25 1-for-7 39627\n",
799 "400 2015-06-26 1-for-5 28718\n",
800 "401 2015-06-29 1-for-10 12765\n",
801 "402 2015-07-08 1-for-4 40822\n",
802 "403 2015-07-10 1-for-8 4982\n",
803 "404 2015-07-13 1-for-8 44063\n",
804 "405 2015-07-15 1-for-10 42820\n",
805 "406 2015-07-20 1-for-10 88\n",
806 "407 2015-07-20 1-for-10 41717\n",
807 "408 2015-07-21 1-for-10 40531\n",
808 "409 2015-07-27 1-for-10 88\n",
809 "410 2015-07-31 0-for-0 35162\n",
810 "411 2015-08-03 1-for-10 42820\n",
811 "412 2015-08-04 1-for-4 28076\n",
812 "413 2015-08-04 1-for-10 6523\n",
813 "414 2015-08-06 1-for-10 19074\n",
814 "415 2015-08-18 1-for-3 19350\n",
815 "416 2015-08-24 1-for-5 28416\n",
816 "417 2015-08-25 1-for-7 39627\n",
817 "418 2015-08-31 1-for-4 28076\n",
818 "419 2015-09-03 1-for-7 24624\n",
819 "420 2015-09-09 0-for-0 22253\n",
820 "421 2015-09-16 1-for-15 22660\n",
821 "\n",
822 "[361 rows x 3 columns]"
823 ]
824 },
825 "execution_count": 8,
826 "metadata": {},
827 "output_type": "execute_result"
828 }
829 ],
830 "source": [
831 "reverse = stock_splits[stock_splits.split_type == \"Reverse Split\"]\n",
832 "df = odo(reverse, pd.DataFrame)\n",
833 "df = df[['asof_date','split_factor','sid']]\n",
834 "df = df[df.sid.notnull()]\n",
835 "# When printing a pandas DataFrame, the head 30 and tail 30 rows are displayed. The middle is truncated.\n",
836 "df"
837 ]
838 },
839 {
840 "cell_type": "code",
841 "execution_count": null,
842 "metadata": {
843 "collapsed": true
844 },
845 "outputs": [],
846 "source": []
847 }
848 ],
849 "metadata": {
850 "kernelspec": {
851 "display_name": "Python 2",
852 "language": "python",
853 "name": "python2"
854 },
855 "language_info": {
856 "codemirror_mode": {
857 "name": "ipython",
858 "version": 2
859 },
860 "file_extension": ".py",
861 "mimetype": "text/x-python",
862 "name": "python",
863 "nbconvert_exporter": "python",
864 "pygments_lexer": "ipython2",
865 "version": "2.7.12"
866 }
867 },
868 "nbformat": 4,
869 "nbformat_minor": 0
870 }