ml-finance-python

python scripts for finance machine learning

git clone https://9o.is/git/ml-finance-python.git

notebook.ipynb

(34376B)


      1 {
      2  "cells": [
      3   {
      4    "cell_type": "markdown",
      5    "metadata": {
      6     "collapsed": true
      7    },
      8    "source": [
      9     "# EventVestor: Impairments and Charges\n",
     10     "\n",
     11     "In this notebook, we'll take a look at EventVestor's *Impairments and Charges* dataset, available on the [Quantopian Store](https://www.quantopian.com/store). This dataset spans January 01, 2007 through the current day, and documents goodwill impairments and other one time charges reported by companies.\n",
     12     "\n",
     13     "### Blaze\n",
     14     "Before we dig into the data, we want to tell you about how  you generally access Quantopian Store data sets. These datasets are available through an API service known as [Blaze](http://blaze.pydata.org). Blaze provides the Quantopian user with a convenient interface to access very large datasets.\n",
     15     "\n",
     16     "Blaze provides an important function for accessing these datasets. Some of these sets are many millions of records. Bringing that data directly into Quantopian Research directly just is not viable. So Blaze allows us to provide a simple querying interface and shift the burden over to the server side.\n",
     17     "\n",
     18     "It is common to use Blaze to reduce your dataset in size, convert it over to Pandas and then to use Pandas for further computation, manipulation and visualization.\n",
     19     "\n",
     20     "Helpful links:\n",
     21     "* [Query building for Blaze](http://blaze.pydata.org/en/latest/queries.html)\n",
     22     "* [Pandas-to-Blaze dictionary](http://blaze.pydata.org/en/latest/rosetta-pandas.html)\n",
     23     "* [SQL-to-Blaze dictionary](http://blaze.pydata.org/en/latest/rosetta-sql.html).\n",
     24     "\n",
     25     "Once you've limited the size of your Blaze object, you can convert it to a Pandas DataFrames using:\n",
     26     "> `from odo import odo`  \n",
     27     "> `odo(expr, pandas.DataFrame)`\n",
     28     "\n",
     29     "### Free samples and limits\n",
     30     "One other key caveat: we limit the number of results returned from any given expression to 10,000 to protect against runaway memory usage. To be clear, you have access to all the data server side. We are limiting the size of the responses back from Blaze.\n",
     31     "\n",
     32     "There is a *free* version of this dataset as well as a paid one. The free one includes about three years of historical data, though not up to the current day.\n",
     33     "\n",
     34     "With preamble in place, let's get started:"
     35    ]
     36   },
     37   {
     38    "cell_type": "code",
     39    "execution_count": 1,
     40    "metadata": {
     41     "collapsed": false
     42    },
     43    "outputs": [],
     44    "source": [
     45     "# import the dataset\n",
     46     "from quantopian.interactive.data.eventvestor import impairments_and_charges\n",
     47     "# or if you want to import the free dataset, use:\n",
     48     "# from quantopian.interactive.data.eventvestor import impairments_and_charges_free\n",
     49     "\n",
     50     "# import data operations\n",
     51     "from odo import odo\n",
     52     "# import other libraries we will use\n",
     53     "import pandas as pd"
     54    ]
     55   },
     56   {
     57    "cell_type": "code",
     58    "execution_count": 2,
     59    "metadata": {
     60     "collapsed": false
     61    },
     62    "outputs": [
     63     {
     64      "data": {
     65       "text/plain": [
     66        "dshape(\"\"\"var * {\n",
     67        "  event_id: ?float64,\n",
     68        "  asof_date: datetime,\n",
     69        "  trade_date: ?datetime,\n",
     70        "  symbol: ?string,\n",
     71        "  event_type: ?string,\n",
     72        "  event_headline: ?string,\n",
     73        "  charge_amount: ?float64,\n",
     74        "  amount_units: ?string,\n",
     75        "  event_rating: ?float64,\n",
     76        "  timestamp: datetime,\n",
     77        "  sid: ?int64\n",
     78        "  }\"\"\")"
     79       ]
     80      },
     81      "execution_count": 2,
     82      "metadata": {},
     83      "output_type": "execute_result"
     84     }
     85    ],
     86    "source": [
     87     "# Let's use blaze to understand the data a bit using Blaze dshape()\n",
     88     "impairments_and_charges.dshape"
     89    ]
     90   },
     91   {
     92    "cell_type": "code",
     93    "execution_count": 3,
     94    "metadata": {
     95     "collapsed": false
     96    },
     97    "outputs": [
     98     {
     99      "data": {
    100       "text/html": [
    101        "3991"
    102       ],
    103       "text/plain": [
    104        "3991"
    105       ]
    106      },
    107      "execution_count": 3,
    108      "metadata": {},
    109      "output_type": "execute_result"
    110     }
    111    ],
    112    "source": [
    113     "# And how many rows are there?\n",
    114     "# N.B. we're using a Blaze function to do this, not len()\n",
    115     "impairments_and_charges.count()"
    116    ]
    117   },
    118   {
    119    "cell_type": "code",
    120    "execution_count": 4,
    121    "metadata": {
    122     "collapsed": false
    123    },
    124    "outputs": [
    125     {
    126      "data": {
    127       "text/html": [
    128        "<table border=\"1\" class=\"dataframe\">\n",
    129        "  <thead>\n",
    130        "    <tr style=\"text-align: right;\">\n",
    131        "      <th></th>\n",
    132        "      <th>event_id</th>\n",
    133        "      <th>asof_date</th>\n",
    134        "      <th>trade_date</th>\n",
    135        "      <th>symbol</th>\n",
    136        "      <th>event_type</th>\n",
    137        "      <th>event_headline</th>\n",
    138        "      <th>charge_amount</th>\n",
    139        "      <th>amount_units</th>\n",
    140        "      <th>event_rating</th>\n",
    141        "      <th>timestamp</th>\n",
    142        "      <th>sid</th>\n",
    143        "    </tr>\n",
    144        "  </thead>\n",
    145        "  <tbody>\n",
    146        "    <tr>\n",
    147        "      <th>0</th>\n",
    148        "      <td>131321</td>\n",
    149        "      <td>2007-01-05</td>\n",
    150        "      <td>2007-01-08</td>\n",
    151        "      <td>GT</td>\n",
    152        "      <td>Impairments/Charges</td>\n",
    153        "      <td>Goodyear To Record $155M To $160M Charges in 1...</td>\n",
    154        "      <td>160</td>\n",
    155        "      <td>$M</td>\n",
    156        "      <td>1</td>\n",
    157        "      <td>2007-01-06</td>\n",
    158        "      <td>3384</td>\n",
    159        "    </tr>\n",
    160        "    <tr>\n",
    161        "      <th>1</th>\n",
    162        "      <td>110962</td>\n",
    163        "      <td>2007-01-08</td>\n",
    164        "      <td>2007-01-09</td>\n",
    165        "      <td>MO</td>\n",
    166        "      <td>Impairments/Charges</td>\n",
    167        "      <td>Altria Group Subsidiary To Record $245M Asset ...</td>\n",
    168        "      <td>245</td>\n",
    169        "      <td>$M</td>\n",
    170        "      <td>1</td>\n",
    171        "      <td>2007-01-09</td>\n",
    172        "      <td>4954</td>\n",
    173        "    </tr>\n",
    174        "    <tr>\n",
    175        "      <th>2</th>\n",
    176        "      <td>1182869</td>\n",
    177        "      <td>2007-01-16</td>\n",
    178        "      <td>2007-01-16</td>\n",
    179        "      <td>FRX</td>\n",
    180        "      <td>Impairments/Charges</td>\n",
    181        "      <td>Forest Labs to Record $494M Charge in 4Q 07</td>\n",
    182        "      <td>494</td>\n",
    183        "      <td>$M</td>\n",
    184        "      <td>1</td>\n",
    185        "      <td>2007-01-17</td>\n",
    186        "      <td>3014</td>\n",
    187        "    </tr>\n",
    188        "  </tbody>\n",
    189        "</table>"
    190       ],
    191       "text/plain": [
    192        "   event_id  asof_date trade_date symbol           event_type  \\\n",
    193        "0    131321 2007-01-05 2007-01-08     GT  Impairments/Charges   \n",
    194        "1    110962 2007-01-08 2007-01-09     MO  Impairments/Charges   \n",
    195        "2   1182869 2007-01-16 2007-01-16    FRX  Impairments/Charges   \n",
    196        "\n",
    197        "                                      event_headline  charge_amount  \\\n",
    198        "0  Goodyear To Record $155M To $160M Charges in 1...            160   \n",
    199        "1  Altria Group Subsidiary To Record $245M Asset ...            245   \n",
    200        "2        Forest Labs to Record $494M Charge in 4Q 07            494   \n",
    201        "\n",
    202        "  amount_units  event_rating  timestamp   sid  \n",
    203        "0           $M             1 2007-01-06  3384  \n",
    204        "1           $M             1 2007-01-09  4954  \n",
    205        "2           $M             1 2007-01-17  3014  "
    206       ]
    207      },
    208      "execution_count": 4,
    209      "metadata": {},
    210      "output_type": "execute_result"
    211     }
    212    ],
    213    "source": [
    214     "# Let's see what the data looks like. We'll grab the first three rows.\n",
    215     "impairments_and_charges[:3]"
    216    ]
    217   },
    218   {
    219    "cell_type": "markdown",
    220    "metadata": {},
    221    "source": [
    222     "Let's go over the columns:\n",
    223     "- **event_id**: the unique identifier for this event.\n",
    224     "- **asof_date**: EventVestor's timestamp of event capture.\n",
    225     "- **trade_date**: for event announcements made before trading ends, trade_date is the same as event_date. For announcements issued after market close, trade_date is next market open day.\n",
    226     "- **symbol**: stock ticker symbol of the affected company.\n",
    227     "- **event_type**: this should always be *Impairments/Charges*.\n",
    228     "- **event_headline**: a brief description of the event\n",
    229     "- **charge_amount**: amount charged in `amount_units`\n",
    230     "- **amount_units**: units of the amount charged. Most commonly millions of dollars.\n",
    231     "- **event_rating**: this is always 1. The meaning of this is uncertain.\n",
    232     "- **timestamp**: this is our timestamp on when we registered the data.\n",
    233     "- **sid**: the equity's unique identifier. Use this instead of the symbol."
    234    ]
    235   },
    236   {
    237    "cell_type": "markdown",
    238    "metadata": {},
    239    "source": [
    240     "We've done much of the data processing for you. Fields like `timestamp` and `sid` are standardized across all our Store Datasets, so the datasets are easy to combine. We have standardized the `sid` across all our equity databases.\n",
    241     "\n",
    242     "We can select columns and rows with ease. Below, we'll fetch all 2012 charges greater than $200M."
    243    ]
    244   },
    245   {
    246    "cell_type": "code",
    247    "execution_count": 5,
    248    "metadata": {
    249     "collapsed": false,
    250     "scrolled": true
    251    },
    252    "outputs": [
    253     {
    254      "data": {
    255       "text/html": [
    256        "<table border=\"1\" class=\"dataframe\">\n",
    257        "  <thead>\n",
    258        "    <tr style=\"text-align: right;\">\n",
    259        "      <th></th>\n",
    260        "      <th>event_id</th>\n",
    261        "      <th>asof_date</th>\n",
    262        "      <th>trade_date</th>\n",
    263        "      <th>symbol</th>\n",
    264        "      <th>event_type</th>\n",
    265        "      <th>event_headline</th>\n",
    266        "      <th>charge_amount</th>\n",
    267        "      <th>amount_units</th>\n",
    268        "      <th>event_rating</th>\n",
    269        "      <th>timestamp</th>\n",
    270        "      <th>sid</th>\n",
    271        "    </tr>\n",
    272        "  </thead>\n",
    273        "  <tbody>\n",
    274        "    <tr>\n",
    275        "      <th>0</th>\n",
    276        "      <td>1382496</td>\n",
    277        "      <td>2012-01-11</td>\n",
    278        "      <td>2012-01-12</td>\n",
    279        "      <td>XL</td>\n",
    280        "      <td>Impairments/Charges</td>\n",
    281        "      <td>XL Group to Record Upto $220M Charges</td>\n",
    282        "      <td>220</td>\n",
    283        "      <td>$M</td>\n",
    284        "      <td>1</td>\n",
    285        "      <td>2012-01-12</td>\n",
    286        "      <td>8340</td>\n",
    287        "    </tr>\n",
    288        "    <tr>\n",
    289        "      <th>1</th>\n",
    290        "      <td>1382455</td>\n",
    291        "      <td>2012-01-11</td>\n",
    292        "      <td>2012-01-12</td>\n",
    293        "      <td>RF</td>\n",
    294        "      <td>Impairments/Charges</td>\n",
    295        "      <td>Regions Financial to Record Upto $745M Impairm...</td>\n",
    296        "      <td>745</td>\n",
    297        "      <td>$M</td>\n",
    298        "      <td>1</td>\n",
    299        "      <td>2012-01-12</td>\n",
    300        "      <td>34913</td>\n",
    301        "    </tr>\n",
    302        "    <tr>\n",
    303        "      <th>2</th>\n",
    304        "      <td>1383159</td>\n",
    305        "      <td>2012-01-13</td>\n",
    306        "      <td>2012-01-16</td>\n",
    307        "      <td>ADM</td>\n",
    308        "      <td>Impairments/Charges</td>\n",
    309        "      <td>Archer Daniels to Record Upto $360M Charge in ...</td>\n",
    310        "      <td>360</td>\n",
    311        "      <td>$M</td>\n",
    312        "      <td>1</td>\n",
    313        "      <td>2012-01-14</td>\n",
    314        "      <td>128</td>\n",
    315        "    </tr>\n",
    316        "    <tr>\n",
    317        "      <th>3</th>\n",
    318        "      <td>1383004</td>\n",
    319        "      <td>2012-01-13</td>\n",
    320        "      <td>2012-01-13</td>\n",
    321        "      <td>NVS</td>\n",
    322        "      <td>Impairments/Charges</td>\n",
    323        "      <td>Novartis to Record $1.22B Charges</td>\n",
    324        "      <td>1220</td>\n",
    325        "      <td>$M</td>\n",
    326        "      <td>1</td>\n",
    327        "      <td>2012-01-14</td>\n",
    328        "      <td>21536</td>\n",
    329        "    </tr>\n",
    330        "    <tr>\n",
    331        "      <th>4</th>\n",
    332        "      <td>1383880</td>\n",
    333        "      <td>2012-01-18</td>\n",
    334        "      <td>2012-01-18</td>\n",
    335        "      <td>HES</td>\n",
    336        "      <td>Impairments/Charges</td>\n",
    337        "      <td>Hess Corp to Record $525M Charge in 4Q 11 on R...</td>\n",
    338        "      <td>525</td>\n",
    339        "      <td>$M</td>\n",
    340        "      <td>1</td>\n",
    341        "      <td>2012-01-19</td>\n",
    342        "      <td>216</td>\n",
    343        "    </tr>\n",
    344        "    <tr>\n",
    345        "      <th>5</th>\n",
    346        "      <td>1384387</td>\n",
    347        "      <td>2012-01-19</td>\n",
    348        "      <td>2012-01-19</td>\n",
    349        "      <td>ECL</td>\n",
    350        "      <td>Impairments/Charges</td>\n",
    351        "      <td>Ecolab to Record $480M Charges by FY 13</td>\n",
    352        "      <td>480</td>\n",
    353        "      <td>$M</td>\n",
    354        "      <td>1</td>\n",
    355        "      <td>2012-01-20</td>\n",
    356        "      <td>2427</td>\n",
    357        "    </tr>\n",
    358        "    <tr>\n",
    359        "      <th>6</th>\n",
    360        "      <td>1385496</td>\n",
    361        "      <td>2012-01-23</td>\n",
    362        "      <td>2012-01-24</td>\n",
    363        "      <td>MUR</td>\n",
    364        "      <td>Impairments/Charges</td>\n",
    365        "      <td>Murphy Oil Unit to Record $370M Asset Impairme...</td>\n",
    366        "      <td>370</td>\n",
    367        "      <td>$M</td>\n",
    368        "      <td>1</td>\n",
    369        "      <td>2012-01-24</td>\n",
    370        "      <td>5126</td>\n",
    371        "    </tr>\n",
    372        "    <tr>\n",
    373        "      <th>7</th>\n",
    374        "      <td>1386032</td>\n",
    375        "      <td>2012-01-24</td>\n",
    376        "      <td>2012-01-25</td>\n",
    377        "      <td>BBOX</td>\n",
    378        "      <td>Impairments/Charges</td>\n",
    379        "      <td>Black Box to Record $320M Charges in 3Q 12</td>\n",
    380        "      <td>320</td>\n",
    381        "      <td>$M</td>\n",
    382        "      <td>1</td>\n",
    383        "      <td>2012-01-25</td>\n",
    384        "      <td>11732</td>\n",
    385        "    </tr>\n",
    386        "    <tr>\n",
    387        "      <th>8</th>\n",
    388        "      <td>1385962</td>\n",
    389        "      <td>2012-01-24</td>\n",
    390        "      <td>2012-01-25</td>\n",
    391        "      <td>RE</td>\n",
    392        "      <td>Impairments/Charges</td>\n",
    393        "      <td>Everest Re Group to Record $245M Catastrophe L...</td>\n",
    394        "      <td>245</td>\n",
    395        "      <td>$M</td>\n",
    396        "      <td>1</td>\n",
    397        "      <td>2012-01-25</td>\n",
    398        "      <td>13720</td>\n",
    399        "    </tr>\n",
    400        "    <tr>\n",
    401        "      <th>9</th>\n",
    402        "      <td>1388133</td>\n",
    403        "      <td>2012-01-30</td>\n",
    404        "      <td>2012-01-30</td>\n",
    405        "      <td>X</td>\n",
    406        "      <td>Impairments/Charges</td>\n",
    407        "      <td>United States Steel to Record Upto $450M Charg...</td>\n",
    408        "      <td>450</td>\n",
    409        "      <td>$M</td>\n",
    410        "      <td>1</td>\n",
    411        "      <td>2012-01-31</td>\n",
    412        "      <td>8329</td>\n",
    413        "    </tr>\n",
    414        "    <tr>\n",
    415        "      <th>10</th>\n",
    416        "      <td>1388719</td>\n",
    417        "      <td>2012-01-31</td>\n",
    418        "      <td>2012-01-31</td>\n",
    419        "      <td>X</td>\n",
    420        "      <td>Impairments/Charges</td>\n",
    421        "      <td>United States Steel to Record Up to $450M Char...</td>\n",
    422        "      <td>450</td>\n",
    423        "      <td>$M</td>\n",
    424        "      <td>1</td>\n",
    425        "      <td>2012-02-01</td>\n",
    426        "      <td>8329</td>\n",
    427        "    </tr>\n",
    428        "  </tbody>\n",
    429        "</table>"
    430       ],
    431       "text/plain": [
    432        "    event_id  asof_date trade_date symbol           event_type  \\\n",
    433        "0    1382496 2012-01-11 2012-01-12     XL  Impairments/Charges   \n",
    434        "1    1382455 2012-01-11 2012-01-12     RF  Impairments/Charges   \n",
    435        "2    1383159 2012-01-13 2012-01-16    ADM  Impairments/Charges   \n",
    436        "3    1383004 2012-01-13 2012-01-13    NVS  Impairments/Charges   \n",
    437        "4    1383880 2012-01-18 2012-01-18    HES  Impairments/Charges   \n",
    438        "5    1384387 2012-01-19 2012-01-19    ECL  Impairments/Charges   \n",
    439        "6    1385496 2012-01-23 2012-01-24    MUR  Impairments/Charges   \n",
    440        "7    1386032 2012-01-24 2012-01-25   BBOX  Impairments/Charges   \n",
    441        "8    1385962 2012-01-24 2012-01-25     RE  Impairments/Charges   \n",
    442        "9    1388133 2012-01-30 2012-01-30      X  Impairments/Charges   \n",
    443        "10   1388719 2012-01-31 2012-01-31      X  Impairments/Charges   \n",
    444        "\n",
    445        "                                       event_headline  charge_amount  \\\n",
    446        "0               XL Group to Record Upto $220M Charges            220   \n",
    447        "1   Regions Financial to Record Upto $745M Impairm...            745   \n",
    448        "2   Archer Daniels to Record Upto $360M Charge in ...            360   \n",
    449        "3                   Novartis to Record $1.22B Charges           1220   \n",
    450        "4   Hess Corp to Record $525M Charge in 4Q 11 on R...            525   \n",
    451        "5             Ecolab to Record $480M Charges by FY 13            480   \n",
    452        "6   Murphy Oil Unit to Record $370M Asset Impairme...            370   \n",
    453        "7          Black Box to Record $320M Charges in 3Q 12            320   \n",
    454        "8   Everest Re Group to Record $245M Catastrophe L...            245   \n",
    455        "9   United States Steel to Record Upto $450M Charg...            450   \n",
    456        "10  United States Steel to Record Up to $450M Char...            450   \n",
    457        "\n",
    458        "   amount_units  event_rating  timestamp    sid  \n",
    459        "0            $M             1 2012-01-12   8340  \n",
    460        "1            $M             1 2012-01-12  34913  \n",
    461        "2            $M             1 2012-01-14    128  \n",
    462        "3            $M             1 2012-01-14  21536  \n",
    463        "4            $M             1 2012-01-19    216  \n",
    464        "5            $M             1 2012-01-20   2427  \n",
    465        "6            $M             1 2012-01-24   5126  \n",
    466        "7            $M             1 2012-01-25  11732  \n",
    467        "8            $M             1 2012-01-25  13720  \n",
    468        "9            $M             1 2012-01-31   8329  \n",
    469        "..."
    470       ]
    471      },
    472      "execution_count": 5,
    473      "metadata": {},
    474      "output_type": "execute_result"
    475     }
    476    ],
    477    "source": [
    478     "twohundreds = impairments_and_charges[('2011-12-31' < impairments_and_charges['asof_date']) & \n",
    479     "                                        (impairments_and_charges['asof_date'] <'2013-01-01') & \n",
    480     "                                        (impairments_and_charges.charge_amount > 200)&\n",
    481     "                                        (impairments_and_charges.amount_units == \"$M\")]\n",
    482     "# When displaying a Blaze Data Object, the printout is automatically truncated to ten rows.\n",
    483     "twohundreds.sort('asof_date')"
    484    ]
    485   },
    486   {
    487    "cell_type": "markdown",
    488    "metadata": {},
    489    "source": [
    490     "Now suppose we want a DataFrame of the Blaze Data Object above, and we only want the sid, charge_amount and the asof_date."
    491    ]
    492   },
    493   {
    494    "cell_type": "code",
    495    "execution_count": 6,
    496    "metadata": {
    497     "collapsed": false
    498    },
    499    "outputs": [
    500     {
    501      "data": {
    502       "text/html": [
    503        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
    504        "<table border=\"1\" class=\"dataframe\">\n",
    505        "  <thead>\n",
    506        "    <tr style=\"text-align: right;\">\n",
    507        "      <th></th>\n",
    508        "      <th>sid</th>\n",
    509        "      <th>asof_date</th>\n",
    510        "      <th>charge_amount</th>\n",
    511        "    </tr>\n",
    512        "  </thead>\n",
    513        "  <tbody>\n",
    514        "    <tr>\n",
    515        "      <th>0</th>\n",
    516        "      <td>34913</td>\n",
    517        "      <td>2012-01-11</td>\n",
    518        "      <td>745.0</td>\n",
    519        "    </tr>\n",
    520        "    <tr>\n",
    521        "      <th>1</th>\n",
    522        "      <td>8340</td>\n",
    523        "      <td>2012-01-11</td>\n",
    524        "      <td>220.0</td>\n",
    525        "    </tr>\n",
    526        "    <tr>\n",
    527        "      <th>2</th>\n",
    528        "      <td>128</td>\n",
    529        "      <td>2012-01-13</td>\n",
    530        "      <td>360.0</td>\n",
    531        "    </tr>\n",
    532        "    <tr>\n",
    533        "      <th>3</th>\n",
    534        "      <td>21536</td>\n",
    535        "      <td>2012-01-13</td>\n",
    536        "      <td>1220.0</td>\n",
    537        "    </tr>\n",
    538        "    <tr>\n",
    539        "      <th>4</th>\n",
    540        "      <td>216</td>\n",
    541        "      <td>2012-01-18</td>\n",
    542        "      <td>525.0</td>\n",
    543        "    </tr>\n",
    544        "    <tr>\n",
    545        "      <th>5</th>\n",
    546        "      <td>2427</td>\n",
    547        "      <td>2012-01-19</td>\n",
    548        "      <td>480.0</td>\n",
    549        "    </tr>\n",
    550        "    <tr>\n",
    551        "      <th>6</th>\n",
    552        "      <td>5126</td>\n",
    553        "      <td>2012-01-23</td>\n",
    554        "      <td>370.0</td>\n",
    555        "    </tr>\n",
    556        "    <tr>\n",
    557        "      <th>7</th>\n",
    558        "      <td>11732</td>\n",
    559        "      <td>2012-01-24</td>\n",
    560        "      <td>320.0</td>\n",
    561        "    </tr>\n",
    562        "    <tr>\n",
    563        "      <th>8</th>\n",
    564        "      <td>13720</td>\n",
    565        "      <td>2012-01-24</td>\n",
    566        "      <td>245.0</td>\n",
    567        "    </tr>\n",
    568        "    <tr>\n",
    569        "      <th>9</th>\n",
    570        "      <td>8329</td>\n",
    571        "      <td>2012-01-30</td>\n",
    572        "      <td>450.0</td>\n",
    573        "    </tr>\n",
    574        "    <tr>\n",
    575        "      <th>10</th>\n",
    576        "      <td>8329</td>\n",
    577        "      <td>2012-01-31</td>\n",
    578        "      <td>450.0</td>\n",
    579        "    </tr>\n",
    580        "    <tr>\n",
    581        "      <th>11</th>\n",
    582        "      <td>351</td>\n",
    583        "      <td>2012-03-05</td>\n",
    584        "      <td>703.0</td>\n",
    585        "    </tr>\n",
    586        "    <tr>\n",
    587        "      <th>12</th>\n",
    588        "      <td>7334</td>\n",
    589        "      <td>2012-03-13</td>\n",
    590        "      <td>293.0</td>\n",
    591        "    </tr>\n",
    592        "    <tr>\n",
    593        "      <th>13</th>\n",
    594        "      <td>1335</td>\n",
    595        "      <td>2012-03-24</td>\n",
    596        "      <td>700.0</td>\n",
    597        "    </tr>\n",
    598        "    <tr>\n",
    599        "      <th>14</th>\n",
    600        "      <td>2263</td>\n",
    601        "      <td>2012-04-02</td>\n",
    602        "      <td>350.0</td>\n",
    603        "    </tr>\n",
    604        "    <tr>\n",
    605        "      <th>15</th>\n",
    606        "      <td>6116</td>\n",
    607        "      <td>2012-04-05</td>\n",
    608        "      <td>372.0</td>\n",
    609        "    </tr>\n",
    610        "    <tr>\n",
    611        "      <th>16</th>\n",
    612        "      <td>23112</td>\n",
    613        "      <td>2012-04-10</td>\n",
    614        "      <td>400.0</td>\n",
    615        "    </tr>\n",
    616        "    <tr>\n",
    617        "      <th>17</th>\n",
    618        "      <td>32902</td>\n",
    619        "      <td>2012-04-17</td>\n",
    620        "      <td>370.0</td>\n",
    621        "    </tr>\n",
    622        "    <tr>\n",
    623        "      <th>18</th>\n",
    624        "      <td>24838</td>\n",
    625        "      <td>2012-04-19</td>\n",
    626        "      <td>260.0</td>\n",
    627        "    </tr>\n",
    628        "    <tr>\n",
    629        "      <th>19</th>\n",
    630        "      <td>2351</td>\n",
    631        "      <td>2012-04-30</td>\n",
    632        "      <td>420.0</td>\n",
    633        "    </tr>\n",
    634        "    <tr>\n",
    635        "      <th>20</th>\n",
    636        "      <td>24838</td>\n",
    637        "      <td>2012-05-17</td>\n",
    638        "      <td>280.0</td>\n",
    639        "    </tr>\n",
    640        "    <tr>\n",
    641        "      <th>21</th>\n",
    642        "      <td>754</td>\n",
    643        "      <td>2012-05-22</td>\n",
    644        "      <td>350.0</td>\n",
    645        "    </tr>\n",
    646        "    <tr>\n",
    647        "      <th>22</th>\n",
    648        "      <td>3735</td>\n",
    649        "      <td>2012-05-23</td>\n",
    650        "      <td>1700.0</td>\n",
    651        "    </tr>\n",
    652        "    <tr>\n",
    653        "      <th>23</th>\n",
    654        "      <td>14388</td>\n",
    655        "      <td>2012-06-05</td>\n",
    656        "      <td>425.0</td>\n",
    657        "    </tr>\n",
    658        "    <tr>\n",
    659        "      <th>24</th>\n",
    660        "      <td>4151</td>\n",
    661        "      <td>2012-06-08</td>\n",
    662        "      <td>600.0</td>\n",
    663        "    </tr>\n",
    664        "    <tr>\n",
    665        "      <th>25</th>\n",
    666        "      <td>11673</td>\n",
    667        "      <td>2012-06-14</td>\n",
    668        "      <td>1000.0</td>\n",
    669        "    </tr>\n",
    670        "    <tr>\n",
    671        "      <th>26</th>\n",
    672        "      <td>88</td>\n",
    673        "      <td>2012-06-21</td>\n",
    674        "      <td>439.0</td>\n",
    675        "    </tr>\n",
    676        "    <tr>\n",
    677        "      <th>27</th>\n",
    678        "      <td>26204</td>\n",
    679        "      <td>2012-06-25</td>\n",
    680        "      <td>272.0</td>\n",
    681        "    </tr>\n",
    682        "    <tr>\n",
    683        "      <th>29</th>\n",
    684        "      <td>5061</td>\n",
    685        "      <td>2012-07-02</td>\n",
    686        "      <td>6200.0</td>\n",
    687        "    </tr>\n",
    688        "    <tr>\n",
    689        "      <th>30</th>\n",
    690        "      <td>903</td>\n",
    691        "      <td>2012-07-06</td>\n",
    692        "      <td>210.0</td>\n",
    693        "    </tr>\n",
    694        "    <tr>\n",
    695        "      <th>...</th>\n",
    696        "      <td>...</td>\n",
    697        "      <td>...</td>\n",
    698        "      <td>...</td>\n",
    699        "    </tr>\n",
    700        "    <tr>\n",
    701        "      <th>61</th>\n",
    702        "      <td>5520</td>\n",
    703        "      <td>2012-10-26</td>\n",
    704        "      <td>275.0</td>\n",
    705        "    </tr>\n",
    706        "    <tr>\n",
    707        "      <th>62</th>\n",
    708        "      <td>166</td>\n",
    709        "      <td>2012-11-01</td>\n",
    710        "      <td>2000.0</td>\n",
    711        "    </tr>\n",
    712        "    <tr>\n",
    713        "      <th>63</th>\n",
    714        "      <td>42173</td>\n",
    715        "      <td>2012-11-01</td>\n",
    716        "      <td>250.0</td>\n",
    717        "    </tr>\n",
    718        "    <tr>\n",
    719        "      <th>65</th>\n",
    720        "      <td>26169</td>\n",
    721        "      <td>2012-11-15</td>\n",
    722        "      <td>400.0</td>\n",
    723        "    </tr>\n",
    724        "    <tr>\n",
    725        "      <th>66</th>\n",
    726        "      <td>7671</td>\n",
    727        "      <td>2012-11-15</td>\n",
    728        "      <td>325.0</td>\n",
    729        "    </tr>\n",
    730        "    <tr>\n",
    731        "      <th>67</th>\n",
    732        "      <td>24833</td>\n",
    733        "      <td>2012-11-17</td>\n",
    734        "      <td>400.0</td>\n",
    735        "    </tr>\n",
    736        "    <tr>\n",
    737        "      <th>68</th>\n",
    738        "      <td>161</td>\n",
    739        "      <td>2012-11-20</td>\n",
    740        "      <td>290.0</td>\n",
    741        "    </tr>\n",
    742        "    <tr>\n",
    743        "      <th>69</th>\n",
    744        "      <td>23998</td>\n",
    745        "      <td>2012-11-26</td>\n",
    746        "      <td>400.0</td>\n",
    747        "    </tr>\n",
    748        "    <tr>\n",
    749        "      <th>71</th>\n",
    750        "      <td>24838</td>\n",
    751        "      <td>2012-11-28</td>\n",
    752        "      <td>1075.0</td>\n",
    753        "    </tr>\n",
    754        "    <tr>\n",
    755        "      <th>72</th>\n",
    756        "      <td>5092</td>\n",
    757        "      <td>2012-11-30</td>\n",
    758        "      <td>267.5</td>\n",
    759        "    </tr>\n",
    760        "    <tr>\n",
    761        "      <th>73</th>\n",
    762        "      <td>5862</td>\n",
    763        "      <td>2012-12-04</td>\n",
    764        "      <td>300.0</td>\n",
    765        "    </tr>\n",
    766        "    <tr>\n",
    767        "      <th>74</th>\n",
    768        "      <td>1335</td>\n",
    769        "      <td>2012-12-05</td>\n",
    770        "      <td>1000.0</td>\n",
    771        "    </tr>\n",
    772        "    <tr>\n",
    773        "      <th>75</th>\n",
    774        "      <td>7041</td>\n",
    775        "      <td>2012-12-05</td>\n",
    776        "      <td>650.0</td>\n",
    777        "    </tr>\n",
    778        "    <tr>\n",
    779        "      <th>76</th>\n",
    780        "      <td>239</td>\n",
    781        "      <td>2012-12-07</td>\n",
    782        "      <td>2000.0</td>\n",
    783        "    </tr>\n",
    784        "    <tr>\n",
    785        "      <th>77</th>\n",
    786        "      <td>8580</td>\n",
    787        "      <td>2012-12-10</td>\n",
    788        "      <td>380.0</td>\n",
    789        "    </tr>\n",
    790        "    <tr>\n",
    791        "      <th>78</th>\n",
    792        "      <td>1274</td>\n",
    793        "      <td>2012-12-11</td>\n",
    794        "      <td>880.0</td>\n",
    795        "    </tr>\n",
    796        "    <tr>\n",
    797        "      <th>79</th>\n",
    798        "      <td>14064</td>\n",
    799        "      <td>2012-12-11</td>\n",
    800        "      <td>370.0</td>\n",
    801        "    </tr>\n",
    802        "    <tr>\n",
    803        "      <th>80</th>\n",
    804        "      <td>8340</td>\n",
    805        "      <td>2012-12-12</td>\n",
    806        "      <td>350.0</td>\n",
    807        "    </tr>\n",
    808        "    <tr>\n",
    809        "      <th>81</th>\n",
    810        "      <td>4488</td>\n",
    811        "      <td>2012-12-13</td>\n",
    812        "      <td>750.0</td>\n",
    813        "    </tr>\n",
    814        "    <tr>\n",
    815        "      <th>82</th>\n",
    816        "      <td>25305</td>\n",
    817        "      <td>2012-12-17</td>\n",
    818        "      <td>300.0</td>\n",
    819        "    </tr>\n",
    820        "    <tr>\n",
    821        "      <th>83</th>\n",
    822        "      <td>25955</td>\n",
    823        "      <td>2012-12-18</td>\n",
    824        "      <td>220.0</td>\n",
    825        "    </tr>\n",
    826        "    <tr>\n",
    827        "      <th>84</th>\n",
    828        "      <td>8369</td>\n",
    829        "      <td>2012-12-18</td>\n",
    830        "      <td>288.0</td>\n",
    831        "    </tr>\n",
    832        "    <tr>\n",
    833        "      <th>85</th>\n",
    834        "      <td>21462</td>\n",
    835        "      <td>2012-12-19</td>\n",
    836        "      <td>240.0</td>\n",
    837        "    </tr>\n",
    838        "    <tr>\n",
    839        "      <th>86</th>\n",
    840        "      <td>40430</td>\n",
    841        "      <td>2012-12-19</td>\n",
    842        "      <td>400.0</td>\n",
    843        "    </tr>\n",
    844        "    <tr>\n",
    845        "      <th>87</th>\n",
    846        "      <td>10025</td>\n",
    847        "      <td>2012-12-19</td>\n",
    848        "      <td>240.0</td>\n",
    849        "    </tr>\n",
    850        "    <tr>\n",
    851        "      <th>88</th>\n",
    852        "      <td>24783</td>\n",
    853        "      <td>2012-12-20</td>\n",
    854        "      <td>2000.0</td>\n",
    855        "    </tr>\n",
    856        "    <tr>\n",
    857        "      <th>89</th>\n",
    858        "      <td>13720</td>\n",
    859        "      <td>2012-12-20</td>\n",
    860        "      <td>220.0</td>\n",
    861        "    </tr>\n",
    862        "    <tr>\n",
    863        "      <th>90</th>\n",
    864        "      <td>17395</td>\n",
    865        "      <td>2012-12-21</td>\n",
    866        "      <td>4300.0</td>\n",
    867        "    </tr>\n",
    868        "    <tr>\n",
    869        "      <th>91</th>\n",
    870        "      <td>34334</td>\n",
    871        "      <td>2012-12-21</td>\n",
    872        "      <td>333.1</td>\n",
    873        "    </tr>\n",
    874        "    <tr>\n",
    875        "      <th>92</th>\n",
    876        "      <td>7543</td>\n",
    877        "      <td>2012-12-26</td>\n",
    878        "      <td>1100.0</td>\n",
    879        "    </tr>\n",
    880        "  </tbody>\n",
    881        "</table>\n",
    882        "<p>89 rows × 3 columns</p>\n",
    883        "</div>"
    884       ],
    885       "text/plain": [
    886        "      sid  asof_date  charge_amount\n",
    887        "0   34913 2012-01-11          745.0\n",
    888        "1    8340 2012-01-11          220.0\n",
    889        "2     128 2012-01-13          360.0\n",
    890        "3   21536 2012-01-13         1220.0\n",
    891        "4     216 2012-01-18          525.0\n",
    892        "5    2427 2012-01-19          480.0\n",
    893        "6    5126 2012-01-23          370.0\n",
    894        "7   11732 2012-01-24          320.0\n",
    895        "8   13720 2012-01-24          245.0\n",
    896        "9    8329 2012-01-30          450.0\n",
    897        "10   8329 2012-01-31          450.0\n",
    898        "11    351 2012-03-05          703.0\n",
    899        "12   7334 2012-03-13          293.0\n",
    900        "13   1335 2012-03-24          700.0\n",
    901        "14   2263 2012-04-02          350.0\n",
    902        "15   6116 2012-04-05          372.0\n",
    903        "16  23112 2012-04-10          400.0\n",
    904        "17  32902 2012-04-17          370.0\n",
    905        "18  24838 2012-04-19          260.0\n",
    906        "19   2351 2012-04-30          420.0\n",
    907        "20  24838 2012-05-17          280.0\n",
    908        "21    754 2012-05-22          350.0\n",
    909        "22   3735 2012-05-23         1700.0\n",
    910        "23  14388 2012-06-05          425.0\n",
    911        "24   4151 2012-06-08          600.0\n",
    912        "25  11673 2012-06-14         1000.0\n",
    913        "26     88 2012-06-21          439.0\n",
    914        "27  26204 2012-06-25          272.0\n",
    915        "29   5061 2012-07-02         6200.0\n",
    916        "30    903 2012-07-06          210.0\n",
    917        "..    ...        ...            ...\n",
    918        "61   5520 2012-10-26          275.0\n",
    919        "62    166 2012-11-01         2000.0\n",
    920        "63  42173 2012-11-01          250.0\n",
    921        "65  26169 2012-11-15          400.0\n",
    922        "66   7671 2012-11-15          325.0\n",
    923        "67  24833 2012-11-17          400.0\n",
    924        "68    161 2012-11-20          290.0\n",
    925        "69  23998 2012-11-26          400.0\n",
    926        "71  24838 2012-11-28         1075.0\n",
    927        "72   5092 2012-11-30          267.5\n",
    928        "73   5862 2012-12-04          300.0\n",
    929        "74   1335 2012-12-05         1000.0\n",
    930        "75   7041 2012-12-05          650.0\n",
    931        "76    239 2012-12-07         2000.0\n",
    932        "77   8580 2012-12-10          380.0\n",
    933        "78   1274 2012-12-11          880.0\n",
    934        "79  14064 2012-12-11          370.0\n",
    935        "80   8340 2012-12-12          350.0\n",
    936        "81   4488 2012-12-13          750.0\n",
    937        "82  25305 2012-12-17          300.0\n",
    938        "83  25955 2012-12-18          220.0\n",
    939        "84   8369 2012-12-18          288.0\n",
    940        "85  21462 2012-12-19          240.0\n",
    941        "86  40430 2012-12-19          400.0\n",
    942        "87  10025 2012-12-19          240.0\n",
    943        "88  24783 2012-12-20         2000.0\n",
    944        "89  13720 2012-12-20          220.0\n",
    945        "90  17395 2012-12-21         4300.0\n",
    946        "91  34334 2012-12-21          333.1\n",
    947        "92   7543 2012-12-26         1100.0\n",
    948        "\n",
    949        "[89 rows x 3 columns]"
    950       ]
    951      },
    952      "execution_count": 6,
    953      "metadata": {},
    954      "output_type": "execute_result"
    955     }
    956    ],
    957    "source": [
    958     "df = odo(twohundreds, pd.DataFrame)\n",
    959     "df = df[['sid', 'asof_date','charge_amount']].dropna()\n",
    960     "# When printing a pandas DataFrame, the head 30 and tail 30 rows are displayed. The middle is truncated.\n",
    961     "df"
    962    ]
    963   }
    964  ],
    965  "metadata": {
    966   "kernelspec": {
    967    "display_name": "Python 2",
    968    "language": "python",
    969    "name": "python2"
    970   },
    971   "language_info": {
    972    "codemirror_mode": {
    973     "name": "ipython",
    974     "version": 2
    975    },
    976    "file_extension": ".py",
    977    "mimetype": "text/x-python",
    978    "name": "python",
    979    "nbconvert_exporter": "python",
    980    "pygments_lexer": "ipython2",
    981    "version": "2.7.10"
    982   }
    983  },
    984  "nbformat": 4,
    985  "nbformat_minor": 0
    986 }