ml-finance-python

python scripts for finance machine learning

git clone https://9o.is/git/ml-finance-python.git

notebook.ipynb

(12147B)


      1 {
      2  "cells": [
      3   {
      4    "cell_type": "markdown",
      5    "metadata": {
      6     "deletable": true,
      7     "editable": true
      8    },
      9    "source": [
     10     "# Exercises: Mean Reversion on Futures\n",
     11     "By Chris Fenaroli, Delaney Mackenzie, and Maxwell Margenot\n",
     12     "\n",
     13     "## Lecture Link\n",
     14     "https://www.quantopian.com/lectures/introduction-to-pairs-trading\n",
     15     "\n",
     16     "https://www.quantopian.com/lectures/mean-reversion-on-futures\n",
     17     "\n",
     18     "###IMPORTANT NOTE: \n",
     19     "This lecture corresponds to the Mean Reversion on Futures lecture, which is part of the Quantopian lecture series. This homework expects you to rely heavily on the code presented in the corresponding lecture. Please copy and paste regularly from that lecture when starting to work on the problems, as trying to do them from scratch will likely be too difficult.\n",
     20     "\n",
     21     "Part of the Quantopian Lecture Series:\n",
     22     "\n",
     23     "* [www.quantopian.com/lectures](https://www.quantopian.com/lectures)\n",
     24     "* [github.com/quantopian/research_public](https://github.com/quantopian/research_public)\n",
     25     "\n",
     26     "\n",
     27     "----"
     28    ]
     29   },
     30   {
     31    "cell_type": "markdown",
     32    "metadata": {
     33     "deletable": true,
     34     "editable": true
     35    },
     36    "source": [
     37     "## Key concepts"
     38    ]
     39   },
     40   {
     41    "cell_type": "code",
     42    "execution_count": null,
     43    "metadata": {
     44     "collapsed": true,
     45     "deletable": true,
     46     "editable": true
     47    },
     48    "outputs": [],
     49    "source": [
     50     "# Useful Functions\n",
     51     "def find_cointegrated_pairs(data):\n",
     52     "    n = data.shape[1]\n",
     53     "    score_matrix = np.zeros((n, n))\n",
     54     "    pvalue_matrix = np.ones((n, n))\n",
     55     "    keys = data.keys()\n",
     56     "    pairs = []\n",
     57     "    for i in range(n):\n",
     58     "        for j in range(i+1, n):\n",
     59     "            S1 = data[keys[i]]\n",
     60     "            S2 = data[keys[j]]\n",
     61     "            result = coint(S1, S2)\n",
     62     "            score = result[0]\n",
     63     "            pvalue = result[1]\n",
     64     "            score_matrix[i, j] = score\n",
     65     "            pvalue_matrix[i, j] = pvalue\n",
     66     "            if pvalue < 0.05:\n",
     67     "                pairs.append((keys[i], keys[j]))\n",
     68     "    return score_matrix, pvalue_matrix, pairs"
     69    ]
     70   },
     71   {
     72    "cell_type": "code",
     73    "execution_count": null,
     74    "metadata": {
     75     "collapsed": true,
     76     "deletable": true,
     77     "editable": true
     78    },
     79    "outputs": [],
     80    "source": [
     81     "# Useful Libraries\n",
     82     "import numpy as np\n",
     83     "import pandas as pd\n",
     84     "\n",
     85     "import statsmodels\n",
     86     "import statsmodels.api as sm\n",
     87     "from statsmodels.tsa.stattools import coint, adfuller\n",
     88     "from quantopian.research.experimental import history, continuous_future\n",
     89     "# just set the seed for the random number generator\n",
     90     "np.random.seed(107)\n",
     91     "\n",
     92     "import matplotlib.pyplot as plt"
     93    ]
     94   },
     95   {
     96    "cell_type": "markdown",
     97    "metadata": {
     98     "deletable": true,
     99     "editable": true
    100    },
    101    "source": [
    102     "----"
    103    ]
    104   },
    105   {
    106    "cell_type": "markdown",
    107    "metadata": {
    108     "deletable": true,
    109     "editable": true
    110    },
    111    "source": [
    112     "#Exercise 1: Testing Artificial Examples\n",
    113     "\n",
    114     "We'll use some artificially generated series first as they are much cleaner and easier to work with. In general when learning or developing a new technique, use simulated data to provide a clean environment. Simulated data also allows you to control the level of noise and difficulty level for your model.\n",
    115     "\n",
    116     "##a. Cointegration Test I\n",
    117     "\n",
    118     "Determine whether the following two artificial series $A$ and $B$ are cointegrated using the `coint()` function and a reasonable confidence level."
    119    ]
    120   },
    121   {
    122    "cell_type": "code",
    123    "execution_count": null,
    124    "metadata": {
    125     "collapsed": false,
    126     "deletable": true,
    127     "editable": true
    128    },
    129    "outputs": [],
    130    "source": [
    131     "A_returns = np.random.normal(0, 1, 100)\n",
    132     "A = pd.Series(np.cumsum(A_returns), name='X') + 50\n",
    133     "\n",
    134     "some_noise = np.random.exponential(1, 100)\n",
    135     "\n",
    136     "B = A - 7 + some_noise\n",
    137     "\n",
    138     "#Your code goes here"
    139    ]
    140   },
    141   {
    142    "cell_type": "markdown",
    143    "metadata": {
    144     "deletable": true,
    145     "editable": true
    146    },
    147    "source": [
    148     "##b. Cointegration Test II\n",
    149     "\n",
    150     "Determine whether the following two artificial series $C$ and $D$ are cointegrated using the `coint()` function and a reasonable confidence level."
    151    ]
    152   },
    153   {
    154    "cell_type": "code",
    155    "execution_count": null,
    156    "metadata": {
    157     "collapsed": false,
    158     "deletable": true,
    159     "editable": true
    160    },
    161    "outputs": [],
    162    "source": [
    163     "C_returns = np.random.normal(1, 1, 100) \n",
    164     "C = pd.Series(np.cumsum(C_returns), name='X') + 100\n",
    165     "\n",
    166     "D_returns = np.random.normal(2, 1, 100)\n",
    167     "D = pd.Series(np.cumsum(D_returns), name='X') + 100\n",
    168     "\n",
    169     "#Your code goes here"
    170    ]
    171   },
    172   {
    173    "cell_type": "markdown",
    174    "metadata": {
    175     "deletable": true,
    176     "editable": true
    177    },
    178    "source": [
    179     "----"
    180    ]
    181   },
    182   {
    183    "cell_type": "markdown",
    184    "metadata": {
    185     "deletable": true,
    186     "editable": true
    187    },
    188    "source": [
    189     "#Exercise 2: Testing Real Examples\n",
    190     "\n",
    191     "##a. Real Cointegration Test I\n",
    192     "\n",
    193     "Determine whether the following two assets `CN` and `SB` were cointegrated during 2015 using the `coint()` function and a reasonable confidence level."
    194    ]
    195   },
    196   {
    197    "cell_type": "code",
    198    "execution_count": null,
    199    "metadata": {
    200     "collapsed": false,
    201     "deletable": true,
    202     "editable": true
    203    },
    204    "outputs": [],
    205    "source": [
    206     "cn = continuous_future('CN', offset = 0, roll = 'calendar', adjustment = 'mul')\n",
    207     "sb = continuous_future('SB', offset = 0, roll = 'calendar', adjustment = 'mul')\n",
    208     "\n",
    209     "cn_price = history(cn, 'price', '2015-01-01', '2016-01-01', 'daily')\n",
    210     "sb_price = history(sb, 'price', '2015-01-01', '2016-01-01', 'daily')\n",
    211     "\n",
    212     "#Your code goes here"
    213    ]
    214   },
    215   {
    216    "cell_type": "markdown",
    217    "metadata": {
    218     "deletable": true,
    219     "editable": true
    220    },
    221    "source": [
    222     "##b. Real Cointegration Test II\n",
    223     "\n",
    224     "Determine whether the following two underlyings `CL` and `HO` were cointegrated during 2015 using the `coint()` function and a reasonable confidence level."
    225    ]
    226   },
    227   {
    228    "cell_type": "code",
    229    "execution_count": null,
    230    "metadata": {
    231     "collapsed": false,
    232     "deletable": true,
    233     "editable": true
    234    },
    235    "outputs": [],
    236    "source": [
    237     "cl = continuous_future('CL', offset = 0, roll = 'calendar', adjustment = 'mul')\n",
    238     "ho = continuous_future('HO', offset = 0, roll = 'calendar', adjustment = 'mul')\n",
    239     "\n",
    240     "cl_price = history(cl, 'price', '2015-01-01', '2016-01-01', 'daily')\n",
    241     "ho_price = history(ho, 'price', '2015-01-01', '2016-01-01', 'daily')\n",
    242     "\n",
    243     "#Your code goes here"
    244    ]
    245   },
    246   {
    247    "cell_type": "markdown",
    248    "metadata": {
    249     "deletable": true,
    250     "editable": true
    251    },
    252    "source": [
    253     "----"
    254    ]
    255   },
    256   {
    257    "cell_type": "markdown",
    258    "metadata": {
    259     "deletable": true,
    260     "editable": true
    261    },
    262    "source": [
    263     "#Exercise 3: Out of Sample Validation\n",
    264     "\n",
    265     "##a. Calculating the Spread\n",
    266     "\n",
    267     "Using pricing data from 2015, construct a linear regression to find a coefficient for the linear combination of `CL` and `HO` that makes their spread stationary."
    268    ]
    269   },
    270   {
    271    "cell_type": "code",
    272    "execution_count": null,
    273    "metadata": {
    274     "collapsed": true,
    275     "deletable": true,
    276     "editable": true
    277    },
    278    "outputs": [],
    279    "source": [
    280     "\n",
    281     "#Your code goes here"
    282    ]
    283   },
    284   {
    285    "cell_type": "markdown",
    286    "metadata": {
    287     "deletable": true,
    288     "editable": true
    289    },
    290    "source": [
    291     "##b. Testing the Coefficient\n",
    292     "\n",
    293     "Use your coefficient from part a to plot the weighted spread using prices from the first half of 2016, and check whether the result is still stationary."
    294    ]
    295   },
    296   {
    297    "cell_type": "code",
    298    "execution_count": null,
    299    "metadata": {
    300     "collapsed": false,
    301     "deletable": true,
    302     "editable": true
    303    },
    304    "outputs": [],
    305    "source": [
    306     "cl_out = get_pricing(cl, fields='price', \n",
    307     "                        start_date='2016-01-01', end_date='2016-07-01')\n",
    308     "ho_out = get_pricing(ho, fields='price', \n",
    309     "                        start_date='2016-01-01', end_date='2016-07-01')\n",
    310     "\n",
    311     "#Your code goes here"
    312    ]
    313   },
    314   {
    315    "cell_type": "markdown",
    316    "metadata": {
    317     "deletable": true,
    318     "editable": true
    319    },
    320    "source": [
    321     "----"
    322    ]
    323   },
    324   {
    325    "cell_type": "markdown",
    326    "metadata": {
    327     "deletable": true,
    328     "editable": true
    329    },
    330    "source": [
    331     "#Extra Credit Exercise: Hurst Exponent\n",
    332     "\n",
    333     "This exercise is more difficult and we will not provide initial structure.\n",
    334     "\n",
    335     "The Hurst exponent is a statistic between 0 and 1 that provides information about how much a time series is trending or mean reverting. We want our spread time series to be mean reverting, so we can use the Hurst exponent to monitor whether our pair is going out of cointegration. Effectively as a means of process control to know when our pair is no longer good to trade.\n",
    336     "\n",
    337     "Please find either an existing Python library that computes, or compute yourself, the Hurst exponent. Then plot it over time for the spread on the above pair of stocks.\n",
    338     "\n",
    339     "These links may be helpful:\n",
    340     "\n",
    341     "* https://en.wikipedia.org/wiki/Hurst_exponent\n",
    342     "* https://www.quantopian.com/posts/pair-trade-with-cointegration-and-mean-reversion-tests"
    343    ]
    344   },
    345   {
    346    "cell_type": "code",
    347    "execution_count": null,
    348    "metadata": {
    349     "collapsed": true,
    350     "deletable": true,
    351     "editable": true
    352    },
    353    "outputs": [],
    354    "source": [
    355     "# Your code goes here"
    356    ]
    357   },
    358   {
    359    "cell_type": "markdown",
    360    "metadata": {},
    361    "source": [
    362     "---\n",
    363     "\n",
    364     "Congratulations on completing the Mean Reversion on Futures exercises!\n",
    365     "\n",
    366     "As you learn more about writing trading models and the Quantopian platform, enter the daily [Quantopian Contest](https://www.quantopian.com/contest). Your strategy will be evaluated for a cash prize every day.\n",
    367     "\n",
    368     "Start by going through the [Writing a Contest Algorithm](https://www.quantopian.com/tutorials/contest) tutorial."
    369    ]
    370   },
    371   {
    372    "cell_type": "markdown",
    373    "metadata": {
    374     "deletable": true,
    375     "editable": true
    376    },
    377    "source": [
    378     "*This presentation is for informational purposes only and does not constitute an offer to sell, a solic\n",
    379     "itation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. (\"Quantopian\"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company.  In preparing the information contained herein, Quantopian, Inc. has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian, Inc. at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.*"
    380    ]
    381   }
    382  ],
    383  "metadata": {
    384   "kernelspec": {
    385    "display_name": "Python 2",
    386    "language": "python",
    387    "name": "python2"
    388   },
    389   "language_info": {
    390    "codemirror_mode": {
    391     "name": "ipython",
    392     "version": 2
    393    },
    394    "file_extension": ".py",
    395    "mimetype": "text/x-python",
    396    "name": "python",
    397    "nbconvert_exporter": "python",
    398    "pygments_lexer": "ipython2",
    399    "version": "2.7.12"
    400   }
    401  },
    402  "nbformat": 4,
    403  "nbformat_minor": 0
    404 }