ml-finance-python
python scripts for finance machine learning
git clone https://9o.is/git/ml-finance-python.git
06_transfer_learning.ipynb
(20525B)
1 {
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {},
6 "source": [
7 "# How to further train a pre-trained model"
8 ]
9 },
10 {
11 "cell_type": "markdown",
12 "metadata": {},
13 "source": [
14 "We will demonstrate how to freeze some or all of the layers of a pre-trained model and continue training using a new fully-connected set of layers and data with a different format."
15 ]
16 },
17 {
18 "cell_type": "markdown",
19 "metadata": {},
20 "source": [
21 "## Imports & Settings"
22 ]
23 },
24 {
25 "cell_type": "code",
26 "execution_count": 116,
27 "metadata": {},
28 "outputs": [],
29 "source": [
30 "from sklearn.datasets import load_files \n",
31 "from keras.utils import np_utils\n",
32 "import numpy as np\n",
33 "from pathlib import Path\n",
34 "\n",
35 "from keras.datasets import cifar10\n",
36 "from keras.utils import to_categorical\n",
37 "from keras.preprocessing.image import ImageDataGenerator\n",
38 "from keras.applications.vgg16 import VGG16\n",
39 "from keras.layers import Dense, Flatten, Dropout\n",
40 "from keras.models import Sequential, Model \n",
41 "from keras.callbacks import ModelCheckpoint, TensorBoard\n",
42 "import matplotlib.pyplot as plt\n",
43 "%matplotlib inline"
44 ]
45 },
46 {
47 "cell_type": "markdown",
48 "metadata": {},
49 "source": [
50 "## Load Dog Dataset"
51 ]
52 },
53 {
54 "cell_type": "markdown",
55 "metadata": {},
56 "source": [
57 "Before running the code cell below, download the dataset of dog images [here](https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip)."
58 ]
59 },
60 {
61 "cell_type": "code",
62 "execution_count": 68,
63 "metadata": {},
64 "outputs": [],
65 "source": [
66 "(X_train, y_train), (X_test, y_test) = cifar10.load_data()"
67 ]
68 },
69 {
70 "cell_type": "code",
71 "execution_count": 69,
72 "metadata": {},
73 "outputs": [],
74 "source": [
75 "cifar10_labels = {0: 'airplane',\n",
76 " 1: 'automobile',\n",
77 " 2: 'bird',\n",
78 " 3: 'cat',\n",
79 " 4: 'deer',\n",
80 " 5: 'dog',\n",
81 " 6: 'frog',\n",
82 " 7: 'horse',\n",
83 " 8: 'ship',\n",
84 " 9: 'truck'}"
85 ]
86 },
87 {
88 "cell_type": "code",
89 "execution_count": 70,
90 "metadata": {},
91 "outputs": [],
92 "source": [
93 "num_classes = len(cifar10_labels)"
94 ]
95 },
96 {
97 "cell_type": "code",
98 "execution_count": 71,
99 "metadata": {},
100 "outputs": [],
101 "source": [
102 "y_train = to_categorical(y_train, num_classes)\n",
103 "y_test = to_categorical(y_test, num_classes)"
104 ]
105 },
106 {
107 "cell_type": "code",
108 "execution_count": 72,
109 "metadata": {},
110 "outputs": [],
111 "source": [
112 "# X_train, X_valid = X_train[5000:], X_train[:5000]\n",
113 "# y_train, y_valid = y_train[5000:], y_train[:5000]"
114 ]
115 },
116 {
117 "cell_type": "markdown",
118 "metadata": {},
119 "source": [
120 "## Obtain the VGG-16 Bottleneck Features"
121 ]
122 },
123 {
124 "cell_type": "markdown",
125 "metadata": {},
126 "source": [
127 "We use the VGG16 weights, pre-trained on ImageNet with the much smaller 32 x 32 CIFAR10 data. Note that we indicate the new input size upon import and set all layers to not trainable:"
128 ]
129 },
130 {
131 "cell_type": "code",
132 "execution_count": 118,
133 "metadata": {},
134 "outputs": [
135 {
136 "name": "stdout",
137 "output_type": "stream",
138 "text": [
139 "_________________________________________________________________\n",
140 "Layer (type) Output Shape Param # \n",
141 "=================================================================\n",
142 "input_7 (InputLayer) (None, 32, 32, 3) 0 \n",
143 "_________________________________________________________________\n",
144 "block1_conv1 (Conv2D) (None, 32, 32, 64) 1792 \n",
145 "_________________________________________________________________\n",
146 "block1_conv2 (Conv2D) (None, 32, 32, 64) 36928 \n",
147 "_________________________________________________________________\n",
148 "block1_pool (MaxPooling2D) (None, 16, 16, 64) 0 \n",
149 "_________________________________________________________________\n",
150 "block2_conv1 (Conv2D) (None, 16, 16, 128) 73856 \n",
151 "_________________________________________________________________\n",
152 "block2_conv2 (Conv2D) (None, 16, 16, 128) 147584 \n",
153 "_________________________________________________________________\n",
154 "block2_pool (MaxPooling2D) (None, 8, 8, 128) 0 \n",
155 "_________________________________________________________________\n",
156 "block3_conv1 (Conv2D) (None, 8, 8, 256) 295168 \n",
157 "_________________________________________________________________\n",
158 "block3_conv2 (Conv2D) (None, 8, 8, 256) 590080 \n",
159 "_________________________________________________________________\n",
160 "block3_conv3 (Conv2D) (None, 8, 8, 256) 590080 \n",
161 "_________________________________________________________________\n",
162 "block3_pool (MaxPooling2D) (None, 4, 4, 256) 0 \n",
163 "_________________________________________________________________\n",
164 "block4_conv1 (Conv2D) (None, 4, 4, 512) 1180160 \n",
165 "_________________________________________________________________\n",
166 "block4_conv2 (Conv2D) (None, 4, 4, 512) 2359808 \n",
167 "_________________________________________________________________\n",
168 "block4_conv3 (Conv2D) (None, 4, 4, 512) 2359808 \n",
169 "_________________________________________________________________\n",
170 "block4_pool (MaxPooling2D) (None, 2, 2, 512) 0 \n",
171 "_________________________________________________________________\n",
172 "block5_conv1 (Conv2D) (None, 2, 2, 512) 2359808 \n",
173 "_________________________________________________________________\n",
174 "block5_conv2 (Conv2D) (None, 2, 2, 512) 2359808 \n",
175 "_________________________________________________________________\n",
176 "block5_conv3 (Conv2D) (None, 2, 2, 512) 2359808 \n",
177 "_________________________________________________________________\n",
178 "block5_pool (MaxPooling2D) (None, 1, 1, 512) 0 \n",
179 "=================================================================\n",
180 "Total params: 14,714,688\n",
181 "Trainable params: 14,714,688\n",
182 "Non-trainable params: 0\n",
183 "_________________________________________________________________\n"
184 ]
185 }
186 ],
187 "source": [
188 "vgg16 = VGG16(include_top=False, input_shape =X_train.shape[1:])\n",
189 "vgg16.summary()"
190 ]
191 },
192 {
193 "cell_type": "markdown",
194 "metadata": {},
195 "source": [
196 "## Freeze model layers"
197 ]
198 },
199 {
200 "cell_type": "markdown",
201 "metadata": {},
202 "source": [
203 "### Selectively freeze layers"
204 ]
205 },
206 {
207 "cell_type": "code",
208 "execution_count": 120,
209 "metadata": {},
210 "outputs": [],
211 "source": [
212 "for layer in vgg16.layers:\n",
213 " layer.trainable = False"
214 ]
215 },
216 {
217 "cell_type": "code",
218 "execution_count": 98,
219 "metadata": {},
220 "outputs": [
221 {
222 "name": "stdout",
223 "output_type": "stream",
224 "text": [
225 "_________________________________________________________________\n",
226 "Layer (type) Output Shape Param # \n",
227 "=================================================================\n",
228 "input_6 (InputLayer) (None, 32, 32, 3) 0 \n",
229 "_________________________________________________________________\n",
230 "block1_conv1 (Conv2D) (None, 32, 32, 64) 1792 \n",
231 "_________________________________________________________________\n",
232 "block1_conv2 (Conv2D) (None, 32, 32, 64) 36928 \n",
233 "_________________________________________________________________\n",
234 "block1_pool (MaxPooling2D) (None, 16, 16, 64) 0 \n",
235 "_________________________________________________________________\n",
236 "block2_conv1 (Conv2D) (None, 16, 16, 128) 73856 \n",
237 "_________________________________________________________________\n",
238 "block2_conv2 (Conv2D) (None, 16, 16, 128) 147584 \n",
239 "_________________________________________________________________\n",
240 "block2_pool (MaxPooling2D) (None, 8, 8, 128) 0 \n",
241 "_________________________________________________________________\n",
242 "block3_conv1 (Conv2D) (None, 8, 8, 256) 295168 \n",
243 "_________________________________________________________________\n",
244 "block3_conv2 (Conv2D) (None, 8, 8, 256) 590080 \n",
245 "_________________________________________________________________\n",
246 "block3_conv3 (Conv2D) (None, 8, 8, 256) 590080 \n",
247 "_________________________________________________________________\n",
248 "block3_pool (MaxPooling2D) (None, 4, 4, 256) 0 \n",
249 "_________________________________________________________________\n",
250 "block4_conv1 (Conv2D) (None, 4, 4, 512) 1180160 \n",
251 "_________________________________________________________________\n",
252 "block4_conv2 (Conv2D) (None, 4, 4, 512) 2359808 \n",
253 "_________________________________________________________________\n",
254 "block4_conv3 (Conv2D) (None, 4, 4, 512) 2359808 \n",
255 "_________________________________________________________________\n",
256 "block4_pool (MaxPooling2D) (None, 2, 2, 512) 0 \n",
257 "_________________________________________________________________\n",
258 "block5_conv1 (Conv2D) (None, 2, 2, 512) 2359808 \n",
259 "_________________________________________________________________\n",
260 "block5_conv2 (Conv2D) (None, 2, 2, 512) 2359808 \n",
261 "_________________________________________________________________\n",
262 "block5_conv3 (Conv2D) (None, 2, 2, 512) 2359808 \n",
263 "_________________________________________________________________\n",
264 "block5_pool (MaxPooling2D) (None, 1, 1, 512) 0 \n",
265 "=================================================================\n",
266 "Total params: 14,714,688\n",
267 "Trainable params: 0\n",
268 "Non-trainable params: 14,714,688\n",
269 "_________________________________________________________________\n"
270 ]
271 }
272 ],
273 "source": [
274 "vgg16.summary()"
275 ]
276 },
277 {
278 "cell_type": "markdown",
279 "metadata": {},
280 "source": [
281 "### Add new layers to model"
282 ]
283 },
284 {
285 "cell_type": "markdown",
286 "metadata": {},
287 "source": [
288 "We use Keras’ functional API to define the vgg16 output as input into a new set of fully-connected layers like so:"
289 ]
290 },
291 {
292 "cell_type": "code",
293 "execution_count": 99,
294 "metadata": {},
295 "outputs": [],
296 "source": [
297 "#Adding custom Layers \n",
298 "x = vgg16.output\n",
299 "x = Flatten()(x)\n",
300 "x = Dense(512, activation=\"relu\")(x)\n",
301 "x = Dropout(0.5)(x)\n",
302 "x = Dense(256, activation=\"relu\")(x)\n",
303 "predictions = Dense(10, activation=\"softmax\")(x)"
304 ]
305 },
306 {
307 "cell_type": "markdown",
308 "metadata": {},
309 "source": [
310 "We define a new model in terms of inputs and output, and proceed from there on as before:"
311 ]
312 },
313 {
314 "cell_type": "code",
315 "execution_count": 100,
316 "metadata": {},
317 "outputs": [],
318 "source": [
319 "transfer_model = Model(inputs = vgg16.input, \n",
320 " outputs = predictions)"
321 ]
322 },
323 {
324 "cell_type": "code",
325 "execution_count": 101,
326 "metadata": {},
327 "outputs": [],
328 "source": [
329 "transfer_model.compile(loss = 'categorical_crossentropy', \n",
330 " optimizer = 'Adam', \n",
331 " metrics=[\"accuracy\"])"
332 ]
333 },
334 {
335 "cell_type": "code",
336 "execution_count": 102,
337 "metadata": {},
338 "outputs": [],
339 "source": [
340 "validation_split = .1"
341 ]
342 },
343 {
344 "cell_type": "markdown",
345 "metadata": {},
346 "source": [
347 "We use a more elaborate ImageDataGenerator that also defines a validation_split:"
348 ]
349 },
350 {
351 "cell_type": "code",
352 "execution_count": 103,
353 "metadata": {},
354 "outputs": [],
355 "source": [
356 "datagen = ImageDataGenerator(\n",
357 " rescale=1. / 255,\n",
358 " horizontal_flip=True,\n",
359 " fill_mode='nearest',\n",
360 " zoom_range=0.1,\n",
361 " width_shift_range=0.1,\n",
362 " height_shift_range=0.1,\n",
363 " rotation_range=30,\n",
364 " validation_split=validation_split)"
365 ]
366 },
367 {
368 "cell_type": "code",
369 "execution_count": 104,
370 "metadata": {},
371 "outputs": [],
372 "source": [
373 "batch_size =32\n",
374 "epochs = 10"
375 ]
376 },
377 {
378 "cell_type": "markdown",
379 "metadata": {},
380 "source": [
381 "We define both train- and validation generators for the fit method:"
382 ]
383 },
384 {
385 "cell_type": "code",
386 "execution_count": 105,
387 "metadata": {},
388 "outputs": [],
389 "source": [
390 "train_generator = datagen.flow(X_train, \n",
391 " y_train, \n",
392 " subset='training')\n",
393 "val_generator = datagen.flow(X_train, \n",
394 " y_train, \n",
395 " subset='validation')"
396 ]
397 },
398 {
399 "cell_type": "code",
400 "execution_count": 108,
401 "metadata": {},
402 "outputs": [],
403 "source": [
404 "vgg16_path = 'models/cifar10.transfer.vgg16.weights.best.hdf5'\n",
405 "checkpointer = ModelCheckpoint(filepath=vgg16_path, \n",
406 " verbose=1, \n",
407 " save_best_only=True)"
408 ]
409 },
410 {
411 "cell_type": "markdown",
412 "metadata": {},
413 "source": [
414 "And now we proceed to train the model:"
415 ]
416 },
417 {
418 "cell_type": "code",
419 "execution_count": 109,
420 "metadata": {},
421 "outputs": [
422 {
423 "name": "stdout",
424 "output_type": "stream",
425 "text": [
426 "Epoch 1/10\n",
427 "1562/1562 [==============================] - 16s 10ms/step - loss: 1.5325 - acc: 0.4553 - val_loss: 1.3096 - val_acc: 0.5438\n",
428 "\n",
429 "Epoch 00001: val_loss improved from inf to 1.30961, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
430 "Epoch 2/10\n",
431 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.3717 - acc: 0.5138 - val_loss: 1.2726 - val_acc: 0.5532\n",
432 "\n",
433 "Epoch 00002: val_loss improved from 1.30961 to 1.27260, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
434 "Epoch 3/10\n",
435 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.3253 - acc: 0.5339 - val_loss: 1.2515 - val_acc: 0.5591\n",
436 "\n",
437 "Epoch 00003: val_loss improved from 1.27260 to 1.25149, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
438 "Epoch 4/10\n",
439 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.3060 - acc: 0.5410 - val_loss: 1.2249 - val_acc: 0.5715\n",
440 "\n",
441 "Epoch 00004: val_loss improved from 1.25149 to 1.22492, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
442 "Epoch 5/10\n",
443 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.2715 - acc: 0.5509 - val_loss: 1.2011 - val_acc: 0.5766\n",
444 "\n",
445 "Epoch 00005: val_loss improved from 1.22492 to 1.20108, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
446 "Epoch 6/10\n",
447 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.2526 - acc: 0.5595 - val_loss: 1.1950 - val_acc: 0.5868\n",
448 "\n",
449 "Epoch 00006: val_loss improved from 1.20108 to 1.19496, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
450 "Epoch 7/10\n",
451 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.2387 - acc: 0.5635 - val_loss: 1.1926 - val_acc: 0.5783\n",
452 "\n",
453 "Epoch 00007: val_loss improved from 1.19496 to 1.19262, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
454 "Epoch 8/10\n",
455 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.2249 - acc: 0.5678 - val_loss: 1.1779 - val_acc: 0.5948\n",
456 "\n",
457 "Epoch 00008: val_loss improved from 1.19262 to 1.17794, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
458 "Epoch 9/10\n",
459 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.2177 - acc: 0.5700 - val_loss: 1.1687 - val_acc: 0.5927\n",
460 "\n",
461 "Epoch 00009: val_loss improved from 1.17794 to 1.16873, saving model to models/cifar10.transfer.vgg16.weights.best.hdf5\n",
462 "Epoch 10/10\n",
463 "1562/1562 [==============================] - 15s 10ms/step - loss: 1.2042 - acc: 0.5739 - val_loss: 1.1826 - val_acc: 0.5862\n",
464 "\n",
465 "Epoch 00010: val_loss did not improve from 1.16873\n"
466 ]
467 }
468 ],
469 "source": [
470 "transfer_model.fit_generator(train_generator,\n",
471 " steps_per_epoch=X_train.shape[0] // batch_size,\n",
472 " epochs=epochs,\n",
473 " validation_data=val_generator,\n",
474 " validation_steps=(X_train.shape[0] * .2) // batch_size,\n",
475 " callbacks=[checkpointer],\n",
476 " verbose=1)"
477 ]
478 },
479 {
480 "cell_type": "code",
481 "execution_count": 111,
482 "metadata": {},
483 "outputs": [],
484 "source": [
485 "# load the weights that yielded the best validation accuracy\n",
486 "transfer_model.load_weights(vgg16_path)"
487 ]
488 },
489 {
490 "cell_type": "code",
491 "execution_count": 112,
492 "metadata": {},
493 "outputs": [
494 {
495 "name": "stdout",
496 "output_type": "stream",
497 "text": [
498 "10000/10000 [==============================] - 2s 236us/step\n"
499 ]
500 },
501 {
502 "data": {
503 "text/plain": [
504 "0.3587"
505 ]
506 },
507 "execution_count": 112,
508 "metadata": {},
509 "output_type": "execute_result"
510 }
511 ],
512 "source": [
513 "transfer_model.evaluate(X_test, y_test)[1]"
514 ]
515 },
516 {
517 "cell_type": "markdown",
518 "metadata": {},
519 "source": [
520 "### Test Set Classification Accuracy"
521 ]
522 },
523 {
524 "cell_type": "markdown",
525 "metadata": {},
526 "source": [
527 "10 epochs lead to a mediocre test accuracy of 35.87% because the assumption that image features translate to so much smaller images is somewhat questionable but it serves to illustrate the workflow."
528 ]
529 },
530 {
531 "cell_type": "code",
532 "execution_count": 114,
533 "metadata": {},
534 "outputs": [],
535 "source": [
536 "# get index of predicted dog breed for each image in test set\n",
537 "vgg16_predictions = np.argmax(transfer_model.predict(X_test), axis=1)"
538 ]
539 },
540 {
541 "cell_type": "code",
542 "execution_count": 115,
543 "metadata": {},
544 "outputs": [
545 {
546 "name": "stdout",
547 "output_type": "stream",
548 "text": [
549 "\n",
550 "Test accuracy: 0.3587%\n"
551 ]
552 }
553 ],
554 "source": [
555 "test_accuracy = np.sum(vgg16_predictions==np.argmax(y_test, axis=1))/len(vgg16_predictions)\n",
556 "print('\\nTest accuracy: %.4f%%' % test_accuracy)"
557 ]
558 }
559 ],
560 "metadata": {
561 "kernelspec": {
562 "display_name": "Python 3",
563 "language": "python",
564 "name": "python3"
565 },
566 "language_info": {
567 "codemirror_mode": {
568 "name": "ipython",
569 "version": 3
570 },
571 "file_extension": ".py",
572 "mimetype": "text/x-python",
573 "name": "python",
574 "nbconvert_exporter": "python",
575 "pygments_lexer": "ipython3",
576 "version": "3.6.8"
577 },
578 "toc": {
579 "base_numbering": 1,
580 "nav_menu": {},
581 "number_sections": true,
582 "sideBar": true,
583 "skip_h1_title": true,
584 "title_cell": "Table of Contents",
585 "title_sidebar": "Contents",
586 "toc_cell": false,
587 "toc_position": {
588 "height": "calc(100% - 180px)",
589 "left": "10px",
590 "top": "150px",
591 "width": "318.55px"
592 },
593 "toc_section_display": true,
594 "toc_window_display": true
595 }
596 },
597 "nbformat": 4,
598 "nbformat_minor": 2
599 }