update samples from Release-3 as a part of 1.2.0 SDK stable release

2020-03-23 23:11:53 +00:00
parent 0401128638
commit 2218af619f
18 changed files with 450 additions and 240 deletions
--- a/tutorials/image-classification-mnist-data/img-classification-part2-deploy.ipynb
+++ b/tutorials/image-classification-mnist-data/img-classification-part2-deploy.ipynb
@@ -39,7 +39,11 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {
+        "tags": [
+          "register model from file"
+        ]
+      },
      "outputs": [],
      "source": [
        "# If you did NOT complete the tutorial, you can instead run this cell \n",
@@ -58,19 +62,7 @@
        "                        model_name=model_name,\n",
        "                        tags={\"data\": \"mnist\", \"model\": \"classification\"},\n",
        "                        description=\"Mnist handwriting recognition\",\n",
-        "                        workspace=ws)\n",
-        "\n",
-        "from azureml.core.environment import Environment\n",
-        "from azureml.core.conda_dependencies import CondaDependencies\n",
-        "\n",
-        "# to install required packages\n",
-        "env = Environment('tutorial-env')\n",
-        "cd = CondaDependencies.create(pip_packages=['azureml-dataprep[pandas,fuse]>=1.1.14', 'azureml-defaults'], conda_packages = ['scikit-learn==0.22.1'])\n",
-        "\n",
-        "env.python.conda_dependencies = cd\n",
-        "\n",
-        "# Register environment to re-use later\n",
-        "env.register(workspace = ws)"
+        "                        workspace=ws)"
      ]
    },
    {
@@ -106,16 +98,190 @@
        "print(\"Azure ML SDK Version: \", azureml.core.VERSION)"
      ]
    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Retrieve the model\n",
+        "\n",
+        "You registered a model in your workspace in the previous tutorial. Now, load this workspace and download the model to your local directory."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "tags": [
+          "load workspace",
+          "download model"
+        ]
+      },
+      "outputs": [],
+      "source": [
+        "from azureml.core import Workspace\n",
+        "from azureml.core.model import Model\n",
+        "import os \n",
+        "ws = Workspace.from_config()\n",
+        "model=Model(ws, 'sklearn_mnist')\n",
+        "\n",
+        "model.download(target_dir=os.getcwd(), exist_ok=True)\n",
+        "\n",
+        "# verify the downloaded model file\n",
+        "file_path = os.path.join(os.getcwd(), \"sklearn_mnist_model.pkl\")\n",
+        "\n",
+        "os.stat(file_path)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Test model locally\n",
+        "\n",
+        "Before deploying, make sure your model is working locally by:\n",
+        "* Downloading the test data if you haven't already\n",
+        "* Loading test data\n",
+        "* Predicting test data\n",
+        "* Examining the confusion matrix"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Download test data\n",
+        "If you haven't already, download the test data to the **./data/** directory"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from azureml.core import Dataset\n",
+        "from azureml.opendatasets import MNIST\n",
+        "\n",
+        "data_folder = os.path.join(os.getcwd(), 'data')\n",
+        "os.makedirs(data_folder, exist_ok=True)\n",
+        "\n",
+        "mnist_file_dataset = MNIST.get_file_dataset()\n",
+        "mnist_file_dataset.download(data_folder, overwrite=True)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Load test data\n",
+        "\n",
+        "Load the test data from the **./data/** directory created during the training tutorial."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from utils import load_data\n",
+        "import os\n",
+        "\n",
+        "data_folder = os.path.join(os.getcwd(), 'data')\n",
+        "# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the neural network converge faster\n",
+        "X_test = load_data(os.path.join(data_folder, 't10k-images-idx3-ubyte.gz'), False) / 255.0\n",
+        "y_test = load_data(os.path.join(data_folder, 't10k-labels-idx1-ubyte.gz'), True).reshape(-1)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Predict test data\n",
+        "\n",
+        "Feed the test dataset to the model to get predictions."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import pickle\n",
+        "import joblib\n",
+        "\n",
+        "clf = joblib.load( os.path.join(os.getcwd(), 'sklearn_mnist_model.pkl'))\n",
+        "y_hat = clf.predict(X_test)\n",
+        "print(y_hat)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "###  Examine the confusion matrix\n",
+        "\n",
+        "Generate a confusion matrix to see how many samples from the test set are classified correctly. Notice the mis-classified value for the incorrect predictions."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from sklearn.metrics import confusion_matrix\n",
+        "\n",
+        "conf_mx = confusion_matrix(y_test, y_hat)\n",
+        "print(conf_mx)\n",
+        "print('Overall accuracy:', np.average(y_hat == y_test))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Use `matplotlib` to display the confusion matrix as a graph. In this graph, the X axis represents the actual values, and the Y axis represents the predicted values. The color in each grid represents the error rate. The lighter the color, the higher the error rate is. For example, many 5's are mis-classified as 3's. Hence you see a bright grid at (5,3)."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# normalize the diagonal cells so that they don't overpower the rest of the cells when visualized\n",
+        "row_sums = conf_mx.sum(axis=1, keepdims=True)\n",
+        "norm_conf_mx = conf_mx / row_sums\n",
+        "np.fill_diagonal(norm_conf_mx, 0)\n",
+        "\n",
+        "fig = plt.figure(figsize=(8,5))\n",
+        "ax = fig.add_subplot(111)\n",
+        "cax = ax.matshow(norm_conf_mx, cmap=plt.cm.bone)\n",
+        "ticks = np.arange(0, 10, 1)\n",
+        "ax.set_xticks(ticks)\n",
+        "ax.set_yticks(ticks)\n",
+        "ax.set_xticklabels(ticks)\n",
+        "ax.set_yticklabels(ticks)\n",
+        "fig.colorbar(cax)\n",
+        "plt.ylabel('true labels', fontsize=14)\n",
+        "plt.xlabel('predicted values', fontsize=14)\n",
+        "plt.savefig('conf.png')\n",
+        "plt.show()"
+      ]
+    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Deploy as web service\n",
        "\n",
-        "Deploy the model as a web service hosted in ACI. \n",
+        "Once you've tested the model and are satisfied with the results, deploy the model as a web service hosted in ACI. \n",
        "\n",
        "To build the correct environment for ACI, provide the following:\n",
        "* A scoring script to show how to use the model\n",
+        "* An environment file to show what packages need to be installed\n",
        "* A configuration file to build the ACI\n",
        "* The model you trained before\n",
        "\n",
@@ -158,6 +324,52 @@
        "    return y_hat.tolist()"
      ]
    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Create environment file\n",
+        "\n",
+        "Next, create an environment file, called myenv.yml, that specifies all of the script's package dependencies. This file is used to ensure that all of those dependencies are installed in the Docker image. This model needs `scikit-learn` and `azureml-sdk`."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "tags": [
+          "set conda dependencies"
+        ]
+      },
+      "outputs": [],
+      "source": [
+        "from azureml.core.conda_dependencies import CondaDependencies \n",
+        "\n",
+        "myenv = CondaDependencies()\n",
+        "myenv.add_conda_package(\"scikit-learn==0.22.1\")\n",
+        "myenv.add_pip_package(\"azureml-defaults\")\n",
+        "\n",
+        "with open(\"myenv.yml\",\"w\") as f:\n",
+        "    f.write(myenv.serialize_to_string())"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Review the content of the `myenv.yml` file."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "with open(\"myenv.yml\",\"r\") as f:\n",
+        "    print(f.read())"
+      ]
+    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -220,11 +432,6 @@
        "from azureml.core.webservice import Webservice\n",
        "from azureml.core.model import InferenceConfig\n",
        "from azureml.core.environment import Environment\n",
-        "from azureml.core import Workspace\n",
-        "from azureml.core.model import Model\n",
-        "\n",
-        "ws = Workspace.from_config()\n",
-        "model = Model(ws, 'sklearn_mnist')\n",
        "\n",
        "\n",
        "myenv = Environment.get(workspace=ws, name=\"tutorial-env\", version=\"1\")\n",
@@ -263,148 +470,14 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "## Test the model\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "### Download test data\n",
-        "Download the test data to the **./data/** directory"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "import os\n",
-        "from azureml.core import Dataset\n",
-        "from azureml.opendatasets import MNIST\n",
-        "\n",
-        "data_folder = os.path.join(os.getcwd(), 'data')\n",
-        "os.makedirs(data_folder, exist_ok=True)\n",
-        "\n",
-        "mnist_file_dataset = MNIST.get_file_dataset()\n",
-        "mnist_file_dataset.download(data_folder, overwrite=True)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "### Load test data\n",
-        "\n",
-        "Load the test data from the **./data/** directory created during the training tutorial."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "from utils import load_data\n",
-        "import os\n",
-        "\n",
-        "data_folder = os.path.join(os.getcwd(), 'data')\n",
-        "# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the neural network converge faster\n",
-        "X_test = load_data(os.path.join(data_folder, 't10k-images-idx3-ubyte.gz'), False) / 255.0\n",
-        "y_test = load_data(os.path.join(data_folder, 't10k-labels-idx1-ubyte.gz'), True).reshape(-1)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "### Predict test data\n",
-        "\n",
-        "Feed the test dataset to the model to get predictions.\n",
+        "## Test deployed service\n",
        "\n",
+        "Earlier you scored all the test data with the local version of the model. Now, you can test the deployed model with a random sample of 30 images from the test data.  \n",
        "\n",
        "The following code goes through these steps:\n",
        "1. Send the data as a JSON array to the web service hosted in ACI. \n",
        "\n",
-        "1. Use the SDK's `run` API to invoke the service. You can also make raw calls using any HTTP tool such as curl."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "import json\n",
-        "test = json.dumps({\"data\": X_test.tolist()})\n",
-        "test = bytes(test, encoding='utf8')\n",
-        "y_hat = service.run(input_data=test)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "###  Examine the confusion matrix\n",
-        "\n",
-        "Generate a confusion matrix to see how many samples from the test set are classified correctly. Notice the mis-classified value for the incorrect predictions."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "from sklearn.metrics import confusion_matrix\n",
-        "\n",
-        "conf_mx = confusion_matrix(y_test, y_hat)\n",
-        "print(conf_mx)\n",
-        "print('Overall accuracy:', np.average(y_hat == y_test))"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "Use `matplotlib` to display the confusion matrix as a graph. In this graph, the X axis represents the actual values, and the Y axis represents the predicted values. The color in each grid represents the error rate. The lighter the color, the higher the error rate is. For example, many 5's are mis-classified as 3's. Hence you see a bright grid at (5,3)."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "# normalize the diagonal cells so that they don't overpower the rest of the cells when visualized\n",
-        "row_sums = conf_mx.sum(axis=1, keepdims=True)\n",
-        "norm_conf_mx = conf_mx / row_sums\n",
-        "np.fill_diagonal(norm_conf_mx, 0)\n",
-        "\n",
-        "fig = plt.figure(figsize=(8,5))\n",
-        "ax = fig.add_subplot(111)\n",
-        "cax = ax.matshow(norm_conf_mx, cmap=plt.cm.bone)\n",
-        "ticks = np.arange(0, 10, 1)\n",
-        "ax.set_xticks(ticks)\n",
-        "ax.set_yticks(ticks)\n",
-        "ax.set_xticklabels(ticks)\n",
-        "ax.set_yticklabels(ticks)\n",
-        "fig.colorbar(cax)\n",
-        "plt.ylabel('true labels', fontsize=14)\n",
-        "plt.xlabel('predicted values', fontsize=14)\n",
-        "plt.savefig('conf.png')\n",
-        "plt.show()"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "## Show predictions\n",
-        "\n",
-        "Test the deployed model with a random sample of 30 images from the test data.  \n",
-        "\n",
+        "1. Use the SDK's `run` API to invoke the service. You can also make raw calls using any HTTP tool such as curl.\n",
        "\n",
        "1. Print the returned predictions and plot them along with the input images. Red font and inverse image (white on black) is used to highlight the misclassified samples. \n",
        "\n",
@@ -562,7 +635,7 @@
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
-      "version": "3.6.6"
+      "version": "3.7.6"
    },
    "msauthor": "sgilley"
  },