From 7047f7629940584ab8647290739163cd30da35ae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Shan=C3=A9=20Winner?=
 <43390034+swinner95@users.noreply.github.com>
Date: Wed, 21 Aug 2019 10:10:56 -0700
Subject: [PATCH] Delete custom-python-transforms.ipynb

---
 .../custom-python-transforms.ipynb            | 231 ------------------
 1 file changed, 231 deletions(-)
 delete mode 100644 work-with-data/dataprep/how-to-guides/custom-python-transforms.ipynb

diff --git a/work-with-data/dataprep/how-to-guides/custom-python-transforms.ipynb b/work-with-data/dataprep/how-to-guides/custom-python-transforms.ipynb
deleted file mode 100644
index 43a3ca62..00000000
--- a/work-with-data/dataprep/how-to-guides/custom-python-transforms.ipynb
+++ /dev/null
@@ -1,231 +0,0 @@
-{
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/work-with-data/dataprep/how-to-guides/custom-python-transforms.png)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "# Custom Python Transforms\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "There will be scenarios when the easiest thing for you to do is just to write some Python code. This SDK provides three extension points that you can use.\n",
-        "\n",
-        "1. New Script Column\n",
-        "2. New Script Filter\n",
-        "3. Transform Partition\n",
-        "\n",
-        "Each of these are supported in both the scale-up and the scale-out runtime. A key advantage of using these extension points is that you don't need to pull all of the data in order to create a dataframe. Your custom python code will be run just like other transforms, at scale, by partition, and typically in parallel."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "## Initial data prep"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "We start by loading crime data."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "import azureml.dataprep as dprep\n",
-        "col = dprep.col\n",
-        "\n",
-        "dflow = dprep.read_csv(path='../data/crime-spring.csv')\n",
-        "dflow.head(5)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "We trim the dataset down and keep only the columns we are interested in. "
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "dflow = dflow.keep_columns(['Case Number','Primary Type', 'Description', 'Latitude', 'Longitude'])\n",
-        "dflow = dflow.replace_na(columns=['Latitude', 'Longitude'], custom_na_list='')\n",
-        "dflow.head(5)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "We look for null values using a filter. We found some, so now we'll look at a way to fill these missing values."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "dflow.filter(col('Latitude').is_null()).head(5)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "## Transform Partition"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "We want to replace all null values with a 0, so we decide to use a handy pandas function. This code will be run by partition, not on all of the dataset at a time. This means that on a large dataset, this code may run in parallel as the runtime processes the data partition by partition."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "pt_dflow = dflow\n",
-        "dflow = pt_dflow.transform_partition(\"\"\"\n",
-        "def transform(df, index):\n",
-        "    df['Latitude'].fillna('0',inplace=True)\n",
-        "    df['Longitude'].fillna('0',inplace=True)\n",
-        "    return df\n",
-        "\"\"\")\n",
-        "dflow.head(5)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "### Transform Partition With File"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "Being able to use any python code to manipulate your data as a pandas DataFrame is extremely useful for complex and specific data operations that DataPrep doesn't handle natively. Though the code isn't very testable unfortunately, it's just sitting inside a string.\n",
-        "So to improve code testability and ease of script writing there is another transform_partiton interface that takes the path to a python script which must contain a function matching the 'transform' signature defined above.\n",
-        "\n",
-        "The `script_path` argument should be a relative path to ensure Dataflow portability. Here `map_func.py` contains the same code as in the previous example."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "dflow = pt_dflow.transform_partition_with_file('../data/map_func.py')\n",
-        "dflow.head(5)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "## New Script Column"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "We want to create a new column that has both the latitude and longitude. We can achieve it easily using [Data Prep expression](./add-column-using-expression.ipynb), which is faster in execution. Alternatively, We can do this using Python code by using the `new_script_column()` method on the dataflow. Note that we use custom Python code here for demo purpose only. In practise, you should always use Data Prep native functions as a preferred method, and use custom Python code when the functionality is not available in Data Prep. "
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "dflow = dflow.new_script_column(new_column_name='coordinates', insert_after='Longitude', script=\"\"\"\n",
-        "def newvalue(row):\n",
-        "    return '(' + row['Latitude'] + ', ' + row['Longitude'] + ')'\n",
-        "\"\"\")\n",
-        "dflow.head(5)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "## New Script Filter"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "Now we want to filter the dataset down to only the crimes that incurred over $300 in loss. We can build a Python expression that returns True if we want to keep the row, and False to drop the row."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "dflow = dflow.new_script_filter(\"\"\"\n",
-        "def includerow(row):\n",
-        "    val = row['Description']\n",
-        "    return 'OVER $ 300' in val\n",
-        "\"\"\")\n",
-        "dflow.head(5)"
-      ]
-    }
-  ],
-  "metadata": {
-    "authors": [
-      {
-        "name": "sihhu"
-      }
-    ],
-    "kernelspec": {
-      "display_name": "Python 3.6",
-      "language": "python",
-      "name": "python36"
-    },
-    "language_info": {
-      "codemirror_mode": {
-        "name": "ipython",
-        "version": 3
-      },
-      "file_extension": ".py",
-      "mimetype": "text/x-python",
-      "name": "python",
-      "nbconvert_exporter": "python",
-      "pygments_lexer": "ipython3",
-      "version": "3.6.4"
-    },
-    "notice": "Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT License."
-  },
-  "nbformat": 4,
-  "nbformat_minor": 2
-}
\ No newline at end of file