From 7d552effb0fa6d2b599b1dead76d7a1023f86c04 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Shan=C3=A9=20Winner?=
 <43390034+swinner95@users.noreply.github.com>
Date: Wed, 21 Aug 2019 10:17:01 -0700
Subject: [PATCH] Delete split-column-by-example.ipynb

---
 .../split-column-by-example.ipynb             | 220 ------------------
 1 file changed, 220 deletions(-)
 delete mode 100644 work-with-data/dataprep/how-to-guides/split-column-by-example.ipynb

diff --git a/work-with-data/dataprep/how-to-guides/split-column-by-example.ipynb b/work-with-data/dataprep/how-to-guides/split-column-by-example.ipynb
deleted file mode 100644
index 02c74746..00000000
--- a/work-with-data/dataprep/how-to-guides/split-column-by-example.ipynb
+++ /dev/null
@@ -1,220 +0,0 @@
-{
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/work-with-data/dataprep/how-to-guides/split-column-by-example.png)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "# Split column by example\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "DataPrep also offers you a way to easily split a column into multiple columns.\n",
-        "The SplitColumnByExampleBuilder class lets you generate a proper split program that will work even when the cases are not trivial, like in example below."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "import azureml.dataprep as dprep"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "dflow = dprep.read_lines(path='../data/crime.txt')\n",
-        "df = dflow.head(10)"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "df['Line'].iloc[0]"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "As you can see above, you can't split this particular file by space character as it will create too many columns.\n",
-        "That's where split_column_by_example could be quite useful."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "builder = dflow.builders.split_column_by_example('Line', keep_delimiters=True)"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "builder.preview()"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "Couple things to take note of here. No examples were given, and yet DataPrep was able to generate quite reasonable split program. \n",
-        "We have passed keep_delimiters=True so we can see all the data split into columns. In practice, though, delimiters are rarely useful, so let's exclude them."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "builder.keep_delimiters = False\n",
-        "builder.preview()"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "This looks pretty good already, except that one case number is split into 2 columns. Taking the first row as an example, we want to keep case number as \"HY329907\" instead of \"HY\" and \"329907\" seperately.  \n",
-        "If we request generation of suggested examples we will get a list of examples that require input."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "suggestions = builder.generate_suggested_examples()\n",
-        "suggestions"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "suggestions.iloc[0]['Line']"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "Having retrieved source value we can now provide an example of desired split.\n",
-        "Notice that we chose not to split date and time but rather keep them together in one column."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "builder.add_example(example=(suggestions['Line'].iloc[0], ['10140490','HY329907','7/5/2015  23:50','050XX N NEWLAND AVE','820','THEFT']))"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "builder.preview()"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "As we can see from the preview, some of the crime types (`Line_6`) do not show up as expected. Let's try to add one more example. "
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "builder.add_example(example=(df['Line'].iloc[1],['10139776','HY329265','7/5/2015  23:30','011XX W MORSE AVE','460','BATTERY']))\n",
-        "builder.preview()"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "This looks just like what we need. Let's get a dataflow with splited columns and drop original column."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "dflow = builder.to_dataflow()\n",
-        "dflow = dflow.drop_columns(['Line'])\n",
-        "dflow.head(5)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "Now we have successfully split the data into useful columns through examples."
-      ]
-    }
-  ],
-  "metadata": {
-    "authors": [
-      {
-        "name": "sihhu"
-      }
-    ],
-    "kernelspec": {
-      "display_name": "Python 3.6",
-      "language": "python",
-      "name": "python36"
-    },
-    "language_info": {
-      "codemirror_mode": {
-        "name": "ipython",
-        "version": 3
-      },
-      "file_extension": ".py",
-      "mimetype": "text/x-python",
-      "name": "python",
-      "nbconvert_exporter": "python",
-      "pygments_lexer": "ipython3",
-      "version": "3.6.8"
-    },
-    "notice": "Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT License."
-  },
-  "nbformat": 4,
-  "nbformat_minor": 2
-}
\ No newline at end of file