mirror of
https://github.com/Azure/MachineLearningNotebooks.git
synced 2025-12-20 17:45:10 -05:00
Update notebooks
This commit is contained in:
@@ -15,7 +15,7 @@
|
||||
"source": [
|
||||
"# AutoML 08: Remote Execution with DataStore\n",
|
||||
"\n",
|
||||
"In this sample accesses a data file on a remote DSVM through DataStore. Advantagets of using data store\n",
|
||||
"This sample accesses a data file on a remote DSVM through DataStore. Advantages of using data store are:\n",
|
||||
"1. DataStore secures the access details.\n",
|
||||
"2. DataStore supports read, write to blob and file store\n",
|
||||
"3. AutoML natively supports copying data from DataStore to DSVM\n",
|
||||
@@ -23,8 +23,8 @@
|
||||
"Make sure you have executed the [00.configuration](00.configuration.ipynb) before running this notebook.\n",
|
||||
"\n",
|
||||
"In this notebook you would see\n",
|
||||
"1. Configuring the DSVM to allow files to be access directly by the get_data method.\n",
|
||||
"2. get_data returning data from a local file.\n",
|
||||
"1. Storing data in DataStore.\n",
|
||||
"2. get_data returning data from DataStore.\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
@@ -285,11 +285,11 @@
|
||||
" le = LabelEncoder()\n",
|
||||
" le.fit(df[\"Label\"].values)\n",
|
||||
" y = le.transform(df[\"Label\"].values)\n",
|
||||
" df = df.drop([\"Label\"], axis=1)\n",
|
||||
" X = df.drop([\"Label\"], axis=1)\n",
|
||||
"\n",
|
||||
" df_train, _, y_train, _ = train_test_split(df, y, test_size=0.1, random_state=42)\n",
|
||||
" X_train, _, y_train, _ = train_test_split(X, y, test_size=0.1, random_state=42)\n",
|
||||
"\n",
|
||||
" return { \"X\" : df.values, \"y\" : y }"
|
||||
" return { \"X\" : X_train.values, \"y\" : y_train }"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -300,7 +300,7 @@
|
||||
"\n",
|
||||
"You can specify automl_settings as **kwargs** as well. Also note that you can use the get_data() symantic for local excutions too. \n",
|
||||
"\n",
|
||||
"<i>Note: For Remote DSVM and Batch AI you cannot pass Numpy arrays directly to the fit method.</i>\n",
|
||||
"<i>Note: For Remote DSVM and Batch AI you cannot pass Numpy arrays directly to AutoMLConfig.</i>\n",
|
||||
"\n",
|
||||
"|Property|Description|\n",
|
||||
"|-|-|\n",
|
||||
@@ -342,7 +342,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Training the Model <a class=\"anchor\" id=\"Training-the-model-Remote-DSVM\"></a>\n",
|
||||
"## Training the Models <a class=\"anchor\" id=\"Training-the-model-Remote-DSVM\"></a>\n",
|
||||
"\n",
|
||||
"For remote runs the execution is asynchronous, so you will see the iterations get populated as they complete. You can interact with the widgets/models even when the experiment is running to retreive the best model up to that point. Once you are satisfied with the model you can cancel a particular iteration or the whole run."
|
||||
]
|
||||
@@ -410,7 +410,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Canceling runs\n",
|
||||
"## Canceling Runs\n",
|
||||
"You can cancel ongoing remote runs using the *cancel()* and *cancel_iteration()* functions"
|
||||
]
|
||||
},
|
||||
@@ -433,7 +433,7 @@
|
||||
"source": [
|
||||
"### Retrieve the Best Model\n",
|
||||
"\n",
|
||||
"Below we select the best pipeline from our iterations. The *get_output* method on automl_classifier returns the best run and the fitted model for the last *fit* invocation. There are overloads on *get_output* that allow you to retrieve the best run and fitted model for *any* logged metric or a particular *iteration*."
|
||||
"Below we select the best pipeline from our iterations. The *get_output* method returns the best run and the fitted model. There are overloads on *get_output* that allow you to retrieve the best run and fitted model for *any* logged metric or a particular *iteration*."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -483,26 +483,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Register fitted model for deployment"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#description = 'AutoML Model'\n",
|
||||
"#tags = None\n",
|
||||
"#remote_run.register_model(description=description, tags=tags)\n",
|
||||
"#remote_run.model_id # Use this id to deploy the model as a web service in Azure"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Testing the Fitted Model <a class=\"anchor\" id=\"Testing-the-Fitted-Model-Remote-DSVM\"></a>\n"
|
||||
"### Testing the Best Fitted Model <a class=\"anchor\" id=\"Testing-the-Fitted-Model-Remote-DSVM\"></a>\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -523,11 +504,11 @@
|
||||
"le = LabelEncoder()\n",
|
||||
"le.fit(df[\"Label\"].values)\n",
|
||||
"y = le.transform(df[\"Label\"].values)\n",
|
||||
"df = df.drop([\"Label\"], axis=1)\n",
|
||||
"X = df.drop([\"Label\"], axis=1)\n",
|
||||
"\n",
|
||||
"_, df_test, _, y_test = train_test_split(df, y, test_size=0.1, random_state=42)\n",
|
||||
"_, X_test, _, y_test = train_test_split(X, y, test_size=0.1, random_state=42)\n",
|
||||
"\n",
|
||||
"ypred = fitted_model.predict(df_test.values)\n",
|
||||
"ypred = fitted_model.predict(X_test.values)\n",
|
||||
"\n",
|
||||
"ypred_strings = le.inverse_transform(ypred)\n",
|
||||
"ytest_strings = le.inverse_transform(y_test)\n",
|
||||
@@ -541,6 +522,11 @@
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"authors": [
|
||||
{
|
||||
"name": "savitam"
|
||||
}
|
||||
],
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3.6",
|
||||
"language": "python",
|
||||
|
||||
Reference in New Issue
Block a user