sphinx-quickstart on Tue Oct 3 11:18:09 2017. You can adapt this file completely to your liking, but it should at least contain the root toctree directive.
Welcome to watson-machine-learning-client’s(V4) documentation!¶
watson-machine-learning-client-V4 is a python library that allows you to work with Watson Machine Learning services.
Train, store, deploy your models and score them using the APIs and integrate them with your application development.
NOTE: DEPRECATED!! This Watson Machine Learning client V4 Beta version is deprecated starting Sep 1st, 2020 and will be
discontinued at the end of the migration period. Migrate to new V4 GA release of IBM Watson Machine Learning Python client.
Refer to the documentation
for the migration process to be able to access new features
NOTE: DEPRECATED!! Python 3.6 framework is deprecated and will be removed on Jan 20th, 2021. It will be read-only
mode starting Nov 20th, 2020. i.e you won’t be able to create new assets using this client. For cloud, switch to using
IBM Watson Machine Learning Python client with python 3.7
NOTE: DEPRECATED!! Spark 2.3 framework is deprecated and will be removed on Dec 1st, 2020. For cloud, switch to using
IBM Watson Machine Learning Python client with Spark 2.4
Contents
Installation¶
There is no separate installation required for watson-machine-learning-client-V4 to use it in IBM Cloud Pak™ for Data as watson-machine-learning-client-V4 is pre-installed in Jupyter notebooks environment.
The package is available on pypi. Please use below command to install it to use with IBM cloud services.
$pip install watson-machine-learning-client-V4
Note
This package is being replaced in favor of the new package ibm-watson-machine-learning in V4 GA release of IBM Watson Machine Learning Python client. Refer to the new V4 GA release documentation for more details.
Requirements (Applicable only for IBM Cloud)¶
To create a Watson Machine Learning service instance, you can use link.
Supported machine learning frameworks¶
For the list of supported machine learning frameworks (models) on IBM cloud, please refer to Watson Machine Learning Documentation.
For the list of supported machine learning frameworks (models) of IBM Cloud Pak™ for Data version 2.5 , Please refer to Watson Machine Learning Documentation.
For the list of supported machine learning frameworks (models) of IBM Cloud Pak™ for Data version 3.0.0, Please refer to Watson Machine Learning Documentation.
API¶
To use Watson Machine learning APIs, user must create an instance of WatsonMachineLearningAPIClient with authentication details.
Authentication¶
Authentication for IBM Cloud
IBM Cloud users can create an instance of Watson Machine learning python client by providing IAM token or apikey.
Example of creating the client using apikey.
from watson_machine_learning_client import WatsonMachineLearningAPIClient
wml_credentials = {
"url": "https://us-south.ml.cloud.ibm.com",
"apikey":"***********",
"instance_id": "*****"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
Example of creating the client using token.
from watson_machine_learning_client import WatsonMachineLearningAPIClient
wml_credentials = {
"url": "https://us-south.ml.cloud.ibm.com",
"token":"***********",
"instance_id": "*****"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
Authentication for IBM Cloud Pak™ for Data(CP4D)
*Authentication for IBM Cloud Pak for Data 2.5*
IBM Cloud Pak for Data version 2.5.0 user can create Watson Machine Learning python client by providing the credentials as given below:
from watson_machine_learning_client import WatsonMachineLearningAPIClient
wml_credentials = {
"url": "<URL>",
"username": "<USERNAME>",
"password" : "<PASSWORD>",
"instance_id": "wml_local",
"version" : "2.5.0"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
Note
Please note the additional version field in the wml_credentials and the value will be “2.5.0”.
Setting default space id is mandatory for CP4D. Refer to client.set.default_space() API in this document.
*Authentication for CP4D 3.0*
IBM Cloud Pak for Data version 3.0.0 or 3.0.1 user can create Watson Machine Learning python client by providing the credentials as given below:
Example of creation of client using user Credentials:
from watson_machine_learning_client import WatsonMachineLearningAPIClient
wml_credentials = {
"url": "<URL>",
"username": "<USERNAME>",
"password" : "<PASSWORD>",
"instance_id": "wml_local",
"version" : "3.0.0"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
Example of creating the client using token:
In IBM Cloud Pak™ for Data version 3.0.0 or 3.0.1, user can authenticate with token set in the notebook environment.
access_token = os.environ['USER_ACCESS_TOKEN']
from watson_machine_learning_client import WatsonMachineLearningAPIClient
wml_credentials = {
"url": "https://us-south.ml.cloud.ibm.com",
"token": access_token,
"instance_id": "wml_local"
"version" : "3.0.0"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
Note
The version value should be set to “3.0.0” for In IBM Cloud Pak™ for Data version 3.0.0 users.
The version value should be set to “3.0.1” for In IBM Cloud Pak™ for Data version 3.0.1 users.
Setting default space id or project id is mandatory. Refer to client.set.default_space() and client.set.default_project() APIs in this document for more example.
Authentication for WML Server
In IBM Watson Machine Learning Server user can authenticate with token set in the notebook environment or with user credentials.
Example of creating the client using User Credentials:
from watson_machine_learning_client import WatsonMachineLearningAPIClient
wml_credentials = {
"url": "<URL>",
"username": "<USERNAME>",
"password" : "<PASSWORD>",
"instance_id": "wml_local",
"version" : "1.1"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
Example of creating the client using token:
access_token = os.environ['USER_ACCESS_TOKEN']
from watson_machine_learning_client import WatsonMachineLearningAPIClient
wml_credentials = {
"url": "https://us-south.ml.cloud.ibm.com",
"token": access_token,
"instance_id": "wml_local"
"version" : "1.1"
}
client = WatsonMachineLearningAPIClient(wml_credentials)
Note
The version value should be set to corresponding WML Server Product version.( Example: “1.1” or “2.0”).
Setting default space id or project id is mandatory. Refer to client.set.default_space() and client.set.default_project() APIs in this document for more example.
The “url” field value should be having the port number as well. For example, the value can be “https://wmlserver.xxx.com:31843”
data_assets (Applicable only for IBM Cloud Pak™ for Data)¶
-
class
client.Assets(client)[source]¶ Store and manage your data assets.
-
ConfigurationMetaNames= <watson_machine_learning_client.metanames.AssetsMetaNames object>¶ MetaNames for Data Assets creation.
-
create(name, file_path)[source]¶ Creates a data asset and uploads content to it.
Parameters
Important
name: Name to be given to the data asset
type: str
file_path: Path to the content file to be uploaded
type: str
Output
Important
returns: metadata of the stored data asset
return type: dict
- Example
>>> asset_details = client.data_assets.create(name="sample_asset",file_path="/path/to/file")
-
delete(asset_uid)[source]¶ Delete a stored data asset.
Parameters
Important
asset_uid: Unique Id of data asset
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.data_assets.delete(asset_uid)
-
download(asset_uid, filename)[source]¶ Download the content of a data asset.
Parameters
Important
asset_uid: The Unique Id of the data asset to be downloaded
type: str
filename: filename to be used for the downloaded file
type: str
Output
returns: Path to the downloaded asset content
return type: str
Example
>>> client.data_assets.download(asset_uid,"sample_asset.csv")
-
get_details(asset_uid)[source]¶ Get data asset details.
Parameters
Important
asset_details: Metadata of the stored data asset
type: dict
Output
Important
returns: Unique id of asset
return type: str
Example
>>> asset_details = client.data_assets.get_details(asset_uid)
-
static
get_href(asset_details)[source]¶ Get url of stored data asset.
Parameters
Important
asset_details: stored data asset details
type: dict
Output
Important
returns: href of stored data asset
return type: str
Example
>>> asset_details = client.data_assets.get_details(asset_uid) >>> asset_href = client.data_assets.get_href(asset_details)
-
static
get_uid(asset_details)[source]¶ Get Unique Id of stored data asset.
Parameters
Important
asset_details: Metadata of the stored data asset
type: dict
type: dict
Output
Important
returns: Unique Id of stored asset
return type: str
Example
>>> asset_uid = client.data_assets.get_uid(asset_details)
-
list(limit=None)[source]¶ List stored data assets. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all data assets in a table format.
return type: None
Example
>>> client.data_assets.list()
-
store(meta_props)[source]¶ Creates a data asset and uploads content to it.
Parameters
Important
meta_props: meta data of the space configuration. To see available meta names use:
>>> client.data_assets.ConfigurationMetaNames.get()
type: dict
- Example
Example for data asset creation for files :
>>> metadata = { >>> client.data_assets.ConfigurationMetaNames.NAME: 'my data assets', >>> client.data_assets.ConfigurationMetaNames.DESCRIPTION: 'sample description', >>> client.data_assets.ConfigurationMetaNames.DATA_CONTENT_NAME: 'sample.csv' >>> } >>> asset_details = client.data_assets.store(meta_props=metadata)
Example of data asset creation using connection:
>>> metadata = { >>> client.data_assets.ConfigurationMetaNames.NAME: 'my data assets', >>> client.data_assets.ConfigurationMetaNames.DESCRIPTION: 'sample description', >>> client.data_assets.ConfigurationMetaNames.CONNECTION_ID: '39eaa1ee-9aa4-4651-b8fe-95d3ddae', >>> client.data_assets.ConfigurationMetaNames.DATA_CONTENT_NAME: 't1/sample.csv' >>> } >>> asset_details = client.data_assets.store(meta_props=metadata)
Example for data asset creation with database sources type connection:
>>> metadata = { >>> client.data_assets.ConfigurationMetaNames.NAME: 'my data assets', >>> client.data_assets.ConfigurationMetaNames.DESCRIPTION: 'sample description', >>> client.data_assets.ConfigurationMetaNames.CONNECTION_ID: '23eaf1ee-96a4-4651-b8fe-95d3dadfe', >>> client.data_assets.ConfigurationMetaNames.DATA_CONTENT_NAME: 't1' >>> } >>> asset_details = client.data_assets.store(meta_props=metadata)
-
deployments¶
-
class
client.Deployments(client)[source]¶ Deploy and score published artifacts (models and functions).
-
create(artifact_uid=None, meta_props=None, rev_id=None, **kwargs)[source]¶ Create a deployment from an artifact. As artifact we understand model or function which may be deployed.
Parameters
Important
artifact_uid: Published artifact UID (model or function uid)
type: str
meta_props: metaprops. To see the available list of metanames use:
>>> client.deployments.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: metadata of the created deployment
return type: dict
- Example
>>> meta_props = { >>> wml_client.deployments.ConfigurationMetaNames.NAME: "SAMPLE DEPLOYMENT NAME", >>> wml_client.deployments.ConfigurationMetaNames.ONLINE: {} >>> } >>> deployment_details = client.deployments.create(artifact_uid, meta_props)
-
create_job(deployment_id, meta_props)[source]¶ Create an asynchronous deployment job.
Parameters
Important
deployment_id: Unique Id of Deployment
type: str
meta_props: metaprops. To see the available list of metanames use:
>>> client.deployments.ScoringMetaNames.get() or client.deployments.DecisionOptimizationmetaNames.get()
type: dict
Output
Important
returns: metadata of the created async deployment job
return type: dict
Note
The valid payloads for scoring input are either list of values, pandas or numpy dataframes.
Example
>>> scoring_payload = {wml_client.deployments.ScoringMetaNames.INPUT_DATA: [{'fields': ['GENDER','AGE','MARITAL_STATUS','PROFESSION'], 'values': [['M',23,'Single','Student'],['M',55,'Single','Executive']]}]} >>> async_job = client.deployments.create_job(deployment_id, scoring_payload)
-
delete(deployment_uid)[source]¶ Delete deployment.
Parameters
Important
deployment uid: Unique Id of Deployment
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.deployments.delete(deployment_uid)
-
delete_job(job_uid, hard_delete=False)[source]¶ Cancels a deployment job that is currenlty running. This method is also be used to delete metadata details of the completed or canceled jobs when hard_delete parameter is set to True.
Parameters
Important
job_uid: Unique Id of deployment job which should be canceled
type: str
- hard_delete: specify True or False.
True - To delete the completed or canceled job. False - To cancel the currently running deployment job. Default value is False.
type: Boolean
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.deployments.delete_job(job_uid)
-
download(virtual_deployment_uid, filename=None)[source]¶ Downloads file deployment of specified deployment Id. Currently supported format is Core ML.
- Parameters
virtual_deployment_uid ({str_type}) – Unique Id of virtual deployment
filename ({str_type}) – filename of downloaded archive (optional)
- Returns
path to downloaded file
- Return type
{str_type}
-
get_details(deployment_uid=None, limit=None)[source]¶ Get information about your deployment(s). If deployment_uid is not passed, all deployment details are fetched.
Parameters
Important
deployment_uid: Unique Id of Deployment (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of deployment(s)
return type: dict
dict (if deployment_uid is not None) or {“resources”: [dict]} (if deployment_uid is None)
Note
If deployment_uid is not specified, all deployments metadata is fetched
Example
>>> deployment_details = client.deployments.get_details(deployment_uid) >>> deployment_details = client.deployments.get_details(deployment_uid=deployment_uid) >>> deployments_details = client.deployments.get_details()
-
static
get_download_url(deployment_details)[source]¶ Get deployment_download_url from deployment details.
- Parameters
deployment_details (dict) – Created deployment details
- Returns
deployment download URL that is used to get file deployment (for example: Core ML)
- Return type
{str_type}
A way you might use me is:
>>> deployment_url = client.deployments.get_download_url(deployment)
-
static
get_href(deployment_details)[source]¶ Get deployment_href from deployment details.
Parameters
Important
deployment_details: Metadata of the deployment
type: dict
Output
Important
returns: deployment href that is used to manage the deployment
return type: str
Example
>>> deployment_href = client.deployments.get_href(deployment)
-
get_job_details(job_uid=None, limit=None)[source]¶ Get information about your deployment job(s). If deployment job_uid is not passed, all deployment jobs details are fetched.
Parameters
Important
job_uid: Unqiue Job ID (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of deployment job(s)
return type: dict
dict (if job_uid is not None) or {“resources”: [dict]} (if job_uid is None)
Note
If job_uid is not specified, all deployment jobs metadata associated with the deployment Id is fetched
Example
>>> deployment_details = client.deployments.get_job_details() >>> deployments_details = client.deployments.get_job_details(job_uid=job_uid)
-
get_job_href(job_details)[source]¶ Get the href of the deployment job.
Parameters
Important
job_details: metadata of the deployment job
type: dict
Output
Important
returns: href of the deployment job
return type: str
Example
>>> job_details = client.deployments.get_job_details(job_uid=job_uid) >>> job_status = client.deployments.get_job_href(job_details)
-
get_job_status(job_id)[source]¶ Get the status of the deployment job.
Parameters
Important
job_id: Unique Id of the deployment job
type: str
Output
Important
returns: status of the deployment job
return type: dict
Example
>>> job_status = client.deployments.get_job_status(job_uid)
-
get_job_uid(job_details)[source]¶ Get the Unique Id of the deployment job.
Parameters
Important
job_details: metadata of the deployment job
type: dict
Output
Important
returns: Unique Id of the deployment job
return type: str
Example
>>> job_details = client.deployments.get_job_details(job_uid=job_uid) >>> job_status = client.deployments.get_job_uid(job_details)
-
static
get_scoring_href(deployment_details)[source]¶ Get scoring url from deployment details.
Parameters
Important
deployment_details: Metadata of the deployment
type: dict
Output
Important
returns: scoring endpoint url that is used for making scoring requests
return type: str
Example
>>> scoring_href = client.deployments.get_scoring_href(deployment)
-
static
get_uid(deployment_details)[source]¶ Get deployment_uid from deployment details.
Parameters
Important
deployment_details: Metadata of the deployment
type: dict
Output
Important
returns: deployment UID that is used to manage the deployment
return type: str
Example
>>> deployment_uid = client.deployments.get_uid(deployment)
-
list(limit=None)[source]¶ List deployments. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all deployments in a table format.
return type: None
Example
>>> client.deployments.list()
-
list_jobs(limit=None)[source]¶ List the async deployment jobs. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all async jobs in a table format.
return type: None
Note
This method list only async deployment jobs created for WML deployment.
Example
>>> client.deployments.list_jobs()
-
score(deployment_id, meta_props)[source]¶ Make scoring requests against deployed artifact.
Parameters
Important
deployment_id: Unique Id of the deployment to be scored
type: str
meta_props: Meta props for scoring
>>> Use client.deployments.ScoringMetaNames.show() to view the list of ScoringMetaNames.
type: dict
transaction_id: transaction id to be passed with records during payload logging (optional)
type: str
Output
Important
returns: scoring result containing prediction and probability
return type: dict
Note
client.deployments.ScoringMetaNames.INPUT_DATA is the only metaname valid for sync scoring.
The valid payloads for scoring input are either list of values, pandas or numpy dataframes.
Example
>>> scoring_payload = {wml_client.deployments.ScoringMetaNames.INPUT_DATA: >>> [{'fields': >>> ['GENDER','AGE','MARITAL_STATUS','PROFESSION'], >>> 'values': [ >>> ['M',23,'Single','Student'], >>> ['M',55,'Single','Executive'] >>> ] >>> } >>> ]} >>> predictions = client.deployments.score(deployment_id, scoring_payload) >>> predictions = client.deployments.score(deployment_id, scoring_payload,async=True)
-
update(deployment_uid, changes)[source]¶ Updates existing deployment metadata. If ASSET is patched, then ‘id’ field is mandatory and it starts a deployment with the provided asset id/rev. Deployment id remains same
Parameters
Important
deployment_uid: Unqiue Id of deployment which should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated deployment
return type: dict
Example
>>> metadata = { >>> client.deployments.ConfigurationMetaNames.NAME:"updated_Deployment", >>> client.deployments.ConfigurationMetaNames.ASSET: { "id": "ca0cd864-4582-4732-b365-3165598dc945", "rev":"2" } >>> } >>> >>> deployment_details = client.deployments.update(deployment_uid, changes=metadata)
-
-
class
metanames.DeploymentMetaNames[source]¶ Set of MetaNames for Deployments Specs.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
N
my_deploymentTAGS
list
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]DESCRIPTION
str
N
my_deploymentCUSTOM
dict
N
{}AUTO_REDEPLOY
bool
N
FalseSPACE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6COMPUTE
dict
N
NoneONLINE
dict
N
{}BATCH
dict
N
{}VIRTUAL
dict
N
{}ASSET
dict
N
{}R_SHINY
dict
N
{}HYBRID_PIPELINE_HARDWARE_SPECS
list
N
[{'id': '3342-1ce536-20dc-4444-aac7-7284cf3befc'}]HARDWARE_SPEC
dict
N
{'id': '3342-1ce536-20dc-4444-aac7-7284cf3befc'}
-
class
metanames.ScoringMetaNames[source]¶ Set of MetaNames for Scoring.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
INPUT_DATA
list
N
[{'fields': ['name', 'age', 'occupation'], 'values': [['john', 23, 'student']]}][{'name(optional)': 'string', 'id(optional)': 'string', 'fields(optional)': 'array[string]', 'values': 'array[array[string]]'}]INPUT_DATA_REFERENCES
list
N
[{'id(optional)': 'string', 'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}, 'schema(optional)': {'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}}]OUTPUT_DATA_REFERENCE
dict
N
{'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}, 'schema(optional)': {'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}}EVALUATIONS_SPEC
list
N
[{'id': 'string', 'input_target': 'string', 'metrics_names': ['auroc', 'accuracy']}][{'id(optional)': 'string', 'input_target(optional)': 'string', 'metrics_names(optional)': 'array[string]'}]ENVIRONMENT_VARIABLES
dict
N
{'my_env_var1': 'env_var_value1', 'my_env_var2': 'env_var_value2'}
-
class
metanames.DecisionOptimizationMetaNames[source]¶ Set of MetaNames for Decision Optimization.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
INPUT_DATA
list
N
[{'fields': ['name', 'age', 'occupation'], 'values': [['john', 23, 'student']]}][{'name(optional)': 'string', 'id(optional)': 'string', 'fields(optional)': 'array[string]', 'values': 'array[array[string]]'}]INPUT_DATA_REFERENCES
list
N
[{'fields': ['name', 'age', 'occupation'], 'values': [['john', 23, 'student']]}][{'name(optional)': 'string', 'id(optional)': 'string', 'fields(optional)': 'array[string]', 'values': 'array[array[string]]'}]OUTPUT_DATA
list
N
[{'name(optional)': 'string'}]OUTPUT_DATA_REFERENCES
list
N
{'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}, 'schema(optional)': {'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}}SOLVE_PARAMETERS
dict
N
experiments¶
-
class
client.Experiments(client)[source]¶ Run new experiment.
-
ConfigurationMetaNames= <watson_machine_learning_client.metanames.ExperimentMetaNames object>¶ MetaNames for experiments creation.
-
clone(experiment_uid, space_id=None, action='copy', rev_id=None)[source]¶ Creates a new experiment identical with the given experiment either in the same space or in a new space. All dependent assets will be cloned too.
Parameters
Important
model_id: Guid of the experiment to be cloned:
type: str
space_id: Guid of the space to which the experiment needs to be cloned. (optional)
type: str
action: Action specifying “copy” or “move”. (optional)
type: str
rev_id: Revision ID of the experiment. (optional)
type: str
Output
Important
returns: Metadata of the experiment cloned.
return type: dict
- Example
>>> client.experiments.clone(experiment_uid=artifact_id,space_id=space_uid,action="copy")
Note
If revision id is not specified, all revisions of the artifact are cloned
Default value of the parameter action is copy
Space guid is mandatory for move action
-
create_revision(experiment_id)[source]¶ Creates a new experiment revision. :param experiment_id: :return: stored experiment new revision details Example:
>>> experiment_revision_artifact = client.experiments.create_revision(experiment_id)
-
delete(experiment_uid)[source]¶ Delete a stored experiment.
Parameters
Important
experiment_uid: Unique Id of the stored experiment
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.experiments.delete(experiment_uid)
-
get_details(experiment_uid=None, limit=None)[source]¶ Get metadata of experiment(s). If no experiment UID is specified all experiments metadata is returned.
Parameters
Important
experiment_uid: UID of experiment (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: experiment(s) metadata
return type: dict
dict (if UID is not None) or {“resources”: [dict]} (if UID is None)
Note
If UID is not specified, all experiments metadata is fetched
Example
>>> experiment_details = client.experiments.get_details(experiment_uid) >>> experiment_details = client.experiments.get_details()
-
static
get_href(experiment_details)[source]¶ Get href of stored experiment.
Parameters
Important
experiment_details: Metadata of the stored experiment
type: dict
Output
Important
returns: href of stored experiment
return type: str
Example
>>> experiment_details = client.experiments.get_detailsf(experiment_uid) >>> experiment_href = client.experiments.get_href(experiment_details)
-
get_revision_details(experiment_uid, rev_uid)[source]¶ Get metadata of stored experiments revisions.
- Parameters
experiment_uid ({str_type}) – stored experiment UID (optional)
rev_id (int) – rev_id number of experiment
- Returns
stored experiment revision metadata
- Return type
dict
Example:
>>> experiment_details = client.repository.get_revision_details(experiment_uid,rev_id)
-
static
get_uid(experiment_details)[source]¶ Get Unique Id of stored experiment.
Parameters
Important
experiment_details: Metadata of the stored experiment
type: dict
Output
Important
returns: Unique Id of stored experiment
return type: str
Example
>>> experiment_details = client.experiments.get_detailsf(experiment_uid) >>> experiment_uid = client.experiments.get_uid(experiment_details)
-
list(limit=None)[source]¶ List stored experiments. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all experiments in a table format.
return type: None
Example
>>> client.experiments.list()
-
list_revisions(experiment_uid, limit=None)[source]¶ List all revision for the given experiment uid.
- Parameters
experiment_uid ({str_type}) – Unique id of stored experiment.
limit (int) – limit number of fetched records (optional)
- Returns
stored experiment revisions details
- Return type
table
>>> client.experiments.list_revisions(experiment_uid)
-
store(meta_props)[source]¶ Create an experiment.
Parameters
Important
meta_props: meta data of the experiment configuration. To see available meta names use:
>>> client.experiments.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: stored experiment metadata
return type: dict
Example
>>> metadata = { >>> client.experiments.ConfigurationMetaNames.NAME: 'my_experiment', >>> client.experiments.ConfigurationMetaNames.EVALUATION_METRICS: ['accuracy'], >>> client.experiments.ConfigurationMetaNames.TRAINING_REFERENCES: [ >>> { >>> 'pipeline': {'href': pipeline_href_1} >>> >>> }, >>> { >>> 'pipeline': {'href':pipeline_href_2} >>> }, >>> ] >>> } >>> experiment_details = client.experiments.store(meta_props=metadata) >>> experiment_href = client.experiments.get_href(experiment_details)
-
update(experiment_uid, changes)[source]¶ Updates existing experiment metadata.
Parameters
Important
experiment_uid: UID of experiment which definition should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated experiment
return type: dict
Example
>>> metadata = { >>> client.experiments.ConfigurationMetaNames.NAME:"updated_exp" >>> } >>> exp_details = client.experiments.update(experiment_uid, changes=metadata)
-
-
class
metanames.ExperimentMetaNames[source]¶ Set of MetaNames for experiments.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
Hand-written Digit RecognitionDESCRIPTION
str
N
Hand-written Digit Recognition trainingTAGS
list
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]EVALUATION_METHOD
str
N
multiclassEVALUATION_METRICS
list
N
[{'name': 'accuracy', 'maximize': False}][{'name(required)': 'string', 'maximize(optional)': 'boolean'}]TRAINING_REFERENCES
list
Y
[{'pipeline': {'href': '/v4/pipelines/6d758251-bb01-4aa5-a7a3-72339e2ff4d8'}}][{'pipeline(optional)': {'href(required)': 'string', 'data_bindings(optional)': [{'data_reference(required)': 'string', 'node_id(required)': 'string'}], 'nodes_parameters(optional)': [{'node_id(required)': 'string', 'parameters(required)': 'dict'}]}, 'training_lib(optional)': {'href(required)': 'string', 'compute(optional)': {'name(required)': 'string', 'nodes(optional)': 'number'}, 'runtime(optional)': {'href(required)': 'string'}, 'command(optional)': 'string', 'parameters(optional)': 'dict'}}]SPACE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6LABEL_COLUMN
str
N
labelCUSTOM
dict
N
{'field1': 'value1'}
model_definitions (Applicable only for IBM Cloud Pak for Data)¶
-
class
client.ModelDefinition(client)[source]¶ Store and manage your model_definitions.
-
ConfigurationMetaNames= <watson_machine_learning_client.metanames.ModelDefinitionMetaNames object>¶ MetaNames for model_definition creation.
-
delete(model_definition_uid)[source]¶ Delete a stored model_definition.
Parameters
Important
model_definition_uid: Unique Id of stored Model definition
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.model_definitions.delete(model_definition_uid)
-
get_details(model_definition_uid)[source]¶ Get metadata of stored model_definition.
Parameters
Important
model_definition_uid: Unique Id of model_definition
type: str
Output
Important
returns: metadata of model definition
return type: dict dict (if model_definition_uid is not None)
Example
>>> model_definition_details = client.model_definitions.get_details(model_definition_uid)
-
get_href(model_definition_details)[source]¶ Get href of stored model_definition.
- param model_definition_details
stored model_definition details
- type model_definition_details
dict
- returns
href of stored model_definition
- rtype
{str_type}
EXAMPLE:
>>> model_definition_uid = client.model_definitions.get_href(model_definition_details)
-
get_uid(model_definition_details)[source]¶ Get uid of stored model.
- Parameters
model_definition_details (dict) – stored model_definition details
- Returns
uid of stored model_definition
- Return type
{str_type}
A way you might use me is:
>>> model_definition_uid = client.model_definitions.get_uid(model_definition_details)
-
list(limit=None)[source]¶ List stored model_definition assets. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all model_definition assets in a table format.
return type: None
Example
>>> client.model_definitions.list()
-
list_revisions(model_definition_uid, limit=None)[source]¶ List stored model_definition assets. If limit is set to None there will be only first 50 records shown.
Parameters
Important
model_definition_uid: Unique id of model_definition
type: str
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all model_definition revision in a table format.
return type: None
Example
>>> client.model_definitions.list_revisions()
-
store(model_definition, meta_props)[source]¶ Create a model_definitions.
Parameters
Important
meta_props: meta data of the model_definition configuration. To see available meta names use:
>>> client.model_definitions.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: Metadata of the model_defintion created
return type: dict
Example
>>> client.model_definitions.store(model_definition, meta_props)
-
-
class
metanames.ModelDefinitionMetaNames[source]¶ Set of MetaNames for Model Definition.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
my_model_definitionDESCRIPTION
str
N
my model_definitionPLATFORM
dict
Y
{'name': 'python', 'versions': ['3.7']}{'name(required)': 'string', 'version(required)': 'version'}VERSION
str
Y
1.0COMMAND
str
N
python3 convolutional_network.pyCUSTOM
dict
N
{'field1': 'value1'}SPACE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6
pipelines¶
-
class
client.Pipelines(client)[source]¶ Store and manage your pipelines.
-
ConfigurationMetaNames= <watson_machine_learning_client.metanames.PipelineMetanames object>¶ MetaNames for pipelines creation.
-
clone(pipeline_uid, space_id=None, action='copy', rev_id=None)[source]¶ Creates a new pipeline identical with the given pipeline either in the same space or in a new space. All dependent assets will be cloned too.
Parameters
Important
pipeline_uid: Guid of the pipeline to be cloned:
type: str
space_id: Guid of the space to which the pipeline needs to be cloned. (optional)
type: str
action: Action specifying “copy” or “move”. (optional)
type: str
rev_id: Revision ID of the pipeline. (optional)
type: str
Output
Important
returns: Metadata of the pipeline cloned.
return type: dict
- Example
>>> client.pipelines.clone(pipeline_uid=artifact_id,space_id=space_uid,action="copy")
Note
If revision id is not specified, all revisions of the artifact are cloned
Default value of the parameter action is copy
Space guid is mandatory for move action
-
create_revision(pipeline_uid)[source]¶ Create a new pipeline revision.
- Parameters
pipeline_uid ({str_type}) – Unique pipeline ID
Example:
>>> client.pipelines.create_revision(pipeline_uid)
-
delete(pipeline_uid)[source]¶ Delete a stored pipeline.
Parameters
Important
pipeline_uid: Unique Id of Pipeline
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.pipelines.delete(pipeline_uid)
-
get_details(pipeline_uid=None, limit=None)[source]¶ Get metadata of stored pipeline(s). If pipeline UID is not specified returns all pipelines metadata.
Parameters
Important
pipeline_uid: Pipeline UID (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of pipeline(s)
return type: dict dict (if UID is not None) or {“resources”: [dict]} (if UID is None)
Note
If UID is not specified, all pipelines metadata is fetched
Example
>>> pipeline_details = client.pipelines.get_details(pipeline_uid) >>> pipeline_details = client.pipelines.get_details()
-
static
get_href(pipeline_details)[source]¶ Get hef from pipeline details.
Parameters
Important
pipeline_details: Metadata of the stored pipeline
type: dict
Output
Important
returns: pipeline href
return type: str
Example
>>> pipeline_details = client.pipelines.get_details(pipeline_uid) >>> pipeline_href = client.pipelines.ger_href(pipeline_details)
-
get_revision_details(pipeline_uid, rev_uid)[source]¶ Get metadata of pipeline revision.
- Parameters
pipeline_uid ({str_type}) – stored pipeline UID
rev_uid ({str_type}) – stored pipeline revision ID
- Returns
stored pipeline revision metadata
- Return type
dict
Example:
>>> pipeline_details = client.pipelines.get_revisions_details(pipeline_uid, rev_uid)
-
static
get_uid(pipeline_details)[source]¶ Get pipeline_uid from pipeline details.
Parameters
Important
pipeline_details: Metadata of the stored pipeline
type: dict
Output
Important
returns: Unique Id of pipeline
return type: str
- Example
>>> pipeline_uid = client.pipelines.get_uid(pipeline_details)
-
list(limit=None)[source]¶ List stored pipelines. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all pipelines in a table format.
return type: None
Example
>>> client.pipelines.list()
-
list_revisions(pipeline_uid, limit=None)[source]¶ List all revision for the given pipeline uid.
- Parameters
pipeline_uid ({str_type}) – Unique id of stored pipeline.
limit (int) – limit number of fetched records (optional)
- Returns
stored pipeline revisions details
- Return type
table
- Example
>>> pipeline_revision_details = client.pipelines.list_revisions(pipeline_uid)
-
store(meta_props)[source]¶ Create a pipeline.
Parameters
Important
meta_props: meta data of the pipeline configuration. To see available meta names use:
>>> client.pipelines.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: stored pipeline metadata
return type: dict
Example
>>> metadata = { >>> client.pipelines.ConfigurationMetaNames.NAME: 'my_pipeline', >>> client.pipelines.ConfigurationMetaNames.DESCRIPTION: 'sample description' >>> } >>> pipeline_details = client.pipelines.store(training_definition_filepath, meta_props=metadata) >>> pipeline_url = client.pipelines.get_href(pipeline_details)
-
update(pipeline_uid, changes)[source]¶ Updates existing pipeline metadata.
Parameters
Important
pipeline_uid: Unique Id of pipeline which definition should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated pipeline
return type: dict
Example
>>> metadata = { >>> client.pipelines.ConfigurationMetaNames.NAME:"updated_pipeline" >>> } >>> pipeline_details = client.pipelines.update(pipeline_uid, changes=metadata)
-
-
class
metanames.PipelineMetanames[source]¶ Set of MetaNames for pipelines.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
Hand-written Digit RecognitionuDESCRIPTION
str
N
Hand-written Digit Recognition trainingSPACE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6TAGS
list
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]DOCUMENT
dict
N
{'doc_type': 'pipeline', 'version': '2.0', 'primary_pipeline': 'dlaas_only', 'pipelines': [{'id': 'dlaas_only', 'runtime_ref': 'hybrid', 'nodes': [{'id': 'training', 'type': 'model_node', 'op': 'dl_train', 'runtime_ref': 'DL', 'inputs': [], 'outputs': [], 'parameters': {'name': 'tf-mnist', 'description': 'Simple MNIST model implemented in TF', 'command': 'python3 convolutional_network.py --trainImagesFile ${DATA_DIR}/train-images-idx3-ubyte.gz --trainLabelsFile ${DATA_DIR}/train-labels-idx1-ubyte.gz --testImagesFile ${DATA_DIR}/t10k-images-idx3-ubyte.gz --testLabelsFile ${DATA_DIR}/t10k-labels-idx1-ubyte.gz --learningRate 0.001 --trainingIters 6000', 'compute': {'name': 'k80', 'nodes': 1}, 'training_lib_href': '/v4/libraries/64758251-bt01-4aa5-a7ay-72639e2ff4d2/content'}, 'target_bucket': 'wml-dev-results'}]}]}{'doc_type(required)': 'string', 'version(required)': 'string', 'primary_pipeline(required)': 'string', 'pipelines(required)': [{'id(required)': 'string', 'runtime_ref(required)': 'string', 'nodes(required)': [{'id': 'string', 'type': 'string', 'inputs': 'list', 'outputs': 'list', 'parameters': {'training_lib_href': 'string'}}]}]}CUSTOM
dict
N
{'field1': 'value1'}IMPORT
dict
N
{'connection': {'endpoint_url': 'https://s3-api.us-geo.objectstorage.softlayer.net', 'access_key_id': '***', 'secret_access_key': '***'}, 'location': {'bucket': 'train-data', 'path': 'training_path'}, 'type': 's3'}{'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}}RUNTIMES
list
N
[{'id': 'id', 'name': 'tensorflow', 'version': '1.13-py3'}]COMMAND
str
N
convolutional_network.py --trainImagesFile train-images-idx3-ubyte.gz --trainLabelsFile train-labels-idx1-ubyte.gz --testImagesFile t10k-images-idx3-ubyte.gz --testLabelsFile t10k-labels-idx1-ubyte.gz --learningRate 0.001 --trainingIters 6000LIBRARY_UID
str
N
fb9752c9-301a-415d-814f-cf658d7b856aCOMPUTE
dict
N
{'name': 'k80', 'nodes': 1}
repository¶
-
class
client.Repository(client)[source]¶ Store and manage your models, functions, spaces, pipelines and experiments using Watson Machine Learning Repository.
Important
To view ModelMetaNames, use:
>>> client.repository.ModelMetaNames.show()
To view ExperimentMetaNames, use:
>>> client.repository.ExperimentMetaNames.show()
To view FunctionMetaNames, use:
>>> client.repository.FunctionMetaNames.show()
To view PipelineMetaNames, use:
>>> client.repository.PipelineMetaNames.show()
To view SpacesMetaNames, use:
>>> client.repository.SpacesMetaNames.show()
To view MemberMetaNames, use:
>>> client.repository.MemberMetaNames.show()
-
clone(artifact_id, space_id=None, action='copy', rev_id=None)[source]¶ Creates a new resource(models, runtimes, libraries, experiments, functions, pipelines) identical with the model either in the same space or in a new space. All dependent assets will be cloned too.
Parameters
Important
model_id: Guid of the artifact to be cloned:
type: str
space_id: Guid of the space to which the model needs to be cloned. (optional)
type: str
action: Action specifying “copy” or “move”. (optional)
type: str
rev_id: Revision ID of the artifact. (optional)
type: str
Output
Important
returns: Metadata of the model cloned.
return type: dict
- Example
>>> client.repository.clone(artifact_id=artifact_id,space_id=space_uid,action="copy")
Note
If revision id is not specified, all revisions of the artifact are cloned
Default value of the parameter action is copy
Space guid is mandatory for move action
-
create_experiment_revision(experiment_uid)[source]¶ Create a new version for a experiment.
- Parameters
experiment_uid ({str_type}) – Unique ID of the experiment.
- Returns
experiment version details
- Return type
dict
Example:
>>> stored_experiment_revision_details = client.repository.create_experiment_revision(experiment_uid)
-
create_function_revision(function_uid)[source]¶ Create a new version for a function.
- Parameters
function_uid ({str_type}) – Unique ID of the function.
- Returns
function version details
- Return type
dict
Example:
>>> stored_function_revision_details = client.repository.create_function_revision( function_uid)
-
create_member(space_uid, meta_props)[source]¶ Create a member within a space.
Parameters
Important
meta_props: meta data of the member configuration. To see available meta names use:
>>> client.spaces.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: metadata of the stored member
return type: dict
Note
client.spaces.MemberMetaNames.ROLE can be any one of the following “viewer, editor, admin”
client.spaces.MemberMetaNames.IDENTITY_TYPE can be any one of the following “user,service”
client.spaces.MemberMetaNames.IDENTITY can be either service-ID or IAM-userID
Example
>>> metadata = { >>> client.spaces.MemberMetaNames.ROLE:"Admin", >>> client.spaces.MemberMetaNames.IDENTITY:"iam-ServiceId-5a216e59-6592-43b9-8669-625d341aca71", >>> client.spaces.MemberMetaNames.IDENTITY_TYPE:"service" >>> } >>> members_details = client.repository.create_member(space_uid=space_id, meta_props=metadata)
-
create_model_revision(model_uid)[source]¶ Create a new version for a model.
- Parameters
model_uid ({str_type}) – model ID
- Returns
model version details
- Return type
dict
Example:
>>> stored_model_revision_details = client.repository.create_model_revision( model_uid="MODELID")
-
create_pipeline_revision(pipeline_uid)[source]¶ Create a new version for a model.
- Parameters
pipeline_uid ({str_type}) – Unique ID of the Pipeline
- Returns
pipeline version details
- Return type
dict
Example:
>>> stored_pipeline_revision_details = client.repository.create_pipeline_revision( pipeline_uid)
-
create_revision(artifact_uid)[source]¶ Create revision for passed artifact_uid.
- Parameters
artifact_uid ({str_type}) – unique id of stored model, experiment, function or pipelines
- Returns
artifact new revision metadata
- Return type
dict
A way you might use me is:
>>> details = client.repository.create_revision(artifact_uid)
-
delete(artifact_uid)[source]¶ Delete model, experiment, pipeline, space, runtime, library or function from repository.
Parameters
Important
artifact_uid: Unique id of stored model, experiment, function, pipeline, space, library or runtime
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.repository.delete(artifact_uid)
-
download(artifact_uid, filename='downloaded_artifact.tar.gz', rev_uid=None)[source]¶ Downloads configuration file for artifact with specified uid.
Parameters
Important
artifact_uid: Unique Id of model, function, runtime or library
type: str
filename: Name of the file to which the artifact content has to be downloaded
default value: downloaded_artifact.tar.gz
type: str
Output
Important
returns: Path to the downloaded artifact content
return type: str
Note
If filename is not specified, the default filename is “downloaded_artifact.tar.gz”.
Example
>>> client.repository.download(model_uid, 'my_model.tar.gz')
-
get_details(artifact_uid=None)[source]¶ Get metadata of stored artifacts. If artifact_uid is not specified returns all models, experiments, functions, pipelines, spaces, libraries and runtimes metadata.
Parameters
Important
artifact_uid: Unique Id of stored model, experiment, function, pipeline, space, library or runtime (optional)
type: str
Output
Important
returns: stored artifact(s) metadata
return type: dict
dict (if artifact_uid is not None) or {“resources”: [dict]} (if artifact_uid is None)
Note
If artifact_uid is not specified, all models, experiments, functions, pipelines, spaces, libraries and runtimes metadata is fetched
Example
>>> details = client.repository.get_details(artifact_uid) >>> details = client.repository.get_details()
-
get_experiment_details(experiment_uid=None, limit=None)[source]¶ Get metadata of experiment. If no experiment_uid is specified all experiments metadata is returned.
Parameters
Important
experiment_uid: Unique Id of experiment (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: experiment(s) metadata
return type: dict
dict (if experiment_uid is not None) or {“resources”: [dict]} (if experiment_uid is None)
Note
If experiment_uid is not specified, all experiments metadata is fetched
Example
>>> experiment_details = client.respository.get_experiment_details(experiment_uid)
-
static
get_experiment_href(experiment_details)[source]¶ Get href of stored experiment.
Parameters
Important
experiment_details: Metadata of the stored experiment
type: dict
Output
Important
returns: href of stored experiment
return type: str
- Example
>>> experiment_details = client.repository.get_experiment_detailsf(experiment_uid) >>> experiment_href = client.repository.get_experiment_href(experiment_details)
-
get_experiment_revision_details(experiment_uid, rev_id)[source]¶ Get metadata of experiment revision.
Parameters
Important
experiment_uid: Unique Id of experiment
type: str
rev_id: Unique id of experiment revision
type: str
Output
Important
returns: experiment revision metadata
return type: dict
- Example
>>> experiment_rev_details = client.respository.get_experiment__revision_details(experiment_uid, rev_uid)
-
static
get_experiment_uid(experiment_details)[source]¶ Get Unique Id of stored experiment.
Parameters
Important
experiment_details: Metadata of the stored experiment
type: dict
Output
Important
returns: Unique Id of stored experiment
return type: str
- Example
>>> experiment_details = client.repository.get_experiment_detailsf(experiment_uid) >>> experiment_uid = client.repository.get_experiment_uid(experiment_details)
-
get_function_details(function_uid=None, limit=None)[source]¶ Get metadata of function. If no function_uid is specified all functions metadata is returned.
Parameters
Important
function_uid: Unique Id of function (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: function(s) metadata
return type: dict
dict (if function_uid is not None) or {“resources”: [dict]} (if function_uid is None)
Note
If function_uid is not specified, all functions metadata is fetched
Example
>>> function_details = client.respository.get_function_details(function_uid) >>> function_details = client.respository.get_function_details()
-
static
get_function_href(function_details)[source]¶ Get href of stored function.
Parameters
Important
function_details: Metadata of the stored function
type: dict
Output
Important
returns: href of stored function
return type: str
Example
>>> function_details = client.repository.get_function_detailsf(function_uid) >>> function_url = client.repository.get_function_href(function_details)
-
get_function_revision_details(function_uid, rev_id)[source]¶ Get metadata of function revision.
Parameters
Important
function_uid: Unique Id of function
type: str
rev_id: Unique Id of function revision
type: str
Output
Important
returns: function revision metadata
return type: dict
Example
>>> function_rev_details = client.respository.get_function_revision_details(function_uid, rev_id)
-
static
get_function_uid(function_details)[source]¶ Get Unique Id of stored function.
Parameters
Important
function_details: Metadata of the stored function
type: dict
Output
Important
returns: Unique Id of stored function
return type: str
Example
>>> function_details = client.repository.get_function_detailsf(function_uid) >>> function_uid = client.repository.get_function_uid(function_details)
-
static
get_member_href(member_details)[source]¶ Get member_href from member details.
Parameters
Important
space_details: Metadata of the stored member
type: dict
Output
Important
returns: member href
return type: str
Example
>>> member_details = client.repository.get_members_details(member_id) >>> member_href = client.repository.get_member_href(member_details)
-
static
get_member_uid(member_details)[source]¶ Get member_uid from member details.
Parameters
Important
member_details: Metadata of the created member
type: dict
Output
Important
returns: unique id of member
return type: str
Example
>>> member_details = client.repository.get_members_details(member_id) >>> member_id = client.repository.get_member_uid(member_details)
-
get_members_details(space_uid, member_id=None, limit=None)[source]¶ Get metadata of members associated with a space. If member_uid is not specified, it returns all the members metadata.
Parameters
Important
space_uid: Unique id of member (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of member(s) of a space
return type: dict dict (if member_id is not None) or {“resources”: [dict]} (if member_id is None)
Note
If member id is not specified, all members metadata is fetched
Example
>>> member_details = client.repository.get_members_details(space_uid,member_id)
-
get_model_details(model_uid=None, limit=None)[source]¶ Get metadata of stored model. If model_uid is not specified returns all models metadata.
Parameters
Important
model_uid: Unique Id of Model (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of model(s)
return type: dict dict (if model_uid is not None) or {“resources”: [dict]} (if model_uid is None)
Note
If model_uid is not specified, all models metadata is fetched
Example
>>> model_details = client.repository.get_model_details(model_uid) >>> models_details = client.repository.get_model_details()
-
static
get_model_href(model_details)[source]¶ Get href of stored model.
Parameters
Important
model_details: Metadata of the stored model
type: dict
Output
Important
returns: href of stored model
return type: str
Example
>>> model_details = client.repository.get_model_detailsf(model_uid) >>> model_uid = client.repository.get_model_href(model_details)
-
get_model_revision_details(model_uid, rev_uid)[source]¶ Get metadata of model revision.
Parameters
Important
experiment_uid: Unique Id of model
type: str
limit: Unique id of model revision
type: str
Output
Important
returns: model revision metadata
return type: dict
- Example
>>> model_rev_details = client.respository.get_model_revision_details(model_uid, rev_uid)
-
static
get_model_uid(model_details)[source]¶ Get Unique Id of stored model.
Parameters
Important
model_details: Metadata of the stored model
type: dict
Output
Important
returns: Unique Id of stored model
return type: str
Example
>>> model_details = client.repository.get_model_detailsf(model_uid) >>> model_uid = client.repository.get_model_uid(model_details)
-
get_pipeline_details(pipeline_uid=None, limit=None)[source]¶ Get metadata of stored pipelines. If pipeline_uid is not specified returns all pipelines metadata.
Parameters
Important
pipeline_uid: Unique id of Pipeline(optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of pipeline(s)
return type: dict dict (if pipeline_uid is not None) or {“resources”: [dict]} (if pipeline_uid is None)
Note
If pipeline_uid is not specified, all pipelines metadata is fetched
Example
>>> pipeline_details = client.repository.get_pipeline_details(pipeline_uid) >>> pipeline_details = client.repository.get_pipeline_details()
-
static
get_pipeline_href(pipeline_details)[source]¶ Get pipeline_hef from pipeline details.
Parameters
Important
pipeline_details: Metadata of the stored pipeline
type: dict
Output
Important
returns: pipeline href
return type: str
Example
>>> pipeline_details = client.repository.get_pipeline_details(pipeline_uid) >>> pipeline_href = client.repository.get_pipeline_href(pipeline_details)
-
get_pipeline_revision_details(pipeline_uid, rev_id)[source]¶ Get metadata of stored pipeline revision.
Parameters
Important
pipeline_uid: Unique id of Pipeline
type: str
rev_id: Unique id Pipeline revision
type: str
Output
Important
returns: metadata of revision pipeline(s)
return type: dict
Example
>>> pipeline_rev_details = client.repository.get_pipeline_revision_details(pipeline_uid, rev_id)
-
static
get_pipeline_uid(pipeline_details)[source]¶ Get pipeline_uid from pipeline details.
Parameters
Important
pipeline_details: Metadata of the stored pipeline
type: dict
Output
Important
returns: Unique Id of pipeline
return type: str
Example
>>> pipeline_details = client.repository.get_pipeline_details(pipeline_uid) >>> pipeline_uid = client.repository.get_pipeline_uid(pipeline_details)
-
get_space_details(space_uid=None, limit=None)[source]¶ Get metadata of stored space. If space_uid is not specified returns all model spaces metadata.
Parameters
Important
space_uid: Unique id of Space (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of space(s)
return type: dict dict (if space_uid is not None) or {“resources”: [dict]} (if space_uid is None)
Note
If space_uid is not specified, all spaces metadata is fetched
Example
>>> space_details = client.repository.get_space_details(space_uid) >>> space_details = client.repository.get_space_details()
-
static
get_space_href(space_details)[source]¶ Get space_href from space details.
Parameters
Important
space_details: Metadata of the stored space
type: dict
Output
Important
returns: space href
return type: str
Example
>>> space_details = client.repository.get_space_details(space_uid) >>> space_href = client.repository.get_space_href(space_details)
-
static
get_space_uid(space_details)[source]¶ Get space_uid from space details.
Parameters
Important
space_details: Metadata of the stored space
type: dict
Output
Important
returns: Unique Id of space
return type: str
Example
>>> space_details = client.repository.get_space_details(space_uid) >>> space_uid = client.repository.get_space_uid(space_details)
-
list()[source]¶ List stored models, pipelines, runtimes, libraries, functions, spaces and experiments. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all models, pipelines, runtimes, libraries, functions, spaces and experiments in a table format.
return type: None
Example
>>> client.repository.list()
-
list_experiments(limit=None)[source]¶ List stored experiments. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all experiments in a table format.
return type: None
Example
>>> client.repository.list_experiments()
-
list_experiments_revisions(experiment_uid, limit=None)[source]¶ List stored experiment revisions. If limit is set to None there will be only first 50 records shown.
Parameters
Important
experiment_uid: Uniquie Id of the experiment
type: str
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all revisions of given experiment ID in a table format.
return type: None
Example
>>> client.repository.list_experiments_revisions(experiment_uid)
-
list_functions(limit=None)[source]¶ List stored functions. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all functions in a table format.
return type: None
Example
>>> client.respository.list_functions()
-
list_functions_revisions(function_uid, limit=None)[source]¶ List stored function revisions. If limit is set to None there will be only first 50 records shown.
Parameters
Important
function_uid: Uniquie Id of the function
type: str
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all revisions of given function ID in a table format.
return type: None
Example
>>> client.repository.list_functions_revisions(function_uid)
-
list_members(space_uid, limit=None)[source]¶ List stored members of a space. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all members associated with a space in a table format.
return type: None
Example
>>> client.spaces.list_members()
-
list_models(limit=None)[source]¶ List stored models. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all models in a table format.
return type: None
Example
>>> client.repository.list_models()
-
list_models_revisions(model_uid, limit=None)[source]¶ List stored model revisions. If limit is set to None there will be only first 50 records shown.
Parameters
Important
model_uid: Uniquie Id of the model
type: str
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all revisions of given model ID in a table format.
return type: None
Example
>>> client.repository.list_models_revisions(model_uid)
-
list_pipelines(limit=None)[source]¶ List stored pipelines. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all pipelines in a table format.
return type: None
Example
>>> client.repository.list_pipelines()
-
list_pipelines_revisions(pipeline_uid, limit=None)[source]¶ List stored pipeline revisions. If limit is set to None there will be only first 50 records shown.
Parameters
Important
model_uid: Uniquie Id of the pipeline
type: str
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all revisions of given pipeline ID in a table format.
return type: None
Example
>>> client.repository.list_pipelines_revisions(pipeline_uid)
-
list_spaces(limit=None)[source]¶ List stored spaces. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all spaces in a table format.
return type: None
Example
>>> client.repository.list_spaces()
-
load(artifact_uid)[source]¶ Load model from repository to object in local environment.
Parameters
Important
artifact_uid: Unique Id of model
type: str
Output
Important
returns: model object
return type: object
Example
>>> model_obj = client.repository.load(model_uid)
-
store_experiment(meta_props)[source]¶ Create an experiment.
Parameters
Important
meta_props: meta data of the experiment configuration. To see available meta names use:
>>> client.experiments.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: Metadata of the experiment created
return type: dict
Example
>>> metadata = { >>> client.experiments.ConfigurationMetaNames.NAME: 'my_experiment', >>> client.experiments.ConfigurationMetaNames.EVALUATION_METRICS: ['accuracy'], >>> client.experiments.ConfigurationMetaNames.TRAINING_REFERENCES: [ >>> { >>> 'pipeline': {'href': pipeline_href_1} >>> >>> }, >>> { >>> 'pipeline': {'href':pipeline_href_2} >>> }, >>> ] >>> } >>> experiment_details = client.repository.store_experiment(meta_props=metadata) >>> experiment_href = client.repository.get_experiment_href(experiment_details)
-
store_function(function, meta_props)[source]¶ Create a function.
Parameters
Important
meta_props: meta data or name of the function. To see available meta names use:
>>> client.functions.ConfigurationMetaNames.get()
type: dict
function: path to file with archived function content or function (as described above)
As a ‘function’ may be used one of the following:
filepath to gz file
‘score’ function reference, where the function is the function which will be deployed
generator function, which takes no argument or arguments which all have primitive python default values and as result return ‘score’ function
type: str or function
Output
Important
returns: Metadata of the function created.
return type: dict
Example
The most simple use is (using score function):
>>> def score(payload): >>> values = [[row[0]*row[1]] for row in payload['values']] >>> return {'fields': ['multiplication'], 'values': values} >>> stored_function_details = client.functions.store(score, name)
Other, more interesting example is using generator function. In this situation it is possible to pass some variables:
>>> wml_creds = {...} >>> def gen_function(wml_credentials=wml_creds, x=2): def f(payload): values = [[row[0]*row[1]*x] for row in payload['values']] return {'fields': ['multiplication'], 'values': values} return f >>> stored_function_details = client.functions.store(gen_function, name)
In more complicated cases you should create proper metadata, similar to this one:
>>> metadata = { >>> client.repository.FunctionMetaNames.NAME: "function", >>> client.repository.FunctionMetaNames.DESCRIPTION: "This is ai function", >>> client.repository.FunctionMetaNames.RUNTIME_UID: "53dc4cf1-252f-424b-b52d-5cdd9814987f", >>> client.repository.FunctionMetaNames.INPUT_DATA_SCHEMAS: [{"fields": [{"metadata": {}, "type": "string", "name": "GENDER", "nullable": True}]}], >>> client.repository.FunctionMetaNames.OUTPUT_DATA_SCHEMAS:[{"fields": [{"metadata": {}, "type": "string", "name": "GENDER", "nullable": True}]}], >>> client.repository.FunctionMetaNames.TAGS: [{"value": "ProjectA", "description": "Functions created for ProjectA"}] >>> } >>> stored_function_details = client.repository.store_function(score, metadata)
-
store_model(model, meta_props=None, training_data=None, training_target=None, pipeline=None, feature_names=None, label_column_names=None, subtrainingId=None)[source]¶ Create a model.
Parameters
Important
model:
Can be one of following:
The train model object:
scikit-learn
xgboost
spark (PipelineModel)
path to saved model in format:
keras (.tgz)
pmml (.xml)
scikit-learn (.tar.gz)
tensorflow (.tar.gz)
spss (.str)
directory containing model file(s):
scikit-learn
xgboost
tensorflow
unique id of trained model
training_data: Spark DataFrame supported for spark models. Pandas dataframe, numpy.ndarray or array supported for scikit-learn models
type: spark dataframe, pandas dataframe, numpy.ndarray or array
meta_props: meta data of the models configuration. To see available meta names use:
>>> client.repository.ModelMetaNames.get()
type: dict
training_target: array with labels required for scikit-learn models
type: array
pipeline: pipeline required for spark mllib models
type: object
feature_names: Feature names for the training data in case of Scikit-Learn/XGBoost models. This is applicable only in the case where the training data is not of type - pandas.DataFrame.
type: numpy.ndarray or list
label_column_names: Label column names of the trained Scikit-Learn/XGBoost models.
type: numpy.ndarray and list
subtrainingId: The subtraining ID for a training created via an experiment.
type: str
Output
Important
returns: Metadata of the model created
return type: dict
Note
For a keras model, model content is expected to contain a .h5 file and an archived version of it.
For deploying a keras model, it is mandatory to pass the FRAMEWORK_LIBRARIES along with other metaprops.
>>> client.repository.ModelMetaNames.FRAMEWORK_LIBRARIES: [{'name':'keras', 'version': '2.1.3'}]
feature_names is an optional argument containing the feature names for the training data in case of Scikit-Learn/XGBoost models. Valid types are numpy.ndarray and list. This is applicable only in the case where the training data is not of type - pandas.DataFrame.
If the training data is of type pandas.DataFrame and feature_names are provided, feature_names are ignored.
The value can be a single dictionary(being deprecated, use list even for single schema) or a list if you are using single input data schema. you can provide multiple schemas as dictionaries inside a list.
Example
>>> stored_model_details = client.repository.store_model(model, name)
In more complicated cases you should create proper metadata, similar to this one:
>>> metadata = { >>> client.repository.ModelMetaNames.NAME: 'customer satisfaction prediction model', >>> client.repository.ModelMetaNames.TYPE: 'tensorflow_1.5', >>> client.repository.ModelMetaNames.RUNTIME_UID: 'tensorflow_1.5-py3' >>>}
In case when you want to provide input data schema of the model, you can provide it as part of meta
>>> metadata = { >>> client.repository.ModelMetaNames.NAME: 'customer satisfaction prediction model', >>> client.repository.ModelMetaNames.RUNTIME_UID: 'spss-modeler_18.1', >>> client.repository.ModelMetaNames.TYPE: 'spss-modeler_18.1', >>> client.repository.ModelMetaNames.INPUT_DATA_SCHEMA: [{'id': 'test', >>> 'type': 'list', >>> 'fields': [{'name': 'age', 'type': 'float'}, >>> {'name': 'sex', 'type': 'float'}, >>> {'name': 'fbs', 'type': 'float'}, >>> {'name': 'restbp', 'type': 'float'}] >>> }, >>> {'id': 'test2', >>> 'type': 'list', >>> 'fields': [{'name': 'age', 'type': 'float'}, >>> {'name': 'sex', 'type': 'float'}, >>> {'name': 'fbs', 'type': 'float'}, >>> {'name': 'restbp', 'type': 'float'}] >>> }] >>> }
A way you might use me with local tar.gz containing model:
>>> stored_model_details = client.repository.store_model(path_to_tar_gz, meta_props=metadata, training_data=None)
A way you might use me with local directory containing model file(s):
>>> stored_model_details = client.repository.store_model(path_to_model_directory, meta_props=metadata, training_data=None)
A way you might use me with trained model guid:
>>> stored_model_details = client.repository.store_model(trained_model_guid, meta_props=metadata, training_data=None)
-
store_pipeline(meta_props)[source]¶ Create a pipeline.
Parameters
Important
meta_props: meta data of the pipeline configuration. To see available meta names use:
>>> client.pipelines.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: Metadata of the pipeline createdn return type: dict
Example
>>> metadata = { >>> client.pipelines.ConfigurationMetaNames.NAME: 'my_training_definition', >>> client.pipelines.ConfigurationMetaNames.DOCUMENT: {"doc_type":"pipeline","version": "2.0","primary_pipeline": "dlaas_only","pipelines": [{"id": "dlaas_only","runtime_ref": "hybrid","nodes": [{"id": "training","type": "model_node","op": "dl_train","runtime_ref": "DL","inputs": [],"outputs": [],"parameters": {"name": "tf-mnist","description": "Simple MNIST model implemented in TF","command": "python3 convolutional_network.py --trainImagesFile ${DATA_DIR}/train-images-idx3-ubyte.gz --trainLabelsFile ${DATA_DIR}/train-labels-idx1-ubyte.gz --testImagesFile ${DATA_DIR}/t10k-images-idx3-ubyte.gz --testLabelsFile ${DATA_DIR}/t10k-labels-idx1-ubyte.gz --learningRate 0.001 --trainingIters 6000","compute": {"name": "k80","nodes": 1},"training_lib_href":"/v4/libraries/64758251-bt01-4aa5-a7ay-72639e2ff4d2/content"},"target_bucket": "wml-dev-results"}]}]}} >>> pipeline_details = client.repository.store_pipeline(pipeline_filepath, meta_props=metadata) >>> pipeline_href = client.repository.get_pipeline_href(pipeline_details)
-
store_space(meta_props)[source]¶ Create a space.
Parameters
Important
meta_props: meta data of the space configuration. To see available meta names use:
>>> client.spaces.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: Metadata of the space created
return type: dict
Example
>>> metadata = { >>> client.spaces.ConfigurationMetaNames.NAME: 'my_space' >>> } >>> space_details = client.repository.store_space(meta_props=metadata) >>> space_href = client.repository.get_space_href(experiment_details)
-
update_experiment(experiment_uid, changes)[source]¶ Updates existing experiment metadata.
Parameters
Important
experiment_uid: Unique of Id experiment which definition should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated experiment
return type: dict
Example
>>> metadata = { >>> client.repository.ExperimentMetaNames.NAME:"updated_exp" >>> } >>> exp_details = client.repository.update_experiment(experiment_uid, changes=metadata)
-
update_function(function_uid, changes, update_function=None)[source]¶ Updates existing function metadata.
Parameters
Important
function_uid: Unique Id of function which define what should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
update_function: path to file with archived function content or function which should be changed for specific function_uid
- .This parameters is valid only for CP4D 3.0.0.
type: str or function
Output
Important
returns: metadata of updated function
return type: dict
Example
>>> metadata = { >>> client.repository.FunctionMetaNames.NAME:"updated_function" >>> } >>> >>> function_details = client.repository.update_function(function_uid, changes=metadata)
-
update_model(model_uid, updated_meta_props=None, update_model=None)[source]¶ Updates existing model metadata.
Parameters
Important
model_uid: Unique id of model which definition should be updated
type: str
updated_meta_props: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
update_model: archived model content file or path to directory containing archived model file which should be changed for specific model_uid
- .
This parameters is valid only for CP4D 3.0.0. A way you might use me with local directory containing model file(s):
type: object or archived model content file
Output
Important
returns: metadata of updated model
return type: dict
Example 1
>>> metadata = { >>> client.repository.ModelMetaNames.NAME:"updated_model" >>> } >>> model_details = client.repository.update_model(model_uid, updated_meta_props=metadata)
Example 2
>>> metadata = { >>> client.repository.ModelMetaNames.NAME:"updated_model" >>> } >>> model_details = client.repository.update_model(model_uid, updated_meta_props=metadata, update_model="newmodel_content.tar.gz")
-
update_pipeline(pipeline_uid, changes)[source]¶ Updates existing pipeline metadata.
Parameters
Important
pipeline_uid: Unique Id of pipeline which definition should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated pipeline
return type: dict
Example
>>> metadata = { >>> client.repository.PipelineMetanames.NAME:"updated_pipeline" >>> } >>> pipeline_details = client.repository.update_pipeline(pipeline_uid, changes=metadata)
-
update_space(space_uid, changes)[source]¶ Updates existing space metadata.
Parameters
Important
space_uid: Unique Id of space which definition should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated space
return type: dict
Example
>>> metadata = { >>> client.repository.SpacesMetaNames.NAME:"updated_space" >>> } >>> space_details = client.repository.update_space(space_uid, changes=metadata)
-
class
metanames.ModelMetaNames[source]¶ Set of MetaNames for models.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
my_modelDESCRIPTION
str
N
my_descriptionINPUT_DATA_SCHEMA
list
N
{'id': '1', 'type': 'struct', 'fields': [{'name': 'x', 'type': 'double', 'nullable': False, 'metadata': {}}, {'name': 'y', 'type': 'double', 'nullable': False, 'metadata': {}}]}{'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}TRAINING_DATA_REFERENCES
list
N
[][{'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}, 'schema(optional)': {'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}}]OUTPUT_DATA_SCHEMA
dict
N
{'id': '1', 'type': 'struct', 'fields': [{'name': 'x', 'type': 'double', 'nullable': False, 'metadata': {}}, {'name': 'y', 'type': 'double', 'nullable': False, 'metadata': {}}]}{'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}LABEL_FIELD
str
N
PRODUCT_LINETRANSFORMED_LABEL_FIELD
str
N
PRODUCT_LINE_IXTAGS
list
N
[{'value': 'string', 'description': 'string'}][{'value(required)': 'string', 'description(optional)': 'string'}]SIZE
dict
N
{'in_memory': 0, 'content': 0}{'in_memory(optional)': 'string', 'content(optional)': 'string'}SPACE_UID
str
N
53628d69-ced9-4f43-a8cd-9954344039a8PIPELINE_UID
str
N
53628d69-ced9-4f43-a8cd-9954344039a8RUNTIME_UID
str
N
53628d69-ced9-4f43-a8cd-9954344039a8TYPE
str
Y
mllib_2.1CUSTOM
dict
N
{}DOMAIN
str
N
Watson Machine LearningHYPER_PARAMETERS
dict
N
METRICS
list
N
IMPORT
dict
N
{'connection': {'endpoint_url': 'https://s3-api.us-geo.objectstorage.softlayer.net', 'access_key_id': '***', 'secret_access_key': '***'}, 'location': {'bucket': 'train-data', 'path': 'training_path'}, 'type': 's3'}{'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}}TRAINING_LIB_UID
str
N
53628d69-ced9-4f43-a8cd-9954344039a8MODEL_DEFINITION_UID
str
N
53628d6_cdee13-35d3-s8989343SOFTWARE_SPEC_UID
str
N
53628d69-ced9-4f43-a8cd-9954344039a8TF_MODEL_PARAMS
dict
N
{'save_format': 'None', 'signatures': 'struct', 'options': 'None', 'custom_objects': 'string'}
-
class
metanames.ExperimentMetaNames[source]¶ Set of MetaNames for experiments.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
Hand-written Digit RecognitionDESCRIPTION
str
N
Hand-written Digit Recognition trainingTAGS
list
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]EVALUATION_METHOD
str
N
multiclassEVALUATION_METRICS
list
N
[{'name': 'accuracy', 'maximize': False}][{'name(required)': 'string', 'maximize(optional)': 'boolean'}]TRAINING_REFERENCES
list
Y
[{'pipeline': {'href': '/v4/pipelines/6d758251-bb01-4aa5-a7a3-72339e2ff4d8'}}][{'pipeline(optional)': {'href(required)': 'string', 'data_bindings(optional)': [{'data_reference(required)': 'string', 'node_id(required)': 'string'}], 'nodes_parameters(optional)': [{'node_id(required)': 'string', 'parameters(required)': 'dict'}]}, 'training_lib(optional)': {'href(required)': 'string', 'compute(optional)': {'name(required)': 'string', 'nodes(optional)': 'number'}, 'runtime(optional)': {'href(required)': 'string'}, 'command(optional)': 'string', 'parameters(optional)': 'dict'}}]SPACE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6LABEL_COLUMN
str
N
labelCUSTOM
dict
N
{'field1': 'value1'}
-
class
metanames.FunctionMetaNames[source]¶ Set of MetaNames for AI functions.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
ai_functionDESCRIPTION
str
N
This is ai functionRUNTIME_UID
str
N
53628d69-ced9-4f43-a8cd-9954344039a8SOFTWARE_SPEC_UID
str
N
53628d69-ced9-4f43-a8cd-9954344039a8INPUT_DATA_SCHEMAS
list
N
[{'id': '1', 'type': 'struct', 'fields': [{'name': 'x', 'type': 'double', 'nullable': False, 'metadata': {}}, {'name': 'y', 'type': 'double', 'nullable': False, 'metadata': {}}]}][{'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}]OUTPUT_DATA_SCHEMAS
list
N
[{'id': '1', 'type': 'struct', 'fields': [{'name': 'multiplication', 'type': 'double', 'nullable': False, 'metadata': {}}]}][{'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}]TAGS
list
N
[{'value': 'ProjectA', 'description': 'Functions created for ProjectA'}][{'value(required)': 'string', 'description(optional)': 'string'}]TYPE
str
N
pythonCUSTOM
dict
N
{}SAMPLE_SCORING_INPUT
list
N
{'input_data': [{'fields': ['name', 'age', 'occupation'], 'values': [['john', 23, 'student'], ['paul', 33, 'engineer']]}]}{'id(optional)': 'string', 'fields(optional)': 'array', 'values(optional)': 'array'}SPACE_UID
str
N
3628d69-ced9-4f43-a8cd-9954344039a8
-
class
metanames.PipelineMetanames[source]¶ Set of MetaNames for pipelines.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
Hand-written Digit RecognitionuDESCRIPTION
str
N
Hand-written Digit Recognition trainingSPACE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6TAGS
list
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]DOCUMENT
dict
N
{'doc_type': 'pipeline', 'version': '2.0', 'primary_pipeline': 'dlaas_only', 'pipelines': [{'id': 'dlaas_only', 'runtime_ref': 'hybrid', 'nodes': [{'id': 'training', 'type': 'model_node', 'op': 'dl_train', 'runtime_ref': 'DL', 'inputs': [], 'outputs': [], 'parameters': {'name': 'tf-mnist', 'description': 'Simple MNIST model implemented in TF', 'command': 'python3 convolutional_network.py --trainImagesFile ${DATA_DIR}/train-images-idx3-ubyte.gz --trainLabelsFile ${DATA_DIR}/train-labels-idx1-ubyte.gz --testImagesFile ${DATA_DIR}/t10k-images-idx3-ubyte.gz --testLabelsFile ${DATA_DIR}/t10k-labels-idx1-ubyte.gz --learningRate 0.001 --trainingIters 6000', 'compute': {'name': 'k80', 'nodes': 1}, 'training_lib_href': '/v4/libraries/64758251-bt01-4aa5-a7ay-72639e2ff4d2/content'}, 'target_bucket': 'wml-dev-results'}]}]}{'doc_type(required)': 'string', 'version(required)': 'string', 'primary_pipeline(required)': 'string', 'pipelines(required)': [{'id(required)': 'string', 'runtime_ref(required)': 'string', 'nodes(required)': [{'id': 'string', 'type': 'string', 'inputs': 'list', 'outputs': 'list', 'parameters': {'training_lib_href': 'string'}}]}]}CUSTOM
dict
N
{'field1': 'value1'}IMPORT
dict
N
{'connection': {'endpoint_url': 'https://s3-api.us-geo.objectstorage.softlayer.net', 'access_key_id': '***', 'secret_access_key': '***'}, 'location': {'bucket': 'train-data', 'path': 'training_path'}, 'type': 's3'}{'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}}RUNTIMES
list
N
[{'id': 'id', 'name': 'tensorflow', 'version': '1.13-py3'}]COMMAND
str
N
convolutional_network.py --trainImagesFile train-images-idx3-ubyte.gz --trainLabelsFile train-labels-idx1-ubyte.gz --testImagesFile t10k-images-idx3-ubyte.gz --testLabelsFile t10k-labels-idx1-ubyte.gz --learningRate 0.001 --trainingIters 6000LIBRARY_UID
str
N
fb9752c9-301a-415d-814f-cf658d7b856aCOMPUTE
dict
N
{'name': 'k80', 'nodes': 1}
-
class
metanames.SpacesMetaNames[source]¶ Set of MetaNames for Spaces Specs.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
my_spaceTAGS
list
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]CUSTOM
dict
N
{"field1":"value1"}DESCRIPTION
str
N
my_descriptionONLINE_DEPLOYMENTS
list
N
[{}][{'name(optional)': 'string', 'description(optional)': 'string', 'guid(optional)': 'string', 'compute(optional)': {'name(required)': 'string', 'nodes(optional)': 'number'}}]SCHEDULES
list
N
[{}][{'cron(optional)': 'string', 'assets(optional)': [{'name(optional)': 'string', 'description(optional)': 'string', 'guid(optional)': 'string', 'compute(optional)': {'name(required)': 'string', 'nodes(optional)': 'number'}}]}]
runtimes¶
-
class
client.Runtimes(client)[source]¶ Creates Runtime Specs and associated Custom Libraries.
Note
There are a list of pre-defined runtimes available. To see the list of pre-defined runtimes, use:
>>> client.runtimes.list(pre_defined=True)
-
clone_library(library_uid, space_id=None, action='copy', rev_id=None)[source]¶ Creates a new function library with the given library either in the same space or in a new space. All dependent assets will be cloned too.
Parameters
Important
model_id: Guid of the library to be cloned:
type: str
space_id: Guid of the space to which the library needs to be cloned. (optional)
type: str
action: Action specifying “copy” or “move”. (optional)
type: str
rev_id: Revision ID of the library. (optional)
type: str
Output
Important
returns: Metadata of the library cloned.
return type: dict
- Example
>>> client.runtmes.clone_library(library_uid=artifact_id,space_id=space_uid,action="copy")
Note
If revision id is not specified, all revisions of the artifact are cloned
Default value of the parameter action is copy
Space guid is mandatory for move action
-
clone_runtime(runtime_uid, space_id=None, action='copy', rev_id=None)[source]¶ Creates a new runtime identical with the given runtime either in the same space or in a new space. All dependent assets will be cloned too.
Parameters
Important
model_id: Guid of the runtime to be cloned:
type: str
space_id: Guid of the space to which the runtime needs to be cloned. (optional)
type: str
action: Action specifying “copy” or “move”. (optional)
type: str
rev_id: Revision ID of the runtime. (optional)
type: str
Output
Important
returns: Metadata of the runtime cloned.
return type: dict
- Example
>>> client.runtimes.clone_runtime(runtime_uid=artifact_id,space_id=space_uid,action="copy")
Note
If revision id is not specified, all revisions of the artifact are cloned
Default value of the parameter action is copy
Space guid is mandatory for move action
-
delete(runtime_uid, with_libraries=False)[source]¶ Delete a runtime.
Parameters
Important
runtime_uid: Runtime UID
type: str
with_libraries: Boolean value indicating an option to delete the libraries associated with the runtime
type: bool
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.runtimes.delete(deployment_uid)
-
delete_library(library_uid)[source]¶ Delete a library.
Parameters
Important
library_uid: Library UID
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.runtimes.delete_library(library_uid)
-
download_configuration(runtime_uid, filename=None)[source]¶ Downloads configuration file for runtime with specified uid.
Parameters
Important
runtime_uid: UID of runtime
type: str
filename: filename of downloaded archive (optional)
default value: runtime_configuration.yaml
type: str
Output
Important
returns: Path to the downloaded runtime configuration
return type: str
Note
If filename is not specified, the default filename is “runtime_configuration.yaml”.
Example
>>> filename="runtime.yml" >>> client.runtimes.download_configuration(runtime_uid, filename=filename)
-
download_library(library_uid, filename=None)[source]¶ Downloads library content with specified uid.
Parameters
Important
library_uid: UID of library
type: str
filename: filename of downloaded archive (optional)
default value: <LIBRARY-NAME>-<LIBRARY-VERSION>.zip
type: str
Output
Important
returns: Path to the downloaded library content
return type: str
Note
If filename is not specified, the default filename is “<LIBRARY-NAME>-<LIBRARY-VERSION>.zip”.
Example
>>> filename="library.tgz" >>> client.runtimes.download_library(runtime_uid, filename=filename)
-
get_details(runtime_uid=None, pre_defined=False, limit=None)[source]¶ Get metadata of stored runtime(s). If runtime UID is not specified returns all runtimes metadata.
Parameters
Important
runtime_uid: runtime UID (optional)
type: str
pre_defined: Boolean indicating to display predefined runtimes only. Default value is set to ‘False’
type: bool
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of runtime(s)
return type: dict The output can be {“resources”: [dict]} or a dict
Note
If UID is not specified, all runtimes metadata is fetched
Example
>>> runtime_details = client.runtimes.get_details(runtime_uid) >>> runtime_details = client.runtimes.get_details(runtime_uid=runtime_uid) >>> runtime_details = client.runtimes.get_details()
-
static
get_href(details)[source]¶ Get runtime_href from runtime details.
Parameters
Important
runtime_details: Metadata of the runtime
type: dict
Output
Important
returns: runtime href
return type: str
Example
>>> runtime_details = client.runtimes.get_details(runtime_uid) >>> runtime_href = client.runtimes.get_href(runtime_details)
-
get_library_details(library_uid=None, limit=None)[source]¶ Get metadata of stored librarie(s). If library UID is not specified returns all libraries metadata.
Parameters
Important
library_uid: library UID (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of library(s)
return type: dict The output can be {“resources”: [dict]} or a dict
Note
If UID is not specified, all libraries metadata is fetched
Example
>>> library_details = client.runtimes.get_library_details(library_uid) >>> library_details = client.runtimes.get_library_details(library_uid=library_uid) >>> library_details = client.runtimes.get_library_details()
-
static
get_library_href(library_details)[source]¶ Get library_href from library details.
Parameters
Important
library_details: Metadata of the library
type: dict
Output
Important
returns: library href
return type: str
Example
>>> library_details = client.runtimes.get_library_details(library_uid) >>> library_url = client.runtimes.get_library_href(library_details)
-
static
get_library_uid(library_details)[source]¶ Get library_uid from library details.
Parameters
Important
library_details: Metadata of the library
type: dict
Output
Important
returns: library UID
return type: str
Example
>>> library_details = client.runtimes.get_library_details(library_uid) >>> library_uid = client.runtimes.get_library_uid(library_details)
-
static
get_uid(details)[source]¶ Get runtime_uid from runtime details.
Parameters
Important
runtime_details: Metadata of the runtime
type: dict
Output
Important
returns: runtime UID
return type: str
Example
>>> runtime_details = client.runtimes.get_details(runtime_uid) >>> runtime_uid = client.runtimes.get_uid(runtime_details)
-
list(limit=None, pre_defined=False)[source]¶ List stored runtimes. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
pre_defined: Boolean indicating to display predefined runtimes only. Default value is set to ‘False’
type: bool
Output
Important
This method only prints the list of runtimes in a table format.
return type: None
Example
>>> client.runtimes.list() >>> client.runtimes.list(pre_defined=True)
-
list_libraries(runtime_uid=None, limit=None)[source]¶ List stored libraries. If runtime UID is not provided, all libraries are listed else, libraries associated with a runtime are listed. If limit is set to None there will be only first 50 records shown.
Parameters
Important
runtime_uid: runtime UID (optional)
type: str
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of libraries in a table format.
return type: None
Example
>>> client.runtimes.list_libraries() >>> client.runtimes.list_libraries(runtime_uid)
-
store(meta_props)[source]¶ Create a runtime.
Parameters
Important
meta_props: meta data of the runtime configuration. To see available meta names use:
>>> client.runtimes.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: Metadata of the runtime created
return type: dict
Example
Creating a library
>>> lib_meta = { >>> client.runtimes.LibraryMetaNames.NAME: "libraries_custom", >>> client.runtimes.LibraryMetaNames.DESCRIPTION: "custom libraries for scoring", >>> client.runtimes.LibraryMetaNames.FILEPATH: "/home/user/my_lib.zip", >>> client.runtimes.LibraryMetaNames.VERSION: "1.0", >>> client.runtimes.LibraryMetaNames.PLATFORM: {"name": "python", "versions": ["3.7"]} >>> } >>> custom_library_details = client.runtimes.store_library(lib_meta) >>> custom_library_uid = client.runtimes.get_library_uid(custom_library_details)
Creating a runtime
>>> runtime_meta = { >>> client.runtimes.ConfigurationMetaNames.NAME: "runtime_spec_python_3.7", >>> client.runtimes.ConfigurationMetaNames.DESCRIPTION: "test", >>> client.runtimes.ConfigurationMetaNames.PLATFORM: { >>> "name": "python", >>> "version": "3.7" >>> }, >>> client.runtimes.ConfigurationMetaNames.LIBRARIES_UIDS: [custom_library_uid] # already existing lib is linked here >>> } >>> runtime_details = client.runtimes.store(runtime_meta)
-
store_library(meta_props)[source]¶ Create a library.
Parameters
Important
meta_props: meta data of the libraries configuration. To see available meta names use:
>>> client.runtimes.LibraryMetaNames.get()
type: dict
Output
Important
returns: Metadata of the library created.
return type: dict
Example
>>> library_details = client.runtimes.store_library({ >>> client.runtimes.LibraryMetaNames.NAME: "libraries_custom", >>> client.runtimes.LibraryMetaNames.DESCRIPTION: "custom libraries for scoring", >>> client.runtimes.LibraryMetaNames.FILEPATH: custom_library_path, >>> client.runtimes.LibraryMetaNames.VERSION: "1.0", >>> client.runtimes.LibraryMetaNames.PLATFORM: {"name": "python", "versions": ["3.7"]} >>> })
-
update_library(library_uid, changes)[source]¶ Updates existing library metadata.
Parameters
Important
library_uid: UID of library which definition should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated library
return type: dict
Example
>>> metadata = { >>> client.runtimes.LibraryMetaNames.NAME:"updated_lib" >>> } >>> library_details = client.runtimes.update_library(library_uid, changes=metadata)
-
update_runtime(runtime_uid, changes)[source]¶ Updates existing runtime metadata.
Parameters
Important
runtime_uid: UID of runtime which definition should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated runtime
return type: dict
Example
>>> metadata = { >>> client.runtimes.ConfigurationMetaNames.NAME:"updated_runtime" >>> } >>> runtime_details = client.runtimes.update(runtime_uid, changes=metadata)
-
-
class
metanames.RuntimeMetaNames[source]¶ Set of MetaNames for Runtime Specs.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
runtime_spec_python_3.7DESCRIPTION
str
N
sample runtimePLATFORM
dict
Y
{"name":python","version":"3.7"){'name(required)': 'string', 'version(required)': 'version'}LIBRARIES_UIDS
list
N
['46dc9cf1-252f-424b-b52d-5cdd9814987f']CONFIGURATION_FILEPATH
str
N
/home/env_config.yamlTAGS
list
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]CUSTOM
dict
N
{"field1": "value1"}SPACE_UID
str
N
46dc9cf1-252f-424b-b52d-5cdd9814987fCOMPUTE
dict
N
{'name': 'name1', 'nodes': 1}{'name(required)': 'string', 'nodes(optional)': 'string'}
-
class
metanames.LibraryMetaNames[source]¶ Set of MetaNames for Custom Libraries.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
my_libDESCRIPTION
str
N
my libPLATFORM
dict
Y
{'name': 'python', 'versions': ['3.7']}{'name(required)': 'string', 'version(required)': 'version'}VERSION
str
Y
1.0FILEPATH
str
Y
/home/user/my_lib_1_0.zipTAGS
dict
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]SPACE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6MODEL_DEFINITION
bool
N
FalseCOMMAND
str
N
commandCUSTOM
dict
N
{'field1': 'value1'}
service_instance (Applicable only for IBM Cloud)¶
-
class
client.ServiceInstance(client)[source]¶ Connect, get details and check usage of your Watson Machine Learning service instance.
-
get_api_key()[source]¶ Get api_key of Watson Machine Learning service. :returns: api_key :rtype: str A way you might use me is: >>> instance_details = client.service_instance.get_api_key()
-
get_details()[source]¶ Get information about your Watson Machine Learning instance.
Output
Important
returns: metadata of service instance
return type: dict
Example
>>> instance_details = client.service_instance.get_details()
-
get_instance_id()[source]¶ Get instance id of your Watson Machine Learning service.
Output
Important
returns: instance id
return type: str
Example
>>> instance_details = client.service_instance.get_instance_id()
-
get_password()[source]¶ Get password for your Watson Machine Learning service.
Output
Important
returns: password
return type: str
Example
>>> instance_details = client.service_instance.get_password()
-
set (Applicable only for IBM Cloud Pak™ for Data)¶
-
class
client.Set(client)[source]¶ Set a space_id/project_id to be used in the subsequent actions.
spaces (Applicable only for IBM Cloud Pak™ for Data)¶
-
class
client.Spaces(client)[source]¶ Store and manage your spaces. This is applicable only for IBM Cloud Pak™ for Data
-
ExportMetaNames= <watson_machine_learning_client.metanames.ExportMetaNames object>¶ MetaNames for spaces creation.
-
create_member(space_uid, meta_props)[source]¶ Create a member within a space.
Parameters
Important
meta_props: meta data of the member configuration. To see available meta names use:
>>> client.spaces.MemberMetaNames.get()
type: dict
Output
Important
returns: metadata of the stored member
return type: dict
Note
client.spaces.MemberMetaNames.ROLE can be any one of the following “viewer, editor, admin”
client.spaces.MemberMetaNames.IDENTITY_TYPE can be any one of the following “user,service”
client.spaces.MemberMetaNames.IDENTITY can be either service-ID or IAM-userID
Example
>>> metadata = { >>> client.spaces.MemberMetaNames.ROLE:"Admin", >>> client.spaces.MemberMetaNames.IDENTITY:"iam-ServiceId-5a216e59-6592-43b9-8669-625d341aca71", >>> client.spaces.MemberMetaNames.IDENTITY_TYPE:"service" >>> } >>> members_details = client.spaces.create_member(space_uid=space_id, meta_props=metadata)
-
delete(space_uid)[source]¶ Delete a stored space.
Parameters
Important
space_uid: space UID
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.spaces.delete(deployment_uid)
-
delete_members(space_uid, member_id)[source]¶ Delete a member associated with a space.
Parameters
Important
space_uid: space UID
type: str
member_uid: member UID
type: str
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.spaces.delete_member(space_uid,member_id)
-
download(space_uid, space_exports_uid, filename=None)[source]¶ Downloads zip file deployment of specified UID.
- Parameters
exports_space_uid ({str_type}) – UID of virtual deployment
filename ({str_type}) – filename of downloaded archive (optional)
- Returns
path to downloaded file
- Return type
{str_type}
-
exports(space_uid, meta_props)[source]¶ Updates existing space metadata. exports assets in the zip file from a space
Parameters
Important
meta_props: meta data of the space configuration. To see available meta names use:
>>> client.spaces.ExportMetaNames.get()
type: dict
Output
Important
returns: metadata of import space
return type: str
Note
Space exports are unsupported on CPD 2.5.
- Example
>>> meta_props = { >>> client.spaces.ExportMetaNames.NAME: "sample", >>> client.spaces.ExportMetaNames.DESCRIPTION : "test description", >>> client.spaces.ExportMetaNames.ASSETS : {"data_assets": [], "wml_model":[]} } >>> } >>> space_details = client.spaces.exports(space_uid, meta_props=meta_props)
-
get_details(space_uid=None, limit=None)[source]¶ Get metadata of stored space(s). If space UID is not specified, it returns all the spaces metadata.
Parameters
Important
space_uid: Space UID (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of stored space(s)
return type: dict dict (if UID is not None) or {“resources”: [dict]} (if UID is None)
Note
If UID is not specified, all spaces metadata is fetched
Example
>>> space_details = client.spaces.get_details(space_uid) >>> space_details = client.spaces.get_details()
-
get_exports_details(space_uid, exports_id=None, limit=None)[source]¶ Get details of exports for space. If exports UID is not specified, it returns all the spaces metadata.
Parameters
Important
space_uid: Space UID (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of stored space(s)
return type: dict dict (if UID is not None) or {“resources”: [dict]} (if UID is None)
Note
If UID is not specified, all spaces metadata is fetched
Example
>>> space_details = client.spaces.get_exports_details(space_uid) >>> space_details = client.spaces.get_exports_details(space_uid,exports_id)
-
static
get_exports_uid(exports_space_details)[source]¶ Get imports_uid from imports space details.
Parameters
Important
space_exports_details: Metadata of the created space import
type: dict
Output
Important
returns: exports space UID
return type: str
Example
>>> member_details = client.spaces.get_exports_details(space_uid, exports_id) >>> imports_id = client.spaces.get_imports_uid(exports_space_details)
-
static
get_href(spaces_details)[source]¶ Get space_href from space details.
Parameters
Important
space_details: Metadata of the stored space
type: dict
Output
Important
returns: space href
return type: str
Example
>>> space_details = client.spaces.get_details(space_uid) >>> space_href = client.spaces.get_href(deployment)
-
get_imports_details(space_uid, imports_id=None, limit=None)[source]¶ Get metadata of stored space(s). If space UID is not specified, it returns all the spaces metadata.
Parameters
Important
space_uid: Space UID (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of stored space(s)
return type: dict dict (if UID is not None) or {“resources”: [dict]} (if UID is None)
Note
If UID is not specified, all spaces metadata is fetched
Example
>>> space_details = client.spaces.get_imports_details(space_uid) >>> space_details = client.spaces.get_imports_details(space_uid,imports_id)
-
static
get_imports_uid(imports_space_details)[source]¶ Get imports_uid from imports space details.
Parameters
Important
imports_space_details: Metadata of the created space import
type: dict
Output
Important
returns: imports space UID
return type: str
Example
>>> member_details = client.spaces.get_imports_details(space_uid, imports_id) >>> imports_id = client.spaces.get_imports_uid(imports_space_details)
-
static
get_member_href(member_details)[source]¶ Get member_href from member details.
Parameters
Important
space_details: Metadata of the stored member
type: dict
Output
Important
returns: member href
return type: str
Example
>>> member_details = client.spaces.get_members_details(member_id) >>> member_href = client.spaces.get_member_href(member_details)
-
static
get_member_uid(member_details)[source]¶ Get member_uid from member details.
Parameters
Important
member_details: Metadata of the created member
type: dict
Output
Important
returns: member UID
return type: str
Example
>>> member_details = client.spaces.get_members_details(member_id) >>> member_id = client.spaces.get_member_uid(member_details)
-
get_members_details(space_uid, member_id=None, limit=None)[source]¶ Get metadata of members associated with a space. If member UID is not specified, it returns all the members metadata.
Parameters
Important
space_uid: member UID (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of member(s) of a space
return type: dict dict (if UID is not None) or {“resources”: [dict]} (if UID is None)
Note
If member id is not specified, all members metadata is fetched
Example
>>> member_details = client.spaces.get_members_details(space_uid,member_id)
-
static
get_uid(spaces_details)[source]¶ Get space_uid from space details.
Parameters
Important
space_details: Metadata of the stored space
type: dict
Output
Important
returns: space UID
return type: str
Example
>>> space_details = client.spaces.get_details(space_uid) >>> space_uid = client.spaces.get_uid(deployment)
-
imports(space_uid, file_path)[source]¶ Updates existing space metadata. Imports assets in the zip file to a space
Parameters
Important
space_uid: UID of space which definition should be updated
type: str
file_path: Path to the content file to be importedn type: dict
Output
Important
returns: metadata of import space
return type: str
Example
>>> space_details = client.spaces.imports(space_uid, file_path="/tmp/spaces.zip")
-
list(limit=None)[source]¶ List stored spaces. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all spaces in a table format.
return type: None
Example
>>> client.spaces.list()
-
list_members(space_uid, limit=None)[source]¶ List stored members of a space. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all members associated with a space in a table format.
return type: None
Example
>>> client.spaces.list_members()
-
store(meta_props)[source]¶ Create a space.
Parameters
Important
meta_props: meta data of the space configuration. To see available meta names use:
>>> client.spaces.ConfigurationMetaNames.get()
type: dict
Output
Important
returns: metadata of the stored space
return type: dict
Example
>>> metadata = { >>> client.spaces.ConfigurationMetaNames.NAME: 'my_space', >>> client.spaces.ConfigurationMetaNames.DESCRIPTION: 'spaces', >>> } >>> spaces_details = client.spaces.store(meta_props=metadata) >>> spaces_href = client.spaces.get_href(spaces_details)
-
update(space_uid, changes)[source]¶ Updates existing space metadata.
Parameters
Important
space_uid: UID of space which definition should be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated space
return type: dict
Example
>>> metadata = { >>> client.spaces.ConfigurationMetaNames.NAME:"updated_space" >>> } >>> space_details = client.spaces.update(space_uid, changes=metadata)
-
update_member(space_uid, member_id, changes)[source]¶ Updates existing member metadata.
Parameters
Important
space_uid: UID of space
type: str
member_id: UID of member that needs to be updated
type: str
changes: elements which should be changed, where keys are ConfigurationMetaNames
type: dict
Output
Important
returns: metadata of updated member
return type: dict
Example
>>> metadata = { >>> client.spaces.ConfigurationMetaNames.ROLE:"viewer" >>> } >>> member_details = client.spaces.update_member(space_uid, member_id, changes=metadata)
-
-
class
metanames.SpacesMetaNames[source]¶ Set of MetaNames for Spaces Specs.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
NAME
str
Y
my_spaceTAGS
list
N
[{'value': 'dsx-project.<project-guid>', 'description': 'DSX project guid'}][{'value(required)': 'string', 'description(optional)': 'string'}]CUSTOM
dict
N
{"field1":"value1"}DESCRIPTION
str
N
my_descriptionONLINE_DEPLOYMENTS
list
N
[{}][{'name(optional)': 'string', 'description(optional)': 'string', 'guid(optional)': 'string', 'compute(optional)': {'name(required)': 'string', 'nodes(optional)': 'number'}}]SCHEDULES
list
N
[{}][{'cron(optional)': 'string', 'assets(optional)': [{'name(optional)': 'string', 'description(optional)': 'string', 'guid(optional)': 'string', 'compute(optional)': {'name(required)': 'string', 'nodes(optional)': 'number'}}]}]
training¶
-
class
client.Training(client)[source]¶ Train new models.
-
cancel(training_uid, hard_delete=False)[source]¶ Cancel a training which is currently running and remove it. This method is also be used to delete metadata details of the completed or canceled training run when hard_delete parameter is set to True.
Parameters
Important
training_uid: Training UID
type: str
- hard_delete: specify True or False.
True - To delete the completed or canceled training runs. False - To cancel the currently running training run. Default value is False.
type: Boolean
Output
Important
returns: status (“SUCCESS” or “FAILED”)
return type: str
Example
>>> client.training.cancel(training_uid)
-
get_details(training_uid=None, limit=None)[source]¶ Get metadata of training(s). If training_uid is not specified returns all model spaces metadata.
Parameters
Important
training_uid: Unique Id of Training (optional)
type: str
limit: limit number of fetched records (optional)
type: int
Output
Important
returns: metadata of training(s)
return type: dict The output can be {“resources”: [dict]} or a dict
Note
If training_uid is not specified, all trainings metadata is fetched
Example
>>> training_run_details = client.training.get_details(training_uid) >>> training_runs_details = client.training.get_details()
-
static
get_href(training_details)[source]¶ Get training_href from training details.
Parameters
Important
training_details: Metadata of the training created
type: dict
Output
Important
returns: training href
return type: str
Example
>>> training_details = client.training.get_details(training_uid) >>> run_url = client.training.get_href(training_details)
-
get_metrics(training_uid)[source]¶ Get metrics.
Parameters
Important
training_uid: training UID
type: str
Output
Important
returns: Metrics of a training run
return type: list of dict
Example
>>> training_status = client.training.get_metrics(training_uid)
-
get_status(training_uid)[source]¶ Get the status of a training created.
Parameters
Important
training_uid: training UID
type: str
Output
Important
returns: training_status
return type: dict
Example
>>> training_status = client.training.get_status(training_uid)
-
static
get_uid(training_details)[source]¶ Get training_uid from training details.
Parameters
Important
training_details: Metadata of the training created
type: dict
Output
Important
returns: Unique id of training
return type: str
Example
>>> training_details = client.training.get_details(training_uid) >>> model_uid = client.training.get_uid(training_details)
-
list(limit=None)[source]¶ List stored trainings. If limit is set to None there will be only first 50 records shown.
Parameters
Important
limit: limit number of fetched records
type: int
Output
Important
This method only prints the list of all trainings in a table format.
return type: None
Example
>>> client.training.list()
-
list_intermediate_models(training_uid)[source]¶ List the intermediate_models.
Parameters
Important
training_uid: Training GUID
type: str
Output
Important
This method only prints the list of all intermediate_models associated with an AUTOAI training in a table format.
return type: None
Note
This method prints the training logs. This method is not supported for IBM Cloud Pack for Data.
Example
>>> client.training.list_intermediate_models()
-
list_subtrainings(training_uid)[source]¶ List the sub-trainings.
Parameters
Important
training_uid: Training GUID
type: str
Output
Important
This method only prints the list of all sub-trainings associated with a training in a table format.
return type: None
Example
>>> client.training.list_subtrainings()
-
monitor_logs(training_uid)[source]¶ Monitor the logs of a training created.
Parameters
Important
training_uid: Training UID
type: str
Output
Important
returns: None
return type: None
Note
This method prints the training logs. This method is not supported for IBM Cloud Pack for Data.
Example
>>> client.training.monitor_logs(training_uid)
-
monitor_metrics(training_uid)[source]¶ Monitor the metrics of a training created.
Parameters
Important
training_uid: Training UID
type: str
Output
Output
Important
returns: None
return type: None
Note
This method prints the training metrics. This method is not supported for IBM Cloud Pack for Data.
Example
>>> client.training.monitor_metrics(training_uid)
-
run(meta_props, asynchronous=True)[source]¶ Create a new Machine Learning training.
Parameters
Important
meta_props: meta data of the training configuration. To see available meta names use:
>>> client.training.ConfigurationMetaNames.show()
type: str
- asynchronous:
True - training job is submitted and progress can be checked later.
False - method will wait till job completion and print training stats.
type: bool
Output
Important
returns: Metadata of the training created
return type: dict
- Examples
Example meta_props for Training run creation in Cloud Pack for Data version 3.1.0 or above: >>> metadata = { >>> client.training.ConfigurationMetaNames.NAME: ‘Hand-written Digit Recognition’, >>> client.training.ConfigurationMetaNames.DESCRIPTION: ‘Hand-written Digit Recognition Training’, >>> client.training.ConfigurationMetaNames.PIPELINE: { >>> “id”: “4cedab6d-e8e4-4214-b81a-2ddb122db2ab”, >>> “rev”: “12”, >>> “model_type”: “string”, >>> “data_bindings”: [ >>> { >>> “data_reference_name”: “string”, >>> “node_id”: “string” >>> } >>> ], >>> “nodes_parameters”: [ >>> { >>> “node_id”: “string”, >>> “parameters”: {} >>> } >>> ], >>> “hardware_spec”: { >>> “id”: “4cedab6d-e8e4-4214-b81a-2ddb122db2ab”, >>> “rev”: “12”, >>> “name”: “string”, >>> “num_nodes”: “2” >>> } >>> }, >>> client.training.ConfigurationMetaNames.TRAINING_DATA_REFERENCES: [{ >>> ‘type’: ‘s3’, >>> ‘connection’: {}, >>> ‘location’: { >>> ‘href’: ‘v2/assets/asset1233456’, >>> } >>> “schema”: “{ “id”: “t1”, “name”: “Tasks”, “fields”: [ { “name”: “duration”, “type”: “number” } ]}” >>> }], >>> client.training.ConfigurationMetaNames.TRAINING_RESULTS_REFERENCE: { >>> ‘id’ : ‘string’, >>> ‘connection’: { >>> ‘endpoint_url’: ‘https://s3-api.us-geo.objectstorage.service.networklayer.com’, >>> ‘access_key_id’: ‘*’, >>> ‘secret_access_key’: ‘***’ >>> }, >>> ‘location’: { >>> ‘bucket’: ‘wml-dev-results’, >>> ‘path’ : “path” >>> } >>> ‘type’: ‘s3’ >>> } >>> }
- NOTE: You can provide either one of the below values can be provided for training:
client.training.ConfigurationMetaNames.EXPERIMENT client.training.ConfigurationMetaNames.PIPELINE client.training.ConfigurationMetaNames.MODEL_DEFINITION:
Example meta_prop values for training run creation in other versions:
>>> metadata = { >>> client.training.ConfigurationMetaNames.NAME: 'Hand-written Digit Recognition', >>> client.training.ConfigurationMetaNames.TRAINING_DATA_REFERENCES: [{ >>> 'connection': { >>> 'endpoint_url': 'https://s3-api.us-geo.objectstorage.service.networklayer.com', >>> 'access_key_id': '***', >>> 'secret_access_key': '***' >>> }, >>> 'source': { >>> 'bucket': 'wml-dev', >>> } >>> 'type': 's3' >>> }], >>> client.training.ConfigurationMetaNames.TRAINING_RESULTS_REFERENCE: { >>> 'connection': { >>> 'endpoint_url': 'https://s3-api.us-geo.objectstorage.service.networklayer.com', >>> 'access_key_id': '***', >>> 'secret_access_key': '***' >>> }, >>> 'target': { >>> 'bucket': 'wml-dev-results', >>> } >>> 'type': 's3' >>> }, >>> client.training.ConfigurationMetaNames.PIPELINE_UID : "/v4/pipelines/<PIPELINE-ID>" >>> } >>> training_details = client.training.run(definition_uid, meta_props=metadata) >>> training_uid = client.training.get_uid(training_details)
-
-
class
metanames.TrainingConfigurationMetaNames[source]¶ Set of MetaNames for trainings.
Available MetaNames:
MetaName
Type
Required
Example value
Schema
TRAINING_DATA_REFERENCES
list
Y
[{'connection': {'endpoint_url': 'https://s3-api.us-geo.objectstorage.softlayer.net', 'access_key_id': '***', 'secret_access_key': '***'}, 'location': {'bucket': 'train-data', 'path': 'training_path'}, 'type': 's3', 'schema': {'id': '1', 'fields': [{'name': 'x', 'type': 'double', 'nullable': 'False'}]}}][{'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}, 'schema(optional)': {'id(required)': 'string', 'fields(required)': [{'name(required)': 'string', 'type(required)': 'string', 'nullable(optional)': 'string'}]}}]TRAINING_RESULTS_REFERENCE
dict
Y
{'connection': {'endpoint_url': 'https://s3-api.us-geo.objectstorage.softlayer.net', 'access_key_id': '***', 'secret_access_key': '***'}, 'location': {'bucket': 'test-results', 'path': 'training_path'}, 'type': 's3'}{'name(optional)': 'string', 'type(required)': 'string', 'connection(required)': {'endpoint_url(required)': 'string', 'access_key_id(required)': 'string', 'secret_access_key(required)': 'string'}, 'location(required)': {'bucket': 'string', 'path': 'string'}}TAGS
list
N
[{'value': 'string', 'description': 'string'}][{'value(required)': 'string', 'description(optional)': 'string'}]PIPELINE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6EXPERIMENT_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6PIPELINE_DATA_BINDINGS
list
N
[{'data_reference_name': 'string', 'node_id': 'string'}][{'data_reference_name(required)': 'string', 'node_id(required)': 'string'}]PIPELINE_NODE_PARAMETERS
list
N
[{'node_id': 'string', 'parameters': {}}][{'node_id(required)': 'string', 'parameters(required)': 'dict'}]SPACE_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6TRAINING_LIB
dict
N
{'href': '/v4/libraries/3c1ce536-20dc-426e-aac7-7284cf3befc6', 'compute': {'name': 'k80', 'nodes': 0}, 'runtime': {'href': '/v4/runtimes/3c1ce536-20dc-426e-aac7-7284cf3befc6'}, 'command': 'python3 convolutional_network.py', 'parameters': {}}{'href(required)': 'string', 'type(required)': 'string', 'runtime(optional)': {'href': 'string'}, 'command(optional)': 'string', 'parameters(optional)': 'dict'}TRAINING_LIB_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6TRAINING_LIB_MODEL_TYPE
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6TRAINING_LIB_RUNTIME_UID
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6TRAINING_LIB_PARAMETERS
dict
N
3c1ce536-20dc-426e-aac7-7284cf3befc6COMMAND
str
N
3c1ce536-20dc-426e-aac7-7284cf3befc6COMPUTE
dict
N
3c1ce536-20dc-426e-aac7-7284cf3befc6PIPELINE_MODEL_TYPE
str
N
tensorflow_1.1.3-py3
AutoAI (BETA, IBM Cloud only)¶
This version of WML V4 Python SDK introduces support for AutoAI Experiments. Note that AutoAI SDK functionality is currently only available as a beta for IBM Cloud.
Working with DataConnection BETA¶
DataConnection is the base class to start working with your data storage needed for AutoAI backend
to fetch training data and store all of the results.
There are several ways you can use the DataConnection object. This is one basic scenario.
To start an AutoAI experiment, first you must specifyt where your training dataset is located.
Currently, WML AutoAI supports only Cloud Object Storage (COS) on Cloud, and file path on Cloud Pak for Data.
Cloud DataConnection Initialization¶
To upload your experiment dataset, you must initialize DataConnection with your COS credentials.
from watson_machine_learning_client.autoai.helpers import S3Connection, S3Location, DataConnection
# note: this DataConnection will be used as a reference where to find your training dataset
training_data_connection = DataConnection(
connection=S3Connection(endpoint_url='url of the COS endpoint',
access_key_id='COS access key id',
secret_access_key='COS secret acces key'
),
location=S3Location(bucket='bucket_name', # note: COS bucket name where training dataset is located
path='my_path' # note: path within bucket where your training dataset is located
)
)
# note: this DataConnection will be used as a reference where to save all of the AutoAI experiment results
results_connection = DataConnection(
connection=S3Connection(endpoint_url='url of the COS endpoint',
access_key_id='COS access key id',
secret_access_key='COS secret acces key'
),
# note: bucket name and path could be different or the same as specified in the training_data_connection
location=S3Location(bucket='bucket_name',
path='my_path'
)
)
Upload your training dataset¶
An AutoAI experiment should be able to access your training data.
If you do not have a training dataset stored already,
you can do it by invoking the write() method of the DataConnection.
training_data_connection.write(data='local_path_to_the_dataset', remote_name='training_dataset.csv')
Download your training dataset¶
To download stored dataset, use the read() method of DataConnection.
dataset = training_data_connection.read() # note: returns a pandas DataFrame
Download your training dataset with AutoAI holdout split recreation¶
This feature of DataConnection is only available after AutoPipeline usage.
DataConnection object must be returned from the AutoPipelines.get_data_connections() method.
train_data, holdout_data = training_data_connection.read(with_holdout_split=True) # note: returns pandas DataFrames
Working with AutoAI class and optimizer (BETA)¶
The AutoAI experiment class is responsible for creating experiments and scheduling training. All experiment results are stored automatically in the user-specified Cloud Object Storage (COS). Then, the AutoAI SDK can fetch the results and provide them directly to the user for further usage.
If an AutoAI object is initialized with passed WML credentials, this is an indicator for a hybrid scenario. Otherwise, local scenario will be enabled (Still under development…)
from watson_machine_learning_client.experiment import AutoAI
experiment = AutoAI({
"instance_id": "...",
"url": "...",
"username": "...",
"password": "...",
"version": "..."
}
project_id='1c4353e0-0b32-4a0e-9152-3d50c8552ddb',
space_id='76g53e0-0b32-4a0e-9152-3d50324855ddb')
)
pipeline_optimizer = experiment.optimizer(
name='test name',
desc='test description',
prediction_type=AutoAI.PredictionType.CLASSIFICATION,
prediction_column='y',
scoring=AutoAI.Metrics.ROC_AUC_SCORE,
test_size=0.1,
max_num_daub_ensembles=1,
train_sample_rows_test_size=1.,
daub_include_only_estimators = [
AutoAI.ClassificationAlgorithms.XGB,
AutoAI.ClassificationAlgorithms.LGBM
]
)
Get configuration parameters¶
To see what configuration parameters are used, call the get_params() method.
config_parameters = pipeline_optimizer.get_params()
print(config_parameters)
{
'name': 'test name',
'desc': 'test description',
'prediction_type': 'classification',
'prediction_column': 'y',
'scoring': 'roc_auc',
'test_size': 0.1,
'max_num_daub_ensembles': 1
}
Fit AutoAI experiment¶
To schedule an AutoAI experiment, call a fit() method.
This will trigger a training and an optimization process on the WML side.
fit() method can be synchronous (background_mode=False), or asynchronous (background_mode=True).
When you do not want to wait until the fit end, invoke an async version,
which immediately returns only fit/run details.
Otherwise, in the sync version, you will see a progress bar with information
about the learning/optimization process.
fit_details = pipeline_optimizer.fit(
training_data_reference=[training_data_connection],
training_results_reference=results_connection,
background_mode=True)
# OR
pipeline_optimizer = pipeline_optimizer.fit(
training_data_reference=[training_data_connection],
training_results_reference=results_connection,
background_mode=False)
Get run status, get run details¶
If you decided to use an asynchronous option during fit() method, you can monitor the run/fit
details and status using the following two methods:
status = pipeline_optimizer.get_run_status()
print(status)
'running'
# OR
'completed'
run_details = pipeline_optimizer.get_run_details()
print(run_details)
{'entity': {'pipeline': {'href': '/v4/pipelines/5bfeb4c5-90df-48b8-9e03-ba232d8c0838'},
'results_reference': {'connection': {'access_key_id': '...',
'endpoint_url': '...',
'secret_access_key': '...'},
'location': {'bucket': '...',
'logs': '53c8cb7b-c8b5-44aa-8b52-6fde3c588462',
'model': '53c8cb7b-c8b5-44aa-8b52-6fde3c588462/model',
'path': '.',
'pipeline': './33825fa2-5fca-471a-ab1a-c84820b3e34e/pipeline.json',
'training': './33825fa2-5fca-471a-ab1a-c84820b3e34e',
'training_status': './33825fa2-5fca-471a-ab1a-c84820b3e34e/training-status.json'},
'type': 's3'},
'space': {'href': '/v4/spaces/71ab11ea-bb77-4ae6-b98a-a77f30ade09d'},
'status': {'completed_at': '2020-02-17T10:46:32.962Z',
'message': {'level': 'info',
'text': 'Training job '
'33825fa2-5fca-471a-ab1a-c84820b3e34e '
'completed'},
'state': 'completed'},
'training_data_references': [{'connection': {'access_key_id': '...',
'endpoint_url': '...',
'secret_access_key': '...'},
'location': {'bucket': '...',
'path': '...'},
'type': 's3'}]},
'metadata': {'created_at': '2020-02-17T10:44:22.532Z',
'guid': '33825fa2-5fca-471a-ab1a-c84820b3e34e',
'href': '/v4/trainings/33825fa2-5fca-471a-ab1a-c84820b3e34e',
'id': '33825fa2-5fca-471a-ab1a-c84820b3e34e',
'modified_at': '2020-02-17T10:46:32.987Z'}}
Get data connections¶
To recreate a holdout split from AutoAI, use the following method to fetch appropriate DataConnection objects and invoke read(with_holdout_split=True) method.
See: Download your training dataset with AutoAi holdout split recreation section.
data_connections = pipeline_optimizer.get_data_connections()
# note: data_connections is a list with all training_connections that you referenced during fit() call
Summary¶
You can get a ranking of all computed pipeline models sorted based on a scorer metric supplied at the beginning.
The output is a pandas.DataFrame with pipeline names, computation timestamps,
machine learning metrics and the number of enhancements implemented in each of the pipeline.
results = pipeline_optimizer.results()
print(results)
Number of enhancements ... training_f1
Pipeline Name ...
Pipeline_4 3 ... 0.555556
Pipeline_3 2 ... 0.554978
Pipeline_2 1 ... 0.503175
Pipeline_1 0 ... 0.529928
Get pipeline details¶
To see pipeline composition steps and nodes, use get_pipeline_details().
Empty pipeline_name returns the best computed pipeline details.
pipeline_params = pipeline_optimizer.get_pipeline_details(pipeline_name='Pipeline_1')
print(pipeline_params)
{
'composition_steps': [
'TrainingDataset_full_199_16', 'Split_TrainingHoldout',
'TrainingDataset_full_179_16', 'Preprocessor_default', 'DAUB'
],
'pipeline_nodes': [
'PreprocessingTransformer', 'LogisticRegressionEstimator'
]
}
Get pipeline¶
Use this method to load a specific pipeline. By default, get_pipeline() returns a Lale (https://github.com/ibm/lale) pipeline.
pipeline = pipeline_optimizer.get_pipeline(pipeline_name='Pipeline_4')
print(type(pipeline))
'lale.operators.TrainablePipeline'
There is an additional option to load a pipeline as a scikit pipeline model type.
lale_pipeline = pipeline_optimizer.get_pipeline(pipeline_name='Pipeline_4', astype=AutoAI.PipelineTypes.SKLEARN)
print(type(lale_pipeline))
<class 'sklearn.pipeline.Pipeline'>
Working with deployments (BETA)¶
The following classes enable you to work with Watson Machine Learning deployments.
Web Service¶
Web Service is an online type of deployment. It allows you to upload and deploy your model to be able to score it via online web service.
from watson_machine_learning_client.deployment import WebService
# note: only AutoAI deployment is possible now
service = WebService({
"instance_id": "...",
"url": "...",
"username": "...",
"password": "...",
"version": "..."
}
project_id='1c4353e0-0b32-4a0e-9152-3d50c8552ddb',
space_id='76g53e0-0b32-4a0e-9152-3d50324855ddb')
)
service.create(
experiment_run_id="...",
model=model,
deployment_name='My new deployment'
)
# OR
# note: in the future you will be able to deploy other models as follows
service.create(
model=model,
metadata={
ws.wml_client.repository.ModelMetaNames.NAME: "Bank credit model - best",
ws.wml_client.repository.ModelMetaNames.TYPE: "wml-hybrid_0.1",
ws.wml_client.repository.ModelMetaNames.RUNTIME_UID: "hybrid_0.1",
},
deployment_name='My new deployment',
training_data=some_trainig_data_object,
training_target=some_trainig_target_object
)
AutoAI Modules (BETA for IBM Cloud)¶
AutoAI¶
-
class
watson_machine_learning_client.experiment.autoai.autoai.AutoAI(wml_credentials: Optional[Union[dict, watson_machine_learning_client.workspace.workspace.WorkSpace]] = None, project_id: Optional[str] = None, space_id: Optional[str] = None)[source]¶ Bases:
watson_machine_learning_client.experiment.base_experiment.base_experiment.BaseExperimentAutoAI class for pipeline models optimization automation.
- wml_credentials: dictionary, required
Credentials to Watson Machine Learning instance.
- project_id: str, optional
ID of the Watson Studio project.
- space_id: str, optional
ID of the Watson Studio Space.
>>> from watson_machine_learning_client.experiment import AutoAI >>> # Remote version of AutoAI >>> experiment = AutoAI( >>> wml_credentials={ >>> "apikey": "...", >>> "iam_apikey_description": "...", >>> "iam_apikey_name": "...", >>> "iam_role_crn": "...", >>> "iam_serviceid_crn": "...", >>> "instance_id": "...", >>> "url": "https://us-south.ml.cloud.ibm.com" >>> }, >>> project_id="...", >>> space_id="...") >>> >>> # Local version of AutoAI >>> experiment = AutoAI()
-
class
ClassificationAlgorithms(value)¶ Bases:
enum.EnumClassification algorithms that AutoAI could use.
-
DT= 'DecisionTreeClassifierEstimator'¶
-
EX_TREES= 'ExtraTreesClassifierEstimator'¶
-
GB= 'GradientBoostingClassifierEstimator'¶
-
LGBM= 'LGBMClassifierEstimator'¶
-
LR= 'LogisticRegressionEstimator'¶
-
RF= 'RandomForestClassifierEstimator'¶
-
XGB= 'XGBClassifierEstimator'¶
-
-
class
DataConnectionTypes¶ Bases:
objectSupported types of DataConnection. OneOf: [s3, FS]
-
DS= 'data_asset'¶
-
FS= 'fs'¶
-
S3= 's3'¶
-
-
class
Metrics¶ Bases:
objectSupported types of classification and regression metrics in autoai.
-
ACCURACY_SCORE= 'accuracy'¶
-
AVERAGE_PRECISION_SCORE= 'average_precision'¶
-
EXPLAINED_VARIANCE_SCORE= 'explained_variance'¶
-
F1_SCORE= 'f1'¶
-
F1_SCORE_MACRO= 'f1_macro'¶
-
F1_SCORE_MICRO= 'f1_micro'¶
-
F1_SCORE_WEIGHTED= 'f1_weighted'¶
-
LOG_LOSS= 'neg_log_loss'¶
-
MEAN_ABSOLUTE_ERROR= 'neg_mean_absolute_error'¶
-
MEAN_SQUARED_ERROR= 'neg_mean_squared_error'¶
-
MEAN_SQUARED_LOG_ERROR= 'neg_mean_squared_log_error'¶
-
MEDIAN_ABSOLUTE_ERROR= 'neg_median_absolute_error'¶
-
PRECISION_SCORE= 'precision'¶
-
PRECISION_SCORE_MACRO= 'precision_macro'¶
-
PRECISION_SCORE_MICRO= 'precision_micro'¶
-
PRECISION_SCORE_WEIGHTED= 'precision_weighted'¶
-
R2_SCORE= 'r2'¶
-
RECALL_SCORE= 'recall'¶
-
RECALL_SCORE_MACRO= 'recall_macro'¶
-
RECALL_SCORE_MICRO= 'recall_micro'¶
-
RECALL_SCORE_WEIGHTED= 'recall_weighted'¶
-
ROC_AUC_SCORE= 'roc_auc'¶
-
ROOT_MEAN_SQUARED_ERROR= 'neg_root_mean_squared_error'¶
-
ROOT_MEAN_SQUARED_LOG_ERROR= 'neg_root_mean_squared_log_error'¶
-
-
class
PipelineTypes¶ Bases:
objectSupported types of Pipelines.
-
LALE= 'lale'¶
-
SKLEARN= 'sklearn'¶
-
-
class
PredictionType¶ Bases:
objectSupported types of learning. OneOf: [BINARY, MULTICLASS, REGRESSION]
-
BINARY= 'binary'¶
-
MULTICLASS= 'multiclass'¶
-
REGRESSION= 'regression'¶
-
-
class
RegressionAlgorithms(value)¶ Bases:
enum.EnumRegression algorithms that AutoAI could use.
-
DT= 'DecisionTreeRegressorEstimator'¶
-
EX_TREES= 'ExtraTreesRegressorEstimator'¶
-
GB= 'GradientBoostingRegressorEstimator'¶
-
LGBM= 'LGBMRegressorEstimator'¶
-
LR= 'LinearRegressionEstimator'¶
-
RF= 'RandomForestRegressorEstimator'¶
-
RIDGE= 'RidgeEstimator'¶
-
XGB= 'XGBRegressorEstimator'¶
-
-
class
TShirtSize¶ Bases:
objectPossible sizes of the AutoAI POD Depends on the POD size, AutoAI could support different data sets sizes.
S - small (2vCPUs and 8GB of RAM) M - Medium (4vCPUs and 16GB of RAM) L - Large (8vCPUs and 32GB of RAM)) XL - Extra Large (16vCPUs and 64GB of RAM)
-
L= 'l'¶
-
M= 'm'¶
-
ML= 'ml'¶
-
S= 's'¶
-
XL= 'xl'¶
-
-
optimizer(name: str, *, prediction_type: watson_machine_learning_client.utils.autoai.enums.PredictionType, prediction_column: str, scoring: watson_machine_learning_client.utils.autoai.enums.Metrics, desc: Optional[str] = None, test_size: float = 0.1, max_number_of_estimators: int = 1, train_sample_rows_test_size: Optional[float] = None, daub_include_only_estimators: Optional[List[Union[watson_machine_learning_client.utils.autoai.enums.ClassificationAlgorithms, watson_machine_learning_client.utils.autoai.enums.RegressionAlgorithms]]] = None, data_join_graph: Optional[watson_machine_learning_client.preprocessing.multiple_files_preprocessor.DataJoinGraph] = None, csv_separator: Union[List[str], str] = ',', excel_sheet: Union[str, int] = 0, positive_label: Optional[str] = None, data_join_only: bool = False, **kwargs) → Union[watson_machine_learning_client.experiment.autoai.optimizers.remote_auto_pipelines.RemoteAutoPipelines, watson_machine_learning_client.experiment.autoai.optimizers.local_auto_pipelines.LocalAutoPipelines][source]¶ Initialize an AutoAi optimizer.
- name: str, required
Name for the AutoPipelines
- prediction_type: PredictionType, required
Type of the prediction.
- prediction_column: str, required
name of the target/label column
- scoring: Metrics, required
Type of the metric to optimize with.
- desc: str, optional
Description
- test_size: float, optional
Percentage of the entire dataset to leave as a holdout. Default 0.1
- max_number_of_estimators: int, optional
Maximum number (top-K ranked by DAUB model selection) of the selected algorithm, or estimator types, for example LGBMClassifierEstimator, XGBoostClassifierEstimator, or LogisticRegressionEstimator to use in pipeline composition. The default is 1, where only the highest ranked by model selection algorithm type is used. (min 1, max 4)
- train_sample_rows_test_size: float, optional
Training data sampling percentage
- daub_include_only_estimators: List[Union[‘ClassificationAlgorithms’, ‘RegressionAlgorithms’]], optional
List of estimators to include in computation process. See: AutoAI.ClassificationAlgorithms or AutoAI.RegressionAlgorithms
- csv_separator: Union[List[str], str], optional
The separator, or list of separators to try for separating columns in a CSV file. Not used if the file_name is not a CSV file. Default is ‘,’.
- excel_sheet: Union[str, int], optional
Name or number of the excel sheet to use. Only use when xlsx file is an input. Default is 0.
- positive_label: str, optional
The positive class to report when binary classification. When multiclass or regression, this will be ignored.
- t_shirt_size: TShirtSize, optional
The size of the remote AutoAI POD instance (computing resources). Only applicable to a remote scenario. See: AutoAI.TShirtSize
- data_join_graph: DataJoinGraph, optional
A graph object with definition of join structure for multiple input data sources. Data preprocess step for multiple files.
- data_join_only: bool, optional
If True only preprocessing will be executed.
RemoteAutoPipelines or LocalAutoPipelines, depends on how you initialize the AutoAI object.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI(...) >>> >>> optimizer = experiment.optimizer( >>> name="name of the optimizer.", >>> prediction_type=AutoAI.PredictionType.BINARY, >>> prediction_column="y", >>> scoring=AutoAI.Metrics.ROC_AUC_SCORE, >>> desc="Some description.", >>> test_size=0.1, >>> max_num_daub_ensembles=1, >>> train_sample_rows_test_size=1, >>> daub_include_only_estimators=[AutoAI.ClassificationAlgorithms.LGBM, AutoAI.ClassificationAlgorithms.XGB], >>> t_shirt_size=AutoAI.TShirtSize.L >>> ) >>> >>> optimizer = experiment.optimizer( >>> name="name of the optimizer.", >>> prediction_type=AutoAI.PredictionType.MULTICLASS, >>> prediction_column="y", >>> scoring=AutoAI.Metrics.ROC_AUC_SCORE, >>> desc="Some description.", >>> )
-
runs(*, filter: str) → Union[watson_machine_learning_client.experiment.autoai.runs.auto_pipelines_runs.AutoPipelinesRuns, watson_machine_learning_client.experiment.autoai.runs.local_auto_pipelines_runs.LocalAutoPipelinesRuns][source]¶ Get the historical runs but with WML Pipeline name filter (for remote scenario). Get the historical runs but with experiment name filter (for local scenario).
- filter: str, required
WML Pipeline name to filter the historical runs. or experiment name to filter the local historical runs.
AutoPipelinesRuns or LocalAutoPipelinesRuns
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI(...) >>> >>> experiment.runs(filter='Test').list()
RemoteAutoPipelines¶
-
class
watson_machine_learning_client.experiment.autoai.optimizers.remote_auto_pipelines.RemoteAutoPipelines(name: str, prediction_type: PredictionType, prediction_column: str, scoring: Metrics, engine: WMLEngine, desc: str = None, test_size: float = 0.1, max_num_daub_ensembles: int = 1, t_shirt_size: TShirtSize = 'm', train_sample_rows_test_size: float = None, daub_include_only_estimators: List[Union[ClassificationAlgorithms, RegressionAlgorithms]] = None, data_join_graph: DataJoinGraph = None, csv_separator: Union[List[str], str] = ',', excel_sheet: Union[str, int] = 0, positive_label: str = None, data_join_only: bool = False, notebooks=False, autoai_pod_version=None, obm_pod_version=None)[source]¶ Bases:
watson_machine_learning_client.experiment.autoai.optimizers.base_auto_pipelines.BaseAutoPipelinesRemoteAutoPipelines class for pipeline operation automation on WML.
- name: str, required
Name for the AutoPipelines
- prediction_type: PredictionType, required
Type of the prediction.
- prediction_column: str, required
name of the target/label column
- scoring: Metrics, required
Type of the metric to optimize with.
- desc: str, optional
Description
- test_size: float, optional
Percentage of the entire dataset to leave as a holdout. Default 0.1
- max_num_daub_ensembles: int, optional
Maximum number (top-K ranked by DAUB model selection) of the selected algorithm, or estimator types, for example LGBMClassifierEstimator, XGBoostClassifierEstimator, or LogisticRegressionEstimator to use in pipeline composition. The default is 1, where only the highest ranked by model selection algorithm type is used.
- train_sample_rows_test_size: float, optional
Training data sampling percentage
- daub_include_only_estimators: List[Union[‘ClassificationAlgorithms’, ‘RegressionAlgorithms’]], optional
List of estimators to include in computation process.
- csv_separator: Union[List[str], str], optional
The separator, or list of separators to try for separating columns in a CSV file. Not used if the file_name is not a CSV file. Default is ‘,’.
- excel_sheet: Union[str, int], optional
Name or number of the excel sheet to use. Only use when xlsx file is an input. Default is 0.
- positive_label: str, optional
The positive class to report when binary classification. When multiclass or regression, this will be ignored.
- t_shirt_size: TShirtSize, optional
The size of the remote AutoAI POD instance (computing resources). Only applicable to a remote scenario.
- engine: WMLEngine, required
Engine for remote work on WML.
- data_join_graph: DataJoinGraph, optional
A graph object with definition of join structure for multiple input data sources. Data preprocess step for multiple files.
-
fit(train_data: Optional[pandas.core.frame.DataFrame] = None, *, training_data_reference: List[watson_machine_learning_client.helpers.connections.connections.DataConnection], training_results_reference: Optional[watson_machine_learning_client.helpers.connections.connections.DataConnection] = None, background_mode=False) → dict[source]¶ Run a training process on WML of autoai on top of the training data referenced by DataConnection.
- training_data_reference: List[DataConnection], required
Data storage connection details to inform where training data is stored.
- training_results_reference: DataConnection, optional
Data storage connection details to store pipeline training results. Not applicable on CP4D.
- background_mode: bool, optional
Indicator if fit() method will run in background (async) or (sync).
Dictionary with run details.
>>> from watson_machine_learning_client.experiment import AutoAI >>> from watson_machine_learning_client.helpers import DataConnection, S3Connection, S3Location >>> >>> experiment = AutoAI(credentials, ...) >>> remote_optimizer = experiment.optimizer(...) >>> >>> remote_optimizer.fit( >>> training_data_connection=[DataConnection( >>> connection=S3Connection( >>> endpoint_url="https://s3.us.cloud-object-storage.appdomain.cloud", >>> access_key_id="9c92n0scodfa", >>> secret_access_key="0ch827gf9oiwdn0c90n20nc0oms29j"), >>> location=S3Location( >>> bucket='automl', >>> path='german_credit_data_biased_training.csv') >>> ) >>> )], >>> DataConnection( >>> connection=S3Connection( >>> endpoint_url="https://s3.us.cloud-object-storage.appdomain.cloud", >>> access_key_id="9c92n0scodfa", >>> secret_access_key="0ch827gf9oiwdn0c90n20nc0oms29j"), >>> location=S3Location( >>> bucket='automl', >>> path='') >>> ) >>> ), >>> background_mode=False)
-
get_data_connections() → List[watson_machine_learning_client.helpers.connections.connections.DataConnection][source]¶ - Create DataConnection objects for further user usage
(eg. to handle data storage connection or to recreate autoai holdout split).
List[‘DataConnection’] with populated optimizer parameters
-
get_params() → dict[source]¶ Get configuration parameters of AutoPipelines.
Dictionary with AutoPipelines parameters.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI(credentials, ...) >>> remote_optimizer = experiment.optimizer(...) >>> >>> remote_optimizer.get_params() { 'name': 'test name', 'desc': 'test description', 'prediction_type': 'classification', 'prediction_column': 'y', 'scoring': 'roc_auc', 'test_size': 0.1, 'max_num_daub_ensembles': 1, 't_shirt_size': 'm', 'train_sample_rows_test_size': 0.8, 'daub_include_only_estimators': ["ExtraTreesClassifierEstimator", "GradientBoostingClassifierEstimator", "LGBMClassifierEstimator", "LogisticRegressionEstimator", "RandomForestClassifierEstimator", "XGBClassifierEstimator"] }
-
get_pipeline(pipeline_name: str = None, astype: PipelineTypes = 'lale', persist: bool = False) → Union[Pipeline, TrainablePipeline][source]¶ Download specified pipeline from WML.
- pipeline_name: str, optional
Pipeline name, if you want to see the pipelines names, please use summary() method. If this parameter is None, the best pipeline will be fetched.
- astype: PipelineTypes, optional
Type of returned pipeline model. If not specified, lale type is chosen.
- persist: bool, optional
Indicates if selected pipeline should be stored locally.
Scikit-Learn pipeline.
RemoteAutoPipelines.summary()
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI(credentials, ...) >>> remote_optimizer = experiment.optimizer(...) >>> >>> pipeline_1 = remote_optimizer.get_pipeline(pipeline_name='Pipeline_1') >>> pipeline_2 = remote_optimizer.get_pipeline(pipeline_name='Pipeline_1', astype=AutoAI.PipelineTypes.LALE) >>> pipeline_3 = remote_optimizer.get_pipeline(pipeline_name='Pipeline_1', astype=AutoAI.PipelineTypes.SKLEARN) >>> type(pipeline_3) <class 'sklearn.pipeline.Pipeline'> >>> pipeline_4 = remote_optimizer.get_pipeline(pipeline_name='Pipeline_1', persist=True) Selected pipeline stored under: "absolute_local_path_to_model/model.pickle"
-
get_pipeline_details(pipeline_name: Optional[str] = None) → dict[source]¶ Fetch specific pipeline details, eg. steps etc.
- pipeline_name: str, optional
Pipeline name eg. Pipeline_1, if not specified, best pipeline parameters will be fetched
Dictionary with pipeline parameters.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI(credentials, ...) >>> remote_optimizer = experiment.optimizer(...) >>> >>> remote_optimizer.get_pipeline_details() >>> remote_optimizer.get_pipeline_details(pipeline_name='Pipeline_4') { 'composition_steps': ['TrainingDataset_full_4521_16', 'Split_TrainingHoldout', 'TrainingDataset_full_4068_16', 'Preprocessor_default', 'DAUB'], 'pipeline_nodes': ['PreprocessingTransformer', 'GradientBoostingClassifierEstimator'] }
-
get_preprocessed_data_connection() → watson_machine_learning_client.helpers.connections.connections.DataConnection[source]¶ - Create DataConnection object for further user usage (with OBM output)
(eg. to handle data storage connection or to recreate autoai holdout split).
DataConnection with populated optimizer parameters
-
get_preprocessing_pipeline() → watson_machine_learning_client.preprocessing.data_join_pipeline.DataJoinPipeline[source]¶ - Returns preprocessing pipeline object for further usage.
(eg. to visualize preprocessing pipeline as graph).
DataJoinPipeline
-
get_run_details() → dict[source]¶ Get fit/run details.
Dictionary with AutoPipelineOptimizer fit/run details.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI(credentials, ...) >>> remote_optimizer = experiment.optimizer(...) >>> >>> remote_optimizer.get_run_details()
-
get_run_status() → str[source]¶ Check status/state of initialized AutoPipelines run if ran in background mode
Dictionary with run status details.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI(credentials, ...) >>> remote_optimizer = experiment.optimizer(...) >>> >>> remote_optimizer.get_run_status() 'completed'
-
predict(X: Union[pandas.core.frame.DataFrame, numpy.ndarray]) → numpy.ndarray[source]¶ Predict method called on top of the best fetched pipeline.
- X: numpy.ndarray or pandas.DataFrame, required
Test data for prediction
Numpy ndarray with model predictions.
-
summary() → pandas.core.frame.DataFrame[source]¶ Prints AutoPipelineOptimizer Pipelines details (autoai trained pipelines).
Pandas DataFrame with computed pipelines and ML metrics.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI(credentials, ...) >>> remote_optimizer = experiment.optimizer(...) >>> >>> remote_optimizer.summary() training_normalized_gini_coefficient ... training_f1 Pipeline Name ... Pipeline_3 0.359173 ... 0.449197 Pipeline_4 0.359173 ... 0.449197 Pipeline_1 0.358124 ... 0.449057 Pipeline_2 0.358124 ... 0.449057
LocalAutoPipelines¶
-
class
watson_machine_learning_client.experiment.autoai.optimizers.local_auto_pipelines.LocalAutoPipelines(name: str, prediction_type: PredictionType, prediction_column: str, scoring: Metrics, desc: str = None, test_size: float = 0.1, max_num_daub_ensembles: int = 1, train_sample_rows_test_size: float = 1.0, daub_include_only_estimators: List[Union[ClassificationAlgorithms, RegressionAlgorithms]] = None, positive_label: str = None, _data_clients: List[Tuple[DataConnection, resource]] = None, _result_client: Tuple[DataConnection, resource] = None, _force_local_scenario: bool = False)[source]¶ Bases:
watson_machine_learning_client.experiment.autoai.optimizers.base_auto_pipelines.BaseAutoPipelinesLocalAutoPipelines class for pipeline operation automation.
- name: str, required
Name for the AutoPipelines
- prediction_type: PredictionType, required
Type of the prediction.
- prediction_column: str, required
name of the target/label column
- scoring: Metrics, required
Type of the metric to optimize with.
- desc: str, optional
Description
- test_size: float, optional
Percentage of the entire dataset to leave as a holdout. Default 0.1
- max_num_daub_ensembles: int, optional
Maximum number (top-K ranked by DAUB model selection) of the selected algorithm, or estimator types, for example LGBMClassifierEstimator, XGBoostClassifierEstimator, or LogisticRegressionEstimator to use in pipeline composition. The default is 1, where only the highest ranked by model selection algorithm type is used.
- train_sample_rows_test_size: float, optional
Training data sampling percentage
- daub_include_only_estimators: List[Union[‘ClassificationAlgorithms’, ‘RegressionAlgorithms’]], optional
List of estimators to include in computation process.
- _data_clients: List[Union[‘client’, ‘resource’]], optional
Internal argument to auto-gen notebooks.
- _result_client: Union[‘client’, ‘resource’], optional
Internal argument to auto-gen notebooks.
- _force_local_scenario: bool, optional
Internal argument to force local scenario enablement.
-
fit(X: DataFrame, y: Series) → Pipeline[source]¶ Run a training process of AutoAI locally.
- X: pandas.DataFrame, required
Training dataset.
- y: pandas.Series, required
Target values.
Pipeline model (best found)
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI() >>> local_optimizer = experiment.optimizer() >>> >>> fitted_best_model = local_optimizer.fit(X=test_data_x, y=test_data_y)
-
get_data_connections() → List[DataConnection][source]¶ Provides list of DataConnections with training data that user specified.
List[‘DataConnection’] with populated optimizer parameters
-
get_holdout_data() → Tuple[pandas.core.frame.DataFrame, numpy.ndarray][source]¶ Provide holdout part of the training dataset (X and y) to the user.
X: DataFrame , y: ndarray
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI() >>> local_optimizer = experiment.optimizer() >>> >>> holdout_data = local_optimizer.get_holdout_data()
-
get_params() → dict[source]¶ Get configuration parameters of AutoPipelines.
Dictionary with AutoPipelines parameters.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI() >>> local_optimizer = experiment.optimizer() >>> >>> local_optimizer.get_params() { 'name': 'test name', 'desc': 'test description', 'prediction_type': 'classification', 'prediction_column': 'y', 'scoring': 'roc_auc', 'test_size': 0.1, 'max_num_daub_ensembles': 1, 'train_sample_rows_test_size': 0.8, 'daub_include_only_estimators': ["ExtraTreesClassifierEstimator", "GradientBoostingClassifierEstimator", "LGBMClassifierEstimator", "LogisticRegressionEstimator", "RandomForestClassifierEstimator", "XGBClassifierEstimator"] }
-
get_pipeline(pipeline_name: str = None, astype: PipelineTypes = 'lale', persist: bool = False) → Union[Pipeline, TrainablePipeline][source]¶ Get specified computed pipeline.
- pipeline_name: str, optional
Pipeline name, if you want to see the pipelines names, please use summary() method. If this parameter is None, the best pipeline will be fetched.
- astype: PipelineTypes, optional
Type of returned pipeline model. If not specified, lale type is chosen.
- persist: bool, optional
Indicates if selected pipeline should be stored locally.
Scikit-Learn pipeline or Lale TrainablePipeline.
LocalAutoPipelines.summary()
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI() >>> local_optimizer = experiment.optimizer() >>> >>> pipeline_1 = local_optimizer.get_pipeline(pipeline_name='Pipeline_1') >>> pipeline_2 = local_optimizer.get_pipeline(pipeline_name='Pipeline_1', astype=PipelineTypes.LALE) >>> pipeline_3 = local_optimizer.get_pipeline(pipeline_name='Pipeline_1', astype=PipelineTypes.SKLEARN) >>> type(pipeline_3) <class 'sklearn.pipeline.Pipeline'>
-
get_pipeline_details(pipeline_name: Optional[str] = None) → dict[source]¶ Fetch specific pipeline details, eg. steps etc.
- pipeline_name: str, optional
Pipeline name eg. Pipeline_1, if not specified, best pipeline parameters will be fetched
Dictionary with pipeline parameters.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI() >>> local_optimizer = experiment.optimizer() >>> >>> pipeline_details = local_optimizer.get_pipeline_details(pipeline_name="Pipeline_1")
-
get_preprocessed_data_connection() → DataConnection[source]¶ Provides DataConnection with preprocessed training data.
DataConnection with populated optimizer parameters
-
get_preprocessing_pipeline() → watson_machine_learning_client.preprocessing.data_join_pipeline.DataJoinPipeline[source]¶ - Returns preprocessing pipeline object for further usage.
(eg. to visualize preprocessing pipeline as graph).
DataJoinPipeline
-
predict(X: Union[pandas.core.frame.DataFrame, numpy.ndarray]) → numpy.ndarray[source]¶ Predict method called on top of the best computed pipeline.
- X: numpy.ndarray or pandas.DataFrame, required
Test data for prediction.
Numpy ndarray with model predictions.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI() >>> local_optimizer = experiment.optimizer() >>> >>> predictions = local_optimizer.predict(X=test_data)
-
summary() → pandas.core.frame.DataFrame[source]¶ Prints AutoPipelineOptimizer Pipelines details (autoai trained pipelines).
Pandas DataFrame with computed pipelines and ML metrics.
>>> from watson_machine_learning_client.experiment import AutoAI >>> experiment = AutoAI() >>> local_optimizer = experiment.optimizer() >>> >>> local_optimizer.summary() training_normalized_gini_coefficient ... training_f1 Pipeline Name ... Pipeline_3 0.359173 ... 0.449197 Pipeline_4 0.359173 ... 0.449197 Pipeline_1 0.358124 ... 0.449057 Pipeline_2 0.358124 ... 0.449057
AutoAI Helpers (BETA for IBM Cloud)¶
DataConnection¶
-
class
watson_machine_learning_client.helpers.connections.connections.DataConnection(location: Union[watson_machine_learning_client.helpers.connections.connections.S3Location, watson_machine_learning_client.helpers.connections.connections.FSLocation, watson_machine_learning_client.helpers.connections.connections.CP4DAssetLocation, watson_machine_learning_client.helpers.connections.connections.WMLSAssetLocation, watson_machine_learning_client.helpers.connections.connections.WSDAssetLocation, watson_machine_learning_client.helpers.connections.connections.DeploymentOutputAssetLocation], connection: Optional[watson_machine_learning_client.helpers.connections.connections.S3Connection] = None, data_join_node_name: Optional[str] = None)[source]¶ Bases:
watson_machine_learning_client.helpers.connections.base_data_connection.BaseDataConnectionData Storage Connection class needed for WML training metadata (input data).
- connection: Union[S3Connection], required
connection parameters of specific type
- location: Union[S3Location, FSLocation, CP4DAssetLocation, WMLSAssetLocation, WSDAssetLocation], required
location parameters of specific type
-
classmethod
from_studio(path: str) → List[watson_machine_learning_client.helpers.connections.connections.DataConnection][source]¶ Create DataConnections from the credentials stored (connected) in Watson Studio. Only for COS.
- path: str, required
Path in COS bucket to the training dataset.
List with DataConnection objects.
>>> data_connections = DataConnection.from_studio(path='iris_dataset.csv')
-
read(with_holdout_split: bool = False, csv_separator: str = ',', excel_sheet: Union[str, int] = 0) → Union[pandas.core.frame.DataFrame, Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]][source]¶ Download dataset stored in remote data storage.
- with_holdout_split: bool, optional
If True, data will be split to train and holdout dataset as it was by AutoAI.
- csv_separator: str, optional
Separator / delimiter for CSV file, default is ‘,’
- excel_sheet: Union[str, int], optional
Excel file sheet name to use, default is 0.
- pandas.DataFrame contains dataset from remote data storage or Tuple[pandas.DataFrame, pandas.DataFrame]
containing training data and holdout data from remote storage (only if only_holdout == True and auto_pipeline_params was passed)
S3Connection¶
-
class
watson_machine_learning_client.helpers.connections.connections.S3Connection(endpoint_url: str, access_key_id: Optional[str] = None, secret_access_key: Optional[str] = None, api_key: Optional[str] = None, service_name: Optional[str] = None, auth_endpoint: Optional[str] = None)[source]¶ Bases:
watson_machine_learning_client.helpers.connections.base_connection.BaseConnectionConnection class to COS data storage in S3 format.
- endpoint_url: str, required
S3 data storage url (COS)
- access_key_id: str, optional
access_key_id of the S3 connection (COS)
- secret_access_key: str, optional
secret_access_key of the S3 connection (COS)
- api_key: str, optional
API key of the S3 connection (COS)
- service_name: str, optional
Service name of the S3 connection (COS)
- auth_endpoint: str, optional
Authentication endpoint url of the S3 connection (COS)
S3Location¶
-
class
watson_machine_learning_client.helpers.connections.connections.S3Location(bucket: str, path: str, **kwargs)[source]¶ Bases:
watson_machine_learning_client.helpers.connections.base_location.BaseLocationConnection class to COS data storage in S3 format.
- bucket: str, required
COS bucket name
- path: str, required
COS data path in the bucket
- model_location: str, optional
Path to the pipeline model in the COS.
- training_status: str, optional
Path t the training status json in COS.
get_credentials_from_config¶
-
class
watson_machine_learning_client.helpers.helpers.get_credentials_from_config(env_name, credentials_name, config_path='./config.ini')[source]¶ Load credentials from config file.
[DEV_LC]
wml_credentials = { } cos_credentials = { }
- Parameters
env_name (str) – the name of [ENV] defined in config file
credentials_name (str) – name of credentials
config_path (str) – path to the config file
- Returns
dict
>>> get_credentials_from_config(env_name='DEV_LC', credentials_name='wml_credentials')
Enums¶
-
class
watson_machine_learning_client.utils.autoai.enums.ClassificationAlgorithms(value)[source]¶ Bases:
enum.EnumClassification algorithms that AutoAI could use.
-
DT= 'DecisionTreeClassifierEstimator'¶
-
EX_TREES= 'ExtraTreesClassifierEstimator'¶
-
GB= 'GradientBoostingClassifierEstimator'¶
-
LGBM= 'LGBMClassifierEstimator'¶
-
LR= 'LogisticRegressionEstimator'¶
-
RF= 'RandomForestClassifierEstimator'¶
-
XGB= 'XGBClassifierEstimator'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.DataConnectionTypes[source]¶ Bases:
objectSupported types of DataConnection. OneOf: [s3, FS]
-
DS= 'data_asset'¶
-
FS= 'fs'¶
-
S3= 's3'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.Directions[source]¶ Bases:
objectPossible metrics directions
-
ASCENDING= 'ascending'¶
-
DESCENDING= 'descending'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.Metrics[source]¶ Bases:
objectSupported types of classification and regression metrics in autoai.
-
ACCURACY_SCORE= 'accuracy'¶
-
AVERAGE_PRECISION_SCORE= 'average_precision'¶
-
EXPLAINED_VARIANCE_SCORE= 'explained_variance'¶
-
F1_SCORE= 'f1'¶
-
F1_SCORE_MACRO= 'f1_macro'¶
-
F1_SCORE_MICRO= 'f1_micro'¶
-
F1_SCORE_WEIGHTED= 'f1_weighted'¶
-
LOG_LOSS= 'neg_log_loss'¶
-
MEAN_ABSOLUTE_ERROR= 'neg_mean_absolute_error'¶
-
MEAN_SQUARED_ERROR= 'neg_mean_squared_error'¶
-
MEAN_SQUARED_LOG_ERROR= 'neg_mean_squared_log_error'¶
-
MEDIAN_ABSOLUTE_ERROR= 'neg_median_absolute_error'¶
-
PRECISION_SCORE= 'precision'¶
-
PRECISION_SCORE_MACRO= 'precision_macro'¶
-
PRECISION_SCORE_MICRO= 'precision_micro'¶
-
PRECISION_SCORE_WEIGHTED= 'precision_weighted'¶
-
R2_SCORE= 'r2'¶
-
RECALL_SCORE= 'recall'¶
-
RECALL_SCORE_MACRO= 'recall_macro'¶
-
RECALL_SCORE_MICRO= 'recall_micro'¶
-
RECALL_SCORE_WEIGHTED= 'recall_weighted'¶
-
ROC_AUC_SCORE= 'roc_auc'¶
-
ROOT_MEAN_SQUARED_ERROR= 'neg_root_mean_squared_error'¶
-
ROOT_MEAN_SQUARED_LOG_ERROR= 'neg_root_mean_squared_log_error'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.MetricsToDirections(value)[source]¶ Bases:
enum.EnumMap of metrics directions.
-
ACCURACY= 'ascending'¶
-
AVERAGE_PRECISION= 'ascending'¶
-
EXPLAINED_VARIANCE= 'ascending'¶
-
F1= 'ascending'¶
-
F1_MACRO= 'ascending'¶
-
F1_MICRO= 'ascending'¶
-
F1_WEIGHTED= 'ascending'¶
-
NEG_LOG_LOSS= 'descending'¶
-
NEG_MEAN_ABSOLUTE_ERROR= 'descending'¶
-
NEG_MEAN_SQUARED_ERROR= 'descending'¶
-
NEG_MEAN_SQUARED_LOG_ERROR= 'descending'¶
-
NEG_MEDIAN_ABSOLUTE_ERROR= 'descending'¶
-
NEG_ROOT_MEAN_SQUARED_ERROR= 'descending'¶
-
NEG_ROOT_MEAN_SQUARED_LOG_ERROR= 'descending'¶
-
NORMALIZED_GINI_COEFFICIENT= 'ascending'¶
-
PRECISION= 'ascending'¶
-
PRECISION_MACRO= 'ascending'¶
-
PRECISION_MICRO= 'ascending'¶
-
PRECISION_WEIGHTED= 'ascending'¶
-
R2= 'ascending'¶
-
RECALL= 'ascending'¶
-
RECALL_MACRO= 'ascending'¶
-
RECALL_MICRO= 'ascending'¶
-
RECALL_WEIGHTED= 'ascending'¶
-
ROC_AUC= 'ascending'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.PipelineTypes[source]¶ Bases:
objectSupported types of Pipelines.
-
LALE= 'lale'¶
-
SKLEARN= 'sklearn'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.PositiveLabelClass[source]¶ Bases:
objectMetrics that need positive label definition for binary classification.
-
AVERAGE_PRECISION_SCORE= 'average_precision'¶
-
F1_SCORE= 'f1'¶
-
F1_SCORE_MACRO= 'f1_macro'¶
-
F1_SCORE_MICRO= 'f1_micro'¶
-
F1_SCORE_WEIGHTED= 'f1_weighted'¶
-
PRECISION_SCORE= 'precision'¶
-
PRECISION_SCORE_MACRO= 'precision_macro'¶
-
PRECISION_SCORE_MICRO= 'precision_micro'¶
-
PRECISION_SCORE_WEIGHTED= 'precision_weighted'¶
-
RECALL_SCORE= 'recall'¶
-
RECALL_SCORE_MACRO= 'recall_macro'¶
-
RECALL_SCORE_MICRO= 'recall_micro'¶
-
RECALL_SCORE_WEIGHTED= 'recall_weighted'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.PredictionType[source]¶ Bases:
objectSupported types of learning. OneOf: [BINARY, MULTICLASS, REGRESSION]
-
BINARY= 'binary'¶
-
MULTICLASS= 'multiclass'¶
-
REGRESSION= 'regression'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.RegressionAlgorithms(value)[source]¶ Bases:
enum.EnumRegression algorithms that AutoAI could use.
-
DT= 'DecisionTreeRegressorEstimator'¶
-
EX_TREES= 'ExtraTreesRegressorEstimator'¶
-
GB= 'GradientBoostingRegressorEstimator'¶
-
LGBM= 'LGBMRegressorEstimator'¶
-
LR= 'LinearRegressionEstimator'¶
-
RF= 'RandomForestRegressorEstimator'¶
-
RIDGE= 'RidgeEstimator'¶
-
XGB= 'XGBRegressorEstimator'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.RunStateTypes[source]¶ Bases:
objectSupported types of AutoAI fit/run.
-
COMPLETED= 'completed'¶
-
FAILED= 'failed'¶
-
-
class
watson_machine_learning_client.utils.autoai.enums.TShirtSize[source]¶ Bases:
objectPossible sizes of the AutoAI POD Depends on the POD size, AutoAI could support different data sets sizes.
S - small (2vCPUs and 8GB of RAM) M - Medium (4vCPUs and 16GB of RAM) L - Large (8vCPUs and 32GB of RAM)) XL - Extra Large (16vCPUs and 64GB of RAM)
-
L= 'l'¶
-
M= 'm'¶
-
ML= 'ml'¶
-
S= 's'¶
-
XL= 'xl'¶
-