projit package
Submodules
projit.ascii_plot module
- projit.ascii_plot.ascii_plot(ydata, xdata=None, logscale=False, pch='o', xlabel='X', ylabel='Y', width=72, height=50)[source]
- Parameters:
ydata – list of values to be plotted
xdata (None) – x coordinate corresponding to ydata. If None will range between 1 and the length of ydata.
logscale (False) – display data with logarithmic Y axis
pch ('o') – string for points (whatever + = - * etc…)
title ('plot') – string for title of the plot
xlabel ('X') – label for the X axis
ylabel ('Y') – label for the Y axis
width (100) – width in term of characters
height (100) – height in term of characters
- Returns:
string corresponding to plot
projit.cli module
- projit.cli.print_results_latex(title, df)[source]
- Latex output - Putting this in a central function in case we change the functionality
or format in the future.
- projit.cli.task_add(project, asset, name, path)[source]
Add elements to a project from the command line
- projit.cli.task_compare(project, datasets, metric, format, precision)[source]
Compare results across muliple datasets. This command loads the results for each dataset and extarcts just the records for the specified metric to compile the comparison dataset to display.
- projit.cli.task_init(name, template='')[source]
Initialise a project from the command line. This function request a description, and thus runs in interactive mode.
- projit.cli.task_list(subcmd, project, dataset, format, precision, tags)[source]
List content of a project from the command line
- projit.cli.task_rm(project, asset, name)[source]
Remove elements to a project from the command line
projit.config module
projit.pdf module
projit.projit module
- class projit.projit.Projit(path, name, desc='', experiments=[], datasets={}, results={}, params={}, hyperparams={}, dataresults={}, executions={}, tags={})[source]
Bases:
objectProjit Class. This is a data structure to contain the core elements of a data science project. It will permit loose coupling between processes and experiments but provide a simple overarching structure for communication and documentation.
- add_dataset(name, path)[source]
Add a named dataset to the project.
- Parameters:
name (string, required) – The dataset name
path (string, required) – The path to the data set (either local path, URL or S3 Bucket)
- Returns:
None
- Return type:
None
- add_experiment(name, path)[source]
Add information of a new experiment to the project. Then save the project configuration. This function will overwrite an experiment of the same name and delete any previous results.
- Parameters:
name (string, required) – The experiment name
path (string, required) – The path to the experiment.
- Returns:
None
- Return type:
None
- add_hyperparam(name, value)[source]
Add a set of hyper parameters to the project.
- Parameters:
name (string, required) – The experiment name
value (Dictionary) – The Dictionary of hyperparameters
- Returns:
None
- Return type:
None
- add_param(name, value)[source]
Add a parameter to the project.
- Parameters:
name (string, required) – The parameter name
value (Any) – The value taken by that parameter
- Returns:
None
- Return type:
None
- add_result(experiment, metric, value, dataset=None)[source]
Add results from an experiment to the project.
They can be overall project results, or associated with a specific dataset
- Parameters:
name (string, required) – The experiment name
metric (string, required) – The name of the metric we are adding.
value (float, required) – The value of the metric to add.
dataset (string, optional) – The dataset against which the results are generated
- Returns:
None
- Return type:
None
- clean_experimental_results(name)[source]
Remove all results for a given experiment
- Parameters:
name (string, required) – The experiment name
- Returns:
None
- Return type:
None
- dataset_exists(name)[source]
Check if a given dataset is in the data structure
- Parameters:
name (string, required) – The dataset name
- Returns:
exists
- Return type:
Boolean
- end_experiment(name, id, hyperparams={})[source]
End an experiment execution. This function require both the experiment name and the hash ID of the previously started execution
- Parameters:
name (string, required) – The experiment name (Unique Identifer)
id (string, required) – The execution hash ID returned by the function: start_experiment
hyperparams – Optional dictionary of hyperparameters used in the experiment execution
- Returns:
None
- Return type:
None
- experiment_exists(name)[source]
Check if a given experiment is in the data structure
- Parameters:
name (string, required) – The experiment name
- Returns:
exists
- Return type:
Boolean
- get_dataset(name)[source]
Retrieve the dataset by name.
- Parameters:
name (string, required) – The dataset to retrieve
- Returns:
Path to dataset
- Return type:
String
- get_experiment_execution_stats(name)[source]
Given an experiment name Return the execution statistics
- get_results(dataset=None)[source]
Retrieve the experimental results as a DataFrame.
They can be overall project results, or associated with a specific dataset
- Parameters:
dataset (string, optional) – The dataset against which the results are generated
- Returns:
DataFrame of results
- Return type:
pandas.DataFrame
- initiate_lock()[source]
Lock files are used during processes that modify the project so that we get consistent state across parallel executions.
- release_lock()[source]
Lock files are used during processes that modify the project so that we get consistent state across parallel executions. Release the lock by deleting the lock file
- reload()[source]
Sometimes we reload the project from disk. Necessary when multiple processes are running experiments in the same project.
- rm_dataset(name)[source]
Remove a named dataset to the project.
- Parameters:
name – The dataset name (or ‘.’ for all datasets)
- Returns:
None
- Return type:
None
- rm_experiment(name)[source]
Remove a named experiment from the project.
- Parameters:
name – The experiment name (or ‘.’ for all experiments)
- Returns:
None
- Return type:
None
- start_experiment(name, path, params={}, tags={})[source]
Start an experiment execution. This function will create a new experiment if this is the first execution otherwise it will simply add a new execution record.
It returns an identifer for the execution (needed to end the execution)
- Parameters:
name (string, required) – The experiment name (Unique Identifer)
path (string, required) – The path to the experiment script being executed
params (Dictionary, optional) – Optional dictionary of parameters used in the experiment execution
tags (Dictionary, optional) – Optional dictionary of tags to describe the experiment
- Returns:
id : The Execution ID
- Return type:
String
- projit.projit.init(template, name, desc='')[source]
Initialise a new projit project. Create the config directory and write the project config there.
- Parameters:
name (string, required) – The name of the project
desc (string, required) – The project description
- Returns:
Projit Object
- Return type:
- projit.projit.load(config_path)[source]
This function allows you to instantiate a Projit project from an existing config_path The config path must contain the required config file that contains the required fields.
Note: This function will always overwrite the path variable in the object so the instance is aware of where it is relative to the config directory.
- Parameters:
config_path (string, required) – The path to the projit configuration
- Returns:
Projit Object
- Return type:
projit.template module
projit.utils module
- projit.utils.locate_projit_config()[source]
Find a path to a projit project config, or return empty string.