projit package

Submodules

projit.ascii_plot module

projit.ascii_plot.arange(beg, end, step)[source]
projit.ascii_plot.ascii_plot(ydata, xdata=None, logscale=False, pch='o', xlabel='X', ylabel='Y', width=72, height=50)[source]
Parameters:
  • ydata – list of values to be plotted

  • xdata (None) – x coordinate corresponding to ydata. If None will range between 1 and the length of ydata.

  • logscale (False) – display data with logarithmic Y axis

  • pch ('o') – string for points (whatever + = - * etc…)

  • title ('plot') – string for title of the plot

  • xlabel ('X') – label for the X axis

  • ylabel ('Y') – label for the Y axis

  • width (100) – width in term of characters

  • height (100) – height in term of characters

Returns:

string corresponding to plot

projit.cli module

projit.cli.cli_main()[source]
projit.cli.extract_max_tags_lengths(project, asset, tags)[source]
projit.cli.filler(current, max_len, content=' ')[source]
projit.cli.main()[source]
projit.cli.print_header(header)[source]
projit.cli.print_results_latex(title, df)[source]
Latex output - Putting this in a central function in case we change the functionality

or format in the future.

projit.cli.print_results_markdown(title, df)[source]
projit.cli.print_usage(prog)[source]

Command line application usage instrutions.

projit.cli.task_add(project, asset, name, path)[source]

Add elements to a project from the command line

projit.cli.task_compare(project, datasets, metric, format, precision)[source]

Compare results across muliple datasets. This command loads the results for each dataset and extarcts just the records for the specified metric to compile the comparison dataset to display.

projit.cli.task_init(name, template='')[source]

Initialise a project from the command line. This function request a description, and thus runs in interactive mode.

projit.cli.task_list(subcmd, project, dataset, format, precision, tags)[source]

List content of a project from the command line

projit.cli.task_plot(project, experiment, property, metric)[source]
projit.cli.task_render(project, path)[source]

Generates a pdf and writes it to the provided path

projit.cli.task_rm(project, asset, name)[source]

Remove elements to a project from the command line

projit.cli.task_status(project)[source]
projit.cli.task_tag(project, asset, name, values)[source]

Add tags to an asset in the project from the command line

projit.cli.task_update(project)[source]

Update a project from the command line

projit.config module

projit.pdf module

class projit.pdf.PDF(orientation='P', unit='mm', format='A4')[source]

Bases: FPDF

add_description(description)[source]
add_title(title)[source]
setup()[source]

projit.projit module

class projit.projit.Projit(path, name, desc='', experiments=[], datasets={}, results={}, params={}, hyperparams={}, dataresults={}, executions={}, tags={})[source]

Bases: object

Projit Class. This is a data structure to contain the core elements of a data science project. It will permit loose coupling between processes and experiments but provide a simple overarching structure for communication and documentation.

add_dataset(name, path)[source]

Add a named dataset to the project.

Parameters:
  • name (string, required) – The dataset name

  • path (string, required) – The path to the data set (either local path, URL or S3 Bucket)

Returns:

None

Return type:

None

add_experiment(name, path)[source]

Add information of a new experiment to the project. Then save the project configuration. This function will overwrite an experiment of the same name and delete any previous results.

Parameters:
  • name (string, required) – The experiment name

  • path (string, required) – The path to the experiment.

Returns:

None

Return type:

None

add_hyperparam(name, value)[source]

Add a set of hyper parameters to the project.

Parameters:
  • name (string, required) – The experiment name

  • value (Dictionary) – The Dictionary of hyperparameters

Returns:

None

Return type:

None

add_param(name, value)[source]

Add a parameter to the project.

Parameters:
  • name (string, required) – The parameter name

  • value (Any) – The value taken by that parameter

Returns:

None

Return type:

None

add_result(experiment, metric, value, dataset=None)[source]

Add results from an experiment to the project.

They can be overall project results, or associated with a specific dataset

Parameters:
  • name (string, required) – The experiment name

  • metric (string, required) – The name of the metric we are adding.

  • value (float, required) – The value of the metric to add.

  • dataset (string, optional) – The dataset against which the results are generated

Returns:

None

Return type:

None

add_tags(asset, name, tags)[source]
clean_experimental_results(name)[source]

Remove all results for a given experiment

Parameters:

name (string, required) – The experiment name

Returns:

None

Return type:

None

create_local_path(ds)[source]
dataset_exists(name)[source]

Check if a given dataset is in the data structure

Parameters:

name (string, required) – The dataset name

Returns:

exists

Return type:

Boolean

end_experiment(name, id, hyperparams={})[source]

End an experiment execution. This function require both the experiment name and the hash ID of the previously started execution

Parameters:
  • name (string, required) – The experiment name (Unique Identifer)

  • id (string, required) – The execution hash ID returned by the function: start_experiment

  • hyperparams – Optional dictionary of hyperparameters used in the experiment execution

Returns:

None

Return type:

None

experiment_exists(name)[source]

Check if a given experiment is in the data structure

Parameters:

name (string, required) – The experiment name

Returns:

exists

Return type:

Boolean

get_dataset(name)[source]

Retrieve the dataset by name.

Parameters:

name (string, required) – The dataset to retrieve

Returns:

Path to dataset

Return type:

String

get_execution_times(name)[source]
get_experiment_execution_stats(name)[source]

Given an experiment name Return the execution statistics

get_hyperparam(name)[source]
get_mean_execution_time(name)[source]
get_param(name)[source]
get_path_to_dataset(name)[source]
get_results(dataset=None)[source]

Retrieve the experimental results as a DataFrame.

They can be overall project results, or associated with a specific dataset

Parameters:

dataset (string, optional) – The dataset against which the results are generated

Returns:

DataFrame of results

Return type:

pandas.DataFrame

get_root_path()[source]

Get the path to where the project folder is located

get_tags(asset, name, tags)[source]
initiate_lock()[source]

Lock files are used during processes that modify the project so that we get consistent state across parallel executions.

is_complete_path(path)[source]
release_lock()[source]

Lock files are used during processes that modify the project so that we get consistent state across parallel executions. Release the lock by deleting the lock file

reload()[source]

Sometimes we reload the project from disk. Necessary when multiple processes are running experiments in the same project.

render(path)[source]
rm_dataset(name)[source]

Remove a named dataset to the project.

Parameters:

name – The dataset name (or ‘.’ for all datasets)

Returns:

None

Return type:

None

rm_experiment(name)[source]

Remove a named experiment from the project.

Parameters:

name – The experiment name (or ‘.’ for all experiments)

Returns:

None

Return type:

None

save()[source]

Save your projit project into config files within the projit config dir

start_experiment(name, path, params={}, tags={})[source]

Start an experiment execution. This function will create a new experiment if this is the first execution otherwise it will simply add a new execution record.

It returns an identifer for the execution (needed to end the execution)

Parameters:
  • name (string, required) – The experiment name (Unique Identifer)

  • path (string, required) – The path to the experiment script being executed

  • params (Dictionary, optional) – Optional dictionary of parameters used in the experiment execution

  • tags (Dictionary, optional) – Optional dictionary of tags to describe the experiment

Returns:

id : The Execution ID

Return type:

String

update_name_description(name, descrip)[source]

Update the core values name and description

validate_asset(asset, name)[source]
projit.projit.init(template, name, desc='')[source]

Initialise a new projit project. Create the config directory and write the project config there.

Parameters:
  • name (string, required) – The name of the project

  • desc (string, required) – The project description

Returns:

Projit Object

Return type:

Projit

projit.projit.init_template(template)[source]
projit.projit.load(config_path)[source]

This function allows you to instantiate a Projit project from an existing config_path The config path must contain the required config file that contains the required fields.

Note: This function will always overwrite the path variable in the object so the instance is aware of where it is relative to the config directory.

Parameters:

config_path (string, required) – The path to the projit configuration

Returns:

Projit Object

Return type:

Projit

projit.projit.projit_load()[source]

projit.template module

projit.template.end_profile(proc_name)[source]
projit.template.eprint(*args, **kwargs)[source]
projit.template.initialise_profile()[source]
projit.template.load_template(filename)[source]

Utility function to load a project template

projit.template.padded(k, padto=20)[source]
projit.template.print_profiles()[source]
projit.template.start_profile(proc_name)[source]

projit.utils module

projit.utils.create_properties(project_name, descrip)[source]
projit.utils.get_data_config(pathway)[source]
projit.utils.get_experiments(pathway)[source]
projit.utils.get_properties(pathway)[source]
projit.utils.initialise_project(name, descrip)[source]
projit.utils.locate_projit_config()[source]

Find a path to a projit project config, or return empty string.

projit.utils.open_config(filename)[source]
projit.utils.walk_up(bottom)[source]

mimic os.walk, but walk ‘up’ instead of down the directory tree

projit.utils.write_config(config, filename)[source]
projit.utils.write_properties(pathway, props)[source]

Module contents