virelay.model

Contains the data model abstraction.

Classes

Analysis

Represents an analysis of multiple attributions.

AnalysisCategory

Represents a single category in an analysis.

AnalysisDatabase

Represents a single analysis database, which contains the analysis of attributions.

Attribution

Represents a single attribution from an attribution database.

AttributionDatabase

Represents a single attribution database, which contains the attributions for the dataset samples.

Hdf5Dataset

Represents a dataset that is stored in an HDF5 database.

ImageDirectoryDataset

Represents an image dataset, where the image files are in a directory hierarchy were the names of the directories represent the labels of the images.

Label

Represents a label of the dataset.

LabelMap

Represents a map between output neuron indices and their respective human-readable label name.

Project

Represents a single project, which can be loaded from a YAML file.

Sample

Represents a sample in a dataset.

Workspace

Represents a workspace, which may consist of multiple projects.

class virelay.model.Analysis(category_name, human_readable_category_name, clustering_name, clustering, embedding_name, embedding, attribution_indices, eigenvalues)[source]

Bases: object

Represents an analysis of multiple attributions.

class virelay.model.AnalysisCategory(name, human_readable_name)[source]

Bases: object

Represents a single category in an analysis. One category can contain many analyses for different attributions. The category name is usually an umbrella term for all the attributions contained in the analysis. This is most-likely a class name.

class virelay.model.AnalysisDatabase(analysis_path, label_map)[source]

Bases: object

Represents a single analysis database, which contains the analysis of attributions.

close()[source]

Closes the analysis database.

get_analysis(category_name, clustering_name, embedding_name)[source]

Gets the analysis for the specified name (names can be, for example, classes for which the analysis was performed).

Parameters
  • category_name (str) – The name of the category for which the analysis was performed. Each analysis was performed for a certain subset of the attributions, in most cases this subset will be defined by the label of dataset samples of the attributions. So the category name is the umbrella term for all the attributions that comprise the analysis, which will, in most cases, be the name of the label.

  • clustering_name (str) – On top of the embedding a clustering is performed. This clustering name is the name of the clustering that is to be retrieved (because usually the analysis contains multiple different clusterings, which are most likely k-means with different k’s).

  • embedding_name (str) – The name of the embedding that is to be retrieved. This is the name of the method that was used to create the embedding. Most likely this will be “spectral” for spectral embeddings and “tsne” for a T-SNE embedding.

Raises
  • ValueError – If the analysis database has already been closed, then a ValueError is raised.

  • LookupError – When the analysis for the specified category name, clustering name, and embedding name could not be found, then a LookupError is raised.

Returns

Returns the analysis for the specified name.

Return type

Analysis

get_categories()[source]

Retrieves the names of all the categories that are contained in this analysis database. The category names are umbrella terms for the attributions for which the analysis was performed. In most cases this will be the name/index/WordNet ID of the label of the dataset samples of the attributions.

Raises

ValueError – If the analysis database has already been closed, then a ValueError is raised.

Returns

Returns a list containing all the categories that are contained in this analysis database.

Return type

list of AnalysisCategory

get_clustering_names()[source]

Retrieves the names of all the clusterings that are contained in this analysis database. The clustering names are usually the name of the method with which the clustering was generated. Most likely this will be k-means with a specific value for k, e.g. ‘kmeans-10’.

Raises

ValueError – If the analysis database has already been closed, then a ValueError is raised.

Returns

Returns a list containing the names of all the clusterings that are contained in this analysis database.

Return type

list of str

get_embedding_names()[source]

Retrieves the names of all the embeddings that are contained in the analysis database. The embedding names are the names of the methods with which the embedding was generated. This will most likely be “spectral” for spectral embeddings and “tsne” for a T-SNE embedding.

Raises

ValueError – If the analysis database has already been closed, then a ValueError is raised.

Returns

Returns a list containing the names of all the embeddings that are contained in this analysis database.

Return type

list of str

has_analysis(category_name, clustering_name, embedding_name)[source]

Determines whether the analysis database contains the analysis with the specified category name (categories can, for example, be classes for which the analysis was performed).

Parameters
  • category_name (str) – The name of the category for which the analysis was performed. Each analysis was performed for a certain subset of the attributions, in most cases this subset will be defined by the label of dataset samples of the attributions. So the category name is the umbrella term for all the attributions that comprise the analysis, which will, in most cases, be the name/index/WordNet ID of the label.

  • clustering_name (str) – On top of the embedding a clustering is performed. This clustering name is the name of the clustering that is to be retrieved (because usually the analysis contains multiple different clusterings, which are most likely k-means with different k’s).

  • embedding_name (str) – The name of the embedding that is to be retrieved. This is the name of the method that was used to create the embedding. Most likely this will be “spectral” for spectral embeddings and “tsne” for a T-SNE embedding.

Raises

ValueError – If the analysis database has already been closed, then a ValueError is raised.

Returns

Returns True if the database contains the analysis with the specified name and False otherwise.

Return type

bool

class virelay.model.Attribution(index, data, labels, prediction)[source]

Bases: object

Represents a single attribution from an attribution database.

render_heatmap(color_map, superimpose=None)[source]

Takes the raw attribution data and converts it so that the data can be visualized as a heatmap.

Parameters
  • color_map (str) – The name of color map that is to be used to render the heatmap.

  • superimpose (numpy.ndarray) – An optional image onto which the heatmap should be superimposed.

Raises

ValueError – If the specified color map is unknown, then a ValueError is raised.

class virelay.model.AttributionDatabase(attribution_path, label_map)[source]

Bases: object

Represents a single attribution database, which contains the attributions for the dataset samples.

close()[source]

Closes the attribution database.

get_attribution(index)[source]

Gets the attribution with the specified index.

Parameters

index (int) – The index of the attribution that is to be retrieved.

Raises
  • ValueError – If the attribution database has already been closed, then a ValueError is raised.

  • LookupError – When the no attribution with the specified index exists, then an LookupError is raised.

Returns

Returns the attribution with the specified index.

Return type

Attribution

has_attribution(index)[source]

Determines whether the attribution database contains the attribution with the specified index.

Parameters

index (int) – The index that is to be checked.

Raises

ValueError – If the attribution database has already been closed, then a ValueError is raised.

Returns

Returns True if the database contains the attribution with the specified index and False otherwise.

Return type

bool

class virelay.model.Hdf5Dataset(name, path, label_map)[source]

Bases: object

Represents a dataset that is stored in an HDF5 database.

close()[source]

Closes the dataset.

get_sample(index)[source]

Gets the sample at the specified index.

Parameters

index (int | str) – The index or key of the sample that is to be retrieved.

Raises
  • ValueError – When the dataset has already been closed, then a ValueError is raised.

  • LookupError – When the specified index is out of range, a LookupError is raised.

Returns

Returns the sample at the specified index.

Return type

Sample

class virelay.model.ImageDirectoryDataset(name, path, label_index_regex, label_word_net_id_regex, input_width, input_height, down_sampling_method, up_sampling_method, label_map)[source]

Bases: object

Represents an image dataset, where the image files are in a directory hierarchy were the names of the directories represent the labels of the images.

close()[source]

Closes the dataset.

get_sample(index)[source]

Gets the sample at the specified index.

Parameters

index (int) – The index of the sample that is to be retrieved.

Raises
  • ValueError – If the dataset has already been closed, then a ValueError is raised.

  • LookupError – When the specified index is out of range, a LookupError is raised. If the label for the retrieved sample could not be determined from the label lookup, then a LookupError is raised

Returns

Returns the sample at the specified index.

Return type

Sample

re_sample_image(image)[source]

Re-samples an image based on the specified up-sampling and down-sampling methods.

Parameters

image (numpy.ndarray) – The image that is to be re-sampled.

Returns

Returns the re-sampled image.

Return type

numpy.ndarray

class virelay.model.Label(index, word_net_id, name)[source]

Bases: object

Represents a label of the dataset.

class virelay.model.LabelMap(path)[source]

Bases: object

Represents a map between output neuron indices and their respective human-readable label name.

get_label_from_index(index)[source]

Retrieves the human-readable name of the label with the specified index.

Parameters

index (int) – The index of the label.

Raises

LookupError – If the specified index does not exist, then a LookupError is raised.

Returns

Returns the human-readable name of the label.

Return type

str

get_label_from_word_net_id(word_net_id)[source]

Retrieves the human-readable name of the label with the specified WordNet ID.

Parameters

word_net_id (str) – The WordNet ID of the label.

Raises

LookupError – If the specified WordNet ID does not exist, then a LookupError is raised.

Returns

Returns the human-readable name of the label.

Return type

str

get_labels(reference)[source]

Retrieves the human-readable names of the labels that match the specified reference. The reference may either be an index, a n-hot encoded vector, or a WordNet ID, the method will figure out which it is and retrieve the correct labels.

Parameters

reference (int or str or numpy.ndarray) – The reference for which all matching labels are to be retrieved. This can either be an index, a n-hot encoded vector, or a WordNet ID.

Raises

LookupError – When no labels for the specified reference could be found (or one or more in case of a n-hot vector), then a LookupError is raised.

Returns

Returns the human-readable name or a list of all the human-readable names of the labels that matched the specified reference.

Return type

str or list of str

get_labels_from_n_hot_vector(n_hot_vector)[source]

Retrieves the human-readable names of the labels that are specified by the n-hot encoded vector.

Parameters

n_hot_vector (numpy.ndarray) – A n-hot encoded vector, where the indices are the label indices and the values are True/1 when the label is present and False/0 when the label is not present.

Raises

LookupError – If the length of the n-hot encoded vector is greater than the number of labels (that is there are indices for which there are no labels), then a LookupError is raised.

Returns

Returns a list of all the labels that are specified by the n-hot encoded vector.

Return type

list of str

class virelay.model.Project(path)[source]

Bases: object

Represents a single project, which can be loaded from a YAML file.

close()[source]

Closes the project, its dataset, and all of its sources.

get_analysis(analysis_method, category_name, clustering_name, embedding_name)[source]

Retrieves a complete analysis.

Parameters
  • analysis_method (str) – The name of the analysis method from which the analysis is to be retrieved.

  • category_name (str) – The name of the category for which the analysis was performed. Each analysis was performed for a certain subset of the attributions, in most cases this subset will be defined by the label of dataset samples of the attributions. So the category name is the umbrella term for all the attributions that comprise the analysis, which will, in most cases, be the name of the label.

  • clustering_name (str) – On top of the embedding a clustering is performed. This clustering name is the name of the clustering that is to be retrieved (because usually the analysis contains multiple different clusterings, which are most likely k-means with different k’s).

  • embedding_name (str) – The name of the embedding that is to be retrieved. This is the name of the method that was used to create the embedding. Most likely this will be “spectral” for spectral embeddings and “tsne” for a T-SNE embedding.

Raises
  • ValueError – If the project has already been closed, then a ValueError is raised.

  • LookupError – When the analysis for the specified analysis method, category name, clustering name, and embedding name could not be found, then a LookupError is raised.

Returns

Returns the analysis for the specified name.

Return type

Analysis

get_analysis_categories(analysis_method)[source]

Retrieves the names of the categories that are in the analyses of the specified analysis method.

Parameters

analysis_method (str) – The name of the analysis method for which the categories are to be retrieved.

Raises
  • ValueError – If the project has already been closed, then a ValueError is raised.

  • LookupError – If the specified analysis method does not exist, then a LookupError is raised.

Returns

Returns a list of the names of the categories.

Return type

list of str

get_analysis_clustering_names(analysis_method)[source]

Retrieves the names of the clustering methods that are in the analyses of the specified analysis method.

Parameters

analysis_method (str) – The name of the analysis method for which the clusterings are to be retrieved.

Raises
  • ValueError – If the project has already been closed, then a ValueError is raised.

  • LookupError – If the specified analysis method does not exist, then a LookupError is raised.

Returns

Returns a list of the names of the clusterings.

Return type

list of str

get_analysis_embedding_names(analysis_method)[source]

Retrieves the names of the embedding methods that are in the analyses of the specified analysis method.

Parameters

analysis_method (str) – The name of the analysis method for which the embeddings are to be retrieved.

Raises
  • ValueError – If the project has already been closed, then a ValueError is raised.

  • LookupError – If the specified analysis method does not exist, then a LookupError is raised.

Returns

Returns a list of the names of the embeddings.

Return type

list of str

get_analysis_methods()[source]

Retrieves the names of all the analysis methods that are in this project.

Raises

ValueError – If the project has already been closed, then a ValueError is raised.

Returns

Returns a list of the names of the all the analysis methods in this project.

Return type

list of str

get_attribution(index)[source]

Retrieves the attribution for the specified index.

Parameters

index (int) – The index of the attribution.

Raises
  • ValueError – If the project has already been closed, then a ValueError is raised.

  • LookupError – If no attribution with the specified index could be found, then a LookupError is raised

Returns

Returns the attribution with the specified index.

Return type

Attribution

get_sample(index)[source]

Retrieves the sample from the dataset with the specified index.

Parameters

index (int) – The index of the dataset sample.

Raises
  • ValueError – If the project has already been closed, then a ValueError is raised. If the project does not contain a dataset, then a ValueError is raised.

  • LookupError – If no dataset sample with the specified index could be found, then a LookupError is raised.

Returns

Returns the dataset sample with the specified index.

Return type

Sample

class virelay.model.Sample(index, data, labels)[source]

Bases: object

Represents a sample in a dataset.

class virelay.model.Workspace[source]

Bases: object

Represents a workspace, which may consist of multiple projects.

add_project(path)[source]

Adds a new project to the workspace.

Parameters

path (str) – The path to the project YAML file.

Raises

ValueError – If the workspace is already closed, a ValueError is raised.

close()[source]

Closes the workspace and all projects within it.

get_project(name)[source]

Retrieves the project with the specified name

Parameters

name (str) – The name of the project that is to be retrieved.

Raises
  • ValueError – If the workspace is already closed, a ValueError is raised.

  • LookupError – If the project with the specified name could not be found, then a LookupError is raised.

Returns

Returns the project with the specified name.

Return type

Project

get_project_names()[source]

Retrieves the names of all the loaded projects.

Returns

Returns a list of the names of all loaded projects.

Return type

list

Raises

ValueError – If the workspace is already closed, a ValueError is raised.