virelay.model
Contains the data model abstraction.
Classes
Represents an analysis of multiple attributions. |
|
Represents a single category in an analysis. |
|
Represents a single analysis database, which contains the analysis of attributions. |
|
Represents a single attribution from an attribution database. |
|
Represents a single attribution database, which contains the attributions for the dataset samples. |
|
Represents a dataset that is stored in an HDF5 database. |
|
Represents an image dataset, where the image files are in a directory hierarchy were the names of the directories represent the labels of the images. |
|
Represents a label of the dataset. |
|
Represents a map between output neuron indices and their respective human-readable label name. |
|
Represents a single project, which can be loaded from a YAML file. |
|
Represents a sample in a dataset. |
|
Represents a workspace, which may consist of multiple projects. |
- class virelay.model.Analysis(category_name, human_readable_category_name, clustering_name, clustering, embedding_name, embedding, attribution_indices, eigenvalues)[source]
Bases:
object
Represents an analysis of multiple attributions.
- class virelay.model.AnalysisCategory(name, human_readable_name)[source]
Bases:
object
Represents a single category in an analysis. One category can contain many analyses for different attributions. The category name is usually an umbrella term for all the attributions contained in the analysis. This is most-likely a class name.
- class virelay.model.AnalysisDatabase(analysis_path, label_map)[source]
Bases:
object
Represents a single analysis database, which contains the analysis of attributions.
- get_analysis(category_name, clustering_name, embedding_name)[source]
Gets the analysis for the specified name (names can be, for example, classes for which the analysis was performed).
- Parameters
category_name (str) – The name of the category for which the analysis was performed. Each analysis was performed for a certain subset of the attributions, in most cases this subset will be defined by the label of dataset samples of the attributions. So the category name is the umbrella term for all the attributions that comprise the analysis, which will, in most cases, be the name of the label.
clustering_name (str) – On top of the embedding a clustering is performed. This clustering name is the name of the clustering that is to be retrieved (because usually the analysis contains multiple different clusterings, which are most likely k-means with different k’s).
embedding_name (str) – The name of the embedding that is to be retrieved. This is the name of the method that was used to create the embedding. Most likely this will be “spectral” for spectral embeddings and “tsne” for a T-SNE embedding.
- Raises
ValueError – If the analysis database has already been closed, then a ValueError is raised.
LookupError – When the analysis for the specified category name, clustering name, and embedding name could not be found, then a LookupError is raised.
- Returns
Returns the analysis for the specified name.
- Return type
- get_categories()[source]
Retrieves the names of all the categories that are contained in this analysis database. The category names are umbrella terms for the attributions for which the analysis was performed. In most cases this will be the name/index/WordNet ID of the label of the dataset samples of the attributions.
- Raises
ValueError – If the analysis database has already been closed, then a ValueError is raised.
- Returns
Returns a list containing all the categories that are contained in this analysis database.
- Return type
list of AnalysisCategory
- get_clustering_names()[source]
Retrieves the names of all the clusterings that are contained in this analysis database. The clustering names are usually the name of the method with which the clustering was generated. Most likely this will be k-means with a specific value for k, e.g. ‘kmeans-10’.
- Raises
ValueError – If the analysis database has already been closed, then a ValueError is raised.
- Returns
Returns a list containing the names of all the clusterings that are contained in this analysis database.
- Return type
list of str
- get_embedding_names()[source]
Retrieves the names of all the embeddings that are contained in the analysis database. The embedding names are the names of the methods with which the embedding was generated. This will most likely be “spectral” for spectral embeddings and “tsne” for a T-SNE embedding.
- Raises
ValueError – If the analysis database has already been closed, then a ValueError is raised.
- Returns
Returns a list containing the names of all the embeddings that are contained in this analysis database.
- Return type
list of str
- has_analysis(category_name, clustering_name, embedding_name)[source]
Determines whether the analysis database contains the analysis with the specified category name (categories can, for example, be classes for which the analysis was performed).
- Parameters
category_name (str) – The name of the category for which the analysis was performed. Each analysis was performed for a certain subset of the attributions, in most cases this subset will be defined by the label of dataset samples of the attributions. So the category name is the umbrella term for all the attributions that comprise the analysis, which will, in most cases, be the name/index/WordNet ID of the label.
clustering_name (str) – On top of the embedding a clustering is performed. This clustering name is the name of the clustering that is to be retrieved (because usually the analysis contains multiple different clusterings, which are most likely k-means with different k’s).
embedding_name (str) – The name of the embedding that is to be retrieved. This is the name of the method that was used to create the embedding. Most likely this will be “spectral” for spectral embeddings and “tsne” for a T-SNE embedding.
- Raises
ValueError – If the analysis database has already been closed, then a ValueError is raised.
- Returns
Returns True if the database contains the analysis with the specified name and False otherwise.
- Return type
bool
- class virelay.model.Attribution(index, data, labels, prediction)[source]
Bases:
object
Represents a single attribution from an attribution database.
- render_heatmap(color_map, superimpose=None)[source]
Takes the raw attribution data and converts it so that the data can be visualized as a heatmap.
- Parameters
color_map (str) – The name of color map that is to be used to render the heatmap.
superimpose (numpy.ndarray) – An optional image onto which the heatmap should be superimposed.
- Raises
ValueError – If the specified color map is unknown, then a ValueError is raised.
- class virelay.model.AttributionDatabase(attribution_path, label_map)[source]
Bases:
object
Represents a single attribution database, which contains the attributions for the dataset samples.
- get_attribution(index)[source]
Gets the attribution with the specified index.
- Parameters
index (int) – The index of the attribution that is to be retrieved.
- Raises
ValueError – If the attribution database has already been closed, then a ValueError is raised.
LookupError – When the no attribution with the specified index exists, then an LookupError is raised.
- Returns
Returns the attribution with the specified index.
- Return type
- has_attribution(index)[source]
Determines whether the attribution database contains the attribution with the specified index.
- Parameters
index (int) – The index that is to be checked.
- Raises
ValueError – If the attribution database has already been closed, then a ValueError is raised.
- Returns
Returns True if the database contains the attribution with the specified index and False otherwise.
- Return type
bool
- class virelay.model.Hdf5Dataset(name, path, label_map)[source]
Bases:
object
Represents a dataset that is stored in an HDF5 database.
- get_sample(index)[source]
Gets the sample at the specified index.
- Parameters
index (int | str) – The index or key of the sample that is to be retrieved.
- Raises
ValueError – When the dataset has already been closed, then a ValueError is raised.
LookupError – When the specified index is out of range, a LookupError is raised.
- Returns
Returns the sample at the specified index.
- Return type
- class virelay.model.ImageDirectoryDataset(name, path, label_index_regex, label_word_net_id_regex, input_width, input_height, down_sampling_method, up_sampling_method, label_map)[source]
Bases:
object
Represents an image dataset, where the image files are in a directory hierarchy were the names of the directories represent the labels of the images.
- get_sample(index)[source]
Gets the sample at the specified index.
- Parameters
index (int) – The index of the sample that is to be retrieved.
- Raises
ValueError – If the dataset has already been closed, then a ValueError is raised.
LookupError – When the specified index is out of range, a LookupError is raised. If the label for the retrieved sample could not be determined from the label lookup, then a LookupError is raised
- Returns
Returns the sample at the specified index.
- Return type
- class virelay.model.Label(index, word_net_id, name)[source]
Bases:
object
Represents a label of the dataset.
- class virelay.model.LabelMap(path)[source]
Bases:
object
Represents a map between output neuron indices and their respective human-readable label name.
- get_label_from_index(index)[source]
Retrieves the human-readable name of the label with the specified index.
- Parameters
index (int) – The index of the label.
- Raises
LookupError – If the specified index does not exist, then a LookupError is raised.
- Returns
Returns the human-readable name of the label.
- Return type
str
- get_label_from_word_net_id(word_net_id)[source]
Retrieves the human-readable name of the label with the specified WordNet ID.
- Parameters
word_net_id (str) – The WordNet ID of the label.
- Raises
LookupError – If the specified WordNet ID does not exist, then a LookupError is raised.
- Returns
Returns the human-readable name of the label.
- Return type
str
- get_labels(reference)[source]
Retrieves the human-readable names of the labels that match the specified reference. The reference may either be an index, a n-hot encoded vector, or a WordNet ID, the method will figure out which it is and retrieve the correct labels.
- Parameters
reference (int or str or numpy.ndarray) – The reference for which all matching labels are to be retrieved. This can either be an index, a n-hot encoded vector, or a WordNet ID.
- Raises
LookupError – When no labels for the specified reference could be found (or one or more in case of a n-hot vector), then a LookupError is raised.
- Returns
Returns the human-readable name or a list of all the human-readable names of the labels that matched the specified reference.
- Return type
str or list of str
- get_labels_from_n_hot_vector(n_hot_vector)[source]
Retrieves the human-readable names of the labels that are specified by the n-hot encoded vector.
- Parameters
n_hot_vector (numpy.ndarray) – A n-hot encoded vector, where the indices are the label indices and the values are True/1 when the label is present and False/0 when the label is not present.
- Raises
LookupError – If the length of the n-hot encoded vector is greater than the number of labels (that is there are indices for which there are no labels), then a LookupError is raised.
- Returns
Returns a list of all the labels that are specified by the n-hot encoded vector.
- Return type
list of str
- class virelay.model.Project(path)[source]
Bases:
object
Represents a single project, which can be loaded from a YAML file.
- get_analysis(analysis_method, category_name, clustering_name, embedding_name)[source]
Retrieves a complete analysis.
- Parameters
analysis_method (str) – The name of the analysis method from which the analysis is to be retrieved.
category_name (str) – The name of the category for which the analysis was performed. Each analysis was performed for a certain subset of the attributions, in most cases this subset will be defined by the label of dataset samples of the attributions. So the category name is the umbrella term for all the attributions that comprise the analysis, which will, in most cases, be the name of the label.
clustering_name (str) – On top of the embedding a clustering is performed. This clustering name is the name of the clustering that is to be retrieved (because usually the analysis contains multiple different clusterings, which are most likely k-means with different k’s).
embedding_name (str) – The name of the embedding that is to be retrieved. This is the name of the method that was used to create the embedding. Most likely this will be “spectral” for spectral embeddings and “tsne” for a T-SNE embedding.
- Raises
ValueError – If the project has already been closed, then a ValueError is raised.
LookupError – When the analysis for the specified analysis method, category name, clustering name, and embedding name could not be found, then a LookupError is raised.
- Returns
Returns the analysis for the specified name.
- Return type
- get_analysis_categories(analysis_method)[source]
Retrieves the names of the categories that are in the analyses of the specified analysis method.
- Parameters
analysis_method (str) – The name of the analysis method for which the categories are to be retrieved.
- Raises
ValueError – If the project has already been closed, then a ValueError is raised.
LookupError – If the specified analysis method does not exist, then a LookupError is raised.
- Returns
Returns a list of the names of the categories.
- Return type
list of str
- get_analysis_clustering_names(analysis_method)[source]
Retrieves the names of the clustering methods that are in the analyses of the specified analysis method.
- Parameters
analysis_method (str) – The name of the analysis method for which the clusterings are to be retrieved.
- Raises
ValueError – If the project has already been closed, then a ValueError is raised.
LookupError – If the specified analysis method does not exist, then a LookupError is raised.
- Returns
Returns a list of the names of the clusterings.
- Return type
list of str
- get_analysis_embedding_names(analysis_method)[source]
Retrieves the names of the embedding methods that are in the analyses of the specified analysis method.
- Parameters
analysis_method (str) – The name of the analysis method for which the embeddings are to be retrieved.
- Raises
ValueError – If the project has already been closed, then a ValueError is raised.
LookupError – If the specified analysis method does not exist, then a LookupError is raised.
- Returns
Returns a list of the names of the embeddings.
- Return type
list of str
- get_analysis_methods()[source]
Retrieves the names of all the analysis methods that are in this project.
- Raises
ValueError – If the project has already been closed, then a ValueError is raised.
- Returns
Returns a list of the names of the all the analysis methods in this project.
- Return type
list of str
- get_attribution(index)[source]
Retrieves the attribution for the specified index.
- Parameters
index (int) – The index of the attribution.
- Raises
ValueError – If the project has already been closed, then a ValueError is raised.
LookupError – If no attribution with the specified index could be found, then a LookupError is raised
- Returns
Returns the attribution with the specified index.
- Return type
- get_sample(index)[source]
Retrieves the sample from the dataset with the specified index.
- Parameters
index (int) – The index of the dataset sample.
- Raises
ValueError – If the project has already been closed, then a ValueError is raised. If the project does not contain a dataset, then a ValueError is raised.
LookupError – If no dataset sample with the specified index could be found, then a LookupError is raised.
- Returns
Returns the dataset sample with the specified index.
- Return type
- class virelay.model.Sample(index, data, labels)[source]
Bases:
object
Represents a sample in a dataset.
- class virelay.model.Workspace[source]
Bases:
object
Represents a workspace, which may consist of multiple projects.
- add_project(path)[source]
Adds a new project to the workspace.
- Parameters
path (str) – The path to the project YAML file.
- Raises
ValueError – If the workspace is already closed, a ValueError is raised.
- get_project(name)[source]
Retrieves the project with the specified name
- Parameters
name (str) – The name of the project that is to be retrieved.
- Raises
ValueError – If the workspace is already closed, a ValueError is raised.
LookupError – If the project with the specified name could not be found, then a LookupError is raised.
- Returns
Returns the project with the specified name.
- Return type