Treant aggregation

These are the API components of datreant for working with multiple Treants at once, and treating them in aggregate.

Bundle

The class datreant.Bundle functions as an ordered set of Treants. It allows common operations on Treants to be performed in aggregate, but also includes mechanisms for filtering and grouping based on Treant attributes, such as tags and categories.

Bundles can be created from all Treants found in a directory tree with datreant.discover():

datreant.discover(dirpath='.', depth=None, treantdepth=None)

Find all Treants within given directory, recursively.

Parameters:
  • dirpath (string, Tree) – Directory within which to search for Treants. May also be an existing Tree.
  • depth (int) – Maximum directory depth to tolerate while traversing in search of Treants. None indicates no depth limit.
  • treantdepth (int) – Maximum depth of Treants to tolerate while traversing in search of Treants. None indicates no Treant depth limit.
Returns:

found – Bundle of found Treants.

Return type:

Bundle

They can also be created directly from any number of Treants:

class datreant.Bundle(*treants)

An ordered set of Treants.

Parameters:treants (Treant, list) – Treants to be added, which may be nested lists of Treants. Treants can be given as either objects or paths to directories that contain Treant statefiles. Glob patterns are also allowed, and all found Treants will be added to the collection.
abspaths

Return a list of absolute member directory paths.

Returns:
abspaths

list giving the absolute directory path of each member, in order

children(hidden=False)

Return a View of all files and directories within the member Trees.

Parameters:hidden (bool) – If True, include hidden files and directories.
Returns:A View giving the files and directories in the member Trees.
Return type:View
draw(depth=None, hidden=False)

Print an ASCII-fied visual of all member Trees.

Parameters:
  • depth (int) – Maximum directory depth to display. None indicates no limit.
  • hidden (bool) – If False, do not show hidden files; hidden directories are still shown if they contain non-hidden files or directories.
get(*tags, **categories)

Filter to only Treants which match the defined tags and categories.

If no arguments given, the full Bundle is returned. This method should be thought of as a filtering, with more values specified giving only those Treants that match.

Parameters:
  • *tags – Tags to match.
  • **categories – Category key, value pairs to match.
Returns:

All matched Treants.

Return type:

Bundle

Examples

Doing a get with:

>>> b.get('this')  

is equivalent to:

>>> b.tags.filter('this')  

Finally, doing:

>>> b.get('this', length=5)  

is equivalent to:

>>> b_n = b.tags.filter('this')  
>>> b_n.categories.groupby('length')[5.0]  
glob(pattern)

Return a View of all child Leaves and Trees of members matching given globbing pattern.

Parameters:pattern (string) – globbing pattern to match files and directories with
globfilter(pattern)

Return a Bundle of members that match by name the given globbing pattern.

Parameters:pattern (string) – globbing pattern to match member names with
leafloc

Get a View giving Leaf at path relative to each Tree in collection.

Use with getitem syntax, e.g. .loc['some name']

Allowed inputs are: - A single name - A list or array of names

If the given path resolves to an existing directory for any Tree, then a ValueError will be raised.

leaves(hidden=False)

Return a View of the files within the member Trees.

Parameters:hidden (bool) – If True, include hidden files.
Returns:A View giving the files in the member Trees.
Return type:View
loc

Get a View giving Tree/Leaf at path relative to each Tree in collection.

Use with getitem syntax, e.g. .loc['some name']

Allowed inputs are: - A single name - A list or array of names

If directory/file does not exist at the given path, then whether a Tree or Leaf is given is determined by the path semantics, i.e. a trailing separator (“/”).

map(function, processes=1, **kwargs)

Apply a function to each member, perhaps in parallel.

A pool of processes is created for processes > 1; for example, with 40 members and ‘processes=4’, 4 processes will be created, each working on a single member at any given time. When each process completes work on a member, it grabs another, until no members remain.

kwargs are passed to the given function when applied to each member

Arguments:
function

function to apply to each member; must take only a single treant instance as input, but may take any number of keyword arguments

Keywords:
processes

how many processes to use; if 1, applies function to each member in member order

Returns:
results

list giving the result of the function for each member, in member order; if the function returns None for each member, then only None is returned instead of a list

names

Return a list of member names.

Returns:
names

list giving the name of each member, in order

parents()

Return a View of the parent directories for each member.

Because a View functions as an ordered set, and some members of this collection may share a parent, the View of parents may contain fewer elements than this collection.

relpaths

Return a list of relative member directory paths.

Returns:
names

list giving the relative directory path of each member, in order

treeloc

Get a View giving Tree at path relative to each Tree in collection.

Use with getitem syntax, e.g. .loc['some name']

Allowed inputs are: - A single name - A list or array of names

If the given path resolves to an existing file for any Tree, then a ValueError will be raised.

trees(hidden=False)

Return a View of the directories within the member Trees.

Parameters:hidden (bool) – If True, include hidden directories.
Returns:A View giving the directories in the member Trees.
Return type:View

AggTags

The class datreant.metadata.AggTags is the interface used by Bundles to access their members’ tags.

class datreant.metadata.AggTags(collection)

Interface to aggregated tags.

add(*tags)

Add any number of tags to each Treant in collection.

Arguments:
tags

Tags to add. Must be strings or lists of strings.

all

Set of tags present among all Treants in collection.

any

Set of tags present among at least one Treant in collection.

clear()

Remove all tags from each Treant in collection.

filter(tag)

Filter Treants matching the given tag expression from a Bundle.

Parameters:tag (str or list) – Tag or tags to filter Treants.
Returns:Bundle of Treants matching the given tag expression.
Return type:Bundle
fuzzy(tag, threshold=80, scope='all')

Get a tuple of existing tags that fuzzily match a given one.

Parameters:
  • tag (str or list) – Tag or tags to get fuzzy matches for.
  • threshold (int) – Lowest match score to return. Setting to 0 will return every tag, while setting to 100 will return only exact matches.
  • scope ({'all', 'any'}) – Tags to use. ‘all’ will use only tags found within all Treants in collection, while ‘any’ will use tags found within at least one Treant in collection.
Returns:

matches – Tuple of tags that match.

Return type:

tuple

remove(*tags)

Remove tags from each Treant in collection.

Any number of tags can be given as arguments, and these will be deleted.

Arguments:
tags

Tags to delete.

AggCategories

The class datreant.metadata.AggCategories is the interface used by Bundles to access their members’ categories.

class datreant.metadata.AggCategories(collection)

Interface to categories.

add(categorydict=None, **categories)

Add any number of categories to each Treant in collection.

Categories are key-value pairs that serve to differentiate Treants from one another. Sometimes preferable to tags.

If a given category already exists (same key), the value given will replace the value for that category.

Keys must be strings.

Values may be ints, floats, strings, or bools. None as a value will not the existing value for the key, if present.

Parameters:
  • categorydict (dict) – Dict of categories to add; keys used as keys, values used as values.
  • categories – Categories to add. Keyword used as key, value used as value.
all

Get categories common to all Treants in collection.

Returns:Categories common to all members.
Return type:dict
any

Get categories present among at least one Treant in collection.

Returns:All unique Categories among members.
Return type:dict
clear()

Remove all categories from all Treants in collection.

groupby(keys)

Return groupings of Treants based on values of Categories.

If a single category is specified by keys (keys is neither a list nor a set of category names), returns a dict of Bundles whose (new) keys are the values of the category specified by keys; the corresponding Bundles are groupings of members in the collection having the same category values (for the category specied by keys).

If keys is a list of keys, returns a dict of Bundles whose (new) keys are tuples of category values. The corresponding Bundles contain the members in the collection that have the same set of category values (for the categories specified by keys); members in each Bundle will have all of the category values specified by the tuple for that Bundle’s key.

Parameters:keys (str, list) – Valid key(s) of categories in this collection.
Returns:Bundles of members by category values.
Return type:dict
keys(scope='all')

Get the keys present among Treants in collection.

Parameters:scope ({'all', 'any'}) – Keys to return. ‘all’ will return only keys found within all Treants in the collection, while ‘any’ will return keys found within at least one Treant in the collection.
Returns:keys – Present keys.
Return type:list
remove(*categories)

Remove categories from Treant.

Any number of categories (keys) can be given as arguments, and these keys (with their values) will be deleted.

Parameters:categories (str) – Categories to delete.
values(scope='all')

Get the category values for all Treants in collection.

Parameters:scope ({'all', 'any'}) – Keys to return. ‘all’ will return only keys found within all Treants in the collection, while ‘any’ will return keys found within at least one Treant in the collection.
Returns:values – A list of values for each Treant in the collection is returned for each key within the given scope. The value lists are given in the same order as the keys from AggCategories.keys.
Return type:list