Treant aggregation¶
These are the API components of datreant
for working with multiple
Treants at once, and treating them in aggregate.
Bundle¶
The class datreant.Bundle
functions as an ordered set of
Treants. It allows common operations on Treants to be performed in aggregate,
but also includes mechanisms for filtering and grouping based on Treant
attributes, such as tags and categories.
Bundles can be created from all Treants found in a directory tree with
datreant.discover()
:
-
datreant.
discover
(dirpath='.', depth=None, treantdepth=None)¶ Find all Treants within given directory, recursively.
Parameters: - dirpath (string, Tree) – Directory within which to search for Treants. May also be an existing Tree.
- depth (int) – Maximum directory depth to tolerate while traversing in search of
Treants.
None
indicates no depth limit. - treantdepth (int) – Maximum depth of Treants to tolerate while traversing in search
of Treants.
None
indicates no Treant depth limit.
Returns: found – Bundle of found Treants.
Return type:
They can also be created directly from any number of Treants:
-
class
datreant.
Bundle
(*treants)¶ An ordered set of Treants.
Parameters: treants (Treant, list) – Treants to be added, which may be nested lists of Treants. Treants can be given as either objects or paths to directories that contain Treant statefiles. Glob patterns are also allowed, and all found Treants will be added to the collection. -
abspaths
¶ Return a list of absolute member directory paths.
Returns: - abspaths
list giving the absolute directory path of each member, in order
-
children
(hidden=False)¶ Return a View of all files and directories within the member Trees.
Parameters: hidden (bool) – If True, include hidden files and directories. Returns: A View giving the files and directories in the member Trees. Return type: View
-
draw
(depth=None, hidden=False)¶ Print an ASCII-fied visual of all member Trees.
Parameters:
-
get
(*tags, **categories)¶ Filter to only Treants which match the defined tags and categories.
If no arguments given, the full Bundle is returned. This method should be thought of as a filtering, with more values specified giving only those Treants that match.
Parameters: - *tags – Tags to match.
- **categories – Category key, value pairs to match.
Returns: All matched Treants.
Return type: Examples
Doing a get with:
>>> b.get('this')
is equivalent to:
>>> b.tags.filter('this')
Finally, doing:
>>> b.get('this', length=5)
is equivalent to:
>>> b_n = b.tags.filter('this') >>> b_n.categories.groupby('length')[5.0]
-
glob
(pattern)¶ Return a View of all child Leaves and Trees of members matching given globbing pattern.
Parameters: pattern (string) – globbing pattern to match files and directories with
-
globfilter
(pattern)¶ Return a Bundle of members that match by name the given globbing pattern.
Parameters: pattern (string) – globbing pattern to match member names with
-
leafloc
¶ Get a View giving Leaf at path relative to each Tree in collection.
Use with getitem syntax, e.g.
.loc['some name']
Allowed inputs are: - A single name - A list or array of names
If the given path resolves to an existing directory for any Tree, then a
ValueError
will be raised.
-
leaves
(hidden=False)¶ Return a View of the files within the member Trees.
Parameters: hidden (bool) – If True, include hidden files. Returns: A View giving the files in the member Trees. Return type: View
-
loc
¶ Get a View giving Tree/Leaf at path relative to each Tree in collection.
Use with getitem syntax, e.g.
.loc['some name']
Allowed inputs are: - A single name - A list or array of names
If directory/file does not exist at the given path, then whether a Tree or Leaf is given is determined by the path semantics, i.e. a trailing separator (“/”).
-
map
(function, processes=1, **kwargs)¶ Apply a function to each member, perhaps in parallel.
A pool of processes is created for processes > 1; for example, with 40 members and ‘processes=4’, 4 processes will be created, each working on a single member at any given time. When each process completes work on a member, it grabs another, until no members remain.
kwargs are passed to the given function when applied to each member
Arguments: - function
function to apply to each member; must take only a single treant instance as input, but may take any number of keyword arguments
Keywords: - processes
how many processes to use; if 1, applies function to each member in member order
Returns: - results
list giving the result of the function for each member, in member order; if the function returns
None
for each member, then onlyNone
is returned instead of a list
-
names
¶ Return a list of member names.
Returns: - names
list giving the name of each member, in order
-
parents
()¶ Return a View of the parent directories for each member.
Because a View functions as an ordered set, and some members of this collection may share a parent, the View of parents may contain fewer elements than this collection.
-
relpaths
¶ Return a list of relative member directory paths.
Returns: - names
list giving the relative directory path of each member, in order
-
treeloc
¶ Get a View giving Tree at path relative to each Tree in collection.
Use with getitem syntax, e.g.
.loc['some name']
Allowed inputs are: - A single name - A list or array of names
If the given path resolves to an existing file for any Tree, then a
ValueError
will be raised.
-
AggTags¶
The class datreant.metadata.AggTags
is the interface used by
Bundles to access their members’ tags.
-
class
datreant.metadata.
AggTags
(collection)¶ Interface to aggregated tags.
-
add
(*tags)¶ Add any number of tags to each Treant in collection.
Arguments: - tags
Tags to add. Must be strings or lists of strings.
-
all
¶ Set of tags present among all Treants in collection.
-
any
¶ Set of tags present among at least one Treant in collection.
-
clear
()¶ Remove all tags from each Treant in collection.
-
filter
(tag)¶ Filter Treants matching the given tag expression from a Bundle.
Parameters: tag (str or list) – Tag or tags to filter Treants. Returns: Bundle of Treants matching the given tag expression. Return type: Bundle
-
fuzzy
(tag, threshold=80, scope='all')¶ Get a tuple of existing tags that fuzzily match a given one.
Parameters: - tag (str or list) – Tag or tags to get fuzzy matches for.
- threshold (int) – Lowest match score to return. Setting to 0 will return every tag, while setting to 100 will return only exact matches.
- scope ({'all', 'any'}) – Tags to use. ‘all’ will use only tags found within all Treants in collection, while ‘any’ will use tags found within at least one Treant in collection.
Returns: matches – Tuple of tags that match.
Return type:
-
remove
(*tags)¶ Remove tags from each Treant in collection.
Any number of tags can be given as arguments, and these will be deleted.
Arguments: - tags
Tags to delete.
-
AggCategories¶
The class datreant.metadata.AggCategories
is the interface used
by Bundles to access their members’ categories.
-
class
datreant.metadata.
AggCategories
(collection)¶ Interface to categories.
-
add
(categorydict=None, **categories)¶ Add any number of categories to each Treant in collection.
Categories are key-value pairs that serve to differentiate Treants from one another. Sometimes preferable to tags.
If a given category already exists (same key), the value given will replace the value for that category.
Keys must be strings.
Values may be ints, floats, strings, or bools.
None
as a value will not the existing value for the key, if present.Parameters: - categorydict (dict) – Dict of categories to add; keys used as keys, values used as values.
- categories – Categories to add. Keyword used as key, value used as value.
-
all
¶ Get categories common to all Treants in collection.
Returns: Categories common to all members. Return type: dict
-
any
¶ Get categories present among at least one Treant in collection.
Returns: All unique Categories among members. Return type: dict
-
clear
()¶ Remove all categories from all Treants in collection.
-
groupby
(keys)¶ Return groupings of Treants based on values of Categories.
If a single category is specified by keys (keys is neither a list nor a set of category names), returns a dict of Bundles whose (new) keys are the values of the category specified by keys; the corresponding Bundles are groupings of members in the collection having the same category values (for the category specied by keys).
If keys is a list of keys, returns a dict of Bundles whose (new) keys are tuples of category values. The corresponding Bundles contain the members in the collection that have the same set of category values (for the categories specified by keys); members in each Bundle will have all of the category values specified by the tuple for that Bundle’s key.
Parameters: keys (str, list) – Valid key(s) of categories in this collection. Returns: Bundles of members by category values. Return type: dict
-
keys
(scope='all')¶ Get the keys present among Treants in collection.
Parameters: scope ({'all', 'any'}) – Keys to return. ‘all’ will return only keys found within all Treants in the collection, while ‘any’ will return keys found within at least one Treant in the collection. Returns: keys – Present keys. Return type: list
-
remove
(*categories)¶ Remove categories from Treant.
Any number of categories (keys) can be given as arguments, and these keys (with their values) will be deleted.
Parameters: categories (str) – Categories to delete.
-
values
(scope='all')¶ Get the category values for all Treants in collection.
Parameters: scope ({'all', 'any'}) – Keys to return. ‘all’ will return only keys found within all Treants in the collection, while ‘any’ will return keys found within at least one Treant in the collection. Returns: values – A list of values for each Treant in the collection is returned for each key within the given scope. The value lists are given in the same order as the keys from AggCategories.keys
.Return type: list
-