An implementation of Apache Ant globs.
An implementation of Ant Globs.
The main entry points for this modules are:
Bases: formic.formic.Matcher
A Matcher for matching the constant passed in the constructor.
This is used to more efficiently match path and file elements that do not have a wild-card, eg __init__.py
Returns True if the argument matches the constant.
Bases: formic.formic.Matcher
A Matcher that matches simple file/directory wildcards as per DOS or Unix.
FNMatcher internally uses fnmatch.fnmatch() to implement Matcher.match()
Returns True if the pattern matches the string
Bases: object
An implementation of the Ant FileSet class.
Arguments to the constructor:
Usage
First, construct a FileSet:
from formic import FileSet
fileset = FileSet(directory="/some/where/interesting",
include="*.py",
exclude=["**/*test*/**", "test*"]
)
There are three APIs for retrieving matches:
FileSet is itself an iterator and returns absolute file names:
for filename in fileset:
print filename
For more control, use fileset.qualified_files(). The following prints filenames relative to the directory:
for filename in fileset.qualified_files(absolute=False):
print filename
For absolute control, use the fileset.files() method and handle the returned tuple yourself:
prefix = fileset.get_directory()
for directory, file_name in fileset.files():
sys.stdout.write(prefix)
if dir:
sys.stdout.write(path.sep)
sys.stdout.write(directory)
sys.stdout.write(path.sep)
sys.stdout.write(file_name)
sys.stdout.write("\n")
Implementation notes:
FileSet is lazy: The files in the FileSet are resolved at the time the iterator is looped over. This means that it is very fast to set up and (can be) computationally expensive only when results are obtained.
You can iterate over the same FileSet instance as many times as you want. Because the results are computed as you iterate over the object, each separate iteration can return different results, eg if the file system has changed.
include and exclude arguments to the constructor can be given in several ways:
In addition to Apache Ant’s default excludes, FileSet excludes:
You can modify the DEFAULT_EXCLUDES class member (it is a list of Pattern instances). Doing so will modify the behaviour of all instances of FileSet using default excludes.
You can provide and alternate function to os.walk() that, for example, heavily truncates the files and directories being searched or returns files and directories that don’t even exist on the file system. This can be useful for testing or even for passing the results of one FileSet result as the search path of a second. See formic.walk_from_list():
files = ["CVS/error.py", "silly/silly1.txt", "1/2/3.py", "silly/silly3.txt", "1/2/4.py", "silly/silly3.txt"]
fileset = FileSet(include="*.py", walk=walk_from_list(files))
for dir, file in fileset:
print dir, file
This lists 1/2/3.py and 1/2/4.py no matter what the contents of the
current directory are. CVS/error.py is not listed because of the default
excludes.
Default excludes shared by all instances. The member is a list of Pattern instances. You may modify this member at run time to modify the behaviour of all instances.
A generator function for iterating over the individual files of the FileSet.
The generator yields a tuple of (rel_dir_name, file_name):
Returns the directory in which the FileSet will be run.
If the directory was set with None in the constructor, get_directory() will return the current working directory.
The returned result is normalized so it never contains a trailing path separator
An alternative generator that yields files rather than directory/file tuples.
If absolute is false, paths relative to the starting directory are returned, otherwise files are fully qualified.
Bases: object
FileSetState is an object encapsulating the FileSet in a particular directory, caching inheritable Pattern matches.
This is an internal implementation class and not meant for reuse or to be accessed directly
Implementation notes:
As the FileSet traverses the directories using, by default, os.walk(), it builds two graphs of FileSetState instances mirroring the graph of directories - one graph of FileSetState instances is for the include globs and the other graph of FileSetState instances for the exclude.
FileSetState embodies logic to decide whether to prune whole directories from the search, either by detecting the include patterns cannot match any file within, or by detecting that an exclude matches all files in this directory and sub-directories.
The constructor has the following arguments:
During the construction of the instance, the instance will evaluate the directory patterns in PatternSet self.unmatched and, for each Pattern, perform of of the following actions:
1. If a pattern matches, it will be moved into one of the ‘matched’ PatternSet instances:
- self.matched_inherit: the directory pattern matches all sub subdirectories as well, eg /test/**
- self.matched_and_subdir: the directory matches this directory and may match subdirectories as well, eg /test/**/more/**
- self.matched_no_subdir: the directory matches this directory, but cannot match any subdirectory, eg /test/*. This pattern will thus not be evaluated in any subdirectory.
Given a set of files in this directory, returns all the files that match the Pattern instances which match this directory.
Returns True if there is a pattern that:
This acts as a terminator for FileSetState instances in the excludes graph.
Returns True if there are no possible matches for any subdirectories of this FileSetState.
When this :class:FileSetState is used for an ‘include’, a return of True means we can exclude all subdirectories.
Bases: exceptions.Exception
Formic errors, such as misconfigured arguments and internal exceptions
Bases: object
An enumeration of different match/non-match types to optimize the search algorithm.
There are two special considerations in match results that derive from the fact that Ant globs can be ‘bound’ to the start of the path being evaluated (eg bound start: /Documents/**).
The various match possibilities are bitfields using the members starting BIT_.
Bases: object
An abstract class that holds some pattern to be matched; matcher.match(string) returns a boolean indicating whether the string matches the pattern.
The Matcher.create() method is a Factory that creates instances of various subclasses.
Factory for Matcher instances; returns a Matcher suitable for matching the supplied pattern
Matcher is an abstract class - this will raise a FormicError
Bases: object
Represents a single Ant Glob.
The Pattern object compiles the pattern into several components:
Pattern also normalises the glob, removing redundant path elements (eg **/**/test/* resolves to **/test/*) and normalises the case of the path elements (resolving difficulties with case insensitive file systems)
Returns True if the Pattern matches all files (in a matched directory).
The file pattern at the end of the glob was / or /*
Returns a MatchType for the directory, expressed as a list of path elements, match for the Pattern.
If self.bound_start is True, the first Section must match from the first directory element.
If self.bound_end is True, the last Section must match the last contiguous elements of path_elements.
Moves all matching files from the set unmatched to the set matched.
Both matched and unmatched are sets of string, the strings being unqualified file names
Bases: object
PatternSet contains a number of implementation optimizations and is an integral part of various optimizations in FileSet.
This class is not an implementation of Apache Ant PatternSet
Returns True if there is any Pattern in the PatternSet that matches all files (see Pattern.all_files())
Note that this method is implemented using lazy evaluation so direct access to the member _all_files is very likely to result in errors
Adds a Pattern to the PatternSet
Returns True if the PatternSet is empty
Extend a PatternSet with addition patterns
patterns can either be:
An iteration generator that allows the loop to modify the PatternSet during the loop
Apply the include and exclude filters to those files in unmatched, moving those that are included, but not excluded, into the matched set.
Both matched and unmatched are sets of unqualified file names.
Remove a Pattern from the PatternSet
Bases: object
A minimal object that holds fragments of a Pattern path.
Each Section holds a list of pattern fragments matching some contiguous portion of a full path, separated by /**/ from other Section instances.
For example, the Pattern /top/second/**/sub/**end/* is stored as a list of three Section objects:
A generator that searches over path_elements (starting from the index start_at), yielding for each match.
Each value yielded is the index into path_elements to the first element after each match. In other words, the returned index has already consumed the matching path elements of this Section.
Matches work by finding a contiguous group of path elements that match the list of Matcher objects in this Section as they are naturally paired.
This method includes an implementation optimization that simplifies the search for Section instances containing a single path element. This produces significant performance improvements.
Returns a the default excludes as a list of Patterns.
This will be the initial value of FileSet.DEFAULT_EXCLUDES. It is defined in the Ant documentation. Formic adds **/__pycache__/**, with the resulting list being:
- **/pycache/**/*
- **/*~
- **/#*#
- **/.#*
- **/%*%
- **/._*
- **/CVS
- **/CVS/**/*
- **/.cvsignore
- **/SCCS
- **/SCCS/**/*
- **/vssver.scc
- **/.svn
- **/.svn/**/*
- **/.DS_Store
- **/.git
- **/.git/**/*
- **/.gitattributes
- **/.gitignore
- **/.gitmodules
- **/.hg
- **/.hg/**/*
- **/.hgignore
- **/.hgsub
- **/.hgsubstate
- **/.hgtags
- **/.bzr
- **/.bzr/**/*
- **/.bzrignore
Breaks a path to a directory into a (drive, list-of-folders) tuple
Parameters: | directory – |
---|---|
Returns: | a tuple consisting of the drive (if any) and an ordered list of folder names |
Returns the version of formic.
This method retrieves the version from VERSION.txt, and it should be exactly the same as the version retrieved from the package manager
Returns true if the directory is root (eg / on UNIX or c:on Windows)
Converts a list of filenames into a directory tree structure.
Reverts a tuple from get_path_components into a path.
Parameters: |
|
---|---|
Returns: | A path comprising the drive and list of folder names. The path terminate with a os.path.sep only if it is a root directory |
Walks a tree returned by list_to_tree returning a list of 3-tuples as if from os.walk().
A function that mimics os.walk() by simulating a directory with the list of files passed as an argument.
Parameters: | files – A list of file paths |
---|---|
Returns: | A function that mimics os.walk() walking a directory containing only the files listed in the argument |
The command-line glue-code for formic. Call main() with the command-line arguments.
Full usage of the command is:
usage: formic [-i [INCLUDE [INCLUDE ...]]] [-e [EXCLUDE [EXCLUDE ...]]]
[--no-default-excludes] [--no-symlinks] [-r] [-h] [--usage]
[--version]
[directory]
Search the file system using Apache Ant globs
Directory:
directory The directory from which to start the search (defaults
to current working directory)
Globs:
-i [INCLUDE [INCLUDE ...]], --include [INCLUDE [INCLUDE ...]]
One or more Ant-like globs in include in the search.If
not specified, then all files are implied
-e [EXCLUDE [EXCLUDE ...]], --exclude [EXCLUDE [EXCLUDE ...]]
One or more Ant-like globs in include in the search
--no-default-excludes
Do not include the default excludes
--no-symlinks Do not include symlinks
Output:
-r, --relative Print file paths relative to directory.
Information:
-h, --help Prints this help and exits
--usage Prints additional help on globs and exits
--version Prints the version of formic and exits
Creates and returns the command line parser, an argparser.ArgumentParser instance.
Entry point for command line; calls main() and then sys.exit() with the return value.
Command line entry point; arguments must match those defined in in create_parser(); returns 0 for success, else 1.
Example:
command.main("-i", "**/*.py", "--no-default-excludes")
Runs formic printing out all .py files in the current working directory and its children to sys.stdout.
If kw is None, main() will use sys.argv.