Package glue :: Package ligolw :: Package utils
[hide private]
[frames] | no frames]

Package utils

source code

Library of utility code for LIGO Light Weight XML applications.


Version: git id 8cbd1b7187ce3ed9a825d6ed11cc432f3cfde9a5

Date: 2017-12-05 15:29:36 +0000

Author: Kipp Cannon <kipp.cannon@ligo.org>

Submodules [hide private]

Classes [hide private]
  MD5File
  RewindableInputFile
  SignalsTrap
  tildefile
Functions [hide private]
 
load_filename(filename, verbose=False, **kwargs)
Parse the contents of the file identified by filename, and return the contents as a LIGO Light Weight document tree.
source code
 
load_fileobj(fileobj, gz=None, xmldoc=None, contenthandler=None)
Parse the contents of the file object fileobj, and return the contents as a LIGO Light Weight document tree.
source code
 
load_url(url, verbose=False, **kwargs)
Parse the contents of file at the given URL and return the contents as a LIGO Light Weight document tree.
source code
 
local_path_from_url(url)
For URLs that point to locations in the local filesystem, extract and return the filesystem path of the object to which they point.
source code
 
sort_files_by_size(filenames, verbose=False, reverse=False)
Return a list of the filenames sorted in order from smallest file to largest file (or largest to smallest if reverse is set to True).
source code
 
write_filename(xmldoc, filename, verbose=False, gz=False, with_mv=True, trap_signals=(15, 20), **kwargs)
Writes the LIGO Light Weight document tree rooted at xmldoc to the file name filename.
source code
 
write_fileobj(xmldoc, fileobj, gz=False, compresslevel=3, **kwargs)
Writes the LIGO Light Weight document tree rooted at xmldoc to the given file object.
source code
 
write_url(xmldoc, url, **kwargs)
Writes the LIGO Light Weight document tree rooted at xmldoc to the URL name url.
source code
Variables [hide private]
  __package__ = 'glue.ligolw.utils'
Function Details [hide private]

load_filename(filename, verbose=False, **kwargs)

source code 

Parse the contents of the file identified by filename, and return the contents as a LIGO Light Weight document tree. stdin is parsed if filename is None. Helpful verbosity messages are printed to stderr if verbose is True. All other keyword arguments are passed to load_fileobj(), see that function for more information. In particular note that a content handler must be specified.

Example:

>>> from glue.ligolw import ligolw
>>> xmldoc = load_filename("demo.xml", contenthandler = ligolw.LIGOLWContentHandler, verbose = True)

load_fileobj(fileobj, gz=None, xmldoc=None, contenthandler=None)

source code 

Parse the contents of the file object fileobj, and return the contents as a LIGO Light Weight document tree. The file object does not need to be seekable.

If the gz parameter is None (the default) then gzip compressed data will be automatically detected and decompressed, otherwise decompression can be forced on or off by setting gz to True or False respectively.

If the optional xmldoc argument is provided and not None, the parsed XML tree will be appended to that document, otherwise a new document will be created. The return value is a tuple, the first element of the tuple is the XML document and the second is a string containing the MD5 digest in hex digits of the bytestream that was parsed.

Example:

>>> from glue.ligolw import ligolw
>>> import StringIO
>>> f = StringIO.StringIO('<?xml version="1.0" encoding="utf-8" ?><!DOCTYPE LIGO_LW SYSTEM "http://ldas-sw.ligo.caltech.edu/doc/ligolwAPI/html/ligolw_dtd.txt"><LIGO_LW><Table Name="demo:table"><Column Name="name" Type="lstring"/><Column Name="value" Type="real8"/><Stream Name="demo:table" Type="Local" Delimiter=",">"mass",0.5,"velocity",34</Stream></Table></LIGO_LW>')
>>> xmldoc, digest = load_fileobj(f, contenthandler = ligolw.LIGOLWContentHandler)
>>> digest
'6bdcc4726b892aad913531684024ed8e'

The contenthandler argument specifies the SAX content handler to use when parsing the document. The contenthandler is a required argument. See the glue.ligolw package documentation for typical parsing scenario involving a custom content handler. See glue.ligolw.ligolw.PartialLIGOLWContentHandler and glue.ligolw.ligolw.FilteringLIGOLWContentHandler for examples of custom content handlers used to load subsets of documents into memory.

load_url(url, verbose=False, **kwargs)

source code 

Parse the contents of file at the given URL and return the contents as a LIGO Light Weight document tree. Any source from which Python's urllib library can read data is acceptable. stdin is parsed if url is None. Helpful verbosity messages are printed to stderr if verbose is True. All other keyword arguments are passed to load_fileobj(), see that function for more information. In particular note that a content handler must be specified.

Example:

>>> from os import getcwd
>>> from glue.ligolw import ligolw
>>> xmldoc = load_url("file://localhost/%s/demo.xml" % getcwd(), contenthandler = ligolw.LIGOLWContentHandler, verbose = True)

local_path_from_url(url)

source code 

For URLs that point to locations in the local filesystem, extract and return the filesystem path of the object to which they point. As a special case pass-through, if the URL is None, the return value is None. Raises ValueError if the URL is not None and does not point to a local file.

Example:

>>> print(local_path_from_url(None))
None
>>> local_path_from_url("file:///home/me/somefile.xml.gz")
'/home/me/somefile.xml.gz'

sort_files_by_size(filenames, verbose=False, reverse=False)

source code 

Return a list of the filenames sorted in order from smallest file to largest file (or largest to smallest if reverse is set to True). If a filename in the list is None (used by many glue.ligolw based codes to indicate stdin), its size is treated as 0. The filenames may be any sequence, including generator expressions.

write_filename(xmldoc, filename, verbose=False, gz=False, with_mv=True, trap_signals=(15, 20), **kwargs)

source code 

Writes the LIGO Light Weight document tree rooted at xmldoc to the file name filename. If filename is None the file is written to stdout, otherwise it is written to the named file. Friendly verbosity messages are printed while writing the file if verbose is True. The output data is gzip compressed on the fly if gz is True. If with_mv is True and filename is not None the filename has a "~" appended to it and the file is written to that name then moved to the requested name once the write has completed successfully.

Internally, write_fileobj() is used to perform the write. All additional keyword arguments are passed to write_fileobj().

This function traps the signals in the trap_signals iterable during the write process (see SignalsTrap for the default signals), and it does this by temporarily installing its own signal handlers in place of the current handlers. This is done to prevent Condor eviction during the write process. When the file write is concluded the original signal handlers are restored. Then, if signals were trapped during the write process, the signals are then resent to the current process in the order in which they were received. The signal.signal() system call cannot be invoked from threads, and trap_signals must be set to None or an empty sequence if this function is used from a thread.

Example:

>>> write_filename(xmldoc, "demo.xml")  # doctest: +SKIP
>>> write_filename(xmldoc, "demo.xml.gz", gz = True)    # doctest: +SKIP

write_fileobj(xmldoc, fileobj, gz=False, compresslevel=3, **kwargs)

source code 

Writes the LIGO Light Weight document tree rooted at xmldoc to the given file object. Internally, the .write() method of the xmldoc object is invoked and any additional keyword arguments are passed to that method. The file object need not be seekable. The output data is gzip compressed on the fly if gz is True, and in that case the compresslevel parameter sets the gzip compression level (the default is 3). The return value is a string containing the hex digits of the MD5 digest of the output bytestream.

Example:

>>> import sys
>>> from glue.ligolw import ligolw
>>> xmldoc = load_filename("demo.xml", contenthandler = ligolw.LIGOLWContentHandler)
>>> digest = write_fileobj(xmldoc, sys.stdout)  # doctest: +NORMALIZE_WHITESPACE
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE LIGO_LW SYSTEM "http://ldas-sw.ligo.caltech.edu/doc/ligolwAPI/html/ligolw_dtd.txt">
<LIGO_LW>
        <Table Name="demo:table">
                <Column Type="lstring" Name="name"/>
                <Column Type="real8" Name="value"/>
                <Stream Delimiter="," Type="Local" Name="demo:table">
"mass",0.5,"velocity",34
                </Stream>
        </Table>
</LIGO_LW>
>>> digest
'37044d979a79409b3d782da126636f53'

write_url(xmldoc, url, **kwargs)

source code 

Writes the LIGO Light Weight document tree rooted at xmldoc to the URL name url.

NOTE: only URLs that point to local files can be written to at this time. Internally, write_filename() is used to perform the write. All additional keyword arguments are passed to that function. The implementation might change in the future, especially if support for other types of URLs is ever added.

Example:

>>> write_url(xmldoc, "file:///data.xml")       # doctest: +SKIP
>>> write_url(xmldoc, "file:///data.xml.gz", gz = True) # doctest: +SKIP