1   
  2   
  3   
  4   
  5   
  6   
  7   
  8   
  9   
 10   
 11   
 12   
 13   
 14   
 15   
 16   
 17   
 18   
 19   
 20   
 21   
 22   
 23   
 24   
 25   
 26   
 27  """ 
 28  The ilwd:char type is used to store ID strings for objects within LIGO 
 29  Light-Weight XML files.  This module and its associated C extention module 
 30  _ilwd provide a class for memory-efficient storage of ilwd:char strings. 
 31   
 32  LIGO Light Weight XML "ilwd:char" IDs are strings of the form 
 33  "table:column:integer", for example "process:process_id:10".  Large complex 
 34  documents can have many millions of these strings, and their storage 
 35  represents a significant RAM burden.  However, while there can be millions 
 36  of ID strings in a document there might be only a small number (e.g., 10 or 
 37  fewer) unique ID prefixes in a document (the table name and column name 
 38  part).  The amount of RAM required to load a document can be significantly 
 39  reduced if the small number of unique string prefixes are stored separately 
 40  and reused.  This module provides the machinery used to do this. 
 41   
 42  The ilwdchar class in this module converts a string or unicode object 
 43  containing an ilwd:char ID into a more memory efficient representation. 
 44   
 45  Example: 
 46   
 47  >>> x = ilwdchar("process:process_id:10") 
 48  >>> print(x) 
 49  process:process_id:10 
 50   
 51  Like strings, the object resulting from this is immutable.  It provides two 
 52  read-only attributes, "table_name" and "column_name", that can be used to 
 53  access the table and column parts of the original ID string.  The integer 
 54  suffix can be retrieved by converting the object to an integer. 
 55   
 56  Example: 
 57   
 58  >>> x.table_name 
 59  u'process' 
 60  >>> int(x) 
 61  10 
 62   
 63  The object also provides the read-only attribute "index_offset", giving the 
 64  length of the string preceding the interger suffix. 
 65   
 66  Example: 
 67   
 68  >>> x.index_offset 
 69  19 
 70   
 71  The objects support some arithmetic operations. 
 72   
 73  Example: 
 74   
 75  >>> y = x + 5 
 76  >>> str(y) 
 77  'process:process_id:15' 
 78  >>> int(y - x) 
 79  5 
 80   
 81  The objects are pickle-able. 
 82   
 83  Example: 
 84   
 85  >>> import pickle 
 86  >>> x == pickle.loads(pickle.dumps(x)) 
 87  True 
 88   
 89  To simplify interaction with documents that do not contain fully-populated 
 90  columns, None is allowed as an input value and is not converted. 
 91   
 92  Example: 
 93   
 94  >>> print(ilwdchar(None)) 
 95  None 
 96   
 97   
 98  Implementation details 
 99  ====================== 
100   
101  Memory is reduced by storing the table_name, column_name, and index_offset 
102  values as class attributes, so only one copy is present in memory and is 
103  shared across all instances of the class.  This means that each unique 
104  table_name and column_name pair requires its own class.  These classes are 
105  created on the fly as new IDs are processed, and get added to this module's 
106  name space.  They are all subclasses of _ilwd.ilwdchar, which implements 
107  the low-level machinery.  After a new class is created it can be accessed 
108  as a symbol in this module, but each of those symbols does not exist until 
109  at least one corresponding ID string has been processed. 
110   
111  Example: 
112   
113  >>> import ilwd 
114  >>> "foo_bar_class" in ilwd.__dict__ 
115  False 
116  >>> x = ilwd.ilwdchar("foo:bar:0") 
117  >>> type(x) 
118  <class 'glue.ligolw.ilwd.foo_bar_class'> 
119  >>> "foo_bar_class" in ilwd.__dict__ 
120  True 
121  >>> print(ilwd.foo_bar_class(10)) 
122  foo:bar:10 
123   
124  The ilwdchar class itself is never instantiated, its .__new__() method 
125  parses the ID string parameter and creates an instance of the appropriate 
126  subclass of _ilwd.ilwdchar, creating a new subclass before doing so if 
127  neccessary. 
128  """ 
129   
130   
131  import six.moves.copyreg 
132   
133   
134  from glue import git_version 
135  from . import _ilwd 
136  import six 
137   
138   
139  __author__ = "Kipp Cannon <kipp.cannon@ligo.org>" 
140  __version__ = "git id %s" % git_version.id 
141  __date__ = git_version.date 
142   
143   
144   
145   
146   
147   
148   
149   
150   
151   
152   
153   
154   
155   
156   
157   
159          """ 
160          Searches this module's namespace for a subclass of _ilwd.ilwdchar 
161          whose table_name and column_name attributes match those provided. 
162          If a matching subclass is found it is returned; otherwise a new 
163          class is defined, added to this module's namespace, and returned. 
164   
165          Example: 
166   
167          >>> process_id = get_ilwdchar_class("process", "process_id") 
168          >>> x = process_id(10) 
169          >>> str(type(x)) 
170          "<class 'glue.ligolw.ilwd.process_process_id_class'>" 
171          >>> str(x) 
172          'process:process_id:10' 
173   
174          Retrieving and storing the class provides a convenient mechanism 
175          for quickly constructing new ID objects. 
176   
177          Example: 
178   
179          >>> for i in range(10): 
180          ...     print str(process_id(i)) 
181          ... 
182          process:process_id:0 
183          process:process_id:1 
184          process:process_id:2 
185          process:process_id:3 
186          process:process_id:4 
187          process:process_id:5 
188          process:process_id:6 
189          process:process_id:7 
190          process:process_id:8 
191          process:process_id:9 
192          """ 
193           
194           
195           
196   
197          key = six.text_type(tbl_name), six.text_type(col_name) 
198          cls_name = str("%s_%s_class" % key) 
199          assert cls_name != "get_ilwdchar_class" 
200          try: 
201                  return namespace[cls_name] 
202          except KeyError: 
203                  pass 
204   
205           
206           
207           
208   
209          class new_class(_ilwd.ilwdchar): 
210                  __slots__ = () 
211                  table_name, column_name = key 
212                  index_offset = len(u"%s:%s:" % key) 
 213   
214          new_class.__name__ = cls_name 
215   
216          namespace[cls_name] = new_class 
217   
218           
219           
220           
221   
222          six.moves.copyreg.pickle(new_class, lambda x: (ilwdchar, (six.text_type(x),))) 
223   
224           
225           
226           
227   
228          return new_class 
229   
230   
231   
232   
233   
234   
235   
236   
238          """ 
239          Metaclass wrapper of glue.ligolw._ilwd.ilwdchar class. 
240          Instantiating this class constructs and returns an instance of a 
241          subclass of glue.ligolw._ilwd.ilwdchar. 
242          """ 
244                  """ 
245                  Convert an ilwd:char-formated string into an instance of 
246                  the matching subclass of _ilwd.ilwdchar.  If the input is 
247                  None then the return value is None. 
248   
249                  Example: 
250   
251                  >>> x = ilwdchar(u"process:process_id:10") 
252                  >>> str(x) 
253                  'process:process_id:10' 
254                  >>> x.table_name 
255                  u'process' 
256                  >>> x.column_name 
257                  u'process_id' 
258                  >>> int(x) 
259                  10 
260                  >>> x.index_offset 
261                  19 
262                  >>> str(x)[x.index_offset:] 
263                  '10' 
264                  >>> print(ilwdchar(None)) 
265                  None 
266                  """ 
267                   
268                   
269                   
270   
271                  if s is None: 
272                          return None 
273   
274                   
275                   
276                   
277   
278                  try: 
279                          table_name, column_name, i = s.strip().split(u":") 
280                  except (ValueError, AttributeError): 
281                          raise ValueError("invalid ilwd:char '%s'" % repr(s)) 
282   
283                   
284                   
285                   
286                   
287   
288                  return get_ilwdchar_class(table_name, column_name)(int(i)) 
  289