1.1 - Flat Bags: much more than dictionaries and lists
First of all a flat Bag can be considered simply as a dictionary which items are ordered and keys may be duplicated.
1.1.1 - How to instantiate a Bag
To instantiate a bag, you pass a list composed of tuples or a dict, to the constructor. If you don't pass any argument you obtain an empty bag.
from gnr.core.gnrbag import Bag
mybag= Bag()
mybag = Bag([('a',1),('b',2)])
mybag = Bag({'a': 1, 'b':2})
1.1.2 - Setting and Getting items
You can read/write a bag's item using the methods getItem(path), setItem(path,value). When considering flat bags, the path is called a label and represent the item's identifier among the bag's children.
mybag = Bag()
mybag.setItem('a',1)
first= mybag.getItem('a')
A more compact way to access to bag's items is the square-brackets notation, which is a typical feature of dictionaries.
mybag['b']=2 second = mybag['b']
1.1.3 - Duplicated labels
It is evident that there are several analogies between a bag's label and dictionary key, but there are also some fundamental differences.
- A bag's label must be a string: numbers or complex types are not valid labels.
- Unlike dictionaries, whose keys must be unique, bags can have different items tagged with the same label.
- If you try to get an item that is not present within the bag, python doesn't raise an exception, but the method getItem returns None
IMPORTANT - It is possible to insert different values with the same label, but in order to do this you have to use the method addItem(label,value) because setItem(label,value) would set a new value on the old item.
beatles= Bag()
beatles. addItem('member', 'John') # you could also use ''setitem'' to add the first item to the bag
beatles. addItem('member', 'Paul')
beatles. addItem('member', 'George')
beatles. addItem('member', 'Ringo')
1.1.4 - Accessing to items by index
A bag is an ordered container, in fact a Bag remembers the order of insertion of its children. This makes a Bag similar to a list, allowing it to get its items with a numeric index that represents an element's position. If you want to access data by its position, you have to use a particular label composed by # followed by the item's index.
first= mybag.getItem('#1')
second= mybag['#2']
This feature is very useful when a bag has several items with the same label, because the method getItem(label) returns only the first item tagged with the argument label. This means that the only way to access items with a duplicated label is by index.
lennon= beatles.getItem('member')
lennon= beatles.getItem('#0')
mccartney=beatles.getItem('#1')
harrison=beatles['#2']
If you need to know the ordinal position of an item you can use the method index(label). But remember that unlike a list's index method, returns the element position using its label and not its value. A bag's label can be duplicated and in this case the method index(label) returns the position of the first occurrence of the label.
n= beatles.index('member')
lennon= beatles['#%i' %n']
1.1.5 - Setting item's position
It is possible to set a new item at a particular position among its brothers, using the optional argument _position of the method setItem(label,value). The default behaviour of setItem is to add the new item as the last element of a list, but the _position argument provides a compact syntax to insert any item at it's desired place. _position must be a string of the following types:
| '<' | set as first item |
| '<label' | set before the element with label |
| '<#index ' | set before the element with index |
| '>' | set as last item |
| '>label' | set after the element with label |
| '>#index ' | set after the element with index |
| '#index' | set at position |
mybag=Bag({'a':1,'b':2,'c':3,'d':4})
mybag.setItem('e',5, _position= '<')
mybag.setItem('f',6, _position= '<c')
mybag.setItem('g',7, _position= '<#3')
1.1.6 - How to display a bag
If you want to display a bag in your python shell you can use the built-in function print. If you need a bag's representation as a string use the method asString
>>> print mybag 0 - (int) e: 5 1 - (int) a: 1 2 - (int) g: 7 3 - (int) f: 6 4 - (int) c: 3 5 - (int) b: 2 6 - (int) d: 4 >>> mystring= mybag.asString() >>> mystring 0 - (int) e: 5 1 - (int) a: 1 2 - (int) g: 7 3 - (int) f: 6 4 - (int) c: 3 5 - (int) b: 2 6 - (int) d: 4
Bag representation makes a line for each item. The line is structured:
| item's index | item's type | label | value |
1.1.7 - Other similarities between bags, dictionaries and lists
1.1.7.1 - Dictionary methods
- bag.keys()
- bag.items()
- bag.values()
- bag.has_key()
1.1.7.2 - List methods
- index()
- pop()
1.1.7.3 - The operator 'in'
Bag also supports the operator in exactly like a dictionary or list.
>>>'a' in mybag True
1.1.7.4 - Transform a flat bag in a dictionary
A bag can be transformed into a dict with the method asDict()
d=b.asDict()
If you attempt to transform a bag that is hierarchical, to a dictionary, you will not generate an error but the resulting dictionary will contain only the first level of the bag.
1.2 - Hierarchical bags:
If a bag contains other bags, the outer one is a Hierarchical Bag.
In the previous chapter we saw how a bag works in breadth. Now we'll see how they can be used to store data "in depth".
Bags aren't just another traversable tree structure. In fact a Bag supports direct access to any value contained in any of the nested bags, using a complex path. This means that a bag contains not only its children but also its descendants.
We call path a concatenation of nested bags' labels that ends always with the innermost item's label. The separator character of a path is dot. Remember that if you need to use labels that include dot char, but you didn't want them to be interpreted as part of a complex path, you have to escape the dot char with a backslash.
1 new_card= Bag()
2 new_card['name']='John'
3 new_card.setItem('surname','Doe')
4 new_card['phone']= Bag()
5 new_card['phone'].setItem('office',555450210)
6 new_card.setItem('phone.home',555345670)
7 new_card.setItem('phone.mobile', 555230450)
8 address_book=Bag()
9 address_book.setItem('friends.johnny', new_card)
10 john_mobile= address_book.getItem('friends.johnny.phone.mobile')
>>> print john_mobile
555230450
Let's examine the address_book example: We instantiate a bag called new_card and we set three items: name, surname and phone. From the above example, we can set an item in bag by using two different syntax: a. the square-brackets notation, or b. the 'setitem' notation. The item phone is a bag and we fill it with three new values: mobile, home, office. There is a formal difference between line 5 and line 6. In line 5, we set office as child of phone, calling the method setItem from the instance labelled as phone. In line 6 we instead set home directly from the bag new_card as its nephew, using the path 'phone.home'.
Even if the instance which sets the item is different, the result is identical. Both items are set at the same level and we can consider them either as children of "phone" or as nested content of new_card.
A hierarchical bag as new_card can be nested within a larger one. In line 9 we set it into the bag friends that is inside the bag address_book. Now you might be thinking that the bag "friends" was not intantiated and that it was not set into address_book. When the method setItem receives the path 'friends.johnny', the bags in the middle are also created, if they don't exist.
This feature is very useful to quickly create many nested bags with just a single command.
mybag=Bag()
mybag.setItem('a.b.c.d.e.f.g', 7)
print mybag['a.b.c.d.e.f.g']
>>> 7
Print function displays nested bags with indented blocks.
>>>print address_book
0 - (Bag) friends:
0 - (Bag) johnny:
0 - (str) name: John
1 - (str) surname: Doe
2 - (Bag) phone:
0 - (int) office: 555450210
1 - (int) home: 555345670
2 - (int) mobile: 555230450
In the previous chapter we saw that we can access an item using a numeric label #index. A bag can be traversed using a path that includes either common labels or a numeric label.
print address_book['friends.johnny.#2.office'] >>> 555450210
1.3 - Bag's attributes
You can attach metadatas to any item of Bag. These metadatas are called attributes . Each attribute has a name and a value. Attributes are stored in a dictionary.
1.3.1 - Setting attributes
1.3.1.1 - with setItem
You can set attributes while you set an item, passing them as **kwargs of the method setItem.
b=Bag()
b.setItem('documents.letters.letter_to_mark', 'file0', createdOn='10-7-2003', createdBy= 'Jack')
b.setItem('documents.letters.letter_to_john', 'file1', createdOn='11-5-2003', createdBy='Mark', lastModify= '11-9-2003')
b.setItem('documents.letters.letter_to_sheila', 'file2')
1.3.1.2 - with setAttr
You can set attributes and change their value with the method setAttr(path, attributes). Attributes are passed as **kwargs.
b.setAttr('documents.letters.letter_to_sheila', createdOn='12-4-2003', createdBy='Walter', lastModify= '12-9-2003')
b.setAttr('documents.letters.letter_to_sheila', fileOwneer='Steve')
>>> print b
0 - (Bag) documents:
0 - (Bag) letters:
0 - (int) letter_to_mark: Bag({'file':'file0'}) <createdOn='10-7-2003' createdBy= 'Jack'>
1 - (int) letter_to_john: Bag({'file':'file1'}) <lastModify='11-9-2003' createdOn='11-5-2003' createdBy='Mark'>
2 - (int) letter_to_sheila: Bag({'file':'file2'})' <lastModify='12-9-2003' createdOn='12-4-2003' createdBy='Walter' _attributes='{'fileOwneer': 'Steve'}'>
1.3.2 - Getting attributes
To get a single item's attributes, there is the method getAttr(path, attr).
>>> print b.getAttr('documents.letters.letter_to_sheila', 'fileOwneer')
'Steve'
There is also a compact square-brackets notation for getAttr(path, attr). It uses special char '?' followed by 'a:' and the attribute's name Let's examine the previous example using the compact syntax:
>>> print b['documents.letters.letter_to_sheila?a:fileOwner'] 'Steve'
1.3.3 - Deleting attributes
You may delete an attribute setting None as it's value
1.3.4 - Digest method
A Bag implements a very useful method called digest that returns a list of tuples, one for each bag's item. These tuples contains the columns requested by the parameter what, which is a comma separated string of special keys.
| #k | show the label of each item |
| #v | show the value of each item |
| #v.path | show inner values of each item |
| #a | show attributes of each item |
| #a.attrname | show the attribute called 'attrname' for each item |
>>> print b['documents.letters'].digest('#k,#a.createdOn,#a.createdBy')
[('letter_to_mark', '10-7-2003', 'Jack'), ('letter_to_john', '11-5-2003', 'Mark'), ('letter_to_sheila', '12-4-2003', 'Walter')]
In this example we requested the label and the attributes fileOwner, createdOn. There is a square-brackets notation also for the method digest. This syntax uses the special char "?" followed by "d:" and then the parameter what.
>>> print b['documents.letters.?d:#k,#a.createdOn,#a.createdBy']
[('letter_to_mark', '10-7-2003', 'Jack'), ('letter_to_john', '11-5-2003', 'Mark'), ('letter_to_sheila', '12-4-2003', 'Walter')]
>>> print b['documents.letters.?d:#v, #a.createdOn']
[('file0', '10-7-2003'), ('file1', '11-5-2003'), ('file2', '12-4-2003')]
1.3.5 - Attributes in path
We said that a path, can be formed either by labels or #index. There is a third way to identify a bag item by specifying a condition on any of its attributes where the attribute value is of type string. For example, the item with label letter_to_mark can be identified by the attribute condition "the file created by Jack". Therefore, instead of using a label, or a numeric index of position in a path, we could alternatively insert a condition on an attribute. The syntax for testing a condition on an attribute within a path is:
#''attribute_name''=''value''
If the attribute tested is called id, the attribute's name can be omitted. Remember that this syntax works only if the tested attribute has a value of type string.
bookcase = Bag()
mybook=Bag()
mybook.setItem('part1', Bag(),title='The fellowship of the ring', pages=213)
mybook.setItem('part2', Bag(), title='The two towers', pages=221)
mybook.setItem('part3', Bag(), title='The return of the king', pages=242)
bookcase.setItem('genres.fantasy.LRDRNGS', mybook , title='The lord of the rings',id='f123', author='Tolkien')
>>> print bookcase.getItem('genres.fantasy.#author=Tolkien')
0 - (Bag) part1: <pages='213' title='The fellowship of the ring'>
1 - (Bag) part2: <pages='221' title='The two towers'>
2 - (Bag) part3: <pages='242' title='The return of the king'>
>>> print bookcase.getAttr('genres.fantasy.#=f123', 'title')
'The lord of the rings'
In this example we identify two uses of path that includes conditions on an item's attributes:
- getItem('genres.fantasy.#author=Tolkien')
- getAttr('genres.fantasy.#=f123', 'title')
1.4 - Bag nodes
We discovered in the previous chapter that we can associate a set of attributes to each item. We will now discuss a more advanced concept about a Bag, where we introduce the BagNode. A Bag is a collection of 'nodes'.
A 'BagNode' is an object composed of three things:
- label
- attributes
- value (or item)
In order to avoid confusion between the terms item and node, what we used to call an 'item' we will now call a value.
If you need to work with nodes, you may get them with the methods:
| getNode(path) | returns a node |
| getNodes() | returns a list of nodes |
| getNodeByAttr(attribute, attr_value) | returns the node that has the passed couple attribute-value |
mybag = Bag({'paper':1, 'scissors':2})
papernode = mybag.getNode('paper')
mybag.setItem('rock', 3 , color='grey')
rocknode=mybag.getNodeByAttr('color','grey')
nodes=mybag.getNodes()
The method getNodes() implements the bag's property nodes.
>>>mybag.getNodes() == mybag.nodes True
If you have a node instance you may use one of the following methods:
hasAttr(attribute) returns true if the node has a value for the passed attribute setAttr(attribute=value) set to the node one or more attributes passed as kwargs getAttr(attribute) returns the attribute's value replaceAttr(attribute=value) replaces the value of one or more attributes passed as kwargs delAttr(attribute) deletes the attribute with the passed name getLabel() returns the node's label setLabel(label) sets the node's label getValue() returns the node's value setValue() sets the node's value
>>> print papernode.hasAttr('color')
False
>>> papernode.setAttr(color='white')
>>> print papernode.getAttr('color')
white
>>> papernode.replaceAttr(color='yellow')
>>> papernode.delAttr('color')
>>> papernode.setLabel('sheet')
>>> print papernode.getLabel()
sheet
>>> papernode.setValue(8)
>>> papernode.getValue()
8
1.5 - Parent reference and backwards paths
We said that each item is enveloped into a bag node, and that can be contained by several bags in different places, this means that a Bag knows its children but ignores who is its father, in fact it may have many fathers.
We could set stricter hypotesis about the structure of a bag, making it more similar to a tree-leaf model, this would happen if a bag had a back reference to the bag that contains it. This feature is implemented by the method setBackRef() If we call the method setBackRef on a bag instance, that bag becomes the root of a tree structure in which each leaf (BagNode) knows its father. This means that we can traverse a bag backward using the property parent of bag's nodes.
family = Bag() family['grandpa'] = Bag() family['grandpa'].setBackRef() family['grandpa.father.son.nephew']=Bag() nephew = family['grandpa.father.son.nephew'] son = family['grandpa.father.son'] father = family['grandpa.father'] >>> son.parent == father True >>> nephew.parent.parent == father True >>> nephew.parent == son True
A bag with back reference can be traversed with special back-paths that use a new syntax. The symbol '../' in a path is equivalent to the property parent.
>>> nephew['../../'] == father True
When the backreference is set, it is possible to get from the bag its own BagNode:
>>> father.node BagNode : father at 432464
Attachments
- bagnode.jpg (29.9 kB) - added by anonymous 20 months ago.
- backref.2.jpg (24.8 kB) - added by anonymous 20 months ago.


