In this lab, we are going to take a look at sets in Python. A set is a unique collection of objects in Python. You can denote a set with a curly bracket {}. Python will remove duplicate items:
set1={"pop", "rock", "soul", "hard rock", "rock", "R&B", "rock", "disco"}
set1
{'R&B', 'disco', 'hard rock', 'pop', 'rock', 'soul'}
The process of mapping is illustrated in the figure:
You can also create a set from a list as follows:
album_list =[ "Michael Jackson", "Thriller", 1982, "00:42:19", \
"Pop, Rock, R&B", 46.0, 65, "30-Nov-82", None, 10.0]
album_set = set(album_list)
album_set
{65,
None,
'Pop, Rock, R&B',
10.0,
46.0,
'Michael Jackson',
'30-Nov-82',
'Thriller',
'00:42:19',
1982}
Now let us create a set of genres:
music_genres = set(["pop", "pop", "rock", "folk rock", "hard rock", "soul", \
"progressive rock", "soft rock", "R&B", "disco"])
music_genres
{'R&B',
'disco',
'folk rock',
'hard rock',
'pop',
'progressive rock',
'rock',
'soft rock',
'soul'}
set(['rap','house','electronic music', 'rap'])
{'electronic music', 'house', 'rap'}
Notice that the duplicates are removed and the output is sorted.
Let us get the sum of the claimed sales:
A=[1,2,2,1]
B=set([1,2,2,1])
sum(A)==sum(B)
False
Now let's determine the average rating:
Let us go over Set Operations, as these can be used to change the set. Consider the set A:
A = set(["Thriller","Back in Black", "AC/DC"] )
A
{'AC/DC', 'Back in Black', 'Thriller'}
We can add an element to a set using the add() method:
A.add("NSYNC")
A
{'AC/DC', 'Back in Black', 'NSYNC', 'Thriller'}
If we add the same element twice, nothing will happen as there can be no duplicates in a set:
A.add("NSYNC")
A
{'AC/DC', 'Back in Black', 'NSYNC', 'Thriller'}
We can remove an item from a set using the remove method:
A.remove("NSYNC")
A
{'AC/DC', 'Back in Black', 'Thriller'}
We can verify if an element is in the set using the in command :
"AC/DC" in A
True
Remember that with sets you can check the difference between sets, as well as the symmetric difference, intersection, and union:
Consider the following two sets:
album_set1 = set(["Thriller",'AC/DC', 'Back in Black'] )
album_set2 = set([ "AC/DC","Back in Black", "The Dark Side of the Moon"] )
album_set1, album_set2
({'AC/DC', 'Back in Black', 'Thriller'},
{'AC/DC', 'Back in Black', 'The Dark Side of the Moon'})
As both sets contain 'AC/DC' and 'Back in Black' we represent these common elements with the intersection of two circles.
We can find the common elements of the sets as follows:
album_set_3=album_set1 & album_set2
album_set_3
{'AC/DC', 'Back in Black'}
We can find all the elements that are only contained in album_set1 using the difference method:
album_set1.difference(album_set2)
{'Thriller'}
We only consider elements in album_set1; all the elements in album_set2, including the intersection, are not included.
The difference between album_set2 and album_set1 is given by:
album_set2.difference(album_set1)
{'The Dark Side of the Moon'}
We can also find the intersection, i.e in both album_list2 and album_list1, using the intersection command :
album_set1.intersection(album_set2)
{'AC/DC', 'Back in Black'}
This corresponds to the intersection of the two circles:
The union corresponds to all the elements in both sets, which is represented by colouring both circles:
The union is given by:
album_set1.union(album_set2)
{'AC/DC', 'Back in Black', 'The Dark Side of the Moon', 'Thriller'}
And you can check if a set is a superset or subset of another set, respectively, like this:
set(album_set1).issuperset(album_set2)
False
set(album_set2).issubset(album_set1)
False
Here is an example where issubset() is issuperset() is true:
set({"Back in Black", "AC/DC"}).issubset(album_set1)
True
album_set1.issuperset({"Back in Black", "AC/DC"})
True
album_set3 = album_set1.union(album_set2)
album_set3
{'AC/DC', 'Back in Black', 'The Dark Side of the Moon', 'Thriller'}
album_set1.issubset(album_set3)
True
</div>
<a href="http://cocl.us/bottemNotebooksPython101Coursera"><img src = "https://ibm.box.com/shared/static/irypdxea2q4th88zu1o1tsd06dya10go.png" width = 750, align = "center"></a>
# About the Authors:
[Joseph Santarcangelo]( https://www.linkedin.com/in/joseph-s-50398b136/) has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.
<hr>
Copyright © 2017 [cognitiveclass.ai](cognitiveclass.ai?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).