Data Structures#

Technically, data structures in Python are also defined as data types. However, for the purposes of the course, we will categorize them differently based on the definition: data structures are formats to organize data, so that it is easier to access and manipulate said data.
More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.

In Python, we are basically bothered with four types of data structures, i.e., List, Tuple, Set, and Dictionary.
We will go through these one by one.

Lists#

A list is a data structure that holds an ordered collection of items, i.e., you can store a sequence of items in a list.
List is written as a sequence of comma-separated values (items) between square brackets.
Important thing about a list is that the items in a list need not be of the same type.

fruitlist = ["apple", "banana", "orange", "mango"]
print(fruitlist)
['apple', 'banana', 'orange', 'mango']
subject_and_year_list = ["chemistry", "physics", 1997, 2000]
print(subject_and_year_list)
['chemistry', 'physics', 1997, 2000]

Accessing values in a list#

Accessing values in a list is similar to how we access a substring in a string. We use square brackets and indices.

print(f"First fruit: {fruitlist[0]}") # printing the first item in fruit list
First fruit: apple
print(subject_and_year_list[2:5]) # getting multiple values
[1997, 2000]

Updating values in a list#

You can update a single or multiple values in a list by assigning the new values to the given indices.

subject_and_year_list
['chemistry', 'physics', 1997, 2000]
# updating a single value
subject_and_year_list[2] = "maths"
subject_and_year_list
['chemistry', 'physics', 'maths', 2000]
fruitlist
['apple', 'banana', 'orange', 'mango']
# updating multiple items
fruitlist[1:3] = ['carrot', 'mango']
fruitlist
['apple', 'carrot', 'mango', 'mango']
fruitlist[1:3] = ['orange']
fruitlist
['apple', 'orange', 'mango']

Deleting list elements#

We can delete a list element by using its index (position) or value.

fruitlist
['apple', 'orange', 'mango']
del fruitlist[0]
fruitlist
['orange', 'mango']
subject_and_year_list
['chemistry', 'physics', 'maths', 2000]
subject_and_year_list.remove(2000)
subject_and_year_list
['chemistry', 'physics', 'maths']

Basic list operations#

If we multiply any list with any number, it repeats all the elements in the list.

list1 = ['hi', 2]
list1
['hi', 2]
list1 * 5
['hi', 2, 'hi', 2, 'hi', 2, 'hi', 2, 'hi', 2]

To check if an element is present in the list or not, we can use the command in

list2 = [1,2,3]
list2
[1, 2, 3]
2 in list2
True
5 in list2
False

To merge or concatenate two lists, we can add them using the + operator

list3 = [1,2,3]
list4 = [4,5,6]
list5 = list3 + list4
list3
[1, 2, 3]
list4
[4, 5, 6]
list5
[1, 2, 3, 4, 5, 6]

We can add items at the end of a list using the append keyword.

list4 = [1,2,3]
list4
[1, 2, 3]
list4.append(8)
print(list4)
[1, 2, 3, 8]

We can use the keyword len to get the length of a list

len(list4)
4

Tuple#

A tuple is a sequence of immutable Python objects. Tuples are sequences just like lists. The difference between tuples and lists are that the tuples cannot be changed unlike lists and tuples use paranthesis instead of square brackets.

tup1 = ('physics', 'chemistry', 1997, 2000)
tup2 = (1,2,3,4,5,6)
tup3 = ('a', 'b', 'c', 'd')
print(tup1, tup2, tup3)
('physics', 'chemistry', 1997, 2000) (1, 2, 3, 4, 5, 6) ('a', 'b', 'c', 'd')

Most of the operations that we carry out on tuples are similar to the ones we carried out for lists. I will just give examples of the operations and skip the discussion for most of these operations.

Accessing values in a tuple#

tup1
('physics', 'chemistry', 1997, 2000)
tup1[2] # accessing one value
1997
tup1[0:3] # accessing multiple values
('physics', 'chemistry', 1997)

Updating values#

Tuples are immutable, thus, we cannot update the values in a given tuple. We encounter an error if we try to do so.

tup1[1] = 'orange' # should give an error
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[34], line 1
----> 1 tup1[1] = 'orange' # should give an error

TypeError: 'tuple' object does not support item assignment

Deleting values#

Removing individual values from a tuple is not possible, but we can delete the complete tuple using the del keyword.

tup3
('a', 'b', 'c', 'd')
del tup3[1] # should give an error
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[36], line 1
----> 1 del tup3[1] # should give an error

TypeError: 'tuple' object doesn't support item deletion
del tup3
tup3
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[38], line 1
----> 1 tup3

NameError: name 'tup3' is not defined

Basic tuple operations#

Multiplying by a number

tup1
('physics', 'chemistry', 1997, 2000)
tup1 * 2
('physics', 'chemistry', 1997, 2000, 'physics', 'chemistry', 1997, 2000)
tup1
('physics', 'chemistry', 1997, 2000)

Check if element is present in tuple

tup3 = (1,2,3)
2 in tup3
True
5 in tup3
False

Concatenate two tuples

tup4 = (1,2,3)
tup5 = (4,5,6)
tup6 = tup4 + tup5
tup6
(1, 2, 3, 4, 5, 6)

Adding items to a tuple is not possible

tup3.append(7)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[46], line 1
----> 1 tup3.append(7)

AttributeError: 'tuple' object has no attribute 'append'

Length of a tuple

len(tup5)
3

Set#

Sets in Python are essentially similar to mathematical sets and are used for operations like union, intersection, difference, and complement, etc. We can create a set, access it’s elements and carry out these mathematical operations. Sets are defined using the keyword set.

variable = set([list of items])

days = set(['mon', 'tue', 'wed', 'thu', 'fri', 'sat', 'sun'])
print(days)
{'mon', 'sat', 'sun', 'wed', 'thu', 'tue', 'fri'}
months = {'jan', 'feb', 'mar'} #sets can also be defined using angular brackets
print(months)
{'jan', 'feb', 'mar'}

Accessing values in a set#

We cannot access individual elements in a set using indices, as the items in a set are not linked to a given index.

months[0]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[50], line 1
----> 1 months[0]

TypeError: 'set' object is not subscriptable

Adding items to a set#

We can add elements to a set using the add method. There is no specific index attached to the added element.

days.add("jan")
days
{'fri', 'jan', 'mon', 'sat', 'sun', 'thu', 'tue', 'wed'}

Removing an item from a set#

Similarly, we can remove elements using discard method.

days.discard("jan")
days
{'fri', 'mon', 'sat', 'sun', 'thu', 'tue', 'wed'}

Union of sets#

The union operation on two sets produces a new set containing all the distinct elements of both sets.

daysA = {'mon', 'tue', 'wed'}
daysB = {'wed', 'thu', 'fri', 'sat', 'sun'}
allDays = daysA | daysB # union operation
allDays
{'fri', 'mon', 'sat', 'sun', 'thu', 'tue', 'wed'}

Intersection of sets#

The intersection operation on two sets produces a new set containing only elements present in both the sets.

daysA & daysB
{'wed'}

Difference of sets#

The difference operation on two sets produces a new set containing only the elements from the first set that are not present in the second set.

daysA - daysB
{'mon', 'tue'}

Compare sets#

We can check if a given set is a subset or superset of another set. This result is a boolean depending on the elements present in the sets.

daysA = {'mon', 'tue', 'wed'}
daysB = {'mon', 'tue', 'wed', 'thu', 'fri'}
daysA <= daysB # checking if daysA is a subset of daysB
True
daysB <= daysA
False
daysA >= daysB # checking if daysA is a superset of daysB
False

Dictionary#

A dictionary is a data structure consisting of key and value pairs. A dictionary is also defined using angular brackets like in sets, however, the keys and values are separated using a colon (:). An empty dictionary without any items is written with just two curly brackets, like this: {}.

marks_dict = {'maths': 34, 'chemistry': 36, 'physics': 32}
print(marks_dict)
{'maths': 34, 'chemistry': 36, 'physics': 32}

Accessing values in a dictionary#

To access dictionary elements, we can use the keys in a square bracket and get the value.

marks_dict['chemistry']
36
marks_dict[0]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[63], line 1
----> 1 marks_dict[0]

KeyError: 0

Updating dictionary#

We can update a dictionary by adding a new entry, or a key-value pair, modifying an existing entry, or deleting an existing entry.

marks_dict
{'maths': 34, 'chemistry': 36, 'physics': 32}
marks_dict['biology'] = 35 # adding new entry
marks_dict
{'maths': 34, 'chemistry': 36, 'physics': 32, 'biology': 35}
marks_dict['chemistry'] = 30 # modifying existing entry
marks_dict
{'maths': 34, 'chemistry': 30, 'physics': 32, 'biology': 35}
del marks_dict['biology'] # deleting an entry
marks_dict
{'maths': 34, 'chemistry': 30, 'physics': 32}

Deleting a dictionary#

marks_dict.clear() # deletes all entries
marks_dict
{}
del marks_dict # deletes the dictionary object
marks_dict
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[69], line 2
      1 del marks_dict # deletes the dictionary object
----> 2 marks_dict

NameError: name 'marks_dict' is not defined

Properties of dictionary#

anime = {'name': 'naruto', 'age': '17', 'class': 'genin'}

The keys() method returns all the keys present in a dictionary

anime.keys()
dict_keys(['name', 'age', 'class'])

The values() method returns all the values present in the dictionary

anime.values()
dict_values(['naruto', '17', 'genin'])

The items() method returns both the keys and values as pairs of a given dictionary

anime.items()
dict_items([('name', 'naruto'), ('age', '17'), ('class', 'genin')])

Copy#

Often we will use complex data structures, such as nested lists (lists within lists). When using such complex data structures, one needs to be careful about items in the data structure and what exactly happens when we try to copy a complex data structure like that.

In the first example, we are not copying the old_list into the new_list. We are basically just assigning two different names to the same memory location.

old_list = [[1,2,3], [4,5,6]]
new_list = old_list

print(old_list)
print(new_list)
[[1, 2, 3], [4, 5, 6]]
[[1, 2, 3], [4, 5, 6]]
old_list.append([7,8,9])
print(old_list)
print(new_list)
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Both old_list and new_list refer to the same object. Thus, when we make any changes to the old_list, it is reflected in the new_list.
we can verify this using the id method that gives the unique id for a specified object. It is basically an object’s memory address or unique identifier. Two different objects cannot have the same id for a given program.

id(old_list)
3089126272128
id(new_list)
3089126272128

We see that the IDs are the same! We are basically referring to the same objects.

Shallow copy#

Python has a package called “copy” that can be used to copy complex or nested items like this.
So we will go ahead and use this package to now create a copy of our old_list.

import copy

old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
new_list = copy.copy(old_list)

print("Old list:", old_list)
print("New list:", new_list)
Old list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
New list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
old_list.append([4, 4, 4])

print("Old list:", old_list)
print("New list:", new_list)
Old list: [[1, 2, 3], [4, 5, 6], [7, 8, 9], [4, 4, 4]]
New list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

This seems to work now! We have added a new item to the old_list, but the new_list is intact! Let’s go a step further and try something like modifying an existing element just to be sure.

old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)
old_list[1]
[2, 2, 2]
old_list[1][2]
2
old_list[1][1] = 'AA'

print("Old list:", old_list)
print("New list:", new_list)
Old list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]

Alas! Modifying the existing elements of the older list modifies the items in the newer list.
How can this be? Lets investigate using the id function.

id(old_list)
3089126268288
id(new_list)
3089126286464

The IDs of the two lists are different. What next? Let’s go a step further and check the IDs of the lists inside the outer list.

id(old_list[1])
3089126292928
id(new_list[1])
3089126292928

As suspected, the copy.copy() method just copied the list item superficially. The inner lists in the newer list were the same ones present in the older list.

Deep Copy#

To combat this typical issue, there is another function in the “copy” package called deepcopy. It basically creates all the items in a nested list ground up, assigning new memory locations to each object. Let’s try this out.

old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)

print("Old list:", old_list)
print("New list:", new_list)
Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
old_list[1][0] = 'BB'

print("Old list:", old_list)
print("New list:", new_list)
Old list: [[1, 1, 1], ['BB', 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
id(old_list)
3089126292800
id(new_list)
3089125844544
id(old_list[1])
3089126281216
id(new_list[1])
3089126273856

As expected, the IDs of the outer and inner lists are different, now the two lists behave independently, so we don’t have to worry about any changes in the old_list being reflected in the new_list.

References:#

  1. https://docs.python.org/3/library/stdtypes.html

  2. https://www.geeksforgeeks.org/python-data-types/

  3. https://www.w3schools.com/python/ref_func_id.asp