Lesson 4: Complex Data Types: list / tuple / set / dictionary

Now, let's explore more complex data types, to store multiple values, like arrays in PHP.

There are two most popular types for ordered data, also known as sequences:

  • list - ordered sequences of elements
  • tuple - similar to lists but immutable (elements cannot be modified once defined)

In addition, there are non-ordered types:

  • set - unordered collection of unique elements
  • dict (dictionary) - an unordered collection of key-value pairs

Looks complex? Here they are in a nutshell.

1names_list = ['John', 'James', 'Mandy'] # Python list, same as array in PHP
2print(names_list[1]) # Result: 'James'
3names_list[0] = 'Jon' # You can assign new values
4 
5# --------------------
6 
7names_tuple = ('John', 'James', 'Mandy') # Tuple: same as list, but you can't modify values
8print(names_tuple[1]) # Result: 'James'
9names_tuple[0] = 'Jon' # You CANNOT assign new values, you will get an error
10 
11# --------------------
12 
13names_set = {'John', 'James', 'Mandy'} # Set: without keys or order
14print(names_set[0]) # You can't do that, you will get an error
15 
16# --------------------
17 
18names_dict = { # Need sets with KEYS?
19 'CEO': 'John', # Dictionary: same as key-value array in PHP
20 'CTO': 'James',
21 'Lead Developer': 'Mandy'
22}
23print(names_dict['CEO']) # Result: 'John'

Of course, it's a bit hard to grasp all of those types at once, but it comes with real practice, later, example by example.

In Machine Learning projects, you will mostly work with just lists, using other types only in some cases.

But still, let's dive a bit deeper into the differences and what operations we may perform with them.


1. Lists

A list is known as an array in other languages. Compared to PHP, the list definition looks the same.

Python

1my_list = [1, 2, 3, 4]

PHP

1$my_array = [1, 2, 3, 4];

You can call lists as objects and use their methods with .method() syntax. A few examples below:

1my_list = [1, 2, 3, 4]
2my_list.append(7)
3 
4# [1, 2, 3, 4, 7]
1my_list = [1, 2, 3, 4]
2my_list.insert(2, 'a')
3 
4print(my_list)
5# [1, 2, 'a', 3, 4]

2. Tuples

Tuples are much like lists, but their elements cannot be modified after the tuple is created.

Tuples can be constructed in several ways:

1tuple1 = () # empty tuple
2tuple2 = ('value1', 'value2', 'value3')
3tuple3 = 'value1', 'value2', 'value3' # parentheses are even optional

Another way to define a tuple is a tuple(iterable) built-in function that accepts any iterable item as an argument.

1t1 = tuple([1, 2, 3])
2# (1, 2, 3)
3 
4# Strings are iterable too
5t2 = tuple('abcde')
6# ('a', 'b', 'c', 'd', 'e')

Having immutable data is an advantage if you want to ensure that it won't get modified by accident. Due to their immutability, tuples may work faster than lists.


3. Sets

A set object is an unordered collection of distinct items and is not considered a sequence, although it can be iterated. Sets do not maintain any order among elements and cannot be accessed using indices.

1names_set = { 'John', 'Bill' }

Lists and other iterables can be converted to sets using the set() function. One of the common uses is to remove duplicates from a series.

1my_list = [8, 8, 1, 2, 2, 3, 4, 5, 5, 5, 6, 7]
2 
3my_set = set(my_list)
4print(my_set)
5# {1, 2, 3, 4, 5, 6, 7, 8}

4. Dictionaries

In Python, a dictionary represents an unordered collection of key-value pairs.

It is also known as an associative array in PHP, a hash map, or a hash table in other programming languages.

Dictionaries are defined using curly braces {} and consist of comma-separated key-value pairs.

Python

1colors = {
2 'red': '#ff0000',
3 'green': '#00ff00',
4 'blue': '#0000ff'
5}

PHP

1$colors = [
2 'red' => '#ff0000',
3 'green' => '#00ff00',
4 'blue' => '#0000ff'
5];

Accessing values

1print(colors['red'])
2 
3# #ff0000

Adding and modifying value

1colors['yellow'] = '#fde047'

Removing items

You can remove a key-value pair using the del statement or pop(key) method.

1colors = {
2 'red': '#ff0000',
3 'green': '#00ff00',
4 'blue': '#0000ff'
5}
6 
7del colors['green']
8colors.pop('red')
9 
10print(colors)
11# {'blue': '#0000ff'}

The dictionary also has the methods to retrieve keys(), values(), and items():

1colors = {
2 'red': '#ff0000',
3 'green': '#00ff00',
4 'blue': '#0000ff'
5}
6 
7print(colors.keys())
8# dict_keys(['red', 'green', 'blue'])
9 
10print(colors.values())
11# dict_values(['#ff0000', '#00ff00', '#0000ff'])
12 
13print(colors.items())
14#dict_items([('red', '#ff0000'), ('green', '#00ff00'), ('blue', '#0000ff')])

Now, let's look at a few operations we may perform with ALL of those "sequence" types.


Membership operators

Membership operators are used to test whether a value is a member of a sequence or collection. There are two such operators, in and not in.

1my_list = [1, 2, 3, 4, 5]
2 
3print(3 in my_list) # Output: True
4print(6 in my_list) # Output: False
5 
6my_list = ['John', 'Bill']
7 
8print('John' not in my_list) # Output: False
9print('George' not in my_list) # Output: True
10 
11title = 'laravel daily'
12print('d' in title) # Output: True
13print('s' not in title) # Output: True

Slicing sequences and indices

Python has a powerful way to slice sequences.

Syntax: my_list[i:j:k] - it means a slice of my_list from i (included) to j (not included) with step k.

You can use negative index values to pick from the end of the list. The negative step k sign reverses the direction in which the list is looked up.

There are some examples of what you can do with that.

1# indices 0 1 2 3 4 5 6 7
2my_list = ['John', 'Paul', 'George', 'Ringo', 'Mick', 'Keith', 'Charlie', 'Bill']
3# -8 -7 -6 -5 -4 -3 -2 -1
4# negative indices
5 
6every_other = my_list[::2]
7print(every_other)
8# ['John', 'George', 'Mick', 'Charlie']
9 
10every_other_reversed = my_list[::-2]
11print(every_other_reversed)
12# ['Bill', 'Keith', 'Ringo', 'Paul']
13 
14every_other_in_range = my_list[2:5:2]
15print(every_other_in_range)
16# ['George', 'Mick']
17 
18two_last = my_list[-2:]
19print(two_last)
20# ['Charlie', 'Bill']

The slice(start, end, step) function returns a slice object you can pass instead if you want to reuse it.

1my_list = ['John', 'Paul', 'George', 'Ringo', 'Mick', 'Keith', 'Charlie', 'Bill']
2s = slice(2, 5, 2)
3 
4print(my_list[s])
5# ['George', 'Mick']

Slicing the list always makes a copy. You can slice the list without arguments to make a copy of my_list[:]. We can verify that:

1my_list = ['John', 'Paul', 'George']
2new_list = my_list[:]
3 
4my_list.append('Brian')
5 
6print(my_list)
7print(new_list)
8# ['John', 'Paul', 'George', 'Brian']
9# ['John', 'Paul', 'George']

Range() Function

There's one more important function I haven't mentioned: range() will help you to generate lists quickly.

The range can be constructed using range(stop) or range(start, stop[, step]) functions and produces immutable sequence-like objects.

1r = range(0, 20, 2)
2 
3print(r[5]) # Result: 10
4print(r[-1]) # Result: 18

It's a valuable tool for loops, we will talk about them in the next lesson.

You can cast a range objects into the list:

1r = list(range(10))
2# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
3 
4s = list(range(1, 11))
5# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
6 
7t = list(range(0, 10, 3))
8# [0, 3, 6, 9]
9 
10u = list(range(0, -10, -1))
11# [0, -1, -2, -3, -4, -5, -6, -7, -8, -9]

Quite a lot of information, right? No worries, as with every programming language, you will get to those types and their operations, with practice.


makuo_042 avatar

my thoughts "Explaining the step is very import the start and stop is self explaning so we dont get confuse, took me time to discover it was how the slice should move especially for the negative and positive side". Thanks.