List Comprehension

If you are a new Python user, or a loop addicted user, it's time for you to consider to change to something new, something that makes your code run faster, and look much better.

Introducing List Comprehension

What is List Comprehension ?

List comprehension is just a succint and elegant way for people to create a new list where every element in the list is created based on a certain rule.

Let's take a look at an example.

Imagine we are having a list of numbers, and we are trying to create a new list, containing the square root of every elemnet in the existing loop. Normally, a lot of programmer will do like this :

existing_list = [4,9,16,25,36,49,64,81,100]

# Create an empty list
sqrt_list = []

# Make a for loop and apply the square root to each element in the existing_list
for i in existing_list:
    # append the newly calculated elements to the empty list
    sqrt_list.append(i**0.5)
sqrt_list
[2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]

Obviously, there is nothing wrong with doing this way. However, list comprehension can greatly reduce the length of the code above.

[i** 0.5 for i in existing_list]
[2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]

Why is List Comprehension better than traditonal Loop ?

Good programmers write code that humans can understand. Martin Fowler

List comprehension clearly makes our code much more readable and shorter than traditional loop. As we can see from the code above, it reduces 3 lines of code into 1. Anyone with a basic understanding of list comprehension can easily understand it.

In addition, list comprehension is also faster than traditional loop. One of the reason is that it doesn't need to load the append attribute of the list and call it as a function at each iteration

How do we write a list comprehension ?

I used to know a lot of people having hard time with using list comprehension as they are too comfortable with using traditonal loop; and hence, less willing to change.

To be honest, creating a list comprehension is even easier than making a traditional loop as you do not even need to use .append.

Normally, the format of a list comprehension will be : [expression for item in list]

Expression is just the method that you want to transform each element of your list

item is each element in your list

Type of List Comprehension

1. Unconditional List Comprehension

We have a list of name, and we want to captitalize all the letters of every name in the list

string_name = ["max", "minh", "megan", "cody", "blair", "lucas", "sarah", "vishal"]

# traditional loop
capital_name =[]
for i in string_name:
    capital_name.append(i.upper())

capital_name
['MAX', 'MINH', 'MEGAN', 'CODY', 'BLAIR', 'LUCAS', 'SARAH', 'VISHAL']

# list comprehension
[i.upper() for i in string_name]
['MAX', 'MINH', 'MEGAN', 'CODY', 'BLAIR', 'LUCAS', 'SARAH', 'VISHAL']

2. Conditional List Comprehension

Assuming we have a string like this: "hello everyone". We want to have a list of unique non-vowel letters in this string

string_hello = "hello everyone"
vowels = ["a","e","i","o","u"]

# traditional loop
non_vowel_list = []

for i in string_hello.replace(" ",""):
    if i not in vowels and i not in non_vowel_list: 
        non_vowel_list.append(i)

non_vowel_list
['h', 'l', 'v', 'r', 'y', 'n']

import numpy as np

# list comprehension
np.unique([i for i in string_hello.replace(" ","") if i not in vowels])
array(['h', 'l', 'n', 'r', 'v', 'y'], dtype='<U1')

Let's look at a more complicated case with an if-else statement.

# if the number in the list is the multiple of 3, we change in to 3, else double it

num_string = [2,5,6,2,1,8,9,12]

# traditional loop
num_list = []
for i in num_string:
    if i % 3 == 0:
        num_list.append(3)
    else :
        num_list.append(i*2)

num_list
[4, 10, 3, 4, 2, 16, 3, 3]
# list comprehension
[3 if (i % 3) == 0 else i*2 for i in num_string]
[4, 10, 3, 4, 2, 16, 3, 3]

3. Nested List Comprehension

Now, since we are a bit better at list comprehension, we are going to do a more advanced stuff : list comprehension with nested loop.

# return a list of string with length > 10

club_string = [["Manchester United", "Liverpool", "Chelsea"],
               ["Real Marid", "Barcelona", "Valencia"]]

club_list = []

# traditional loop
for leauge in club_string:
    for club in leauge:
        if len(club) > 10:
            club_list.append(club)

club_list
['Manchester United']

# list comprehension
[club 
 for leauge in club_string 
 for club in leauge
 if len(club) > 10]
['Manchester United']

DO NOT OVERUSE LIST COMPREHENSION

One of the purpose of writing list comprehension is better readability. Cramping everything inside the list comprehension will make thing even worse than traditional loop. For example:

# return a list of number staying within a list of number with length of less than 3 and multiple this
# number by 2 if it is odd or by 3 if it is even

number_list = [[1,2,3,4],
               [5,6],
               [7,8,9]]

[number*2 if number % 2 == 1 else number*3 for list_ in number_list if len(list_) < 3 
 for number in list_]
[10, 18]

The first way that we can fix it is to break the line

[number*2 
 if number % 2 == 1 
 else number*3 
 for list_ in number_list 
 if len(list_) < 3 
 for number in list_]
[10, 18]

Another way to make the code look easier is to create an extra function

def num_condition(num):
    if num % 2 == 1:
        return num*2
    else:
        return num*3

[num_condition(number) for list_ in number_list if len(list_) < 3 for number in list_]
[10, 18]

Dictionary Comprehension

Knowing how to write a list comprehension is smiliar to "buy 1 get 1" as you will be able to write a dictionary comprehension.

Similar to list comprehension, dictionary comprehension is a concise, and fast way to produce a new dictionary where every key and value are based on a certain rule.

import pandas as pd
city_df = pd.DataFrame({"city" : ["Toronto","Waterloo", "Toronto", 
                                  "Vancouver", "Waterloo"],
                        "house_price" : [1000000,500000,700000,800000,600000]})
city_df
city house_price
0 Toronto 1000000
1 Waterloo 500000
2 Toronto 700000
3 Vancouver 800000
4 Waterloo 600000

For example, we have a dataframe as shown above, and we want to get a dictionary showing the average house price of each city. The way to write a dictionary comprehension is really similar to write a list comprehension. The only difference is that we need to have values and keys for the output

city = city_df.city.unique()
{i : (city_df.house_price[city_df.city == i]).mean() for i in city}
{'Toronto': 850000.0, 'Waterloo': 550000.0, 'Vancouver': 800000.0}
By
Tags : #python,