If you are a new Python user, or a loop addicted user, it's time for you to consider to change to something new, something that makes your code run faster, and look much better.
Introducing List Comprehension
What is List Comprehension ?
List comprehension is just a succint and elegant way for people to create a new list where every element in the list is created based on a certain rule.
Let's take a look at an example.
Imagine we are having a list of numbers, and we are trying to create a new list, containing the square root of every elemnet in the existing loop. Normally, a lot of programmer will do like this :
existing_list = [4,9,16,25,36,49,64,81,100]
# Create an empty list
sqrt_list = []
# Make a for loop and apply the square root to each element in the existing_list
for i in existing_list:
# append the newly calculated elements to the empty list
sqrt_list.append(i**0.5)
sqrt_list
[2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
Obviously, there is nothing wrong with doing this way. However, list comprehension can greatly reduce the length of the code above.
[i** 0.5 for i in existing_list]
[2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
Why is List Comprehension better than traditonal Loop ?
Good programmers write code that humans can understand. Martin Fowler
List comprehension clearly makes our code much more readable and shorter than traditional loop. As we can see from the code above, it reduces 3 lines of code into 1. Anyone with a basic understanding of list comprehension can easily understand it.
In addition, list comprehension is also faster than traditional loop. One of the reason is that it doesn't need to load the append attribute of the list and call it as a function at each iteration
How do we write a list comprehension ?
I used to know a lot of people having hard time with using list comprehension as they are too comfortable with using traditonal loop; and hence, less willing to change.
To be honest, creating a list comprehension is even easier than making a traditional loop as you do not even need to use .append.
Normally, the format of a list comprehension will be : [expression for item in list]
Expression is just the method that you want to transform each element of your list
item is each element in your list
Type of List Comprehension
1. Unconditional List Comprehension
We have a list of name, and we want to captitalize all the letters of every name in the list
string_name = ["max", "minh", "megan", "cody", "blair", "lucas", "sarah", "vishal"]
# traditional loop
capital_name =[]
for i in string_name:
capital_name.append(i.upper())
capital_name
['MAX', 'MINH', 'MEGAN', 'CODY', 'BLAIR', 'LUCAS', 'SARAH', 'VISHAL']
# list comprehension
[i.upper() for i in string_name]
['MAX', 'MINH', 'MEGAN', 'CODY', 'BLAIR', 'LUCAS', 'SARAH', 'VISHAL']
2. Conditional List Comprehension
Assuming we have a string like this: "hello everyone". We want to have a list of unique non-vowel letters in this string
string_hello = "hello everyone"
vowels = ["a","e","i","o","u"]
# traditional loop
non_vowel_list = []
for i in string_hello.replace(" ",""):
if i not in vowels and i not in non_vowel_list:
non_vowel_list.append(i)
non_vowel_list
['h', 'l', 'v', 'r', 'y', 'n']
import numpy as np
# list comprehension
np.unique([i for i in string_hello.replace(" ","") if i not in vowels])
array(['h', 'l', 'n', 'r', 'v', 'y'], dtype='<U1')
Let's look at a more complicated case with an if-else statement.
# if the number in the list is the multiple of 3, we change in to 3, else double it
num_string = [2,5,6,2,1,8,9,12]
# traditional loop
num_list = []
for i in num_string:
if i % 3 == 0:
num_list.append(3)
else :
num_list.append(i*2)
num_list
[4, 10, 3, 4, 2, 16, 3, 3]
# list comprehension
[3 if (i % 3) == 0 else i*2 for i in num_string]
[4, 10, 3, 4, 2, 16, 3, 3]
3. Nested List Comprehension
Now, since we are a bit better at list comprehension, we are going to do a more advanced stuff : list comprehension with nested loop.
# return a list of string with length > 10
club_string = [["Manchester United", "Liverpool", "Chelsea"],
["Real Marid", "Barcelona", "Valencia"]]
club_list = []
# traditional loop
for leauge in club_string:
for club in leauge:
if len(club) > 10:
club_list.append(club)
club_list
['Manchester United']
# list comprehension
[club
for leauge in club_string
for club in leauge
if len(club) > 10]
['Manchester United']
DO NOT OVERUSE LIST COMPREHENSION
One of the purpose of writing list comprehension is better readability. Cramping everything inside the list comprehension will make thing even worse than traditional loop. For example:
# return a list of number staying within a list of number with length of less than 3 and multiple this
# number by 2 if it is odd or by 3 if it is even
number_list = [[1,2,3,4],
[5,6],
[7,8,9]]
[number*2 if number % 2 == 1 else number*3 for list_ in number_list if len(list_) < 3
for number in list_]
[10, 18]
The first way that we can fix it is to break the line
[number*2
if number % 2 == 1
else number*3
for list_ in number_list
if len(list_) < 3
for number in list_]
[10, 18]
Another way to make the code look easier is to create an extra function
def num_condition(num):
if num % 2 == 1:
return num*2
else:
return num*3
[num_condition(number) for list_ in number_list if len(list_) < 3 for number in list_]
[10, 18]
Dictionary Comprehension
Knowing how to write a list comprehension is smiliar to "buy 1 get 1" as you will be able to write a dictionary comprehension.
Similar to list comprehension, dictionary comprehension is a concise, and fast way to produce a new dictionary where every key and value are based on a certain rule.
import pandas as pd
city_df = pd.DataFrame({"city" : ["Toronto","Waterloo", "Toronto",
"Vancouver", "Waterloo"],
"house_price" : [1000000,500000,700000,800000,600000]})
city_df
city | house_price | |
---|---|---|
0 | Toronto | 1000000 |
1 | Waterloo | 500000 |
2 | Toronto | 700000 |
3 | Vancouver | 800000 |
4 | Waterloo | 600000 |
For example, we have a dataframe as shown above, and we want to get a dictionary showing the average house price of each city. The way to write a dictionary comprehension is really similar to write a list comprehension. The only difference is that we need to have values and keys for the output
city = city_df.city.unique()
{i : (city_df.house_price[city_df.city == i]).mean() for i in city}
{'Toronto': 850000.0, 'Waterloo': 550000.0, 'Vancouver': 800000.0}