(basics)=
# Python basics

In [2]:
!echo Last updated: `date +'%Y-%m-%d %H:%M:%S'`

Last updated: 2024-01-04 15:20:19


*****

## Introduction

Now that our working environment is set, and we know how to edit and execute Python code through Jupyter Notebook (see {ref}`setup`), we move on to the Python language itself. In this chapter, we introduce the basic concepts, operators, and data types, in Python.

Throughout most of the chapter, we are going to cover *data types*, in terms of their properties and their behavior. These include the elementary "atomic" data types, namely: 

* `int` (see {ref}`numbers-int-float`),
* `float` (see {ref}`numbers-int-float`), 
* `bool` (see {ref}`boolean-values`), and
* `None` (see {ref}`none`),

as well as the more complex "collection" data types, namely: 

* `str` (see {ref}`strings`),
* `list` (see {ref}`lists`),
* `dict` (see {ref}`dict`),
* `tuple` (see {ref}`tuples`), and
* `set` (see {ref}`sets`).

We are going to place more emphasis, and cover more methods, when discussing those data structures which are most useful for our purposes later on in the book, such as `list` (see {ref}`lists`).

In addition to data types, we are going to introduce the basic concepts of the Python language, such as {ref}`variables`, {ref}`functions`, and {ref}`basics-mutability-and-copies`.

(variables)=
## Variables and assignment

The most basic concepts in Python, just like in any other programming language, are the concepts of *variables* and *assignment*. We assign values to variables so that we can keep intermediate results in computer memory, and keep processing them incrementally throughout our script. We will see that variables can hold values of any complexity level, ranging from simple numbers or strings, and to arrays, tables, vector layers, or rasters. 

Assignment in Python is done using the assignment operator `=`: 

* To the left of the `=` operator we specify the variable *name* of our choice
* To the right of the `=` operator we specify the *value* to be assigned

For example, the following expression assigns the numeric value of `3` to a variable named `x`:

In [3]:
x = 3

Variable names can be composed of lowercase letters (`[a-z]`), uppercase letters (`[A-Z]`), digits (`[0-9]`), and underscore (`_`). Also note that variable names are case-sensitive, e.g., `g` and `G` are two different variables. 

We can now access the value assigned to `x` in any subsequent expression in our script:

In [4]:
x

3

Note that assigning another value to a pre-defined variable "replaces" its contents. We are going to elaborate on the meaning of this later on (see {ref}`basics-mutability-and-copies`):

In [5]:
x = 5
x

5

Trying to access an undefined variable is a common cause of errors in our code. For example, if we have not defined a variable named `z` anywhere in our script, then the expression `z` raises an error: 

In [6]:
# z  ## Raises error

```{note}
The above expression, as well as other expressions that raise errors is "commented out" (see {ref}`code-comments`), so that the notebook can be run uninterrupted. To see the error message, remove the `#` symbol at the beginning of the line (in your own copy of the notebook) and then run the cell.
```

(functions)=
## Functions

Functions are named pieces of code, to perform a particular job. We will often be executing: 

* Built-in functions
* Functions from the standard library
* Functions from third-party packages (see {ref}`loading-packages`)

Functions in Python are executed by specifying their name, followed by parentheses. Inside the parentheses, there can be zero or more arguments (i.e., function inputs), separated by commas, depending on the function. For example, the built-in function [`abs`](https://docs.python.org/3/library/functions.html#abs) accepts a number and returns its *absolute* value: 

In [7]:
abs(-7)

7

Later on, we will learn how to define our own functions (see {ref}`defining-functions`).

(basic-data-types)=
## Data types

(basic-data-types-overview)=
### Data types—overview

Numeric values, such as `3` or `5` shown above (see {ref}`variables`), are just one of the built-in data types in Python. The most commonly used built-in data types are summarized in {numref}`data-types`.

```{table} Python data types
:name: data-types

| Data type | Meaning | Divisibility | Mutability | Example |
|:---:|:---:|:---:|:---:|:---:|
| `int` | Integer | Atomic | Immutable | `7` | 
| `float` | Float | Atomic | Immutable | `3.2` | 
| `bool` | Boolean | Atomic | Immutable | `True` |
| `None` | None | Atomic | Immutable | `None` |
| `str` | String | Collection | Immutable | `'Hello!'` | 
| `list` | List | Collection | Mutable | `[1,2,3]` |
| `tuple` | Tuple | Collection | Immutable | `(1,2)` |
| `dict` | Dictionary | Collection | Mutable | `{'a':2,'b':7}` |
| `set` | Set | Collection | Mutable | `{'a','b'}` |
```

These data types are the basic building blocks of Python code. Later on, we are going to learn about other, more complex, data structures, defined in third-party packages. For example, we will learn about: 

* `ndarray`---representing arrays (see {ref}`creating-arrays`)
* `DataFrame`---representing tables (see {ref}`pandas-creating-from-scratch`)
* `GeoDataFrame`---representing vector layers (see {ref}`layer-from-scratch`)

Note the distinction between "atomic" and "collection" data types: 

* "atomic" data types which represent an indivisible value—`int`, `float`, `bool`, and `None`
* "collection" data types that represent a collection of elements, whereas each element in the collection is an "internal" data structure, either atomic or a collection (except for `str`)—`str`, `list`, `tuple`, `dict`, and `set`

Instances of the data types, namely the values, can be expressed as *literal* values or as *variables*. For example, in the expression:

In [8]:
x = 5

`5` is a literal value, while `x` is a variable.

(checking-with-type)=
### Checking with `type`

The [`type`](https://docs.python.org/3/library/functions.html#type) function can be used to identify the data type. Let us see how the various data type names listed in {numref}`data-types` appear in the console:

In [9]:
type(7)

int

In [10]:
type(3.2)

float

In [11]:
type(True)

bool

In [12]:
type(None)

NoneType

In [13]:
type('Hello!')

str

In [14]:
type([1, 2, 3])

list

In [15]:
type((1, 2))

tuple

In [16]:
type({'a': 2, 'b': 7})

dict

In [17]:
type({'a', 'b'})

set

In the following sections (see {ref}`numbers-int-float`--{ref}`sets`) we go over the most important properties and methods for each of these data types.

```{note}
To check if a given object belongs to the specified type *programmatically*, you can use the `isinstance` function. For example, `isinstance(1,int)` returns `True` (because `1` is an `int`), while `isinstance(1.1,int)` returns `False` (because `1.1` is a `float`).
```

(numbers-int-float)=
## Numbers (`int`, `float`)

### Integers and floats

An [`int`](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex) (integer) represents a numeric value *without* a decimal point, possibly negative---if is starts with `-`. For example, here are two `int` values, `3` and `-78`:

In [18]:
3

3

In [19]:
-78

-78

A [`float`](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex) represents a numeric value, whether positive or negative, *with* a decimal point. Here are two `float` values, `-3.2` and `3.0`:

In [20]:
-3.2

-3.2

In [21]:
3.0

3.0

Note that presence of a decimal point in a literal number automatically creates a `float`. Otherwise, we create an `int`.

`int` and `float` can be distinguished based on the way they are printed (with a decimal point, or without it). More systematically, they can be distinguished using the `type` function (see {ref}`checking-with-type`):

In [22]:
type(3)

int

In [23]:
type(3.0)

float

(arithmetic-operators)=
### Arithmetic operators

The most commonly used [arithmetic operators](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex) in Python are given in {numref}`arithmetic-ops`.

```{table} Arithmetic operators in Python
:name: arithmetic-ops

| Operator | Meaning |
|:---:|:---:|
| `+` | Addition |
| `-` | Subtraction |
| `*` | Multiplication |
| `/` | Division |
| `**` | Exponent |
| `//` | Floor divition |
| `%` | Modulus |
```

The arithmetic operators can be used with both `int` and `float` values. Here are a few examples:

In [24]:
1 + 5

6

In [25]:
7 - 3.5

3.5

In [26]:
5.2 * 5

26.0

In [27]:
1 / 2

0.5

Note that the exponent operator in Python is `**`:

In [28]:
10 ** 3

1000

```{note}
Confusingly, Python has a `^` operator for something completely different than exponent (such as defined in R, or in plain language), namely the [Bitwise XOR](https://en.wikipedia.org/wiki/Bitwise_operation#XOR) operator, which is beyond the scope of this book. For example, `10^3` returns `9`.
```

Python operators are associated with *precedence* rules, which are similar and in agreement with [order of operations](https://en.wikipedia.org/wiki/Order_of_operations) in mathematics. For example, expectedly, `*` has precedence over `+`, therefore:

In [29]:
1 + 2 * 3

7

Parentheses can be used to indicate precedence:

In [30]:
(1 + 2) * 3

9

In fact, to make our code clearer, the recommendation is to use parentheses even when they are not required:

In [31]:
1 + (2 * 3)

7

Arithmetic operations return `int` or `float`, as necessary. For example, addition of two `int` values always returns an `int`, because the result is guaranteed to be a whole number:

In [32]:
2 + 3

5

However, *division* of two `int` values always returns a `float`, because the result of a division is not guaranteed to be a whole number and thus cannot be always represented using an `int`:

In [33]:
2 / 2

1.0

Calculations can be *assigned* to variables (see {ref}`variables`) to keep the intermediate result in memory, in case our calculation requires several steps. Using the assignment operator and arithmetic operators, we already know how to write Python code comprising several expressions. For example:

In [34]:
x = 55
y = 30
z = x - y
z = z * 10
z

250

```{admonition} Exercise 02-a
:class: important
* How many seconds are there in a day? Write an arithmetic expression in Python to find out.
```

```{note}
Floor division (`//`) and modulus (`%`) are less useful for the purposes of this book, and only given in {numref}`arithmetic-ops` for completeness. As an exercise, search online for "python floor division" and "python modulus" to check out what they do, then try them out in the Python command line or notebook.
```

### Increment assignment

Another commonly used Python operator is [*increment assignment*](https://docs.python.org/3/reference/simple_stmts.html#augmented-assignment-statements), `+=`. It is a shortcut to addition combined with assignment, i.e., `x+=y` is a shortcut to `x=x+y`. For example:

In [35]:
x = 10
x += 5
x

15

A common use of increment assignment is to advance a "counter" variable inside a `for` loop (see {ref}`for-loops`).

```{note}
Other than increment assignment (`+=`), Python also has decrement assignment (`-=`), multiply assignment (`*=`), and division assignment (`/=`) operators.
```

### `int` and `float` conversions

We can convert a number to `int` or `float`, using functions of the same name:

* `int`→`float`—a decimal point followed by zero, i.e., `.0` is added
* `float`→`int`—anything after the decimal point is discarded

For example:

In [36]:
float(1)

1.0

In [37]:
int(11.8)

11

We can get the nearest integer using [`round`](https://docs.python.org/3/library/functions.html#round):

In [38]:
round(11.8)

12

(boolean-values)=
## Boolean values (`bool`)

### What are Boolean values?

[Boolean values](https://docs.python.org/3/library/stdtypes.html#boolean-type-bool) represent one of two states, "true" or "false". Accordingly, the boolean data type in Python can have just one of two possible values, `True` and `False`. Boolean values can be created by literally typying `True` and `False`: 

In [39]:
True

True

In [40]:
False

False

However, typically boolean values are created as a result of conditional expressions (see {ref}`conditions`).

### Negation

Boolean values can be reversed ("negated") using the `not` operator, followed by a boolean value (or an expression that creates a boolean value). The `not` operator is considered one of the [*logical operators*](https://docs.python.org/3/reference/expressions.html#not), along with `and` and `or`  ({numref}`logical-ops`) which will be introduced next (see {ref}`conditions`).

```{table} Logical operators in Python
:name: logical-ops

| Operator | Meaning |
|:---:|:---:|
| `and` | And |
| `or` | Or |
| `not` | Not |
```

For example:

In [41]:
not True

False

In [42]:
not False

True

In [43]:
not 1 == 1

False

Negation is useful when writing conditionals (see {ref}`conditionals`).

(conditions)=
### Conditions

Most often, boolean values arise as a result of a *condition*, such as:

In [44]:
3 > 2

True

Conditions involve [*conditional operators*](https://docs.python.org/3/reference/expressions.html#value-comparisons), such as `>` (greater than) in the above example. The conditional operators in Python are summarized in {numref}`conditional-ops`.

```{table} Conditional operators in Python
:name: conditional-ops

| Operator | Meaning |
|:---:|:---:|
| `==` | Equal |
| `!=` | Not equal |
| `<` | Less than |
| `<=` | Less than or equal |
| `>` | Greater than |
| `>=` | Greater than or equal |
```

Here are some more examples of conditional operators:

In [45]:
x = 11
x

11

In [46]:
x > 10

True

In [47]:
x <= 10

False

In [48]:
x != 11

False

```{note}
Keep in mind the distinction between the assignment operator `=` (see {ref}`variables`) and the equality conditional operator `==`!
```

Two or more conditional expressions can be combined into one expression, using the logical operators `and` or `or` ({numref}`logical-ops`): 

* When using `and`, the expression is `True` if *both* sides are `True`; otherwise the expression is `False`. 
* When using `or`, the expression is `True` if *at least one* side is `True`; otherwise the expression is `False`. 

For example:

In [49]:
1 == 1 and 2 == 3

False

In [50]:
1 == 1 or 2 == 3 

True

In [51]:
1 == 1 and not 2 == 3

True

```{note}
The above examples work fine without parentheses because conditional operators (such as `==`) have higher [precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence) than logical operators (such as `and`). To make the code clearer and to avoid dealing with precedence rules, you may want to use parentheses nevertheless, as in `(1==1) and (2==3)`.
```

### Boolean to number

Boolean values can be converted to integers using `int`, in which case `False` becomes `0` and `True` becomes `1`:

In [52]:
int(False)

0

In [53]:
int(True)

1

In fact, the conversion takes place automatically when mixing `True` and `False` values with numbers, as part of an arithmetic expressions or conditions. For example:

In [54]:
False + 9

9

In [55]:
True / 2

0.5

In [56]:
True == 1

True

(none)=
## None (`None`)

Python has a special object called [`None`](https://docs.python.org/3/library/constants.html#None), used to represent the absence of a value:

In [57]:
None

The special value `None` has its own class, named `NoneType`:

In [58]:
type(None)

NoneType

`None` is used to denote the absence of a value. For example, `None` can be used to mark missing data in a `list` (see {ref}`lists`). The `None` data type is not very relevant for our purposes, so we are not going to encounter it very often later on, but you should be aware it exists.

(strings)=
## Strings (`str`)

### Creating strings

Strings ([`str`](https://docs.python.org/3/library/stdtypes.html#textseq)) are sequences of characters, including digits, letters, punctations, whitespaces, and directives such as "newline".

 Strings can be created using either single (`'`) or double (`"`) quotes. For example:

In [59]:
'Hello!'

'Hello!'

In [60]:
"Hello!"

'Hello!'

Strings created using single quotes are identical to those created with double quotes. What matters is just the contents inside the quotes. Note that strings are printed with single quotes, but this is just an inconsequential convention.

```{note}
One reason for having two types of quote characters is being able to create strings that contain *internal* qoutes. For example `'He said: "Hi!"'` is a string that contains internal double quotes (`"`), which is possible thanks to the fact it is defined using single quotes (`'`).
```

Note that a string can be empty:

In [61]:
''

''

```{note}
A string (`str`) is considered a "collection" data type ({numref}`data-types`) because a string is actually a collection of characters, rather than an atomic value. For example, we can subset a string using slicing, as in `x='Hello';x[:2]` (see {ref}`list-subsets`), as if the string was a list of characters. Although for our purposes in this book we are not going to split strings to "parts", therefore practically treating them as atomic values, we still classified them as a "collection" data type in {numref}`data-types` for the sake of accuracy.
```

### String length

The `len` function can be used to count the number of characters in a string. For example:

In [62]:
len('Hello')

5

### Conversion to string

Other data types, such as `int`, `float`, and `bool`, can be converted to string using the `str` function:

In [63]:
str(12)

'12'

In [64]:
str(-5.7)

'-5.7'

In [65]:
str(False)

'False'

### String concatenation

Strings can be concatenated using the `+` operator. That way, the contents of variables can be combined with literal strings to create a new string. For example:

In [66]:
x = 'Hello'
y = 'World'
x + ' ' + y

'Hello World'

Note that when trying to concatenate strings with other data types, the latter are *not* automatically transformed to a string, resulting in an error:

In [67]:
# 'band_' + 1 + '.tif'  ## Raises error!

For the concatenation to work, we must transform all components to strings:

In [68]:
'band_' + str(1) + '.tif'  ## This works

'band_1.tif'

```{note}
Other than simply using the `+` operator, Python has at least three other, more advanced, methods, to concatenate strings with (numeric) values. Here they are, ordered from oldest to newest:

* ["Old" style](https://docs.python.org/3/tutorial/inputoutput.html#old-string-formatting) string formatting, e.g., `'band_%s.tif' % 1`
* ["New" style](https://docs.python.org/3/tutorial/inputoutput.html#the-string-format-method) string formatting, e.g., `'band_{}.tif'.format(1)`
* ["f-strings"](https://docs.python.org/3/tutorial/inputoutput.html#tut-f-strings), e.g., `x=1; f'band_{x}.tif'`

The `+` operator is perfectly sufficient for the purposes of this book, so we will not elaborate on these methods. However, if you are going to get into text processing using Python, make sure to check them out!
```

### String to number

Strings can be converted to `int` or `float` using functions of the same name. However, for the conversion to be successful, the string must represent a valid number. Namely, the string must contain only `+` or `-` (or nothing) at the beginning, followed by numbers. When converting to `float` (not `int`!), the number may also contain a decimal point. For example:

In [69]:
int('-99')

-99

In [70]:
float('-99.32')

-99.32

In [71]:
float('1')

1.0

```{note}
Working with stings is less relevant for our purposes in this book. Nevertheless, here are some useful string methods to be aware of:

* [`.strip`](https://docs.python.org/3/library/stdtypes.html#str.strip)---Remove spaces from start and end
* [`.lower`](https://docs.python.org/3/library/stdtypes.html#str.lower)---Convert to lowercase
* [`.upper`](https://docs.python.org/3/library/stdtypes.html#str.upper)---Convert to uppercase
* [`.title`](https://docs.python.org/3/library/stdtypes.html#str.title)---Convert to titlecase
* [`.startswith(pattern)`](https://docs.python.org/3/library/stdtypes.html#str.startswith)---Check if string starts with `pattern`
* [`.endswith(pattern)`](https://docs.python.org/3/library/stdtypes.html#str.endswith)---Check if string ends with `pattern`
* [`.find(pattern)`](https://docs.python.org/3/library/stdtypes.html#str.find)---Find the index of `pattern` within the string
* [`sep.join([str1,str2,...])`](https://docs.python.org/3/library/stdtypes.html#str.join)---Join strings `str1`, `str2`, etc., using the `sep` string as separator 
```

(lists)=
## Lists (`list`)

### Creating lists

Lists ([`list`](https://docs.python.org/3/library/stdtypes.html#list))—as well as tuples (see {ref}`tuples`), dictionaries (see {ref}`dict`) and sets (see {ref}`sets`) which we cover next—are data structures that contain collections of items, known as elements. There is no homogeneity restriction, namely a `list` can contain any mixture of elements of any type. Each element may be any data type, including both the "atomic" data types (e.g., `int`, `float`, `bool`) and "collection" data types (for example, we can have a list of lists). A list is an *ordered* collection, meaning that the order of elements matters, and that we can access individual elements using numeric indices (see {ref}`list-elements` and {ref}`list-subsets`).

Lists can be created using square brackets `[`, with elements separated by commas. For example, here is how we can create an empty `list`:

In [72]:
x = []
x

[]

And here is how we can create a `list` with seven elements of type `str`:

In [73]:
days = ['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']
days

['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']

An important property of lists is their *length*, i.e., the number of elements. List length can be calculated using function `len`:

In [74]:
len(days)

7

Again, keep in mind that there is no restriction on the *type* of `list` elements. We can mix different types in the same `list`, although such "heterogeneous" lists are less useful in practice:

In [75]:
[1, 'A', True, [55,56,57]]

[1, 'A', True, [55, 56, 57]]

(list-elements)=
### Accessing `list` elements

List elements can be extracted using an index inside square brackets `[`. Importantly, indexing in Python starts at *zero*. For example, `days[0]` returns the first element of `days`:

In [76]:
days[0]

'Sun'

`days[1]` returns the second element of `days`:

In [77]:
days[1]

'Mon'

`days[2]` returns the third element of `days`:

In [78]:
days[2]

'Tue'

and so on. We can think of the Python index as an offset; the first element has an offset of zero, the second element has an offset of one, and so on.

```{note}
Trying to access a list item beyond list length raises an error. Try executing an expression such as `days[10]` to see this behavior for yourself.
```

(assignment-to-list)=
### Assignment to `list`

We can modify a `list` element by assigning a new value into it. For example:

In [79]:
days[2] = 'ABC'
days

['Sun', 'Mon', 'ABC', 'Wed', 'Thu', 'Fri', 'Sat']

Let us do another assignment to get back the original `days` list:

In [80]:
days[2] = 'Tue'
days

['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']

Note that updating an element (or any other subset) in an existing data structure is considered an "in place" modification, and as such applicable only to *mutable* data types such as a `list`. We elaborate on the meaning and implications of "in place" operations later on (see {ref}`basics-mutability-and-copies`).

(list-subsets)=
### `list` slicing

We can get a subset of a list, containing just some of the elements, using a notation known as *slicing*. Slicing uses an index of the form `start:stop:step`. The meaning of the three components is as follows:

* `start`—where to start, default is `0`
* `stop`—where to stop, default is at the end of the `list`
* `step`—step size, default is `1`

The resulting subset goes from `start` (inclusive) to `stop` (exclusive), and progresses in steps of size `step`:

* When `start` is omitted (e.g., `:end`), the subset starts from the beginning
* When `stop` is omitted (e.g., `start:`), the subset goes all the way to the last element
* When `step` is omitted (e.g., `start:end`), the default step of size `1` is used

```{note}
The rationale behind `end` being exclusive is that a combination (see {ref}`list-operators`) of complementary slices returns the complete list, e.g., `days[:3]+days[3:]` is equal to `days`.
```

For example, `days[0:3]` means *start* at the element with index `0`, *end* before the element with index `3` (i.e., end at index `2` inclusive), using step size `1`. Therefore we get the first three elements—`0`, `1`, and `2`—from `days`:

In [81]:
days[0:3]

['Sun', 'Mon', 'Tue']

The default value of `start` is `0`. Therefore it can be omitted, to get the same result using `days[:3]`:

In [82]:
days[:3]

['Sun', 'Mon', 'Tue']

However, if we need to get one or more elements from the middle of the `list`, we need to use both `start` and `end`, as follows:

In [83]:
days[1:3]

['Mon', 'Tue']

When `stop` is omitted, the subset includes all elements from `start` till the end of the `list`:

In [84]:
days[1:]

['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']

Note that `list` subsets using the slice notation, even if they are comprised of just one element, are `list` objects of length one:

In [85]:
days[:1]

['Sun']

while individual elements (see {ref}`list-elements`) are returned as standalone objects:

In [86]:
days[0]

'Sun'

(list-operators)=
### `list` operators

The `+` and `*` arithmetic operators are defined for lists, too, but their meaning is different than with numbers (see {ref}`arithmetic-operators`):

* `+` *appends* two or more lists together
* `*` *replicates* a list

For example, here we use `+` to append the first two weekdays with another `list` of length 3, to get a list of length 5:

In [87]:
days[:2] + [1,2,3]

['Sun', 'Mon', 1, 2, 3]

The `*` operator replicates a `list`. The right-hand value needs to be an `int`, specifying the number of repititions. For example, here we replicate the first two days of the week three times:

In [88]:
days[:2] * 3

['Sun', 'Mon', 'Sun', 'Mon', 'Sun', 'Mon']

### What are methods?

Let's introduce the concept of *methods*, which we will shortly demonstrate with list methods (see {ref}`list-methods`). A method is similar to a function (see {ref}`functions`), but it is part of a data type (and, more generally, of a *class*) definition, unlike a function which is a standalone object. A method is invoked using an object name, followed by a dot (`.`), then the method name.

For example, as we have already seen, a function named `do_something`, with an argument named `x`, is invoked with:

```python
do_something(x)
```

A method named `do_something`, however, would be invoked with:

```python
y.do_something(x)
```

where `y` is an object of a class that has a `do_something` method. 

(list-methods)=
### `list` methods

Some of the most useful methods for modifying lists are `.append`, `.pop`, `.reverse`, and `.sort`. Let us see what they do through examples.

The [`.append`](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) method *appends* the specified new element to a list, thus incresing its length by one. The new element is appended at the end of the list. For example, here is how we append the string `'New day'` at the end of `days`:

In [89]:
days.append('New day')
days

['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'New day']

Note that list methods such as `.append`, `.pop`, `.reverse`, and `.sort` (see below), modify the list *itself*, also known as "in place" (see {ref}`basics-mutability-and-copies`), and return `None`. There is no need to assign the result back to the original variable. For example, doing something like `days=days.append('New day')` is incorrect, as this will just assign `None` to `days` and we will lose the information in the list.

The [`.pop`](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) method does the opposite of `.append`. Namely, `.pop` *removes* the last element of the given list. For example, here is how we can remove the last element (`'New day'`) in `days`, thus returning to the original list:

In [90]:
days.pop()
days

['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']

The [`.reverse`](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) method, as can be expected, reverses the list:

In [91]:
days.reverse()
days

['Sat', 'Fri', 'Thu', 'Wed', 'Tue', 'Mon', 'Sun']

Another useful method is [`.sort`](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists), which sorts the given list. In case the list contents are strings, then they are sorted in alphabetical order:

In [92]:
days.sort()
days

['Fri', 'Mon', 'Sat', 'Sun', 'Thu', 'Tue', 'Wed']

Before moving on, let's re-create the original version of `days`:

In [93]:
days = ['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']
days

['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']

(the-in-operator)=
### The `in` operator

The [`in`](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) operator is used to check the *presence* of a given value in a list. For example, suppose that we want to check whether a particular string, such as `'Sunday'`, appears in `days`. The straightforward approach is to compare the given value to each element, combining the results with `or` (see {ref}`conditions`):

In [94]:
'Sunday' == days[0] or \
'Sunday' == days[1] or \
'Sunday' == days[2] or \
'Sunday' == days[3] or \
'Sunday' == days[4] or \
'Sunday' == days[5] or \
'Sunday' == days[6]

False

```{note}
In the above example, we use the `\` character to denote that the expression continues on the next line. Otherwise Python detects that the expression ends with an `or` (and nothing after it), which raises an error. 
```

The above expression is straightforward but rather verbose, and specific to the list *length*. Instead, we can use the `in` operator, which is both shorter and more general:

In [None]:
'Sunday' in days

False

Here is another example, where the result is `True` since the value `'Sun'` does occur in `days`:

In [96]:
'Sun' in days

True

```{admonition} Exercise 02-b
:class: important
* What will be the result of `'sun' in days`, and why? Run the expression to check your answer.
```

(tuples)=
## Tuples (`tuple`)

### Creating tuples

Tuples ([`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)) are ordered collections of values of any type, just like lists (see {ref}`lists`). The difference between lists and tuples is that tuples are *immutable*, while lists are mutable. In other words, the `tuple` data type may be considered the immutable version of `list`. We elaborate on the concept of mutability later on (see {ref}`basics-mutability-and-copies`). In short, mutable data types can be modified after creation, e.g., using assignment to subsets (see {ref}`assignment-to-list`), or "in place" methods (see {ref}`list-methods`), while immutable data types cannot be modified.

Tuples can be created using ordinary parentheses `(`, with elements separated by commas:

In [11]:
t = ('one', 'two', 'three')
t

('one', 'two', 'three')

Here is a little inconsistency that is important to be aware of. In case we want to create a `tuple` that contains just one element, we still must include a comma after it:

In [4]:
u = ('one',)
u

('one',)

Otherwise, the parentheses are ignored and the result is the element itself, rather than a `tuple` contaning it:

In [24]:
u = ('one')
u

'one'

```{note}
In fact, parentheses are not required to create a `tuple`; commas are sufficient. For example, `1,` or `2,4` are tuples too. (Execute these expressions to see for yourself!)
```

### Tuple operations

Tuples can be indexed (see {ref}`list-elements`), sliced (see {ref}`list-subsets`), duplicated or combined (see {ref}`list-operators`), or evaluated using `in` (see {ref}`the-in-operator`), just like lists. For example:

In [8]:
t[0]

'one'

In [16]:
'two' in t

True

Since tuples are immutable, however, we cannot replace an element of existing `tuple` via assignment:

In [97]:
# t[0] = 'ten'  ## Raises error!

For the same reason, we also cannot modify a `tuple` "in place" using list methods such as `.append`, `.pop`, `.reverse`, or `.sort` (see {ref}`list-methods`).

### Conversion to and from `list`

Tuples can be converted to and from lists, using the `list` and `tuple` functions, respectively. For example:

In [20]:
list(t)

['one', 'two', 'three']

In [22]:
tuple(list(t))

('one', 'two', 'three')

(dict)=
## Dictionaries (`dict`)

### Creating a `dict`

A dictionary ([`dict`](https://docs.python.org/3/library/stdtypes.html#dict)) is a collection of `key:value` pairs, where the keys and values can be of any Python data type, as long as the `dict` keys are immutable. Typically, the keys are strings. Another important property of the keys is that they must be *unique*, because they are used to access the `dict` values. 

A dictionary can be created using curly brackets, encompassing `key:value` pairs, separated by commas. For example, the following expression creates a dictionary named `person`, containing four `key:value` pairs:

In [12]:
person = {'firstname':'John', 'lastname':'Smith', 'age':50, 'eyecolor':'blue'}
person

{'firstname': 'John', 'lastname': 'Smith', 'age': 50, 'eyecolor': 'blue'}

Note that three of the *values* are strings (`str`) and one is an integer (`int`), while all *keys* are strings.

(accessing-dict-values)=
### Accessing `dict` values

Unlike a `list` (see {ref}`lists`) or a `tuple` (see {ref}`tuples`), where elemets are accessible through numeric indices, dictionary entries are not associated with any particular order and therefore cannot be accessed using a numeric index. Instead, dictionary values are only accessible through the keys. In that sense, a Python `dict` is analogous to a real-life dictionary, since both associate, or *translate*, one set of values (keys) with another (values). For example, an English-French dictionary associates English words (keys) with a French translation (values).

Dictionary values are accessed using square brackets (`[`) in an expression such as `d[key]`, where `d` is a `dict` object and `key` is the key. Again, keep in mind that dictionary keys are typically strings, but in general they can be any other immutable data type. For example, here is how we can access each of the four values in `person`:

In [13]:
person['firstname']

'John'

In [103]:
person['lastname']

'Smith'

In [104]:
person['age']

50

In [105]:
person['eyecolor']

'blue'

### Assignment to `dict`

Dictionary values can be modified, by assignment and using the respective key, similarly to the way that `list` values can be modified by assignment using a numeric index (see {ref}`assignment-to-list`):

In [14]:
person['firstname'] = 'James'
person

{'firstname': 'James', 'lastname': 'Smith', 'age': 50, 'eyecolor': 'blue'}

We can also create new `key:value` pairs, by assignment to a non-existing property:

In [107]:
person['owns_car'] = True
person

{'firstname': 'James',
 'lastname': 'Smith',
 'age': 50,
 'eyecolor': 'blue',
 'owns_car': True}

### Detecting `dict` keys

The `in` operator, which we used to check if a `list` contains a given element (see {ref}`the-in-operator`), applies to `dict` keys and can be used to check if a dictionary contains a given *key*:

In [108]:
'firstname' in person

True

In [18]:
'address' in person

False

```{admonition} Exercise 02-c
:class: important
* Python data types can be combined into more complex data structures. For example, we can create a list of tuples, a dictionary of lists, and so on. 
* Create a dictionary with two keys, `'a'` and `'b'`, and two values which are lists, `[1,2]` and `[3,4]`, respectively. 
* Which expression can be used to access the value `4`?
```

(sets)=
## Sets (`set`)

A [`set`](https://docs.python.org/3/library/stdtypes.html#set) is a collection of values which are guaranteed to be *unique*. A `set` is used to indicate whether a particular value is part of a group or not, without any additional information about that value. In other words, a `set` can be thought of as a `dict` with just the keys (without the values). 

A `set` can be created from scratch, using curly brackets `{`, with elements separated by commas:

In [110]:
x = {'John', 'James', 'Bob'}
x

{'Bob', 'James', 'John'}

In [112]:
type(x)

set

A `set` can also be created from a dictionary, using the `set` function. In that case, the values are discarded:

In [113]:
set(person)

{'age', 'eyecolor', 'firstname', 'lastname', 'owns_car'}

Finally, a set can be created from a `list`, in which case duplicated values are discarded:

In [3]:
set([1, 7, 9, 7])

{1, 7, 9}

We are not going to use sets later on in this book. However, it is important to be aware that this basic data structure exists, in case you encounter it when working with Python.

(basics-mutability-and-copies)=
## Mutability and copies

### Overview

As mentioned above, data types in Python can be divided into two groups based on the ability to modify them after creation ({numref}`data-types`):

* *Immutable* types, which cannot be changed after creation
* *Mutable* types, which can be changed "in place" after creation

In this section, we elaborate on "in place" modification of mutable variables, and demonstrate the implications we need to be aware of when using it. Before that, we need to know a little more, at least conceptually, about how variables and data sctructures are stored in computer memory.

It is helpful to think of a data structure as a specific location in computer memory, containing a particular information in a data structure, whether mutable or immutable. For example, suppose that we define a variable named `a` with a value, such as `2` or `[1,2]`. Now, the label `a` refers to a memory location which stores that particular value. We can schematically illustrate this as follows, where `a` is a label we place on a "box", a memory location holding the information, marked as ×:

```text
a → ⊠
```

When re-assigning a new value into an existing variable, we can think of the label "switching" to point at a new memory location, with new information +.

```text
a ↘ ⊠
    ⊞
```

 Additionally, and only with *mutable* values, we may modify the value "in place". For example, in this chapter we learned about five methods to modify `list` values "in place": 

* assignment to replace a list element, as in `a[0]=500` (see {ref}`assignment-to-list`), 
* `.append`, 
* `.pop`, 
* `.reverse`, and 
* `.sort` (see {ref}`list-methods`).

When modifying a value "in place", the label still points to the same memory location. It is just that the information in that memory location has changed, e.g., from × to +: 

```text
a → ⊞
```

Why does this matter? When writing code, what is the practical difference between modifying a memory location "in place", and switching to a new memory location when re-assigning a new value? The answer is, it matters in situations where we have more than one copy of the same variable.

It is important to understand, that, when creating a copy of a variable, as in `b=a`, we are creating a copy of the "label", which points to the same memory location as `a`. So that, now, we have two lables pointing at the same memory location: 

```text
a → ⊠ 
b ↗  
```

What happens if we modify one of the variables `a` or `b` "in place"? The answer is that the change is going to be reflected in the other variable too!

```text
a → ⊞ 
b ↗  
```

We can create "real" independent copies, using the `.copy` method, as in `b=a.copy()` instead of `b=a`. In this case, `a` and `b` point at different memory locations, so that any modification of one does not affect the other:

```text
a → ⊠ 
b → ⊠ 
```

The next two sections demonstrate the ideas described here in practice.

### Immutable values

As discussed above, creating a variable named `a` with the immutable value `2` means that we now have a "label" `a`, which "points" at a fixed value of `2`:

In [10]:
a = 2
a

2

Assigning `a` into another variable `b` makes both labels `a` and `b` point to the same immutable value `2`:

In [11]:
b = a
b

2

Making a copy of the "label" does not create another copy of the data, just another pointer to the *same* data. Programmatically, the fact that `a` and `b` are pointers to the same memory location can be detected using the `is` operator:

In [12]:
a is b

True

Variables pointing at immutable values, such as `int`, are basically labels to values that cannot be changed. There are no methods to modify an `int` "in place". We can only re-assign a new value. For example, assigning a new value to `a`, such as `55`, makes the respective label `a` point to another memory location with a different immutable value:

In [14]:
a = 55
a

55

The second label `b` is unaffected, still pointing to the same immutable value `2`:

In [None]:
b

2

Consequently, `a` and `b` are no longer labels for the same memory location:

In [15]:
a is b

False

We can illustrate the old and new situations as follows:

```text
(1) a → 2    (2) a → 55 
    b ↗          b → 2  
```

The above may seem obvious, but hang on. *Mutable* data types are the ones associated with tricky behavior when modifying them "in place", as shown next.

### Mutable values

Variables pointing at mutable data structures, such as a `list`, behave exactly the same way as immutable ones when assigning a new value, as in `a=[5]`. However, mutable values behave differently (and unexpectedly, in case you are unaware of this behavior) when modifying them *"in place"*. 

For example, suppose we have a list `a`. In other words, the label `a` which points to a (mutable) data structure:

In [16]:
a = [1, 2]
a

[1, 2]

Next, we create a copy of `a`, named `b`. The label `b` now points at the memory location, with the same information, as `a`:

In [17]:
b = a
b

[1, 2]

Now, let us modify `a` through assignment to a subset (see {ref}`assignment-to-list`). Importantly, this operation modifies `a` "in place". We do not make the label `a` point at a new "box", elswhere in computer memory (as in `a=[5]`). Instead, we make a modification inside the existing "box":

In [121]:
a[0] = 500
a

[500, 2]

Perhaps surprisingly, the modification of `a` is also reflected in `b`:

In [122]:
b

[500, 2]

We can illustrate the old and new situations as follows:

```text
(1) a → [1, 2]    (2) a → [500, 2] 
    b ↗               b ↗  
```

What has happened? Recall that `a` and `b` are labels pointing to the same memory location:

In [123]:
a is b

True

An "in place" modification of `a`—such as assignment to a subset (as shown here), or `.append`, `.pop`, `.reverse`, and `.sort`— modifies the information where it is at, not switching the label `a` to a new memory location. Therefore, the change is going to be reflected in `b`, or in any other label pointing towards the same memory location. Only operations where we "point" `a` to a new, different, data structure, such as `a=[5]` will not be reflected in `b`.

If we want to create an explicit *copy* of the data, that is, to have the same information in a new independent memeory location, we need to use the `.copy` method when creating the copy:

In [7]:
a = [1, 2]
b = a.copy()
b

[1, 2]

Using the `is` operator, we can demonstrate that `a` and `b` are pointers to distinct memory locations:

In [8]:
a is b

False

Now, modifying `a` does not affect `b`, since `a` and `b` are independent copies:

In [3]:
a[0] = 500
a

[500, 2]

In [4]:
b

[1, 2]

Here is an illustration of the old and new situations when `b` is created using `b=a.copy()`:

```text
(1) a → [1, 2]    (2) a → [500, 2] 
    b → [1, 2]        b → [1, 2]
```

(exercise-basics)=
## More exercises

```{admonition} Exercise 02-d
:class: important
* Create one object of each type from the ones we learned about in this chapter, except for `None` and `set`:
    * `int` (see {ref}`numbers-int-float`)
    * `float` (see {ref}`numbers-int-float`)
    * `bool` (see {ref}`boolean-values`)
    * `str` (see {ref}`strings`)
    * `list` (see {ref}`lists`)
    * `tuple` (see {ref}`dict`)
    * `dict` (see {ref}`tuples`)
* Use the `type` function to make sure each object belongs to the specified class.
* Write an expression that does something with each of the data structures, such as subsetting, an arithmetic operation, etc.
```

```{admonition} Exercise 02-e
:class: important
* Create a dictionary named `person`, as shown below.
* Write expressions that return:
    * John's eye color
    * John's last name
    * A string combining John's first and last name, separated by a space
    * A boolean value indicating whether one of John's hobbies is `'Drawing'`

```py
person = {
    'name': {'first': 'John', 'last': 'Smith'}, 
    'age': 50, 
    'eyecolor': 'blue', 
    'hobbies': ['Fishing', 'Golf', 'Python programming']
}
```