Overview

Part 1 - Sorting

DAT038
Date: November 24, 2022
Last modified: July 10, 2025
4 min read
DAT038

My other course for this study period is a course in Data Structures & Algorithms.

Sorting & Searching

We began this course with sorting and why it’s an important topic for a computer scientist. Sorting data comes up in all fields of computer science, everything from low-level, simple, array sorting - to sort millions of data gathered by an AI.

With this we began with simple sorting and searching algorithms. Firstly we began with binary search. Binary search is a simple yet powerful search algorithm for finding a value in a sorted array.

Pseudo Code for a Binary Search:

func binary_search(array, search_value):
low = first index of array
high = last index of array
# If our low and high pointers ever cross, we know that the element is not in the array
while low <= high:
mid = mid index of array; (low + high) / 2
if(search_value is to the left of mid value):
continue search to the left of the array
else if(search_value is to the right of mid value):
continue search to the right of the array
else:
return mid
return Not in array

And an actual Python implementation:

def binary_search(arr, val):
low = 0
high = len(arr) - 1
while low <= high:
mid = (low + high) // 2
if(val < arr[mid]):
high = mid - 1
elif(val > arr[mid]):
low = mid + 1
else:
return mid
return -1

The next step was Selection and Insertion sort, simple naive sorting algorithms.

The selection sort algorithm is: Find the smallest element in the list and swap it to the first index, then continue for the next-smallest etc.

An implementation:

def selection_sort(arr):
size = len(arr)
for index in range(size):
min_index = index
for j in range(index + 1, size):
if(arr[j] < arr[min_index]):
min_index = j
(arr[index], arr[min_index]) = (arr[min_index], arr[index])

After that it was time for insertion sort, which builds on the idea of inserting a value in an array. The idea is that we start by saying that the first element is the ‘sorted’ part. Then we begin looking at the ‘unsorted’ part, at seeing where it should be inserted into the sorted array. A simple, yet, powerful algorithm. We will see how important the idea of inserting will be later on.

An implementation:

def insertion_sort(arr):
# Begin by saying the first element is sorted
# We call the
for i in range(1, len(arr)):
val = arr[i]
j = i - 1
while j >= 0 and val < arr[j]:
arr[j + 1] = arr[j]
j -= 1
arr[j+ 1] = val

Although these algorithms are simple, they are unfortunately really slow, when the array becomes larges. For now, we can just say that why this is because: The number of operations needed to check and sort grows quadratic with the array size.

Imagine we have two sorted lists, and we want to merge them - what’s the algorithm for this? Quite simple actually, we use two pointers for knowing where we are in both arrays, starting at the first index and both and compare which value we insert into the new (bigger) array). Then we move the pointers accordingly, until we reach the end. Reaching the end here means we either hit the end of the array of both arrays simultaneously, or reaching the end of one array - if we do that we know there’s nothing left to compare to, therefore we can just put the rest of the elements in.

An implementation of a merge function:

def merge(arr1, arr2):
# We can cheat a little bit using python - but the idea is still the same
output = []
p1 = 0
p2 = 0
while p1 < len(arr1) and p2 < len(arr2):
if(arr1[p1] <= arr2[p2]):
output.append(arr1[p1])
p1 += 1
else:
output.append(arr2[p2])
p2 += 1
while p1 < len(arr1):
output.append(arr1[p1])
p1 += 1
while p2 < len(arr2):
output.append(arr2[p2])
p2 += 1
return output

Now that we’re familiar with this algorithm - we can use this by divide and conquering an unsorted array with a bit of modifications. We can imagine that we divide the unsorted array into a left and right part and recursively mergesort() them as well.

In practice, it means that we get a sorted left array and a sorted right array, then we finally merge them for the final result.

An implementation of merge sort:

def merge_sort(arr: list) -> list:
if len(arr) < 2:
return arr[:]
else:
mid = len(arr) // 2
L = merge_sort(arr[:mid])
R = merge_sort(arr[mid:])
result = merge(L, R)
return result

Merge sort is great, but sometimes, if the list is almost sorted or worse, already sorted and, you call this function sort it - it will be incredibly slow, this is due to merge sort not being a so called in place algorithm. An algorithm which has the same divide and conquer strategy but is in-place is Quicksort!

Quicksort builds on the idea that you partition your array into one part which is lower than the pivot element and one part which is higher.

In practice this means you sort it by recursively partitioning the sub-arrays.

So partitioning, sounds quite easy right? Choose a pivot element and just put larger numbers to the right and lower to the left. But choosing the pivot is the tricky part. If we choose a good pivot our algorithm becomes much faster. A common (and good) strategy is either choosing a random pivot, or the median of three.

In this example I will choose the pivot element as the last element of the list. An implementation of Quicksort:

def partition(arr, low, high):
def swap(i, j):
arr[i], arr[j] = arr[j], arr[i]
pivot = arr[high]
i = low - 1
for j in range(low, high):
if arr[j] <= pivot:
i += 1
swap(i, j)
swap(i + 1, high)
return i + 1
def quick_sort(arr, low, high):
if low < high:
p = partition(arr, low, high)
quick_sort(arr, low, p - 1)
quick_sort(arr, p + 1, high)

This concludes this part, next time I’ll write about (short) about Complexity - after that the major part of this course, actual data structures and how we use them as well as implement them.