Most technical interviews with companies will ask
you to whiteboard code some type of recursive function
in your favorite programming language. Although Python
seems to be the dominate king in data science, recursion
can be a powerful tool in R.
What is recursion?
Recursive functions call themselves. That is, they
break down the problem into the smallest possible
components and the function() calls itself within the
original function() on each of the smaller components.
Afterward, the results are put together to solve the
original problem. Let’s take a look at more concrete examples.
Quicksort
A technical interview will usually ask you to implement
some type of sorting function that can be solved using
a recursive algorithm. Let’s try implementing quicksort in R:
Programming Quicksort in R
1234567891011121314151617181920212223
#!/usr/bin/env Rscript# Author: Jason A. FrenchquickSort <-function(vect){# Args:# vect: Numeric Vector# Stop if vector has length of 1if(length(vect)<=1){return(vect)}# Pick an element from the vector element <- vect[1] partition <- vect[-1]# Reorder vector so that integers less than element# come before, and all integers greater come after. v1 <- partition[partition < element] v2 <- partition[partition >= element]# Recursively apply steps to smaller vectors. v1 <- quickSort(v1) v2 <- quickSort(v2)return(c(v1, element, v2))}
A second sorting algorithm that we can implement using recursion is the Merge Sort.
Sorting algorithms are important because they differ in their speed, depending on the
nature of the input data. In the case of merge sort, the worst possible outcome for
the time it takes to sort the data is n * log(n), where n is the length of your list of
numbers.
Like Quicksort, we’re splitting the vector into smaller sub-vectors by halving it until
each vector has just one integer. We then merge the vectors back together, this time in order. Let’s tackle this algorithm in R as well.
I found that Rosetta Code already provides a good example, so below is the their public domain code with my
comments added to the code for explanation.
#!/usr/bin/env Rscript# http://rosettacode.org/wiki/Sorting_algorithms/Merge_sort#Rmergesort <-function(m){ merge_ <-function(left, right)# Recursive function to compare and append values in order{# Create a list to hold the results result <- c()# This is our stop condition. While left and right contain # a value, compare themwhile(length(left)>0&& length(right)>0){# If left is less than or equal to right,# add it to the result listif(left[1]<= right[1]){ result <- c(result, left[1])# Remove the value from the list left <- left[-1]}else{# When right is less than or equal to left,# add it to the result. result <- c(result, right[1])# Remove the appended integer from the list. right <- right[-1]}}# Keep appending the values to the result while left and right# exist.if(length(left)>0) result <- c(result, left)if(length(right)>0) result <- c(result, right) result
}# Below is our stop condition for the mergesort function.# When the length of the vector is 1, just return the integer. len <- length(m)if(len <=1) m else{# Otherwise keep dividing the vector into two halves. middle <- length(m)/2# Add every integer from 1 to the middle to the left left <- m[1:floor(middle)] right <- m[floor(middle+1):len]# Recursively call mergesort() on the left and right halves. left <- mergesort(left) right <- mergesort(right)# Order and combine the results.if(left[length(left)]<= right[1]){ c(left, right)}else{ merge_(left, right)}}}