Leetcode: Largest Divisible Subset

Published in

The Startup

7 min readJun 14, 2020

Background

The Largest Divisible Subset problem is a great problem which embodies many critical concepts to more advanced interviewing problems such as Dynamic Programming and Backtracking. The high-level reason why Dynamic Programming is useful in the solution is that we are asked to find the maximum of a given computation. Many times, when a problem asks to optimize, this is a perfect candidate for Dynamic Programming. The high-level reason why Backtracking is useful in this problem is that we will be optimizing the subset length, while the problem asks us to return the resulting subset whose length is this optimal value. Most times in programming, it is very useful to refrain from doing everything all at once, rather computing parts of the solution in sequence. The backtracking approach is a great way to decouple different steps into manageable chunks.

Problem Statement

The Largest Divisible Subset problem requires that you find the largest subset which upholds a given invariant for each pair in the subset.

I will not be giving the invariant any special mention, as the specific invariant is not actually important.

What is important is to understand is the other elements to the problem:

Subset | Reducing the Search Space

Often times with LeetCode-style questions, the complete search space of the problem is the complete set of all subsets of a given collection. Because the building a subset can either include or exclude a given item, for a collection of N items, the complete number of subsets will be 2^N possible subsets.

This is the first indicator that the complete search is not a viable alternative.

This is a huge indicator that some sort of memoization will need to be done, thus making a tradeoff of increased memory usage to reduce the search space. Usually, this reduced search space will come from storing a maximum answer which is correct locally. After all the local maximums are found, the global maximum can be found from these local results.

Usually, local results are valid for a given i from 0 → i. Therefore, the computation of i in the memoization will take O(n) time, and this will occur from 0-->n. Therefore, the search space is reduced to a total time complexity of O(n^2). This is an exponentially (no pun intended) better algorithm complexity, and one that is reasonable. (Of course there may be better alternatives, but O(n²) is a valid complexity runtime).

Backtracking | Compute Optimal size, Return Corresponding Array

While it is hard to elegantly put this concept into words (I need to further internalize this), it is not required to do everything all at once.

For the problem listed here, we will see that the solution will be divided into a 2-phase approach:

For each position in the array, compute the maximal subset size which maintains a given invariant. Store this in an auxiliary table (dp table).
Run through the dp table in reverse to build up the corresponding maximal subset.

As can be seen, if I were to try to choose a maximal subset while also storing the items that make up that subset, then I will be wasting massive amounts of memory as I hold onto lists of potential subsets which are not optimal.

The Backtracking approach allows me to store the minimum amount of data that is needed to arrive at a globally correct solution.

Conceptual Analysis

We need to find the largest subset of an array where each of its elements are divisible by one another. More formally, for a pair (a,b), they are divisible by one another if a%b==0 or if b%a==0.

For example, the input array of [1,2,4,6] has [1,2,4] as the largest subset of the input where each of its members are mutually divisible with one another.

The Largest Divisible Subset problem requires that you find the largest subset which upholds a given invariant for each pair in the subset.

Therefore, the general approach to the problem is as follows:

Build DP Table: For each element i in the input array nums, compute the maximum subset size whose elements can include nums[0 → i].
Backtrack: Use this computed DP table to generate the desired maximum subset.

Build DP Table

The first step of the problem is to find a recurrence relation which embodies the desired choices. The recurrence relation is as follows:

Recurrence Relation — Subset Problems

While this recurrence relation is quite involved, we will break this down into its many elements.

When it is desired to find an optimal subset from a list of elements where there is a choice to include or exclude an element, it is necessary that for a given subset whose possible elements include those from index 0->i-1, the condition to include such an element is twofold:

Does the inclusion of an element maintain the isDivisible property? If not, then this is definitely not an option.
If the isDivisible property is maintained with the inclusion of an element, does including this element generate the maximum subset size? For example, the element 6 indeed is divisible by the elements [1,2], however, if I include the element 6, then I will not be able to later add the elements [4,8] to the set. This is because while 6 is divisible by 1 and 2, resulting in a subset size of 3, there is another choice of creating a subset with [1,2,4,8], which will result in a subset size of 4. Therefore, we cannot utilize a greedy approach to including elements to the maximum subset.

Now that the condition has been established, it is now necessary to understand the computation space required for this problem.

A given element at index i can be included in the subset of any given element preceding it. For this reason, The recurrence relation has a max constraint, which implies that the calculation of one element in the dp table is computed in linear time. Therefore, the entire search space to build up the dp table is O(n^2). This shows that the search space was reduced from exponential in size to just polynomial time.

Backtrack

The dp table from the previous step now contains the maximum subset size for every element i which contains the element i.

As an example, for an input array of [1,2,4,6,8], the corresponding dp table would be [1,2,3,3,4]. As can be seen, both the inclusion of the 4 and the 6 will result in a max subset size of length 3. However, only the inclusion of the 4 would allow for a greater subset size when we consider the 8.

Therefore, the backtracking solution needs to scan the dp array for the largest maximum subset length. From there, a set of criterion needs to work backwards to linearly build up the correct subset of {1,2,4,8}.

This can be achieved through the following logic:

Find the maximum value in the dp table. Add the corresponding value from the input array.
Decrement this maximum value by 1, and then iterate backwards until this value is found. Once this decremented value is found in the dp array, add the corresponding value from the input array only if this value satisfies the isDivisible invariant. If this does not hold, continue to traverse the dp array.

This second step is very important. In the example shown above, there are two elements in the dp array with the value of 3. However, the first 3 encountered corresponds with the value 6 in the input array. As can be seen, 6 and 8 fail to be divisible into each other. Therefore, we know that 6 cannot be in the maximum subset being built.

A question can be asked in this approach: How can we know that this backtracking method will indeed return the correct answer by only looking into the dp table. Well, the path from the trivial set to the globally maximal set is traced out by a strictly increasing path (increment by 1 when I add an element) in the dp table. Therefore, by performing the two steps outlined above, it is guaranteed that the approach will get the optimal solution.