Searching for RH Counterexamples in Golang - Search Strategies

Intro

Now that we have created a list of superabundant numbers to search across, we can use this approach instead of searching across every number. The simplest way would be to just throw away the existing naive approach and use this one instead, but what if we want to keep both options open?

This idea of enabling different strategies that can be chosen at runtime is called the strategy pattern. In our case, this allows the user to choose either the naive, or the superabundant strategy, or even a third strategy that we might dream up later!

Adding Search DB

We have currently assumed a single database to store the Divisor values, the largest of which implicitly doubles up as the “state” where we stopped. A cleaner abstraction would be to create a separate database where we can store the state at which we stopped the search, which we can continue next time. Each record in this database will correspond to a SearchMetadata interface which store the current state of the search. The interface and the structures for this DB can be defined as follows:

package riemann

import "time"

type SearchState interface {
	Serialize() string
	Value() int64
	GetNextBatch(int64) []SearchState
}

type SearchMetadata struct {
	startTime     time.Time
	endTime       time.Time
	stateType     string
	startingState SearchState
	endingState   SearchState
}

type SearchStateDB interface {
	Initialize()
	LatestSearchState(string) SearchState
	InsertSearchMetadata(SearchMetadata)
	Close()
}

In the above implementation, we have a SearchStateDB interface representing the DB, SearchMetadata interface representing the a record in the DB, and a SearchState interface representing a single state of search.

Abstracting out the Search State

While trying to add this abstraction, we come across a few abstraction leaks, where the function implementation bleeds into the function signature.

func ComputerRiemannDivisorSums(startingN, endingN int64) []RiemannDivisorSum {
	output := []RiemannDivisorSum{}
	for i := startingN; i <= endingN; i++ {
		ds, err := DivisorSum(i)
		if err != nil {
			panic("Divisor Sum cannot be found")
		}
		wv := WitnessValue(i, ds)
		output = append(output, RiemannDivisorSum{N: i, WitnessValue: wv, DivisorSum: ds})
	}
	return output
}

Firstly, we don’t have any function that calculates the Divisor Sum for a single state, but we have a starting and ending number. Secondly, our RiemannDivisorSum accepts a starting and ending integer as an argument, which needs to be changed. This implicitly assumes that each SearchState is going to be a number, which is not necessarily the case. Finally, the search strategy is currently just a simple linear search which we are going to chance soon.

To go with the strategy pattern, we need to generalize the API of our function call which computes the RiemannDivisorSum to accept a generic SearchState object and returns a single RiemannDivisorSum object.

func ComputeRiemannDivisorSum(ss SearchState) RiemannDivisorSum {
	i := ss.Value()
	ds, err := DivisorSum(i)
	if err != nil {
		panic("Divisor Sum cannot be found")
	}
	wv := WitnessValue(i, ds)
	return RiemannDivisorSum{N: i, WitnessValue: wv, DivisorSum: ds}
}

Extracting the Search Strategy

The search strategy can be extracted into a method for the ExhaustiveSearchState type as well.

package riemann

import "fmt"

type ExhaustiveSearchState struct {
	n int64
}

func NewExhaustiveSearchState(n int64) *ExhaustiveSearchState {
	ess := ExhaustiveSearchState{}
	ess.n = n
	return &ess
}

func (ess *ExhaustiveSearchState) Serialize() string {
	return fmt.Sprint(ess.n)
}

func (ess *ExhaustiveSearchState) Value() int64 {
	return ess.n
}

func (ess *ExhaustiveSearchState) GetNextBatch(batchSize int64) []SearchState {
	output := []SearchState{}
	startingVal := ess.Value()
	for i := int64(1); i <= batchSize; i++ {
		output = append(output, SearchState(&ExhaustiveSearchState{startingVal + i}))
	}
	return output
}

All the above changes are combined into a single commit which supports the In-Memory DB here, which includes updating the tests as well to handle the changes. In this commit, we add the SQLite DB implementation as well.

Keeping It Simple and Precalculating Primes

In the next step for creating the superabundant strategy, we need to create the superabundant numbers themselves. As we have identified before, this requires us to list all the first N primes, and then raise it to the partitions identified in the previous post.

To calculate the first N primes, we have a few options:

  1. Find the first N prime for each pass
  2. Precalulcate the first N prime for each program run
  3. Just create a simple list of the first N primes, which is long enough for us to not run out for our purpose!

The first option is definitely inefficient, as we have to calculate the same thing many, many times. I almost implemented the simple, second option before realizing that even this was unnecessary, since we would certainly never run out of primes if we just had the first few hundreds of them pre-populated! In this commit, we create a simple function to precalculate the primes by reading them from a list!

Getting Search Working (Almost)

With that, we have all the parts ready to get search working. In this commit, we add in some of the final code to get search working with the In-Memory database and in this commit, we extend this to support the SQLite DB as well.

Integer Overflow strikes again!

At this point, I was expecting the search to start working for larger and larger values (especially since we had already accounted for BigInt in our code). However, I soon realized that the numbers were not going above some limit. In addition, some of the numbers in the DB were… negative?

21564263898777600|131077474504704000|1.675769
113982537750681600|699079864025088000|1.670911
934656809555589120|5682842765623296000|1.633229
-5049996470079440896|3149971916677185536|1.51724

Turns out the BIGINT type name in SQLITE is just a fancy name for integer and hence we were having integer overflows as soon as the number went above the max 64bit integer value (2^63-1=9223372036854775807).

To work around this, we have two options:

  1. Make SQLite support BigInt
  2. Use a different DB which supports BigInt
  3. Hack the way through by storing the Integer as Text!

I couldn’t find an easy way to make #1 happen in my searches. #2 is certainly doable and is probably a more robust approach to this, adopted by the original post, but the effort involved seemed intimidating. After some deliberation, I decided to go with #3 because it seemed like the easiest approach to Get Things Done. The idea is to convert the integer into a string and store in the DB as text instead. We need to pad the string with as many zeroes as possible to make sure each record has the same length to allow for sorting (to finding the largest number, etc).

In this commit, I do the necessary steps required to implement this, and also standardize some other parts of the codebase (eg:- the search state) which didn’t have the BigInt representation yet.

Optimizations

After this commit, I could see that the number was increasing above the BigInt limit (yay!), but still the rate of increase was very slow. So I did some profiling of the code and fixed a couple of issues:

  • I was accidentally using the non-optimized version of partition function, and I changed it to the Memoized version instead.
  • We were currently pushing all the witnessValues into the DB whether or not they were good or not. I constrained the push to only work for the values above a particular threshold
if candidateResult.WitnessValue > minWitnessValue {
    candidateResults = append(candidateResults, candidateResult)
}

The max size of the integers soon flew past 100 digits and I also had to increase the text size from 100 to 200. All of these changes + some tests were added in this commit.

Now that we have the setup ready, I exposed the configuration to the user as command line variables using the flag package in this commit.

Now we can run the function as follows:

go run main.go -batchSize=2000000

I let the function for 90 minutes and go the following result!

Inserted Metadata in 8.9245ms
Next batch computed in 1m24.98617825s
Computed Sums from 76, 4762673 to 76, 6762672 in 2m0.869268709s
Inserted Metadata in 2.803834ms
^CGot Interrupt. Shutting down...

# Highest Number Analyzed

Number: 301636780994431728010760007158715714553939298600699502749358249775324019594914827660889834777106826124965058502681389429104000, DivisorSum: 2993832424950391227772754571910692260529552449882483136246881023400525704941780554920433140759062609625475973120000000000000000, WitnessValue 1.751674

# Largest Witness Value

Number: 4506098451919302822384982325231044694457514388204548545746925991621844089120853123536321685586363021627833280000,
DivisorSum: 44126661189014029531028195360423443724904929974906708146575753230310185338019315536906138383548416000000000000000, WitnessValue 1.764622
DB Closed!
DB Closed!

Conclusion

With all the changes that we did above, we were able to search all the superabundant numbers containing upto 76 factors, with the largest number analyzed having 135 digits! We found witness values much higher than the original local optimum at 10080 but we have a long way to go to find a Riemann counterexample!

Before I wrap up the code, I did a couple more refactors (#1, #2) to get the repo into a state where I am more happy with the package structure. The final state of the repository is here.

In the original series, Jeremy does quite a few things more, by dockerizing, deploying it in AWS, and doing some amazing analysis on the results, but I’ve decided to end my journey here as it is since the original intent of the series was to learn Golang and the rest of the path deviates away from that goal. Feel free to raise an issue in Github, if you have find anything that I’ve missed or got wrong. Thanks for reading and I hope you have enjoyed the series as much as I did!

Cover Image: (Fork in the Road - Beth Macdonald)