Problem 1
Find the sum of all the multiples of 3 or 5 below 1000.
This is a gentle first problem, but it is designed to illustrate what project Euler is really about.
The Naive
Finding the solution is simple: check if n is divisible by either 3 or 5 and and all the numbers which are. We only need to check natural numbers up to 1000, but with our computers, 3000 operation is, really, nothing. Answer is computed instantaneously. To see an example of brute-force calculations, click here.
Little Gauss
Through Project Euler, I've developed an instinct for identifying bad solutions, and the naive approach is one that stinks in disgrace. The weakest aspect of this naive approach is that as we increase the upper limit to 10000, 100000, or even 100000000, the program would take longer to find the correct solution.
For example, using the time object in python, I recorded the time it takes the program to compute the sum of all multiples of 3 and 5 for following n-values. Observe that the every tenfold increase in n lead to approximately tenfold increase in the program run-time.
n | Time (seconds) |
1000 | 0.0000 (negligible) |
10000 | 0.0080 |
10000 | 0.0860 |
100000 | 0.9260 |
1000000 | 6.67300 |
But program run-time is subjected to multitude of factors, most explicitly the computing power of the hardware - if I use an intel i-5 core processor in lieu of i-3, the increase in time may not have seemed so pronounced. What remains invariant, however, is the number of computation a processor must have undertaken. Therefore, we should assess the efficiency of the program not in terms of time but of number of steps the program demands for a particular input value. (One interested in more matured and complete introduction to efficiency assessment of a program can watch this lecture on MIT OCW or visit this wiki)
Here I outline a very elementary assessment of our naive solution:
1. Think of the worst-case input value for the program. In the naive solution, there isn't a particularly bad input, given that it is an integer.
2. Predict the behavior of the program as the input size increases. In the naive solution, for every increase in the input size (e.g. n + 1), we add at least additional steps: check if it's divisible by 3, by 5, and if true, add to the current sum.
3. We can see that the number of steps of the naive solution increases linearly as the input size increase (slope of three). We denote the complexity of the program, in Big O Notation, as O(n).
Linear complexity is not too bad, but for a simple program as our own, I think embarrassing to have to wait for 6 seconds to get a response for a small input of 1,000,000.
For the first discussion of our Little Gauss solution, it is sensible to borrow from the insights of Little Gauss. Gauss stated that the sum of digits from 1 to n is given by the following formula.
$ Sum = \frac{n(n+1)}{2} $
So, if we want sum of a multiples of 3 below 1000, we use the following formula:
$ Sum = \frac{n(\frac{n}{f}+1)}{2} * f = \frac{1000(\frac{1000}{3}+1)}{2} * 3 = 166833 $
Similarly, we can calculate the sum of multiples of 5 below 5. Important to note however, is that the least common multiple of the two factors, 15, and its multiples would be counted twice if we were to simply add our result to 3 and 5.
My implementation of the algorithm can be found here.
How efficient is this program compared to the naive? I reproduce the table above with the time values replaced with times from Little Gauss program:
n | Time (seconds) |
1000 | 0.0000 (negligible) |
10000 | 0.0000 |
10000 | 0.0000 |
100000 | 0.0000 |
1000000 | 0.0000 |
The number of steps required for this program is independent of the input size, and is therefore constant, O(1). This kind of efficiency (although constant complexity is, I think, nearly impossible to achieve in most cases) is what we will be looking for in all of our Project Euler problems.