東大数学2024問題6 (part 5): エラトステネスの篩のコーディング - 複天一流：どんな手を使ってでも問題解決を図るブログ

前回のあらすじ
A code for the Sieve of Eratosthenes
Test run

前回のあらすじ

ちょっと前のブログ記事の最後に英語を使って「付録」を書いてみたら、海外からのアクセスが来るようになって意外に感じている。東大の入試問題が世界からも注目されているのは、ちょっとした驚きであった。ということで、今回は英語で書く分量を増やしてみて、どんな反応が返ってくるかみてみようと思う。

Summary of the previous article: numerical investigations were made on the largest integer that a given computer language can handle, such as the C language, FORTRAN and Python. The largest integer in FORTRAN (based on the 128-bit expression) was already impressive, but the one handled by Python was even more impressive: it has NO limitation in principle. The largest number that Python can handle can go as far as the hardware of a PC/mac configuration can allow it. Amazing.
Using pointers, one can investigate what type of representation is employed for the integer expression in computers. However, it is widely believed that the concept of the pointer is intrinsic to the C language, so that it seems at first glance that the other languages are not suitable for this approach.
In the last article, I nevertheless managed to find ways to enable this investigation with the pointers even in Fortan95 and Python. A special note is that Python employs representations of integers with 256-bit (32byte) format by default, which is already gigantic. Therefore, storing even a single integer requires 32 bytes, four times greater than an ordinary integer number in C. In this sense, Python is a modern "kid" living in the world where plenty of resources are available for users with reasonable prices.
My original plan was to write a code in Python, which is based on the algorythm called the Sieve of Eratosthnes, in order to solve the Maths problem in the entrance examination 2024 of University of Tokyo. After those playful technical attempts and detours, I finally come to a point that I can show you the code.

A code for the Sieve of Eratosthenes

I try to be as pedantic as possible to the original procedure of the Sieve of Eratosthenes.

Namely, a table containing integer numbers, say, up to 10000. Then, 0 and 1 are slashed in the table because they are not prime number by definition.
Next, the first prime number "2" is chosen and its multipliers (including 2 itself) are all marked in the table with a slash. In this process, all even integers except 2 are excluded from the list of prime numbers.
The smallest integer surviving the previous processes is the second prime number, which turns out to be "3". The multipliers of 3 are again slashed in the table.
The then surviving smallest integer (that is, 5) is the third prime number and its multipliers are .... these processes are repeated until the smallest integer surviving the previous processes exceeds the largest integer originally set in the table.
In the end, we have a list of prime numbers included in the table.

Our attempt is to write a code in our own way based on the original algorithm, but this means that there can be errors in the code. To verify what we obtained with the code are correct, we crosscheck the database provided by the PrimePages, the online database created mainly by mathematicians in the University of Tennessee. The PrimePages provides a table containing the first 1000 small primes, so that we use the data for the verification.

I ran the code several times and learned that it is necessary to set the maximum integer in the initial table needs to exceed Nmax=110000 if we demand the first 10000 smallest prime numbers. So, we begin with writing the code as,

Nmax  = 110000
Npmax = 10000

We define a python list named prime_table as the table of integers up to Nmax.

prime_table = [True for i in range(Nmax)]
prime_table[0:2] = [0,0,1]

The list is for a logic variable and the index of prime_table corresponds to the integers in the table. That is, if prime_table[n]=True, then the integer n is one of the prime numbers. If the list returns a false value, then the number does not belong to the prime. The first two elements of the list are set "False" (0) in the initial setting, because n=0,1 are not prime numbers. On the other hand, prime_table[2]=1 (True) is set because n=2 is a prime number.

In the begining, MaxPime=2 is set and its multipliers MaxPrime * n are to be "slashed", that is, set be "False" in the list prime_table. Due to a commutability of a product for two integers, the multipliers can be larger than and equal to MaxPrime. Pmax records the total number of the prime numbers found so far.

Pmax = 1
MaxPrime = 2
while MaxPrime < Nmax:
    n = MaxPrime
    while n <= Nmax / MaxPrime:
        prime_table[MaxPrime * n] = 0
        n += 1
    while True:
        MaxPrime += 1
        if MaxPrime > Nmax-1:
            print("End of Search at tmp MaxPrime =", MaxPrime)
            break
        if prime_table[MaxPrime] == True:
            Pmax += 1
            break

In the end of the programme, a search for the next prime number, that is, MaxPrime, is attempted by referring to the data_table having the True value.

This is the contents of the Sieve of Eratosthenes.

To show the prime numbers we found with this algorithm, the additional section is placed right after the above code.

for i in range(Nmax):
   if data_table[i] == True:
      print(i)

Test run

As a test, we set a pair of small numbers for Nmax and Npmax and run the code.

Nmax = 120
Npmax = 30

The result is as the following.

$python3 eratos.py 

       2       3       5       7      11      13      17      19      23      29
      31      37      41      43      47      53      59      61      67      71
      73      79      83      89      97     101     103     107     109     113

The first 30 prime numbers are detected for Nmax less than and equal to 120. These prime numbers are verified with the database downloaded from the PrimePages! So, it seems the code works properly. But we need to repeat this check for larger numbers, too. That would be the topic of the next article.