Island Area Survey

Robert Munafo, 2023 Jun 26.

This article summarises the method used to compile a thorough list of the largest islands (see that article for the actual list and links to many of the specific islands). The project consisted mainly of a rigorous pixel counting estimate of the minibrots, once they have been located via a survey also performed via pixel counting; thus the name "area survey".

Overall Result

Found 16585 islands (where a mirror-image pair counts as only one) at "level 12" (defined below) and definitively determined the rankings of the 4848 largest of those. Various symmetry-like coincidences were noticed (see below).

Prospector and Surveyors

The design uses a prospector (main program) and many surveyors (each a spawned thread)

At "level 12", the prospector checked every pixel on a 524288×262144 grid of pixels that covered the top half of a square centered at -0.76 and width 2.56).

Small groups of pixels surrounded by non-member points are suspected possible islands; a preliminary center and size is estimated, and this is re-scanned at a resolution up to 1024×1024 and about 8 times higher dwell limit. If it is determined to actually be an island, its coordinates are added to a work list. Work items are sent to surveyors whenever needed but staying within a maximum thread count limit.

The surveyors look at each island again at the 1024×1024 resolution, with even higher dwell limits to get the area estimate for that island. The estimate has to be good enough to reliably determine the ordering of the island in relation to others that are close to the same area. Each surveyor chooses its iteration limit on its own, by starting with 128×2^level but then doubling over and over until has decreased by less than 1/10 of 1%. After this occurs, a model of asymptotic convergence rate for pixel counting islands (based on data from this project and from my Area of the Mandelbrot Set project) is used to extrapolate to a final area estimate plus an error estimate, together expressing a more accurate answer than just using the value attained from the highest dwell limit.

Other Design Details

Dwell limits must be chosen in proportion to grid sizes; this includes using higher dwell limits when looking at a single island using its own individual grid scan. I used the same solutions (to the fundamental problems of pixel-counting and statistical rigor) that I had developed during my earlier work on the Area of the Mandelbrot Set.

One question to ask is whether to count the island's islands as part of the island. Mathematically one should, but maybe it's more "intuitive" to not do so, because the "sub-island" will eventually appear further down on the list. Practically, there is extra effort to identify isolated pixels and count them out. I went with the "mathematical" and practically easier choice, which is that the "sub-island" areas get counted.

At an early stage the prospector must perform some kind of flood fill algorithm to determine if a pixel is part of a clump that is a candidate island; this is easiest if a grid of pixels is stored in memory, and far easier (and quicker) if you don't have to keep shiting that grid by small amounts as new rows/columns are scanned. It may be hard to store a full 524288×262144 grid (137.4 billion pixels) in memory at once. I did not have the memory on my machine at the time (2019) to store grids at the high resolutions I planned to perform this survey, so I chose to have the prospector scan the grid in pieces, square tiles that overlap their neighbours. Only one is stored in memory, and pixels in the overlap area get re-computed.

That means that many islands will be found multiple times, which creates the need for a matching function to check if two islands are the same: the distance between their centers is less than 1/2 of either island's "diameter", and the ratio between the two areas is "close" to 1. The "center" is defined as the center-of-gravity of the pixels that were counted to estimate an area.

This matching is done before deciding to assign a "surveyor", in order to avoid having to do the computationally expensive (and very precise) surveyor area measurement multiple times. But in order to determine that "the ratio between the two areas is 'close' to 1", a modest-precision pixel count must be done.

While doing all this work, one may as well find the period of each island in the list; this is determined using my method from 1992, which is described in the period article.

Runtime and Related Statistics

An 8-core Nehalem system was used, 2.2 GHz and 16 threads; this was very new hardware for 2009. All results are periodically written to files that are automatically reloaded at startup, to recover from any interruption (e.g. due to power failure). The level-12 survey took about 10.1 days, with a total thread×CPU time 4.835 times as long as the level-11 survey. The runtime ratio between levels increases a bit each time because the Hausdorff dimension of the Mandelbrot set increases with magnification level. Thread efficiency was sufficient to finish the work 14.6× as fast as doing it in a single thread sequentially (as determined by comparing the time(1) utility's "real" to "user" numbers).

Here are the statistics for all levels:

Level Def (Poss) Total Rat ------Parallel------ 3 2 2 8 0m 8.817s 35.61129 4 3 5 19 2.375 0m14.789s 149.8380 4.208 5 5 12 45 2.368 0m42.628s 530.0333 3.537 6 13 33 133 2.956 2m23.885s 1949.943 3.679 7 33 97 351 2.639 8m30.979s 7409.384 3.800 8 104 249 948 2.701 31m59.284s 28793.18 3.886 9 261 674 2513 2.651 138m42.552s 120389.7 4.202 10 708 1770 6560 2.610 665m9.948s 555657.2 4.593 11 1865 4596 16698 2.545 2970m0.116s 2601579. 4.682 12 4849 11737 42437 2.541 14544m0.00s 12578854 4.835

To explain the numbers more fully — in the bottom row from left to right: 12 is the survey level; 4849 islands whose ranking was definitively determined if you count R2 as #1; 11737 islands with a suspected ranking; (25851 more islands identified, this number is not shown in the table); total of 42437 islands were identified; 2.541 is the ratio of 42437 to the number above it; 14558m51.4s is "wall clock time" (about 10.1 days, estimated because a restart occurred); 12578854 is "user" time in seconds (every thread's time is added together to estimate what it would take without multithreading); 4.835 is the ratio of 12578854 to the number above it.

Notably, the top of the rightmost column has the number 4.208. This number is higher than the ones below it. I suspect this irregularity is because of the relatively small number of islands at those smaller scales, and expected fluctuation that occurs with small sample sizes.

Interesting Symmetries

I found some interesting symmetries. For example the rank-195 and rank-196 islands have the same orientation:

R2.2/7F(2/3B1)S (period 28) is rank #195

R2.1/5F(3/4B1)S (period 25) is rank #196

An even more impressive symmetry is found bewteen ranks 192 and 197:

R2F(2/5(2/3B1)B3)S (period 13, rank 192)

R2F(1/2(1/5B1)B1)FS{2}S (period 11, rank 197)

Both have 5-pointed branch points in their filaments, but the 5's are from a different tuning level (2/5 is from the secondary continental mu-atom R2.2/5a), 1/5 is from a tertiary level); one is the centre of its primary filament (no FS operator in its R2-name), the other is the centre of a secondary filament (it has one FS[] operator in its name).

Such pairs seem to be abundant. Rank 184 and 188 resemble each other in a similar way when one is rotated 180 degrees. Rank 201 and 203 match when one is mirror-flipped about a vertical axis.

You cannot match any two islands just by getting them to be pointing the same direction. For example, ranks 231 and 233 (after rotating one by 180 degrees) do not line up at all. The filaments, branch points, satellite islands, etc. do not line up, and one has a much bigger 1/2a. Comparing 115 and 117: each filament is clearly the same length, but the satellite islands within are clearly arranged differently.

(I have not investigated this phenomenon further, but I remember something in an academic article about the Misiurewicz points being aligned with invisible circles/quasi-ellipses around the island; that might be a lot of the explanation.)

Other Findings

The largest islands on the real axis are ranked 1, 3, 5, 35, 37, 42, 53, 61, 115, 117, 190, 325, ... overall. Notice the gap from 5 to 35, and some lesser gaps after that: 5 to 35 is a ratio of seven, but 61 to 115 is less than double.

The R2F(1/3B1)S within R2F(1/2B1)S is the 547^th largest island overall; while the R2F(1/2B1)S within R2F(1/3B1)S is rank 563.

Additional Surveys

I performed a survey of islands on the real axis (not the same as the spike R2F(1/2B) because that includes off-axis islands). I went up to level 20 and found the largest 1049 with definitive ranking, plus 1504 smaller with estimated ranks.

I also did a close-up survey around R2F(1/3B1)S early in 2011 but I don't know of any notes from that period.

Mu-ency main page — index — recent changes — DEMZ

This page was written in the "embarrassingly readable" markup language RHTF, and was last updated on 2024 Dec 11.

s.30