Sequence A020916: A Restricted Class of Permutations
This sequence, Sloane's A020916, counts a certain subclass of permutations. Starting with N=1, sequence A020916 is: 0, 0, 1, 2, 0, 0, 24, 96, 0, 0, 10000, 60736, 0, 0, 20511168, 168661760, 0, 0, 134002359296, 1398597049856, 0, 0, 2146989255011328, 27232259080056832, ...
Linear Molecules
The sequence is based on the idea of "linear" molecules, which are molecules whose atoms are in a single chain, with no branches. Real-world examples include H2 (two hydrogen atoms), H2O (water, two hydrogen atoms linked by an oxygen atom), HNO (nitroxyl, a hydrogen, nitrogen and oxygen atom in that order), and C2H2 (acetylene, which has two hydrogens and two carbons with the atoms arranged HCCH). These molecules are sometimes represented in drawings like this:
H-H H-O-H H-N=O H-C≡C-H
Each atom has a valence, which is the number of bonds it can connect with, and each atom is connected to one or two neighbors with at least one, or possibly multiple, bonds. In the above examples, the H atoms have valence 1, the O atoms have valence 2, the N atom has valence 3, and the C atoms have valence 4. All bonds are single bonds, except the one between N and O which is a "double bond", and the one between the two C atoms, which is a "triple bond".
Abstract Molecules
This sequence does not concern actual molecules, it concerns a more abstract "perfect" molecule that can have atoms with any arbitrarily large (integer) valence. Bonds are available in any number, and we ignore any concerns about chemical stability. Since these aren't actual chemical elements, we just call each atom by a number representing its valence. The above examples then become:
1-1 1-2-1 1-3=2 1-4≡4-1
The question answered by sequence A020916 is: How many ways are there to make a molecule that has exactly one atom of each valence from 1 to N, where N is the number of atoms in the molecule? Of the above examples, 1-3=2 is the only one that qualifies. There are two molecules with 4 atoms:
N=3 1-3=2
N=4 1-2-4≡3 1-3=4=2
In real chemistry, the 4-atom molecules could be H-O-C≡N (cyanic acid) and H-N=C=O (Isocyanic acid). The first of these has not been isolated, but sequence A020916 does not care.
These are the only solutions with N=3 and N=4 (the "reverse" forms 2=3-1, 3≡4-2-1 and 2=4=3-1 are considered equivalent and do not count as separate solutions).
In any solution, the 1 must be one end or the other, because it has only one bond to share with another atom — if it were in the middle there would be no way to hold the molecule together. By convention we'll always put the 1 at the left end.
In any solution, the total of the atoms is twice the total number of bonds. For example, in 1-3=2, the atoms add up to 6, and the bonds add up to 3. In 1-2-4≡3, the atoms add up to 10 and the bonds add up to 5. Notice this means the sum of the atoms has to be an even number.
In a solution with N atoms, the atoms add up to N(N+1)/2, which is the Nth triangular number. However, half of the triangular numbers are odd: namely, when N is equal to 1, 2, 5, 6, 9, 10, 13, 14, ... (any multiple of 4 plus 1 or 2). This means that there are no ways to form a solution for these values of N.
When N<9, the atom-numbers in the molecule, when treated as digits, spell out a multiple of 11. In the above 3 examples, 123=11×12, 1243=11×113, and 1324=11×122. You can also see that the other factor (12, 113 and 122 in the examples) has digits corresponding to each of the bonds in the molecule — for example 113 for two single bonds and a triple bond. This is true for all "molecules", not just solutions for A020916. For example, the molecole 1-4{=}4-1 corresponds to the integer 1441 which is 11×131. It is pretty easy to see why this relationship holds.
For N=7 there are 24 solutions. Here they are, with 4×, 5× and 6× bonds represented by "::", ":':" and "=:=" respectively:
1-2-3=5≡4-7=:=6 1-2-3=5≡6≡7::4 1-2-3=7:':6-5::4
1-2-4≡5=3-7=:=6 1-2-4≡5=6::7≡3 1-2-4≡7::6=5≡3
1-2-6:':7=3-5::4 1-2-6:':7=4=5≡3 1-3=4=5≡7::6=2
1-3=4=6::7≡5=2 1-3=5≡4-2-7=:=6 1-3=5≡4-6:':7=2
1-3=5≡7::6=4=2 1-3=6::7≡5=4=2 1-3=7:':6-2-5::4
1-3=7:':6-4≡5=2 1-4≡5=3-2-7=:=6 1-4≡5=3-6:':7=2
1-4≡5=7:':6-3=2 1-4≡6≡7::5-3=2 1-5::7≡6≡4-3=2
1-6:':7=3-2-5::4 1-6:':7=3-4≡5=2 1-6:':7=5≡4-3=2
For N=8 there are 96 solutions, and for N=11 there are exactly 10000.
Although the first few terms make it appear that all terms are highly composite, this does not hold true in general:
A3 = 1
A4 = 2
A7 = 24 = 23 3
A8 = 96 = 25 3
A11 = 10000 = 24 54
A12 = 60736 = 26 13 73
A15 = 20511168 = 26 3 317 337
A16 = 168661760 = 28 5 17 23 337
A19 = 134002359296 = 210 13 2347 4289
A20 = 1398597049856 = 29 2731634863
A23 = 2146989255011328 = 210 3 7 13 23 619 539447
A24 = 27232259080056832 = 212 15031 442319257
Computing the Sequence
A recursive algorithm with pruning suffices to compute the terms through A16 in a few minutes. I used the following C code:
/* This assumes "int" is 32-bit, and "long long" is 64 bit */ long long a020916_p( int i, // next position to fill (0 = first position) int n, // length of molecule we want to build int bb, // bitmap of atom values that have been used int val, // valence-number of bond from previous atom int rem) // sum of remaining unused atom values { int j; long long rv; if (i >= n-2) { /* There are two atoms left to place, rem is their sum, and val is the valence coming in. his leaves only one possible solution: the next atom must be (rem+val)/2, as illustrated by the examples below. If this "ideal" value is legal (<= n), bigger than the valence coming in, and if that atom hasn't been used yet, we have a solution. rem val solution 5 3 4-1 5 2 impossible by parity 5 1 3=2 17 1 9⠶⠶8 3 5 impossible by valence Note that if this penultimate atom is available, we know the final one is available too because we've been keeping track of the sum. */ int pen; /* penultimate atom */ pen = rem + val; if (pen & 1) { return(0); } pen >>= 1; if ((pen <= n) && (pen > val) && ((bb & (1 << pen)) == 0)) { return(1); } return(0); } /* Recurse on all unused numbers */ rv=0; /* Try all candidates for the next atom. Note this next atom must be at least 1 greater than the incoming valence. */ for(j=val+1; j<=n; j++) { /* Check bitmap to see if this atom is available */ if ((bb & (1 << j)) == 0) { /* Recurse and accumulate */ rv = rv + a020916_p(i+1, n, bb | (1 << j), j - val, rem-j); } } return(rv); } void seq_a020916(void) { int n, sum; long long tot; for(n=2; n<=20; n++) { sum = n * (n+1) / 2; if ((n+3)& 2) { /* We always put the '1' atom first; this eliminates mirror image equivalents and reduces the depth of the search. */ tot = a020916_p(1, n, 1<<1, 1, sum-1); } else { tot = 0; } printf("%d %lld\n", n, tot); } }
A Better Algorithm
The above is a "brute-force" algorithm that essentially searches all combinations explicitly. An improved method is to create a list of "left" and "right" half-molecules. A half-molecule is any combination of atoms that might be part of a solution. For example, when N=7, the following table shows all possibilities for the left 4 atoms in the left column, and the possibilities for the right 3 atoms in the right column:
|
To make a solution, the left half needs to have the same bond as the right half so they can join, and the two together have to have one of each atom. It turns out that if the latter criterion is met, the former follows (see if you can prove this for yourself). So, for example the 1-3=5≡7:: on the left matches up with the ::6=4=2 on the right.
To find matches efficiently, the two lists need to be sorted into an equivalent order. This can be done by tagging the items on the left with a label that tells which atoms are present, and the items on the right with which atoms are absent. For example, the tag for 1-4≡5=3- is 1345 (its atoms in ascending order), and the tag for =6::7≡3 is 1245 (the atoms it is lacking). Then sort both lists by the tags, and they will be ordered in a way that allows all 24 matches to be found with a single scan.
There are more items here (17+19=36) than actual solutions (24) to the N=7 problem. However, it is still a lot fewer than the total number of combinations that have to be checked by the full-search technique shown above. (The full-search method builds 71 different sequences of length 7; without the "pruning" techniques there would be 720 permutations to check). This is not much of an advantage when N=7, but it becomes much greater for higher values of N. For example, when N=15, the simpler method has to check 122747099 patterns of length 15, and the "left and right halves" method needs to generate only 786156+849212=1635368 patterns, which is 1/75 as many. For N=19, the simple method took over a day, while the optimized method took only 3 minutes.
This technique allows the N=23 case to be computed in a couple days, using a balanced binary tree data structure for the lists. N=24 took 17.7 days on a single Xeon processor core.
The next nonzero term is N=27. Extrapolating from known results, I estimate a runtime of about 68 years on a single processor, or about a month if the task can be divided amongst 1000 processors with 75% efficiency.
Some other sequences are discussed here.
This page was written in the "embarrassingly readable" markup language RHTF, and was last updated on 2022 Mar 19. s.27