Minimally Complex Sequences
Quick Links: MCS Source Code License (Creative Commons AttributionNonCommercial 4.0 International)
This is a listing of integer sequences generated by what I call 'classical' definitions. These include virtually all number sequences known by ancient cultures (the notable exception is the prime numbers). There are also many others that have equally simple definitions. Examples include:
 The natural numbers: 0, 1, 2, 3, 4, 5, 6, 7, ...
 The squares: 0, 1, 4, 9, 16, 25, 36, 49, ... (and others that have a "figurate" interpretation)
 The powers of 2: 1, 2, 4, 8, 16, 32, 64, ...
 The Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, ...
 The factorials: 1, 1, 2, 6, 24, 120, 720, ...
The subject of this page is a broader class of recurrence relations than those described on the 2^{nd}order linear recurrence page, but not broad enough to include the recurrence formulas on my accelerating sequences page.
My comprehensive sequences page is broader still, and has pointers to many other specialized pages on other subcategories of sequences.
Contents
Types of Sequences in my Catalog
Detailed Description of Sequence Numbering System
C Source Code
Ordering Used in the Table
Main Table
There is an index to the main table at the bottom of each page.
Statistics
Footnotes
Types of Sequence Definitions
The following broad categories of sequence definitions are used:
Direct: Each term A_{N} is computed directly from a formula involving N but not depending on other terms in the sequence.
(This category includes the even numbers, if defined as E_{N}=2N.)
FirstOrder Recurrence^{1}: After an initial term A_{0}, subsequent terms A_{N+1} are defined in terms of a formula involving A_{N} (and possibly also involving N).
(This category includes the factorials, if defined as F_{0}=1, F_{N}=N F_{N1}, and the even numbers, if defined as E_{0}=0, E_{N}=2+E_{N1}. Notice that the same sequence can have both a direct definition and a recurrence definition.)
SecondOrder Recurrence^{1}: After two initial terms A_{0} and A_{1}, subsequent terms A_{N+1} are defined in terms of a formula involving A_{N1}, and possibly also A_{N} and/or N.
(This category includes the Fibonacci numbers, F_{0}=0, F_{1}=1, F_{N+1}=F_{N}+F_{N1}. It includes all of the 2^{nd}order linear recurrence sequences described here but also a lot of others such as MCS54400, which is nonlinear because its formula has an NA_{N} term.).
ThirdOrder Recurrence^{1}: After three initial terms A_{0}, A_{1} and A_{2}, subsequent terms A_{N+1} are defined in terms of a formula involving A_{N2}, and possibly also A_{N}, A_{N1}, and/or N.
All formulas are a sum of terms involving a constant coefficient multiplied by some combination of a power of N and zero or more previous terms. This table shows the combinations that are used:

Direct formulas use only the first row; firstorder recurrence formulas use the first two rows, and so on.
Some of the sequences belong to a more narrowlydefined category studied by Horadam and possessing some interesting properties. They are discussed more fully on my linear recurrences page: The Horadam Sequences.
The 2^{nd}order Lucas sequences are another narrowlydefined category of linear recurrence relations. Again, see my linear recurrences page: The Lucas Sequences.
Sequence ID Number
Each sequence definition is given an identification number that uniquely specifies the initial term(s) (if any) and formula. All ID numbers begin with the letters MCS (for Minimally Complex Sequence) followed by one or more decimal digits. The decimal digits are chosen by the following steps:
1. The sequence type, initial value(s) if needed and coefficients
are encoded into fields, either by using the coefficient
directly as a field value, or using a translation table that
ensures the "default" coefficient has a field value of 0.
2. The fields are Huffmanencided into binary strings.
3. The binary strings are concatenated into a single bitstring.
4. The bitstring is modified into a truncated bitstring.
5. The truncated bitstring is treated as a single positive integer;
this is the sequence number, and the decimal representation
of this integer is placed after MCS to make the sequence ID number.
The first field gives the sequence type as defined above: 4 for direct, 3 for 1^{st}order recurrence, 2 for 2^{nd}order recurrence and 1 for 3^{rd}order recurrence. After this there are several more fields for the coefficients, in the order shown here:
For direct definitions: K, C_{N}, C_{N2}, C_{N3}
For 1^{st}order recurrence: K, C_{AN}, C_{N}, C_{N AN}, C_{N2}, C_{N3}, A_{0}
For 2^{nd}order recurrence: K, C_{AN}, C_{N}, C_{AN1}, C_{N AN}, C_{N2}, C_{N3}, C_{N AN1}, D_{1}, A_{0}
For 3^{rd}order recurrence: K, C_{AN}, C_{N}, C_{AN1}, C_{N AN}, C_{N2}, C_{AN2}, C_{N AN1}, C_{N3}, D_{2}, D_{1}, A_{0}
where D_{1}=A_{1}A_{0}1 and D_{2}=A_{2}A_{1}1
All fields are then converted to one or more binary digits according to the following table:
0 → 0
1 → 100
2 → 1010
2 → 10110
4 → 101110
4 → 1011110
5 → 1011111
1 → 110
3 → 1110
5 → 111100
6 → 1111010
6 → 1111011
3 → 111110
7 → 111111000
8 → 111111001
x → 11111101x (for future expansion)
7 → 111111100
8 → 111111101
x → 11111111x (for future expansion)
Then the fields, so encoded, are concatenated to form the bitstring. This table was generated by a normal Huffman statistical division based on the frequency of coefficient values in all sequence definitions with a complexity score (see below) of 9 or less; then the lengths were diminished as much as possible with the goal of reducing the average overall bitstring length; it was then reordered so that the shorter (more common) bitstrings end in 0, and so that larger coefficients map onto larger bitstrings.
Steps 34: Truncated Bitstring, and Sequence Number
To make the truncated bitstring, first an initial 1 is added (because otherwise, initial 0's would be lost when the bitstring is converted to a decimal integer). Then any trailing 0's are removed (there is no loss of information because, as specified below, extra 0 terms at the end are always assumed and allow future expansion). Then a single trailing 1 is removed (again there is no loss of information because at this point we know there is at least one trailing 1). The bitstring is then converted to decimal as if it were a binary integer. "MCS" followed by this decimal number is the sequence ID number. No zeros are added; the sequence numbers have a variable number of digits. In general, sequences with simpler definitions will have smaller sequence numbers.
For future expansion of the sequence generator, sequence types other than 1 through 4 are unused (which means that there are currently no bitstrings starting with 10, corresponding to sequence numbers between 2^{N} and 3×2^{N1} for all N; plus other smaller ranges defined similarly). This provides space to define new sequence types and formulas, without having to resort to long sequence ID numbers. These new sequence types can use any encoding whatsoever, even abandoning the requirements given in the next few paragraphs, so long as the bitstring of all 0's remains undefined (because it results in a 0 or null sequence number).
Additional fields can be added at the end of the existing coefficient lists. For example, an N^{4} term could be added, with all existing sequences having a C_{N4} coefficient of 0. Any new fields and the corresponding formula terms will be defined with this compatibility in mind. (Here is an example: suppose it is decided to allow the coefficient of A_{N} to be a fraction. This necessitates a new "denominator of C_{AN}" term. Its useful values are positive integers; the value of 1 is the one that has no effect. So, a 1 in this coefficient would be encoded as a field value of 0; the higher values 2, 3, 4, 5, ... could be encoded as 1, 1, 2, 3, etc. respectively, in order to take maximum advantage of the huffman field lengths.) Since the Huffman table maps 0→0, and trailing 0's are discarded before converting to a decimal sequence ID, it is easy to see that any number of extra 0 fields can be added, with no effect on the resulting sequence ID number. Even after the hypothetical N^{4} and "denominator of C_{AN}" terms are added, all previouslydefined sequences will still have the same ID numbers.
Furthermore, for any new fields, a different Huffman table could be used, or the expansion could even use a nonHuffman type of encoding, so long as all sequence numbers as currently defined result in the sequences currently defined. (For example, suppose it is decided that the constant, N, N^{2} and N^{3} terms should each have an optional factor of 1^{N}. In this case the best approach is to add 4 fields that are each exactly 1 bit wide (or a single field of 4 bits). For backwards compatibility, bit values of 0 mean there is no factor of 1^{N}; and all sequence numbers would be interpreted as having this extra 4 bits at the end of the bitstring.)
There are by definition no sequence numbers with a leading 0 when expressed in decimal, such as MCS027. These are intended to be used as manuallyassigned sequence numbers, as in the Sloane database (except that there is still no fixed number of digits). Thus, to find out the definition of MCS027 or MCS0143 you would have to look in the database. (No such database exists at present, however it is anticipated that these will be used to attach notes and crossreferences, for example to document the multiple different equivalent definitions of a sequence; an example follows.)
The sequence number is not affected by the scoring system, although the scoring system affects which sequences are listed in the main table. There is, however, a rough correlation between the coefficient value weights in the scoring system and their Huffman symbol lengths.
Some sequences are shown with multiple sequence numbers — an initial sequence number in larger print, followed by "(alias ...)" with additional sequence numbers. The aliases represent different formulas that create the same sequence. Here is an example of two definitions that are considered to be aliases:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
MCS220: A_{N} = N (score: 1)
1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, ...
MCS886: A_{N} = N  1 (score: 2)
These sequences are considered equivalent because they differ only in the initial 1 at the beginning of MCS886. In the main table they are both listed under MCS220. The number MCS220 is chosen not because 220 is less than 886, but because MCS220 has a lower complexity score (1 vs. 2).
For the purposes of matching aliases, any initial 1, 0 or 1 terms are ignored but all other terms are significant, including 1, 0 or 1 terms that occur after the first term of magnitude 2 or greater.
Most of the time, aliases will have quite different sequence numbers, as in the example with MCS220 and MCS886. However, in a few rare cases the sequence numbers are very close (such as MCS860675 and MCS860676, two versions of the Lucas numbers that differ only by an initial "1"). Another curiousity (that has an easy explanation if you look at the encoding scheme) is seen in MCS220 = N, MCS440 = N^{2} and MCS880 = N^{3}. And although it is not listed here, MCS110 is N^{0}.
The Fibonacci numbers have a secondorder recurrence definition: A_{0}=0; A_{1}=1; A_{N+1}=A_{N}+A_{N1}. The coefficients that need to be encoded are: sequence type=2; K=0; C_{AN}=1; C_{N}=0; C_{AN1}=1; C_{N AN}=0; C_{N2}=0; C_{N3}=0; C_{N AN1}=0; A_{1}A_{0}1=0; A_{0}=0. So the fields, before encoding are: 2, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0. By the Huffman table, this becomes the bitstring 1.1010.0.100.0.100.0.0.0.0.0.0 (after adding the initial 1). Then trailing zeros are dropped, and then a single trailing 1 is dropped, making 1.1010.0.100.0 or 1101001000; this is converted to decimal giving the value 840. So, this definition of the Fibonacci numbers is given sequence number MCS456.
Note that some sequences have multiple definitions. For example, the natural numbers may be defined as a direct sequence: A_{N} = N. Using this definition the bitstring is 1.10110.0.100.0, and the sequence number is MCS108. However, the natural numbers can also be defined by a 1^{st}order recurrence A_{0}=0; A_{N+1} = A_{N} + 1. By this definition, the bitstring is 1.1110.100.100.0.0.0.0, and the sequence number is MCS244. As you might expect, the first definition has a lower complexity score, and that is the one shown in the main table.
Complexity Score
Each sequence definition is given a complexity score based on the number of terms and the size of its coefficients. The basic idea is to take the number of terms in the formula and add the absolute value of the initial terms plus any coefficients (a coefficient of 1 does not count). After a while I made the score calculation a bit more complex to better handle 2^{nd} and 3^{rd}order recurrence formulas.
For examples of how it works just browse the following table and search for "score: 3", etc.
The main table includes those sequences with complexity scores up to 5.
Efficient Implementation
I have written an efficient search engine that finds MCS sequences matching a given query, with results ordered by complexity score. The source code is here:
mcsfind.c
rpmtypes.h
int128.c
int128.h
type_s16.h
type_s32.h
type_s64.h
Build Instructions
These instructions are for UNIX, Linux or MacOS (using a Terminal or Shell window).
Save each of the source files on your computer, and remove the ".txt" from the end of each of the source files' names. Put theminto directories that are set up like the following. The name of the directory "include" is important, but you can use different names for "development" and "mcs" if you wish:
development include rpmtypes.h int128.c int128.h type_s16.h type_s32.h type_s64.h mcs mcsfind.cThen from within the development/mcs directory, use the following commands:
g++ c m64 ../include/int128.c o int128.o g++ m64 I../include mcsfind.c lm int128.o o mcsfindIf the compilation is successful, you should be able to perform MCS searches like the following example:
$ time ./mcsfind w1 2 3 5 9 2, 3, 5, 9, 17, 33, 65, 129, 257, 513, 1025, 2049, 4097, 8193, 16385, 32769, 65537, 131073, 262145, 524289, 1048577, 2097153, 4194305, 8388609, 16777217, ... MCS252546 : A[0] = 2; A[N] = 2 A[N1]  1 (score: 4) 1, 2, 3, 5, 9, 15, 25, 43, 73, 123, 209, 355, 601, 1019, 1729, 2931, 4969, 8427, 14289, 24227, 41081, 69659, 118113, 200275, 339593, 575819, 976369, ... MCS802976 : A[0] = 1; A[1] = 2; A[2] = 3; A[N] = A[N1] + 2 A[N3] (score: 5) 1, 1, 2, 3, 5, 9, 18, 39, 89, 209, 498, 1195, 2877, 6937, 16738, 40399, 97521, 235425, 568354, 1372115, 3312565, 7997225, 19306994, 46611191, 112529353, ... MCS6904326 : A[0] = 1; A[1] = 1; A[K+1] = 2 A[K] + A[K1]  K (score: 5) real 0m0.008s user 0m0.006s sys 0m0.002s $
The mcsfind command has many options and includes builtin help. To see the builtin help, use the command ./mcsfind help
Performance
Using the reportall and n options, along with the wc tool in UNIX, one can measure the speed with which mcsfind generates the N "simplest" sequences. For example, the command:
time mcs ls20 reportall n10000  wc
might show that mcsfind took 0.326 seconds to generate 10,000 sequences. In the following table I measure the number of sequences, and time taken, for various values of the ls option, which sets an upper limit on the score of all the generated sequences. A sample command is:
time mcs reportall ls8  wc

All tests were run on a 2.16 GHz Nehalem Xeon workstation, but the workstation was busier when the oddnumbered entries were run, giving those a lower benchmark in the sequences/sec column. I expect that any CPU with a speed of 2 GHz or faster will have no problem meeting or exceeding 40,000 sequences/sec on runs of a few seconds or longer.
Note also that the results for maximum score of 15 are the same as for 16. This results from the scoring algorithm and limits on coefficient magnitudes; there are simply no sequences with a score in the range (14..15].
Ordering Used in the Table
I use the ordering originally established by N.J.A. Sloane [1].
Here the sequences are presented in lexicographic order, as if minus signs and any leading 1, 0 or 1 terms were ignored. For example, "1, 1, 2, 1, 6, 5, 6, ..." is treated as if it starts "2, 1, 6, 5, 6, ...".
Note in particular that any aliases (see above) are considered identical and listed together just as they would have been in Sloane's books ([1] and [2]).
. . . Forward to page 2 . . . Last page (page 30)
Quick Index:
Sequences Beginning 2,0,...
Sequences Beginning 2,1,...
Sequences Beginning 2,2,0,... ; 2,2,1,... ; etc.
Sequences Beginning 2,2,4,... ; 2,2,5,... ; etc.
Sequences Beginning 2,3,0,... ; 2,3,1,... ; etc.
Sequences Beginning 2,3,4,... ; 2,3,5,... ; etc.
Sequences Beginning 2,4,...
Sequences Beginning 2,5,... ; 2,6,... and 2,7,...
Sequences Beginning 2,8,... ; 2,9,... ; etc.
Sequences Beginning 3,0,... ; 3,1,... ; etc.
Sequences Beginning 3,4,... ; 3,5,... ; etc.
Sequences Beginning 3,8,... ; 3,9,... ; etc.
Sequences Beginning 4,0,... ; 4,1,... ; etc.
Sequences Beginning 4,6,... ; 4,7,... ; etc.
Sequences Beginning 5,...
Sequences Beginning 6,... ; 7,... ; etc.
This page was written in the "embarrassingly readable" markup language RHTF, and some sections were last updated on 2016 Jun 09. s.11