I am interested in calculating the nth term of higher order Fibonacci sequence in C++, using matrix exponentiation, while allowing n to be reasonably large (in the range of hundreds).
The sequences I have described are not Fibonacci sequence only but generalizations of it, of which Fibonacci is order 2, in order 3 3 previous terms are used to calculate the next term, i.e. F(n) = F(n – 1) + F(n – 2) + F(n – 3), this is usually called Tribonacci, order 4 is Tetranacci, and order 5 Pentanacci… I wish to calculate series up to 36nacci.
I know Fibonacci like sequences grow extremely fast, so I used boost/multiprecision/cpp_int.hpp to allow arbitrary length integers in C++, to let the function be able to calculate several hundredth term.
Order here means number of previous terms used to calculate the next term, Fibonacci is order 2, Tribonacci is order 3 and Tetranacci order 4, etc. Also, the series starts with n – 1 zeros and then S[n] == 1 for series S of order n, i.e. order 2 starts with [0, 1]
and 3 [0, 0, 1]
and 4 [0, 0, 0, 1]
, this specification is consistent with Wikipedia and OEIS and the logic of the sequences.
I have found the matrices that could be used to generate terms for orders 2, 3 and 4 online, and I have found the patterns of how the matrices are made, and wrote functions to generate matrices for arbitrary high orders.
import numpy as np
def onacci_matrix(n: int) -> np.ndarray:
mat = np.zeros((n, n), dtype=np.uint64)
mat[0, :] = 1
for i in range(1, n):
mat[i, i - 1] = 1
if not n % 2:
mat = mat.T
return mat
The matrices for the first five orders are:
In [336]: onacci_matrix(2)
Out[336]:
array([[1, 1],
[1, 0]], dtype=uint64)
In [337]: onacci_matrix(3)
Out[337]:
array([[1, 1, 1],
[1, 0, 0],
[0, 1, 0]], dtype=uint64)
In [338]: onacci_matrix(4)
Out[338]:
array([[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 0, 0, 1],
[1, 0, 0, 0]], dtype=uint64)
In [339]: onacci_matrix(5)
Out[339]:
array([[1, 1, 1, 1, 1],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0]], dtype=uint64)
In [340]: onacci_matrix(6)
Out[340]:
array([[1, 1, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0],
[1, 0, 0, 1, 0, 0],
[1, 0, 0, 0, 1, 0],
[1, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 0, 0]], dtype=uint64)
More specifically, using zero-based indexing, the item located at row 0 column 0 of the base matrix raised to ith power is the item with index i + o – 1 of the series of order o.
from typing import List
def onacci_iterative(n: int, o: int) -> List[int]:
series = []
stack = [0] * (o - 1) + [1]
for _ in range(n + 1):
stack.append(sum(stack))
series.append(stack.pop(0))
return series
ONACCI6 = onacci_matrix(6)
In [342]: np.linalg.matrix_power(ONACCI6, 19)
Out[342]:
array([[233904, 117920, 59448, 29970, 15109, 7617],
[230064, 115984, 58472, 29478, 14861, 7492],
[222447, 112144, 56536, 28502, 14369, 7244],
[207338, 104527, 52696, 26566, 13393, 6752],
[177368, 89418, 45079, 22726, 11457, 5776],
[117920, 59448, 29970, 15109, 7617, 3840]], dtype=uint64)
In [343]: onacci_iterative(24, 6)[24]
Out[343]: 233904
Because exponentiation is involved, this can be done using exponentiation by squaring and thus done in logarithmic time.
The matrices are stored as flat vectors to get maximum performance, I have written functions to inspect which elements of a and b are needed for each element of a dot b, and calculated the exact index that for the series of order o has the maximum term representable by unsigned integers of a given width:
def matmul_indices(n: int) -> dict:
indices = {}
for y in range(n):
s = y * n
for x in range(n):
indices[s + x] = [(s + z, z * n + x) for z in range(n)]
return indices
def onacci_limits(o: int, lim: int) -> dict:
limits = []
lim = 1 << lim
for i in range(2, o + 1):
n = 0
while onacci(n, i)[-1] <= lim:
n += 1
limits.append(n - 1)
return limits
So the following is the C++ code I have written so far:
#include <array>
#include <boost/multiprecision/cpp_int.hpp>
#include <chrono>
#include <iostream>
#include <vector>
using boost::uint64_t;
using boost::multiprecision::uint128_t;
using boost::multiprecision::uint256_t;
using boost::multiprecision::uint512_t;
using std::array;
using std::chrono::high_resolution_clock;
using std::vector;
constexpr array<int, 35> ONACCI_64 = {
93, 75, 71, 70, 70,
71, 72, 73, 74, 75,
76, 77, 78, 79, 80,
81, 82, 83, 84, 85,
86, 87, 88, 89, 90,
91, 92, 93, 94, 95,
96, 97, 98, 99, 100
};
constexpr array<int, 35> ONACCI_128 = {
186, 148, 139, 136, 135,
135, 136, 137, 138, 139,
140, 141, 142, 143, 144,
145, 146, 147, 148, 149,
150, 151, 152, 153, 154,
155, 156, 157, 158, 159,
160, 161, 162, 163, 164
};
constexpr array<int, 35> ONACCI_256 = {
370, 293, 274, 267, 265,
264, 264, 265, 266, 267,
268, 269, 270, 271, 272,
273, 274, 275, 276, 277,
278, 279, 280, 281, 282,
283, 284, 285, 286, 287,
288, 289, 290, 291, 292
};
constexpr array<int, 35> ONACCI_512 = {
739, 585, 544, 529, 524,
521, 521, 521, 522, 523,
524, 525, 526, 527, 528,
529, 530, 531, 532, 533,
534, 535, 536, 537, 538,
539, 540, 541, 542, 543,
544, 545, 546, 547, 548
};
uint64_t square_sum(int n) {
double sq, cu;
sq = n * n;
cu = sq * n;
return uint64_t(floor(cu / 3 + sq / 2 + double(n) / 6 - 0.5));
}
vector<int> ONACCI_MATRIX(16205);
void generate_matrix(int n) {
uint64_t start;
start = square_sum(n - 1);
if (n & 1) {
for (int i = 0; i < n * n; i++) {
auto [q, r] = std::div(i, n);
ONACCI_MATRIX[start + i] = not q or q - r == 1;
}
}
else {
for (int i = 0; i < n * n; i++) {
auto [q, r] = std::div(i, n);
ONACCI_MATRIX[start + i] = not r or r - q == 1;
}
}
}
template <typename T>
vector<T> square_multiply(const vector<T>& a, const vector<T>& b, int n) {
int square = n * n;
vector<T> c(square);
T s;
for (int y = 0; y < square; y += n) {
for (int x = 0; x < n; x++) {
s = 0;
for (int z = 0; z < n; z++) {
s += a[y + z] * b[x + z * n];
}
c[y + x] = s;
}
}
return c;
}
template <typename T>
T onacci_matrix(int n, int o) {
if (o < 2 or o > 36) {
throw std::invalid_argument("order must be between 2 and 36");
}
if (n < o - 1) {
return 0;
}
if (n == o - 1) {
return 1;
}
int square = n * n;
vector<T> base(square), p(square);
uint64_t start = square_sum(n - 1);
for (uint64_t i = start; i < start + square; i++) {
base[i] = ONACCI_MATRIX[i];
p[i] = ONACCI_MATRIX[i];
}
n -= o;
while (n) {
if (n & 1) {
p = square_multiply<T>(base, p, o);
}
base = square_multiply<T>(base, base, o);
n >>= 1;
}
return p[0];
}
int main()
{
for (int i = 2; i <= 36; i++) {
generate_matrix(i);
}
uint128_t n = onacci_matrix<uint128_t>(139, 4);
std::cout << n;
}
The code is incomplete, I want to use template functions to test the performance of each type, and I want the program to act like a shell, with the user entering the order of the series and the index of the desired term, the index is then used to determine the smallest bit width that is necessary to make sure there is no overflow. The order must be between 2 and 36, and the term at the index cannot be greater than the range of uint512_t
.
The code doesn’t work, it compiles successfully, but it crashes. I compiled using Visual Studio 2022, with C++20 standard, in Debug x64 mode, and I got error message:
Expression: vector subscript out of range
I am a beginner in C++ and I have no idea what caused the bug, or if there are other bugs, or if my code is completely wrong in the first place.
How to fix the code?
I have clicked retry already, and the program just crashes again. And that line 1886 is in the vector.h file, my program has only 136 lines.
2
Answers
The title ask about a high level algorithm, but at the end your actual question is about a segmentation fault which, I deduce, is localized a logical error in your code.
A quick replacement of
.operator[](i)
by.at(i)
, shows that your code goes out-of-bounds at least once in this loop:https://godbolt.org/z/ExYxzfhq1
The error has a bit more information than the MSVC assert dialog:
For any non-zero value of
start
it seems inevitable that this will happen.I don’t have a fix because I have no idea what you are trying to do.
Perhaps you mean
i < start + n
; I don’t know.IMO, this sounds like the typical logical error when making linear arrays into 2D arrays instead of using a proper 2D array library.
I didn’t know what the code was really trying to accomplish.
So when piecing it together I accidentally rewrote the stuff. Here’s my iterative any-Order fibonacci:
Here’s a simple demo Live Online:
Matrix Approach
I won’t do all the matrix logic myself, instead using Boost Ublas:
First of all, let’s create the exponentation helper like you describe:
Next, we need the
onacci_matrix
generator:To check that everything is correct, I created a benchmark comparison:
And when you run that like
It prints
This obviously matches the value from the question. Note that although at
n=19
the iterative solution is obviously faster, it’s possible to check validity even without scaling the integer type:These all correctly check out, 36x "itval == mval? true":
Multiprecision
To get the actually correct values, let’s use Boost Multiprecision. Instead of printing the entire values (which will be very large) we print the ₁₀log (basically, the number of digits).
I benchmarked
Int = uint64_t: (overflows)
Int = mp::int128_t: (overflows)
Int = mp::int1024_t: (still overflows)
The only types that will likely fit the bill are arbitrary-precision integer types:
Int = mp::cpp_int: (never overflows, no link dependency)
Int = mp::mpz_int: (can never overflow, uses GMP/MPFR)
Here’s the code with the
mpz_int
enabled: Live On ColiruSUMMARY
So while nothing above addresses your code, it may help you validating your intermediates, and perhaps getting ballpark measures on what performance to expect.