TL;DR, for fastest solution, go to Solution #1.
Solutions:
Input Data!
data = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])
Output:
a
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
b
array([[ 5, 6, 7, 8],
[13, 14, 15, 16]])
NumPy:
Solution 1: Fastest one + Generalized:
We reshape the matrix to already done chunks, and then we use a list comprehension to slice the arrays and accomplish the task.
I use reshaped[:, i, :] to filter out the specific chunk from the 3D matrix.
Code below:
reshaped = data.reshape(-1, m, n)
chunks = [reshaped[:, i, :] for i in range(m)]
For 1D:
reshaped = data.reshape(-1, m, n)
chunks = [reshaped[:, i, :].ravel() for i in range(m)]
Above is the best and cleanest answer!
Solution 2: Slower with array_split
np.array_split splits numpy arrays into chunks, but normally it splits to a certain amount of chunks, which is not what we want, we want to split the array to chunks of certain amount of items.
I'm using (data.shape[0] + 3) // 4, which gets the length of the array, adds 3 for insurance of the division (in case of extra items), and use floor division to divide by chunks of 4, to know how many chunks we need.
Then I use [::2] and [1::2] to split every other element into the two lists.
Code below:
chunks = np.array_split(data, (data.shape[0] + 3) // 4)
a, b = chunks[::2], chunks[1::2]
Solution 3: Another faster one with reshape
I use reshape to calculate how many chunks with using floor division to divide 4 with the shape of the matrix.
Then I use [::2] and [1::2] to split every other element into the two lists.
chunks = data.reshape((data.shape[0] // 4, 4))
a, b = chunks[::2], chunks[1::2]
I reshape the array into a 2D array with chunks of 4 items.
Solution 4: Reshape and chunking
I use reshape to change the shape of the array, to chunks of 8 using (-1, 8).
If you don't understand what -1 does, check out the post here.
As mentioned in the comments my @BallpointBen:
... When reshaping an array, the new shape must contain the same number of elements as the old shape, meaning the products of the two shapes' dimensions must be equal. When using a -1, the dimension corresponding to the -1 will be the product of the dimensions of the original array divided by the product of the dimensions given to reshape so as to maintain the same number of elements.
I then use numpy slicing to properly slice the matrix by chunks of 4.
Code below:
reshaped = data.reshape(-1, 8)
a = reshaped[:, :4]
b = reshaped[:, 4:]
List Comprehension
Solution 5: Using modulo for conditional slicing in list comprehension
I iterate through the length of the array, then every 4 items, I use slicing to get chunks of 4.
Then I use [::2] and [1::2] to split every other element into the two lists.
Code below:
chunks = [data[i:i+4] for i in range(len(data)) if (i % 4) == 0]
a, b = chunks[::2], chunks[1::2]
Solution 6: Using list comprehension and zip:
I iterate through the length of the array, then every 8 items, I use slicing to get 2 chunks of 4.
After that, I use zip to alternate the items to achieve the desired result.
Code below:
chunks = [[data[i:i+4], data[i+4:i+8]] for i in range(len(data)) if (i % 8) == 0]
a, b = zip(*chunks)
Timings:
I'm testing the following of exactly 499875840 items, as the OP specified to determine the speed of each solution:
data = np.arange(1, 499875841)
Code used to test:
import timeit
def U13_1():
# Generalized
data = np.arange(1, 499875841)
reshaped = data.reshape(-1, n, m)
chunks = [reshaped[:, i, :] for i in range(n)]
def U13_2():
data = np.arange(1, 499875841)
chunks = np.array_split(data, (data.shape[0] + 3) // 4)
a, b = chunks[::2], chunks[1::2]
def U13_3():
data = np.arange(1, 499875841)
chunks = data.reshape((data.shape[0] // 4, 4))
a, b = chunks[::2], chunks[1::2]
def U13_4():
data = np.arange(1, 499875841)
reshaped = data.reshape(-1, 8)
a = reshaped[:, :4]
b = reshaped[:, 4:]
def U13_5():
data = np.arange(1, 499875841)
chunks = [data[i:i+4] for i in range(len(data)) if (i % 4) == 0]
a, b = chunks[::2], chunks[1::2]
def U13_6():
data = np.arange(1, 499875841)
chunks = [[data[i:i+4], data[i+4:i+8]] for i in range(len(data)) if (i % 8) == 0]
a, b = zip(*chunks)
def hpaulj():
# Generalized
data = np.arange(1, 499875841)
a,b,c = arr.reshape(-1, M, N).transpose(1, 0, 2).reshape(-1, arr.size//M)
def mozway():
# Generalized
data = np.arange(1, 499875841)
out = np.split(np.argsort(np.arange(len(data))%(4*2)//4, kind='stable'), 2)
def pauls():
data = np.arange(1, 499875841)
i = np.arange(1, len(data)+1) % 8
m = (i >= 1) & (i <= 4)
a = data[m]
b = data[~m]
U13_1 = timeit.timeit('U13_1()', 'from __main__ import U13_1', number=1)
print('U13_1 Speed:', U13_1)
U13_2 = timeit.timeit('U13_2()', 'from __main__ import U13_2', number=1)
print('U13_2 Speed:', U13_2)
U13_3 = timeit.timeit('U13_3()', 'from __main__ import U13_3', number=1)
print('U13_3 Speed:', U13_3)
U13_4 = timeit.timeit('U13_4()', 'from __main__ import U13_4', number=1)
print('U13_4 Speed:', U13_4)
U13_5 = timeit.timeit('U13_5()', 'from __main__ import U13_5', number=1)
print('U13_5 Speed:', U13_5)
hpaulj = timeit.timeit('hpaulj()', 'from __main__ import hpaulj', number=1)
print('U13_6 Speed:', U13_6)
hpaulj = timeit.timeit('hpaulj()', 'from __main__ import hpaulj', number=1)
print('hpaulj Speed:', hpaulj)
mozway = timeit.timeit('mozway()', 'from __main__ import mozway', number=1)
print('mozway Speed:', mozway)
pauls = timeit.timeit('pauls()', 'from __main__ import pauls', number=1)
print('pauls Speed:', pauls)
import matplotlib.pyplot as plt
df = pd.DataFrame({'U13_1': U13_1, 'U13_2': U13_2, 'U13_3': U13_3, 'U13_4': U13_4, 'U13_5': U13_5, 'U13_6': U13_6, 'hpaulj': hpaulj, 'mozway': mozway, 'pauls': pauls}, index=[0])
plt.plot(df.columns, df.iloc[0])
plt.show()
Output:
U13_1 Speed: 0.45211420000032376
U13_2 Speed: 378.43887300000006
U13_3 Speed: 0.6713313000000198
U13_4 Speed: 0.6900013000004037
U13_5 Speed: 298.6191995999998
U13_6 Speed: 601.4342458000001
hpaulj Speed: 1.2215797999997449
mozway Speed: 21.53476950000004
pauls Speed: 4.3681056999994325
Fastest ones:
My solution #1 Generalized
My solution #3 Not Generalized
My solution #4 Not Generalized
Fastest ones (only including generalized):
My solution #1 Generalized
@hpaulj's solution Generalized
@mozway's solution Generalized
Graph:

b = [a[cut:cut + 4] for cut in range(0, len(a), 8)]NandMrelative the length of the array. Does result have to be a list ofParrays (each M` long), or will it always be a list 2 arrayd? How about a (P,M) shaped array instead of a list?