Array slicing


In computer programming, array slicing is an operation that extracts a subset of elements from an array and packages them as another array, possibly in a different dimension from the original.
Common examples of array slicing are extracting a substring from a string of characters, the "ell" in "hello", extracting a row or column from a two-dimensional array, or extracting a vector from a matrix.
Depending on the programming language, an array slice can be made out of non-consecutive elements. Also depending on the language, the elements of the new array may be aliased to those of the original array.

Details

For "one-dimensional" arrays — vectors, sequence, strings etc. — the most common slicing operation is extraction of zero or more consecutive elements. Thus, if we have a vector containing elements, and we want to create an array slice from the 3rd to the 6th items, we get. In programming languages that use a 0-based indexing scheme, the slice would be from index 2 to 5.
Reducing the range of any index to a single value effectively eliminates that index. This feature can be used, for example, to extract one-dimensional slices or two-dimensional slices from a three-dimensional array. However, since the range can be specified at run-time, type-checked languages may require an explicit notation to actually eliminate the trivial indices.
General array slicing can be implemented by referencing every array through a dope vector or descriptor — a record that contains the address of the first array element, and then the range of each index and the corresponding coefficient in the indexing formula. This technique also allows immediate array transposition, index reversal, subsampling, etc. For languages like C, where the indices always start at zero, the dope vector of an array with d indices has at least 1 + 2d parameters. For languages that allow arbitrary lower bounds for indices, like Pascal, the dope vector needs 1 + 3d entries.
If the array abstraction does not support true negative indices, then negative indices for the bounds of the slice for a given dimension are sometimes used to specify an offset from the end of the array in that dimension. In 1-based schemes, -1 generally would indicate the second-to-last item, while in a 0-based system, it would mean the very last item.

History

The concept of slicing was surely known even before the invention of compilers. Slicing as a language feature probably started with FORTRAN, more as a consequence of non-existent type and range checking than by design. The concept was also alluded to in the preliminary report for the IAL in that the syntax allowed one or more indices of an array element to be omitted when used as an actual parameter.
Kenneth Iverson's APL had very flexible multi-dimensional array slicing, which contributed much to the language's expressive power and popularity.
ALGOL 68 introduced comprehensive multi-dimension array slicing and trimming features.
Array slicing facilities have been incorporated in several modern languages, such as Ada 2005, Boo, Cobra, D, Fortran 90, Go, Rust, Matlab, Perl, Python, S-Lang, Windows PowerShell and the mathematical/statistical languages GNU Octave, S and R.

Timeline of slicing in various programming languages

1966: Fortran 66">Fortran#FORTRAN 66">Fortran 66

The Fortran 66 programmers were only able to take advantage of slicing matrices by row, and then only when passing that row to a subroutine:

SUBROUTINE PRINT V
REAL VEC
PRINT *,
END
PROGRAM MAIN
PARAMETER
REAL MATRIX
DATA MATRIX/1, 1, 1, 2, 4, 8, 3, 9, 27/
CALL PRINT V
END

Result:

2.000000 4.000000 8.000000

Note that there is no dope vector in FORTRAN 66 hence the length of the slice must also be passed as an argument - or some other means - to the SUBROUTINE. 1970s Pascal and C had similar restrictions.

1968: [Algol 68]

Algol68 final report contains an early example of slicing, slices are specified in the form:
¢ for computers with extended character sets ¢
or:
# FOR COMPUTERS WITH ONLY 6 BIT CHARACTERS. #
Both bounds are inclusive and can be omitted, in which case they default to the declared array bounds. Neither the stride facility, nor diagonal slice aliases are part of the revised report.
Examples:
real a :=,, ); # declaration of a variable matrix #
real c =,, ); # constant matrix, the size is implied #
refreal row := a; # alias/ref to a row slice #
refreal col2 = a; # permanent alias/ref to second column #
print ); # second column slice #
print ); # last row slice #
print ); # last column slice #
print ); # leading 2-by-2 submatrix "slice" #
+1.000010+0 +4.000010+0 +9.000010+0
+3.000010+0 +9.000010+0 +2.700010+1
+1.000010+0 +8.000010+0 +2.700010+1
+1.000010+0 +1.000010+0 +2.000010+0 +4.000010+0

1970s: [MATLAB]


> A = round % 3x4x5 three-dimensional or cubic array
> A % 3x4 two-dimensional array along first and second dimensions
ans =
8 3 5 7
8 9 1 4
4 4 2 5
> A % 3x2 two-dimensional array along first and second dimensions
ans =
3 5
9 1
4 2
> A % 2x4 two-dimensional array using the 'end' keyword; works with GNU Octave 3.2.4
ans =
6 1 4 6
10 1 3 1
> A % single-dimension array along second dimension
ans =
8 3 5 7
> A % single value
ans = 3

1976: S">S programming language">S/R">R programming language">R

Arrays in S and GNU R are always one-based, thus the indices of a new slice will begin with one for each dimension, regardless of the previous indices. Dimensions with length of one will be dropped. Dimension names will be preserved.

> A <- array # 3x4x5 three-dimensional or cubic array
> A # 3x4 two-dimensional array along first and second dimensions

25 28 31 34
26 29 32 35
27 30 33 36
> A # 3x2x1 cubic array subset
,, 1

28 31
29 32
30 33
> A # single-dimension array along first dimension
28 29 30
> A # single value
28

1977: Fortran 77">Fortran#FORTRAN 77">Fortran 77

The Fortran 77 standard introduced the ability to slice and concatenate strings:

PROGRAM MAIN
PRINT *, 'ABCDE'
END

Produces:
BCD
Such strings could be passed by reference to another subroutine, the length would also be passed transparently to the subroutine as a kind of short dope vector.

SUBROUTINE PRINT S
CHARACTER *STR
PRINT *, STR
END
PROGRAM MAIN
CALL PRINT S
END

Again produces:
BCD

1979: Sinclair_BASIC">Sinclair_BASIC#Sinclair_BASIC">Sinclair_BASIC ZX80/81/Spectrum

The standard ROM of the ZX80/81/Spectrum offers BASIC with the ability to slice and concatenate strings:
in the command part which points out the needed array slice the x and y value can be omitted giving the meaning to use all chained array cells or. With multidimensional arrays, the slicing is only possible with the last level dimension.

10 LET a$="ABCDE"
20 PRINT a$

Produces:
BCD

10 LET a$="ABCDE"
20 LET b$=a$+a$+a$
30 PRINT b$

Produces:
DEBCA

1983: Ada 83">Ada (programming language)">Ada 83 and above

Ada 83 supports slices for all array types. Like Fortran 77 such arrays could be passed by reference to another subroutine, the length would also be passed transparently to the subroutine as a kind of short dope vector.

with Text_IO;
procedure Main is
Text : String := "ABCDE";
begin
Text_IO.Put_Line ;
end Main;

Produces:
BCD
Note: Since in Ada indices are n-based the term Text will result in an Array with the base index of 2.
The definition for Text_IO.Put_Line is:

package Ada.Text_IO is

procedure Put_Line;

The definition for String is:

package Standard is
subtype Positive is Integer range 1.. Integer'Last;
type String is array of Character;
pragma Pack;

As Ada supports true negative indices as in type History_Data_Array is array of History_Data; it places no special meaning on negative indices. In the example above the term Some_History_Data would slice the History_Data from 31 BC to 30 AD.

1987: [Perl]

If we have
@a = ;
as above, then the first 3 elements, middle 3 elements and last 3 elements would be:

@a; #
@a; #
@a; #

Perl supports negative list indices. The -1 index is the last element, -2 the penultimate element, etc.
In addition, Perl supports slicing based on expressions, for example:

@a; # 4th element until the end
@a; # 1st, 4th and 7th element
@a; # every 3rd element

1991: Python">Python (programming language)">Python

If you have the following list:

>>> nums =

Then it is possible to slice by using a notation similar to element retrieval:

>>> nums # no slicing
>>> nums # from index 0 until index 3
>>> nums
>>> nums

Note that Python allows negative list indices. The index -1 represents the last element, -2 the penultimate element, etc.
Python also allows a step property by appending an extra colon and a value. For example:

>>> nums
>>> nums # nums
>>> nums # starting at index 0 and getting every third element
>>> nums # from index 1 until index 5 and getting every second element

The stride syntax was introduced in the second half of the 1990s, as a result of requests put forward by scientific users in the Python "matrix-SIG".
Slice semantics potentially differ per object; new semantics can be introduced when operator overloading the indexing operator. With Python standard lists, every slice is a copy. Slices of NumPy arrays, by contrast, are views onto the same underlying buffer.

1992: Fortran 90">Fortran#Fortran_90">Fortran 90 and above

In Fortran 90, slices are specified in the form

lower_bound:upper_bound

Both bounds are inclusive and can be omitted, in which case they default to the declared
array bounds. Stride defaults to 1. Example:

real, dimension:: a ! declaration of a matrix

print *, a ! second column
print *, a ! last row
print *, a ! leading 10-by-10 submatrix

1994: Analytica">Analytica (software)">Analytica

Each dimension of an array value in Analytica is identified by an Index variable. When slicing or subscripting, the syntax identifies the dimension over which you are slicing or subscripting by naming the dimension. Such as:

Index I := 1..5
Index J :=
Variable X := Array
X -> 20
X -> Array
X -> Array
X

Naming indexes in slicing and subscripting is similar to naming parameters in function calls instead of relying on a fixed sequence of parameters. One advantage of naming indexes in slicing is that the programmer does not have to remember the sequence of Indexes, in a multidimensional array. A deeper advantage is that expressions generalize automatically and safely without requiring a rewrite when the number of dimensions of X changes.

1998: S-Lang">S-Lang (programming language)">S-Lang

Array slicing was introduced in version 1.0. Earlier versions did not
support this feature.
Suppose that A is a 1-d array such as

A = ; % A =

Then an array B of first 5 elements of A may be created using

B = A:4;

Similarly, B may be assigned to an array of the last 5 elements of A via:

B = A-5:;

Other examples of 1-d slicing include:

A % The last element of A
A % All elements of A
A::2 % All even elements of A
A % All odd elements of A
A % All even elements in the reversed order
A, % Elements 0-3 and 10-14

Slicing of higher-dimensional arrays works similarly:

A % The last row of A
A1:5], , is an array of 10 integers. Then
A is equivalent to an array of the first 10 elements
of A. A practical example of this is a sorting
operation such as:

I = array_sort; % Obtain a list of sort indices
B = A; % B is the sorted version of A
C = A; % Same as above but more concise.

1999: D">D programming language">D

Consider the array:

int a = ;

Take a slice out of it:

int b = a;

and the contents of b will be . The first index of the slice is inclusive, the second is exclusive.

auto c = a;

means that the dynamic array c now contains because inside the the $ symbol refers to the length of the array.
D array slices are aliased to the original array, so:

b = 10;

means that a now has the contents . To create a copy of the array data, instead of only an alias, do:

auto b = a.dup;

Unlike Python, D slice bounds don't saturate, so code equivalent to this Python code is an error in D:

>>> d =
>>> d

2004: [SuperCollider]

The programming language SuperCollider implements some concepts from J/APL. Slicing looks as follows:

a = // assign an array to the variable a
a // return the first two elements of a
a // return the first two elements of a: the zero can be omitted
a // return the element 3 till last one
a0, 3 // return the first and the fourth element of a
a0, 3 = // replace the first and the fourth element of a
a = // replace the two last elements of a
// assign a multidimensional array to the variable a
a = 0, 1, 2, 3, 4], , , fish">[Friendly interactive shell">fishArrays in fish are always one-based, thus the indices of a new slice will begin with one, regardless of the previous indices.

> set A # $A is an array with the values 3, 5, 7, 9, 11
> echo $A # Print the first two elements of $A
3 5
> set B $A # $B contains the first and second element of $A, i.e. 3, 5
> set -e A; echo $A # Erase the third and fifth elements of $A, print $A
3 5 9

2006: Cobra">Cobra (programming language)">Cobra

Cobra supports Python-style slicing. If you have a list

nums =

, then the first 3 elements, middle 3 elements, and last 3 elements would be:

nums # equals
nums # equals
nums # equals

Cobra also supports slicing-style syntax for 'numeric for loops':

for i in 2 : 5
print i
  1. prints 2, 3, 4
for j in 3
print j
  1. prints 0, 1, 2

2006: [Windows PowerShell]

Arrays are zero-based in PowerShell and can be defined using the comma operator:

PS > $a = 2, 5, 7, 3, 8, 6, 4, 1
PS > # Print the first two elements of $a:
PS > "$"
2 5
PS > # Take a slice out of it using the range operator:
PS > "$"
7 3 8 6
PS > # Get the last 3 elements:
PS > "$"
6 4 1
PS > # Return the content of the array in reverse order:
PS > "$" # Length is a property of System.Object
1 4 6 8 3 7 5 2

2009: Go">Go (programming language)">Go

Go supports Python-style syntax for slicing. Arrays and slices can be sliced. If you have a slice

nums := int

then the first 3 elements, middle 3 elements, last 3 elements, and a copy of the entire slice would be:

nums // equals int
nums // equals int
nums // equals int
nums // equals int

Slices in Go are reference types, which means that different slices may refer to the same underlying array.

2010: [Cilk Plus]

Cilk Plus supports syntax for array slicing as an extension to C and C++.

array_base // All of vector A
B // Elements 2 to 7 of vector B
C // Column 5 of matrix C
D // Elements 0, 2, 4 of vector D

Cilk Plus's array slicing differs from Fortran's in two ways:
  • the second parameter is the length instead of the upper bound, in order to be consistent with standard C libraries;
  • slicing never produces a temporary, and thus never needs to allocate memory. Assignments are required to be either non-overlapping or perfectly overlapping, otherwise the result is undefined.

    2012: Julia">Julia_(programming language)">Julia

is like that of Matlab, but uses square brackets. Example:

julia> x = rand
4x3 Array:
0.323877 0.186253 0.600605
0.404664 0.894781 0.0955007
0.223562 0.18859 0.120011
0.149316 0.779823 0.0690126
julia> x # get the second column.
4-element Array:
0.186253
0.894781
0.18859
0.779823
julia> x # get the first row.
1x3 Array:
0.323877 0.186253 0.600605
julia> x # get the submatrix spanning rows 1,2 and columns 2,3
2x2 Array:
0.186253 0.600605
0.894781 0.0955007