TIL - Filtering arrays can be tricky
July 27, 2021
So, today I learned that making an operation on array can be tricky. Especially if said array contains objects with multiple properties.
Here's an example array:
const arr = [1, 2, 3, 4, 5]
Let's say I want to remove all the elements that are higher than 2 but lower than 5.
First thing that came to my reptile mid was Array.prototype.splice()
method. I remembered that there's a nice way to remove an element at the specific index (arr.splice(i, 1)
). What could be easier than iterating over the array with some for
or forEach
loop and removing each element that matches our test?
const arr = [1, 2, 3, 4, 5]
arr.forEach((element, index) => {
if (element > 2 && element < 5) {
arr.splice(index, 1)
}
})
// What do we want:
// [ 1, 2, 5 ]
// What we actually get:
// [ 1, 2, 4, 5 ]
So, you probably already know what happened here. When the loop met the first element of the array that matched the condition in the if
block, it got removed immediately (value: 3, index: 2), leaving us with [1, 2, 4, 5]
. By the way, it happened partially because Array.prototype.splice()
uses an in-place algorithm to add, remove or modify elements in array. It modifies the instance of an array, without creating a new JS object.
In the next iteration, index just got increased to 3. What do we have at position 3 in the new array? Alas, not 4 as we would expect, but 5. Our 4 got ommitted!
Working solutions
The story above speaks for itself - write more than one test for your functions, sometimes your code does not behave like you would like it to. After a moment of self-loathing I embarqued on a quest for a solution that would meet the requirements. There are few, as it turns out:
Looping backwards
One of the solutions is to inverse the loop and iterate from the end:
const arr = [1, 2, 3, 4, 5]
for (let i = arr.length - 1; i > -1; i--) {
if (arr[i] > 2 && arr[i] < 5) {
arr.splice(i, 1)
}
}
// What do we want:
// [ 1, 2, 5 ]
// What we actually get:
// [ 1, 2, 5 ]
This solution works because spliced elements are already iterated over and the elements in the remaining ("unchecked") part of the array keep their index value.
New array
Okay, what about creating a completely new array that will serve us as a dump for the elements that we want to keep and eventually linking it to the original array reference? Completely doable!
let arr = [1, 2, 3, 4, 5]
let newArr = []
arr.forEach((element, index) => {
if (!(element > 2 && element < 5)) {
newArr.push(element)
}
})
arr = newArr
// What do we want:
// [ 1, 2, 5 ]
// What we actually get:
// [ 1, 2, 5 ]
In this example we just need to inverse the test case (or use else block) to get the elements we don't want to keep OUT and the rest IN (and not the opposite).
Array.prototype.filter()
Array.prototype.filter() is a built-in JavaScript method that creates a new array based on the condition passed in the callback. Here's how the implementation would look like:
let arr = [1, 2, 3, 4, 5]
let newArr = arr.filter(element => {
return !(element > 2 && element < 5)
})
arr = newArr
I find this solution the most elegant in terms of syntax. One just need to remember that filter()
method returns a new array and does not modify the original array in-place (like splice()
, for instance).
Summary
So, which approach is the most efficient? For the small data sets the answer is - the one that suits you best.
However, performance-wise, it turns out a good old for
loop is the fastest. It is due to the fact that both filter()
and forEach()
methods add a new function to the callstack for each element of the array (whereas traditional for
loop does not). It does not make a huge difference for our array containing five elements, but imagine a situation where you need to filter out an array containing few million elements in a time-constrained environment. Traditional for
loop, although a bit harder to read, is recommended for any operation on big arrays.
Hope this article cleared out the view on filtering arrays.