Using Higher-Order Methods Everywhere

Filter is one of the higher-order sequence methods in Swift that can be used and abused in surprising ways. Using it for what it’s not intended for can really drag on your efficiency, though. Say you want to find the first element of a sequence that matches some predicate:

someSequence.filter(somePredicate).first

That’s no good. filter has walked the entire sequence, testing every element. Then, it allocated and filled an array. Don’t worry about it, you say. Check out the lazy function:

lazy(someSequence).filter(somePredicate).first

Except that it still walks along the whole sequence. Really: first causes the whole thing to be evaluated.

You grumble. But hey, you know about generators, right? Add another line, and we’re in business:

var g = lazy(someSequence).filter(somePredicate).generate()
g.next()

And you’d be right. The g.next() returns exactly what you’d expect it to: it’s even optional, like first.

But there’s a lot of overhead. And it’s not very nice looking. Compare it to the more imperative version:

for x in someSequence where somePredicate(x) {
  // do thing
}

I think the loop is better. It’s still not the best though. I want to be able to chain my methods: that loop looks fine there, but if it was at the end of a chain of maps and whatnot, it would be ugly as sin. Solution? Extension!

extension SequenceType {
  func match(@noescape pred: Generator.Element -> Bool) -> Generator.Element? {
    for x in self where pred(x) { return x }
    return nil
  }
}

Now our code looks like this:

someSequence.match(somePredicate)

But there’s an extra problem here: no-one knows what match does. It’s a very unspecific word, with several possible meanings. It decreases readability. Luckily again, we can be Swifty, and overload another function: first.

extension SequenceType {
  func first(@noescape pred: Generator.Element -> Bool) -> Generator.Element? {
    for x in self where pred(x) { return x }
    return nil
  }
}
someSequence.first(somePredicate)

This, I think, is very understandable. It fits in with the other standard library functions that have versions with and without closures (sort, for instance). You could do the same with last:

extension CollectionType where Index : BidirectionalIndexType {
  func last(@noescape pred: Generator.Element -> Bool) -> Generator.Element? {
    for x in reverse() where pred(x) { return x }
    return nil
  }
}

Here’s another use of filter that causes problems:

someSequence.filter(somePredicate).count

This isn’t as bad as first, since you really do need to walk the whole sequence, but it’s still allocating and filling an array for no reason. To avoid the array, you can use reduce:

someSequence.reduce(0) { (sum, el) in pred(el) ? sum + 1 : sum }

But again, it introduces a little overhead, and it’s not very clear. Why not overload count?

extension SequenceType {
  func count(@noescape predicate: Generator.Element -> Bool) -> Int {
    var i = 0
    for el in self where predicate(el) { ++i }
    return i
  }
}

I think that’s again, pretty readable. Some of the other sequence functions have obvious cousins: indexOf to indicesOf, etc:

extension CollectionType {
  
  func indicesOf(@noescape isElement: Generator.Element -> Bool) -> [Index] {
    return indices
      .filter { isElement(self[$0]) }
  }
  
  func dropFirst(@noescape drop: Generator.Element -> Bool) -> Self.SubSequence {
    return indexOf { !drop($0) }
      .map { self[$0..<endIndex] } ??
      self[startIndex..<endIndex]
  }
  
  func prefix(@noescape isElement: Generator.Element -> Bool) -> Self.SubSequence {
    return indexOf { !isElement($0) }
      .map { self[startIndex..<$0] } ??
      self[startIndex..<endIndex]
  }
}

extension RangeReplaceableCollectionType {
  mutating func popFirst(@noescape pop: Generator.Element -> Bool) -> Generator.Element? {
    return indexOf(pop)
      .map { self.removeAtIndex($0) }
  }
}

Best practice says we should build our programs out of small, modular functions. That’s great, until someone else tries to understand what “GTMergeBlack2” means. Overload methods that already exist, in the same way that the other standard library functions are overloaded, and everyone knows what’s going on.

Advertisements

One comment

  1. Pingback: Swift Utility Belt Series: Part 1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s