eprothro
2/26/2015 - 6:06 PM

Enumerators in Ruby | Primer

Enumerators in Ruby | Primer

Enumerators yield for each value in the enumeration

calling next returns the next value, and then raises when there's no values left
e = Enumerator.new do |yielder|
  a = 1
  yielder.yield a
end

e.next
=> 1

e.next
StopIteration: iteration reached an end

e.next rescue "that was an exception"
=> "that was an exception"

An Enumerator also includes the Enumerable module

You get Enumerable methods for free
e = Enumerator.new do |yielder|
  a = 0
  3.times do
    a = a + 1
    yielder << a
  end
end

e.count
=> 3
e.select{|x| x>2}
=> [3]
Most Enumerable methods return an Enumerator when called without a block
e = ['foo', 'bar', 'bat'].map
=> #<Enumerator: ["foo", "bar", "bat"]:map>
This means you can chain most Enumerable and Enumerator methods in useful ways
e = Enumerator.new do |yielder|
  a = 0
  3.times do
    a = a + 1
    yielder << a
  end
end

e.map.with_index do |n, i|
  puts "element #{n} @ index[#{i}]"
  n * 1000 + i
end
element 1 @ index[0]
element 2 @ index[1]
element 3 @ index[2]
=> [1000, 2001, 3002]
Enumerator.new allows you to determine size lazily
class Test
  def foo
    Enumerator.new(cheaper_foo_count) do |y|
      puts "expensive calculation!"

      y.yield 1
      y.yield 2

      puts "fully executed!"
    end
  end

  def cheaper_foo_count
    @cheaper_count ||= begin
      puts "cheaply calculate foo count"
      2
    end
  end
end

e = Test.new.foo
cheaply calculate foo count
=> #<Enumerator: #<Enumerator::Generator:0x007fbb73a16400>:each>

e.size
=> 2

e.count
expensive calculation!
fully executed!
=> 2

Use enum_for to provide interface with or without a block

enum_for(:my_enumerating_method) is just syntactic sugar to handle the case where your method is called without a block for you. So, this also allows you to return an enumerator without having to actually call Enumerator.new. The code can focus on your looping/breaking/yeilding logic.

class Test
  def foo
    return enum_for(:foo) unless block_given?

    yield 1
    yield 2
  end
end

e = Test.new.foo
=> #<Enumerator: #<Test:0x007fe1319a0410>:foo>

e.next
=> 1

Test.new.foo do |x|
  puts x
end
1
2
=> nil
enum_for enumerates lazily and efficiently
class Test
  def foo
    return enum_for(:foo) unless block_given?

    puts "expensive calculation!"

    yield 1
    yield 2

    puts "fully executed!"
  end
end

e = Test.new.foo
=> #<Enumerator: #<Test:0x007fe1319a0410>:foo>

e.next
expensive calculation!
=> 1

e.next
=> 2

e.next
fully executed!
StopIteration: iteration reached an end

Test.new.foo do |x|
  puts x
end

expensive calculation!
1
2
fully executed!
=> nil
enum_for enumerates in isolation
class Test
  def foo
    return enum_for(:foo) unless block_given?

    @local_variable = 0

    loop do
      raise "broken!, should not print #{@local_variable + 1}" if @local_variable >= 2
      yield @local_variable += 1
    end

    puts "done!"
  end
end

o = Test.new
=> #<Test:0x007ff4141186d0>
e1 = o.foo
=> #<Enumerator: #<Test:0x007ff4141186d0>:foo>
e2 = o.foo
=> #<Enumerator: #<Test:0x007ff4141186d0>:foo>

e1.next
=> 1
e2.next
=> 1
e2.next
=> 2
e1.next
RuntimeError: broken!, should not print 3
class Test
  def foo
    return enum_for(:foo) unless block_given?

    local_variable = 0

    loop do
      raise "broken!, should not print #{local_variable + 1}" if local_variable >= 2
      yield local_variable += 1
    end

    puts "done!"
  end
end

o = Test.new
=> #<Test:0x007fe9b896aa38>
e1 = o.foo
=> #<Enumerator: #<Test:0x007fe9b896aa38>:foo>
e2 = o.foo
=> #<Enumerator: #<Test:0x007fe9b896aa38>:foo>

e1.next
=> 1
e2.next
=> 1
e2.next
=> 2
e1.next
=> 2

When yielding to the yielder, you can use the << shorthand

e = Enumerator.new do |yielder|
  a = 0
  3.times do
    a = a + 1
    yielder << a
  end
end

e.next
=> 1
e.next
=> 2
e.next
=> 3
e.next
StopIteration: iteration reached an end

You can yield multiple arguments as well

Yielding with << requires an array
e = Enumerator.new do |yielder|
  yielder << [1, 2]
end

e.each do |a|
  puts "#{a.class}: #{a}"
end
Array: [1, 2]  # <= here is the difference when using `<<` vs `#yield`

e.each do |a, b|
  puts "#{a.class}: #{a}"
  puts "#{b.class}: #{b}"
end
Fixnum: 1
Fixnum: 2

e.each do |a, b, c|
  puts "#{a.class}: #{a}"
  puts "#{b.class}: #{b}"
  puts "#{c.class}: #{c}"
end

Fixnum: 1
Fixnum: 2
NilClass:
Yielding with yield allows multiple flat parameters
e = Enumerator.new do |yielder|
  yielder.yield 1, 2
end

e.each do |a|
  puts "#{a.class}: #{a}"
end
Fixnum: 1

e.each do |a, b|
  puts "#{a.class}: #{a}"
  puts "#{b.class}: #{b}"
end
Fixnum: 1
Fixnum: 2

e.each do |a, b, c|
  puts "#{a.class}: #{a}"
  puts "#{b.class}: #{b}"
  puts "#{c.class}: #{c}"
end

Fixnum: 1
Fixnum: 2
NilClass:
Flat is maybe prettier style, but I recommend the array, since you have to play with others

If another class is proxying the yielding of the block and assumes there is only 1 element

SomeProxyClass
  def do_stuff_on_a_collection(collection = [], *attributes, &block)
      _map_collection(collection, &block)
  end  
  def _map_collection(collection)
    collection.map do |element|
      _scope{ yield element }
    end
  end
end

proxy_object.do_stuff_on_a_collection my_enum_with_multiple_block_args do |a, b|
  # I'll only have a AND b if they were in an array!
  # If they were passed in a flat list of args, I'll only get the first one!
end