This article was originally posted on the globaldev blog, they have kindly allowed me to repost it here. If you’re looking for a Ruby job in London you should check out their jobs page.
Ruby 2.1 is the next significant version of Ruby, having been released on Christmas Day 2013, just 10 months after 2.0.0. It comes with a whole host of changes and improvements, and this post dives in to the details of what’s new.
With 2.1 Ruby moves to a new versioning scheme based on Semantic Versioning.
The scheme is MAJOR.MINOR.TEENY, so with 2.1.0 the major version is 2, the minor version is 1, and the teeny version is 0. The teeny version number takes over from the patchlevel for minor bug and security fixes. The minor version number will be used for new features that are largely backwards compatible, and major for incompatible changes that can’t be released as a minor.
This means rather than referring to, say, 1.9.3 in general and 1.9.3-p545 specifically it will be 2.1 in general and 2.1.1 specifically.
The plan is to release a new minor version every 12 months, so we can expect to see Ruby 2.2 on Christmas Day 2014.
After being introduced in Ruby 2.0.0 keyword arguments get a small improvement in 2.1. Required keyword arguments allow you to omit the default value for a keyword argument in the method definition, and an error will be raised if they are not given when the method is called.
# length is required
def pad(num, length:, char: "0")
num.to_s.rjust(length, char)
end
pad(42, length: 6) #=> "000042"
pad(42) #=> #<ArgumentError: missing keyword: length>
As you can see in the example above there are some cases where keyword arguments can really help disambiguate which argument is which, but there isn’t any sensible default. Now you don’t have to choose.
As strings in Ruby are mutable, any string literals must result in a new string each time they are evaluated, e.g.
def env
"development"
end
# returns new String object on each call
env.object_id #=> 70329318373020
env.object_id #=> 70329318372900
This can be quite wasteful, creating and then garbage collecting a lot of
objects. To allow you to avoid this, calling #freeze
directly on a string
literal is special cased to look up the string in a table of frozen strings.
This means the same string will be reused
def env
"development".freeze
end
# returns the same String object on each call
env.object_id #=> 70365553080120
env.object_id #=> 70365553080120
Strings literals as keys in Hash literals will also be treated the same,
without the need to call #freeze
.
a = {"name" => "Arthur"}
b = {"name" => "Ford"}
# same String object used as key in both hashes
a.keys.first.object_id #=> 70253124073040
b.keys.first.object_id #=> 70253124073040
During the development of 2.1 this feature started off as a syntax addition,
with "string"f
resulting in a frozen string. It was decided to switch to the
technique of special casing the #freeze
call on a literal as it allows for
writing code that is backwards and forwards compatible, plus subjectively many
people weren’t fond of the new syntax.
def
returns the method name as a SymbolThe result of defining a method is no longer nil
, instead it’s a symbol of
the method’s name. The canonical example of this is making a single method
private.
class Client
def initialize(host, port)
# ...
end
private def do_request(method, path, body, **headers)
# ...
end
def get(path, **headers)
do_request(:get, path, nil, **headers)
end
end
It also makes for a nice way of adding method decorators, here’s an example using Module#prepend to wrap before/after calls around a method.
module Around
def around(method)
prepend(Module.new do
define_method(method) do |*args, &block|
send(:"before_#{method}") if respond_to?(:"before_#{method}", true)
result = super(*args, &block)
send(:"after_#{method}") if respond_to?(:"after_#{method}", true)
result
end
end)
method
end
end
class Example
extend Around
around def call
puts "call"
end
def before_call
puts "before"
end
def after_call
puts "after"
end
end
Example.new.call
outputs
before
call
after
The define_method
and define_singleton_method
methods have also been
updated to return symbols rather than their proc arguments.
Integer (1
) and Float (1.0
) literals are a given, now we have Rational
(1r
) and Complex (1i
) literals too.
These work really nicely with Ruby’s casting mechanism for mathematical
operations, such that a rational number like one third – 1/3 in mathematical
notation – can be written 1/3r
in Ruby. 3i
produces the complex number
0+3i, this means complex numbers can be written in standard mathematical
notation, 2+3i
produces the complex number 2+3i!
#to_h
The many classes that got a #to_h
method in Ruby 2.0.0 are now joined
by Array and any other class including Enumerable.
[[:id, 42], [:name, "Arthur"]].to_h #=> {:id=>42, :name=>"Arthur"}
require "set"
Set[[:id, 42], [:name, "Arthur"]].to_h #=> {:id=>42, :name=>"Arthur"}
This will come in handy with all those Enumerable methods on Hash that return an Array
headers = {"Content-Length" => 42, "Content-Type" => "text/html"}
headers.map {|k, v| [k.downcase, v]}.to_h
#=> {"content-length" => 42, "content-type" => "text/html"}
Previous to 2.1 Ruby used a global method cache, this would be invalidated for all classes when a method was defined, module included, object extended with a module, etc. anywhere in your code. This made some classes – such as OpenStruct – and some techniques – such as exception tagging – unusable for performance reasons.
This is now no longer an issue, Ruby 2.1 uses a method cache based on the class hierarchy, invalidating the cache for only the class in question and any subclasses.
A method has been added to the RubyVM class to return some debugging information on the status of the method cache.
class Foo
end
RubyVM.stat #=> {:global_method_state=>133, :global_constant_state=>820, :class_serial=>5689}
# setting constant increments :global_constant_state
Foo::Bar = "bar"
RubyVM.stat(:global_constant_state) #=> 821
# defining instance method increments :class_serial
class Foo
def foo
end
end
RubyVM.stat(:class_serial) #=> 5690
# defining global method increments :global_method_state
def foo
end
RubyVM.stat(:global_method_state) #=> 134
Exceptions now have a #cause
method that will return the causing exception.
The causing exception will automatically be set when you rescue one exception
and raise another.
require "socket"
module MyProject
Error = Class.new(StandardError)
NotFoundError = Class.new(Error)
ConnectionError = Class.new(Error)
def self.get(path)
response = do_get(path)
raise NotFoundError, "#{path} not found" if response.code == "404"
response.body
rescue Errno::ECONNREFUSED, SocketError => e
raise ConnectionError
end
end
begin
MyProject.get("/example")
rescue MyProject::Error => e
e #=> #<MyProject::ConnectionError: MyProject::ConnectionError>
e.cause #=> #<Errno::ECONNREFUSED: Connection refused - connect(2) for "example.com" port 80>
end
Currently the causing error isn’t output anywhere, and rescue
won’t pay
attention to the cause, but just having the cause automatically set should be a
great help while debugging.
Exceptions also get the #backtrace_locations
method that was curiously
missing from 2.0.0. This returns Thread::Backtrace::Location objects
rather than strings, giving easier access to the details of the backtrace.
Ruby 2.1 introduces a generational garbage collector, this divides all objects into young and old generations. During the marking phase a regular GC run will only look at the young generation, with the old being marked less frequently. Sweeping is done with the same lazy sweeping system introduced in 1.9.3. An object is promoted to the old generation when it survives a young generation run.
If you have objects in the old generation referring to objects in the young generation, but you’re only looking at the young generation it may seem like an object doesn’t have any references, and you might incorrectly GC an in-use object. Write barriers prevent this by adding old generation objects to a ‘remember set’ when they are modified to refer to a young generation object (e.g. old_array.push(young_string)). This ‘remember set’ is then taken in to account when marking the young generation.
Most generational garbage collectors need these write barriers on all objects, but with the many 3rd party C extensions available for Ruby this isn’t possible, so a workaround was devised whereby objects that aren’t write barrier protected (“shady” objects) won’t ever be promoted to the old generation. This isn’t ideal as you won’t get the full benefit of the generational GC, but it does maximise backwards compatibility.
While the marking phase is now a lot faster the write barriers do add some overhead, and any performance gains are very dependant on what exactly your code is doing.
The GC.start
method gets two new keyword arguments, full_mark
and
immediate_sweep
. Both of these default to true.
With full_mark
set to true both generations are marked, false will only mark
the young generation. With immediate_sweep
set true a full ‘stop the world’
sweep will be performed, false will perform a lazy sweep, deferred to when it’s
required and only sweeping the minimum required.
GC.start # trigger a full GC run
GC.start(full_mark: false) # only collect young generation
GC.start(immediate_sweep: false) # mark only
GC.start(full_mark: false, immediate_sweep: false) # minor GC
The GC.stress
debugging option can now be set to an integer flag to control
which part of the garbage collector to stress.
GC.stress = true # full GC at every opportunity
GC.stress = 1 # minor marking at every opportunity
GC.stress = 2 # lazy sweep at every opportunity
The output of GC.stat
has been updated to include some more details, and the
method itself now takes a key argument to return just the value for that key,
rather than building and returning the full hash.
GC.stat #=> {:count=>6, ... }
GC.stat(:major_gc_count) #=> 2
GC.stat(:minor_gc_count) #=> 4
GC also gets a new method latest_gc_info
which returns information about the
most recent garbage collection run.
GC.latest_gc_info #=> {:major_by=>:oldgen, :gc_by=>:newobj, :have_finalizer=>false, :immediate_sweep=>false}
Ruby will pay attention to a whole bunch of new environment variables now when it’s started up, that can be used to tune the behaviour of the garbage collector.
This was available before as RUBY_HEAP_MIN_SLOTS. It sets the initial allocation slots, and defaults to 10000.
This was also available before, as RUBY_FREE_MIN. It sets the minimum number of slots that should be available after GC. New slots will be allocated will be allocated if GC hasn’t freed up enough. Defaults to 4096.
Grows the number of allocated slots by the given factor. (next slots number) = (current slots number) * (this factor). The default is 1.8.
The maximum number of slots that will be allocated at one time. The default is 0, which means no maximum.
This one isn’t new, but it’s worth covering. It is the amount of memory that can be allocated without triggering garbage collection. It defaults to 16 * 1024 * 1024 (16MB).
The rate at which the malloc_limit grows, the default is 1.4.
The maximum the malloc_limit can reach. Default 32 * 1024 * 1024 (32MB).
The amount the old generation can increase by before triggering a full GC. Default is 16 * 1024 * 1024 (16MB).
The rate at which the old_malloc_limit grows. Default 1.2.
The maximum the old_malloc_limit can reach. Default 128 * 1024 * 1024 (128MB).
Ruby 2.1 adds some more tools to help track down when you’re keeping references to old/large objects and not letting the garbage collector claim them.
We now get a collection of methods to trace object allocations and report on them.
require "objspace"
module Example
class User
def initialize(first_name, last_name)
@first_name, @last_name = first_name, last_name
end
def name
"#{@first_name} #{@last_name}"
end
end
end
ObjectSpace.trace_object_allocations do
obj = Example::User.new("Arthur", "Dent").name
ObjectSpace.allocation_sourcefile(obj) #=> "example.rb"
ObjectSpace.allocation_sourceline(obj) #=> 10
ObjectSpace.allocation_class_path(obj) #=> "Example::User"
ObjectSpace.allocation_method_id(obj) #=> :name
ObjectSpace.allocation_generation(obj) #=> 6
end
The number returned by allocation_generation
is the number of garbage
collections that had been run when the object was created. So if this is a
small number then the object was created early in the lifetime of the
application.
There’s also trace_object_allocations_start
and
trace_object_allocations_stop
as alternatives to trace_object_allocations
with a block, and trace_object_allocations_clear
to clear recorded allocation
data.
Further to this it’s possible to output this information and a little more to a file or string as JSON for further analysis or visualisation.
require "objspace"
ObjectSpace.trace_object_allocations do
puts ObjectSpace.dump(["foo"].freeze)
end
outputs
{
"address": "0x007fd122123f40",
"class": "0x007fd121072098",
"embedded": true,
"file": "example.rb",
"flags": {
"wb_protected": true
},
"frozen": true,
"generation": 6,
"length": 1,
"line": 4,
"references": [
"0x007fd122123f68"
],
"type": "ARRAY"
}
You can also use ObjectSpace.dump_all
to dump the entire heap.
require "objspace"
ObjectSpace.trace_object_allocations_start
# do things ...
ObjectSpace.dump_all(output: File.open("heap.json", "w"))
Both these methods can be used without activating object allocation tracing, but you’ll get less detail in the output.
Finally there’s ObjectSpace.reachable_objects_from_root
which works similarly
to ObjectSpace.reachable_objects_from
but takes no argument and works from
the root instead. There is one slight quirk to this method in that it returns a
hash that has been put in to ‘compare by identity’ mode, so you need the exact
same string objects that it uses for keys to get anything out of it.
Fortunately there is a workaround.
require "objspace"
reachable = ObjectSpace.reachable_objects_from_root
reachable = {}.merge(reachable) # workaround compare_by_identity
reachable["symbols"] #=> ["freeze", "inspect", "intern", ...
Refinements are no longer experimental and won’t generate a warning, they also get a couple of small tweaks to make them more useable.
Along with the top level #using
to activate refinements in a file, there is
now a Module#using
method to activate refinements in a module. However, the
effect of ‘using’ a refinement is still lexical, it won’t be active when
reopening a module definition.
module NumberQuery
refine String do
def number?
match(/\A(0|-?[1-9][0-9]*)\z/) ? true : false
end
end
end
module Example
using NumberQuery
"42".number? #=> true
end
module Example
"42".number? #=> #<NoMethodError: undefined method `number?' for "42":String>
end
Refinement definitions are now inherited with Module#include
, meaning you can
group together a bunch of refinements defined in separate modules to just one,
and activate them all with a single using
.
module BlankQuery
refine Object do
def blank?
respond_to?(:empty?) ? empty? : false
end
end
refine String do
def blank?
strip.length == 0
end
end
refine NilClass do
def blank?
true
end
end
end
module NumberQuery
refine Object do
def number?
false
end
end
refine String do
def number?
match(/\A(0|-?[1-9][0-9]*)\z/) ? true : false
end
end
refine Numeric do
def number?
true
end
end
end
module Support
include BlankQuery
include NumberQuery
end
class User
using Support
# ...
def points=(obj)
raise "points can't be blank" if obj.blank?
raise "points must be a number" unless obj.number?
@points = obj
end
end
String#scrub
String#scrub
has been added to Ruby 2.1 to help deal with strings that have
ended up with invalid bytes in them.
# create a string that can't be sensibly printed
# 'latin 1' encoded string with accented character
string = "öops".encode("ISO-8859-1")
# misrepresented as UTF-8
string.force_encoding("UTF-8")
# and mixed with a UTF-8 character
string = "¡#{string}!"
You wouldn’t ever create a string like this deliberately (or at least I hope not), but it’s not uncommon for a string that has been through a number of systems to get mangled like this.
Presented with just the end result it’s pretty much impossible to untangle it all, but we can at least get rid of the characters that are now invalid.
# replace with 'replacement character'
string.scrub #=> "¡�ops!"
# delete
string.scrub("") #=> "¡ops!"
# replace with chosen character
string.scrub("?") #=> "¡?ops!"
# yield to a block for custom replacement
# (in this case the invalid bytes as hex)
string.scrub {|bytes| "<#{bytes.unpack("H*").join}>"} #=> "¡<f6>ops!"
The same result can also be achieved by calling #encoding
with the current
encoding and invalid: :replace
as arguments
string.encode("UTF-8", invalid: :replace) #=> "¡�ops!"
string.encode("UTF-8", invalid: :replace, replace: "?") #=> "¡?ops!"
Bignum and Rational now use the GNU Multiple Precision Arithmetic Library (GMP) to improve performance.
Setting $SAFE = 4
was intended to put Ruby in a ‘sandbox’ type mode and allow
execution of untrusted code. However it wasn’t terribly effective, required a
lot of code scattered all over Ruby, and was almost never used, so it has been
removed.
$SAFE = 4 #=> #<ArgumentError: $SAFE=4 is obsolete>
clock_gettime
Ruby now has access to the system’s clock_gettime()
function though
Process.clock_gettime
, this allows easy access to a number of different time
values. It must be called with a clock id as the first argument:
Process.clock_gettime(Process::CLOCK_REALTIME) #=> 1391705719.906066
Supplying Process::CLOCK_REALTIME
will give you a unix timestamp as the
return value. This will match Time.now.to_f
, but as it skips creating a Time
instance it’s a little bit quicker.
Another use for Process.clock_gettime
is to get access to a monotonic clock,
that is a clock that always moves forwards, regardless to adjustments to the
system clock. This is perfect for critical timing or benchmarking.
However the monotonic clock value only makes sense when compared to another as the starting reference point is arbitrary.
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
sleep 1
Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time #=> 1.0051147330086678
Another clock useful for benchmarking is CLOCK_PROCESS_CPUTIME_ID
, this works
similarly to the monotonic clock in that it always advances, and only makes
sense when referenced against another cpu time, but it only advances when the
cpu has to do any work.
start_time = Process.clock_gettime(Process::CLOCK_PROCESS_CPUTIME_ID)
sleep 1
Process.clock_gettime(Process::CLOCK_PROCESS_CPUTIME_ID) - start_time #=> 0.005225999999999981
These three clocks, realtime, monotonic, and cpu, should always be available. Depending on your system you may have access to other clocks, check the documentation for the others that might be available.
To check if any of these clocks are supported you can check for the presence of the constant storing its clock id.
Process.const_defined?(:CLOCK_PROCESS_CPUTIME_ID) #=> true
Process.const_defined?(:CLOCK_THREAD_CPUTIME_ID) #=> false
There is also a Process.clock_getres
method available that can be used to
discover the resolution provided by a specific clock.
The included version of RubyGems has been updated to 2.2. Gemfile.lock support has been added to its basic Gemfile support, working towards the goal of merging all Bundler features into RubyGems.
The --file
(or -g
) option to gem install
no longer requires a file name
for the dependancy file, it will auto-detect the Gemfile. A gem install
will
also generate a Gemfile.lock if one is not present, and respect the versions it
specifies if it exists.
$ ls
Gemfile
$ gem install -g
Fetching: going_postal-0.1.4.gem (100%)
Installing going_postal (0.1.4)
$ ls
Gemfile
Gemfile.lock
You can see the full list of changes in the RubyGems History File.
The bundled Rake has been updated to version 10.1.0, this removes a bunch of deprecated features. Older versions of Rake have warned about these features for quite a while so hopefully you won’t encounter any compatibility problems.
See the full Rake release notes for versions 10.0.3 and 10.1.0 for more details.
The included version of RDoc is now at 4.1, which brings a nice update to the default template with some accessibility improvements. See the RDoc History file for the full set of changes.
A new method Process.setproctitle
has been added to set the process title
without assigning to $0
. A corresponding method Process.argv0
has also been
added to retrieve the original value of $0
even if it has been assigned to.
Say you had some code in a background processing worker that looked like the following
data.each_with_index do |datum, i|
Process.setproctitle("#{Process.argv0} - job #{i} of #{data.length}")
process(datum)
end
you’d see something like the following if you were to run ps
$ ps
PID TTY TIME CMD
339 ttys000 0:00.23 -bash
7321 ttys000 0:00.06 background.rb - job 10 of 30
Symbols now join integers and floating point numbers in being frozen.
:foo.frozen? #=> true
:foo.instance_variable_set(:@bar, "baz") #=> #<RuntimeError: can't modify frozen Symbol>
This change was made to set things up for garbage collection of symbols in a future version of Ruby.
Using private
, protected
, public
, or module_function
without arguments
in a string evaluated with eval
, instance_eval
, or module_eval
the method
visibility scope would leak out to the calling scope, such that foo
in the
following example would be private.
class Foo
eval "private"
def foo
"foo"
end
end
This is fixed in 2.1, so foo
would be public in this example.
#untrusted?
is now an alias of #tainted?
Ruby previously had two sets of methods for marking/checking objects as
untrusted, the first set, #tainted?
, #taint
, and #untaint
, and the second
#untrusted?
, #untrust
, and #trust
. These behaved the same, but set
separate flags, so an object could be untrusted, but not tainted.
These methods have been unified to set/get a single flag, with #tainted?
etc
being the preferred names and #untrusted?
etc generating warnings.
string = "foo"
string.untrust
string.tainted? #=> true
generates the warning
example.rb:2: warning: untrust is deprecated and its behavior is same as taint
return
in lambda now always returns from lambdaLambdas differ from Procs/blocks in that using return in a lambda returns from
the lambda, not the enclosing method. However there was an exception to this,
if a lambda was passed to a method with &
and called with yield
. This
exception has now been removed.
def call_with_yield
yield
end
def test
call_with_yield(&lambda {return "hello from lambda"})
"hello from method"
end
test #=> "hello from method"
The example above would have returned "hello from lambda"
under Ruby <= 2.0.0
It is now possible to get details of the system’s network interfaces with
Socket.getifaddrs
. This returns an array of Socket::Ifaddr objects.
require "socket"
info = Socket.getifaddrs.find do |ifaddr|
(ifaddr.flags & Socket::IFF_BROADCAST).nonzero? &&
ifaddr.addr.afamily == Socket::AF_INET
end
info.addr.ip_address #=> "10.0.1.2"
StringScanner#[]
now accepts symbols as arguments, and will return the
corresponding named capture from the last match.
require "strscan"
def parse_ini(string)
scanner = StringScanner.new(string)
current_section = data = {}
until scanner.eos?
scanner.skip(/\s+/)
if scanner.scan(/;/)
scanner.skip_until(/[\r\n]+/)
elsif scanner.scan(/\[(?<name>[^\]]+)\]/)
current_section = current_section[scanner[:name]] = {}
elsif scanner.scan(/(?<key>[^=]+)=(?<value>.*)/)
current_section[scanner[:key]] = scanner[:value]
end
end
data
end
YAML.safe_load
YAML
(well, Psych
, the underlying yaml implementation) has had a safe_load
method added. By default only the following classes can be deserialised:
TrueClass
, FalseClass
, NilClass
, Numeric
, String
, Array
, and
Hash
. To deserialise other classes that you know will be safe you can pass a
whitelist as an argument.
If a disallowed class is found Psych::DisallowedClass
will be raised, this can
also be referenced as YAML::DisallowedClass
.
require "yaml"
YAML.safe_load(":foo: 1") #=> #<Psych::DisallowedClass: Tried to load unspecified class: Symbol>
YAML.safe_load(":foo: 1", [Symbol]) #=> {:foo=>1}
Ruby’s Resolv DNS library gets basic support for one-shot multicast DNS lookups. It doesn’t support continuous queries, and can’t do service discovery, but it’s still a pretty neat new feature (Checkout the dnssd gem for full DNS Service Discovery support).
require "resolv"
resolver = Resolv::MDNS.new
resolver.getaddress("example.local") #=> #<Resolv::IPv4 10.0.1.2>
Combined with the resolv-replace library this allows you to use mDNS names with most Ruby networking libraries.
require "resolv-replace"
require "net/http"
Resolv::DefaultResolver.replace_resolvers([Resolv::Hosts.new, Resolv::MDNS.new])
Net::HTTP.get_response(URI.parse("http://example.local")) #=> #<Net::HTTPOK 200 OK readbody=true>
Resolv also gains the ability to query DNS LOC records.
require "resolv"
dns = Resolv::DNS.new
# find.me.uk has LOC records for all UK postcodes
resource = dns.getresource("W1A1AA.find.me.uk", Resolv::DNS::Resource::IN::LOC)
resource.latitude #=> #<Resolv::LOC::Coord 51 31 6.827 N>
resource.longitude #=> #<Resolv::LOC::Coord 0 8 37.585 W>
And the final change for Resolve, it’s now possible to get back the full DNS
message with Resolv::DNS#fetch_resource
.
require "resolv"
dns = Resolv::DNS.new
dns.fetch_resource("example.com", Resolv::DNS::Resource::IN::A) do |reply, reply_name|
reply #=> #<Resolv::DNS::Message:0x007f88192e2cc0 @id=55405, @qr=1, @opcode=0, @aa=0, @tc=0, @rd=1, @ra=1, @rcode=0, @question=[[#<Resolv::DNS::Name: example.com.>, Resolv::DNS::Resource::IN::A]], @answer=[[#<Resolv::DNS::Name: example.com.>, 79148, #<Resolv::DNS::Resource::IN::A:0x007f88192e1c80 @address=#<Resolv::IPv4 93.184.216.119>, @ttl=79148>]], @authority=[], @additional=[]>
reply_name #=> #<Resolv::DNS::Name: example.com.>
end
Errors from sockets have been improved to include the socket address in the message.
require "socket"
TCPSocket.new("localhost", 8080) #=> #<Errno::ECONNREFUSED: Connection refused - connect(2) for "localhost" port 8080>
Hash#shift
much fasterThe performance of Hash#shift
has been massively improved and this, coupled
Hash being insertion ordered since Ruby 1.9, makes it practical to implement a
simple least recently used cache.
class LRUCache
def initialize(size)
@size, @hash = size, {}
end
def [](key)
@hash[key] = @hash.delete(key)
end
def []=(key, value)
@hash.delete(key)
@hash[key] = value
@hash.shift if @hash.size > @size
end
end
Queue, SizedQueue, and ConditionVariable have been sped up by implementing them in C rather then Ruby.
It is no longer possible to rescue the exception used internally by Timeout to abort the block it’s given. This is mostly an internal implementation detail that’s nothing to worry about, the Timeout::Error raised externally when the timeout is reached is unchanged and can be rescued as normal.
require "timeout"
begin
Timeout.timeout(1) do
begin
sleep 2
rescue Exception
# no longer swallows the timeout exception
end
end
rescue StandardError => e
e #=> #<Timeout::Error: execution expired>
end
Set gains #intersect?
and #disjoint?
methods. #intersect?
returns true if
the receiver and the argument have at least one value in common, and false
otherwise. #disjoint?
is the opposite and returns true if the sets have no
elements in common, false otherwise.
require "set"
a = Set[1,2,3]
b = Set[3,4,5]
c = Set[4,5,6]
a.intersect?(b) #=> true
b.intersect?(c) #=> true
a.intersect?(c) #=> false
a.disjoint?(b) #=> false
b.disjoint?(c) #=> false
a.disjoint?(c) #=> true
Another minor change to Set, #to_set
called on a set will simply return self,
rather than a copy.
require "set"
set = Set["foo", "bar", "baz"]
set.object_id #=> 70286489985620
set.to_set.object_id #=> 70286489985620
The WEBrick HTTP response body can now be set to anything responding to #read
and #readpartial
. Previously it had to be an instance of IO or a String. The
example below implements a class that wraps an enumerator, and then uses this
to stream out a response of the current time every second for 10 seconds.
require "webrick"
class EnumeratorIOAdapter
def initialize(enum)
@enum, @buffer, @more = enum, "", true
end
def read(length=nil, out_buffer="")
return nil unless @more
until (length && @buffer.length >= length) || !fill_buffer; end
if length
part = @buffer.slice!(0, length)
else
part, @buffer = @buffer, ""
end
out_buffer.replace(part)
end
def readpartial(length, out_buffer="")
raise EOFError if @buffer.empty? && !fill_buffer
out_buffer.replace(@buffer.slice!(0, length))
end
private
def fill_buffer
@buffer << @enum.next
rescue StopIteration
@more = false
end
end
server = WEBrick::HTTPServer.new(Port: 8080)
server.mount_proc "/" do |request, response|
enum = Enumerator.new do |yielder|
10.times do
sleep 1
yielder << "#{Time.now}\r\n"
end
end
response.chunked = true
response.body = EnumeratorIOAdapter.new(enum)
end
trap(:INT) {server.shutdown}
server.start
Numeric#step
The #step
method on Numeric can now accept the keyword arguments by:
and
to:
rather than positional arguments. The to:
argument is optional, and if
omitted it will result in an infinite sequence. If using positional arguments
you can pass nil as the first argument to get the same behaviour.
0.step(by: 5, to: 20) do |i|
puts i
end
outputs:
0
5
10
15
20
0.step(by: 3) do |i|
puts i
end
0.step(nil, 3) do |i|
puts i
end
would both output
0
3
6
9
12
... and so on
The IO#seek
method now accepts :CUR
, :END
, and :SET
as symbols, along
with the old flags named by the constants IO::SEEK_CUR, IO::SEEK_END, and
IO::SEEK_SET.
New are IO::SEEK_DATA and IO::SEEK_HOLE (or :DATA
and :HOLE
) for its second
argument. When these are supplied then the first argument is used as the
minimum size of the data/hole to seek too.
f = File.new("example.txt")
# sets the offset to the start of the next data chunk at least 8 bytes long
f.seek(8, IO::SEEK_DATA)
# sets the offset to the start of the next empty space at least 32 bytes long
f.seek(32, IO::SEEK_HOLE)
These may not be supported on all platforms, you can check with
IO.const_defined?(:SEEK_DATA)
and IO.const_defined?(:SEEK_HOLE)
_nonblock
without raising exceptionsIO#read_nonblock
and IO#write_nonblock
now each get an exception
keyword
argument. When set to false
(default is true
) this causes the methods to
return a symbol on error, rather than raise exceptions.
require "socket"
io = TCPSocket.new("www.example.com", 80)
message = "GET / HTTP/1.1\r\nHost: www.example.com\r\nConnection: close\r\n\r\n"
loop do
IO.select(nil, [io])
result = io.write_nonblock(message, exception: false)
break unless result == :wait_writeable
end
response = ""
loop do
IO.select([io])
result = io.read_nonblock(32, exception: false)
break unless result
next if result == :wait_readable
response << result
end
puts response.lines.first
If you set default internal and external encodings Ruby will transcode from the external encoding to the internal. The exception to this is when the external encoding is set to ASCII-8BIT (aka binary), where no transcoding takes place.
The same exception should be made if the encodings were supplied to an IO method as an argument, but there was a bug, and the transcoding would take place. This has now been fixed.
File.read("example.txt", encoding: "ascii-8bit:utf-8").encoding #=> #<Encoding:ASCII-8BIT>
#include
and #prepend
now publicAffecting Module and Class, the #include
and #prepend
methods are now
public.
module NumberQuery
def number?
match(/\A(0|-?[1-9][0-9]*)\z/) ? true : false
end
end
String.include(NumberQuery)
"123".number? #=> true
require "bigdecimal"
module FloatingPointFormat
def to_s(format="F")
super
end
end
BigDecimal.prepend(FloatingPointFormat)
decimal = BigDecimal("1.23")
decimal.to_s #=> "1.23" # rather than "0.123E1"
#singleton_class?
Module and Class gain a #singleton_class?
method that, predictably, returns
whether or not the receiver is a singleton class.
class Example
singleton_class? #=> false
class << self
singleton_class? #=> true
end
end
Module#ancestors
more consistent#ancestors
called on a singleton class now includes singleton classes in the
returned array, this makes the behaviour more consistent between being called
on regular classes and singleton classes. It also clears up an irregularity
where singleton classes would show up, but only if a module had been prepended
(not included) in to the singleton class.
Object.ancestors.include?(Object) #=> true
Object.singleton_class.ancestors.include?(Object.singleton_class) #=> true
Object#singleton_method
Similar to #method
and #instance_method
, but will return only singleton
methods.
class Example
def self.test
end
def test2
end
end
# returns class method
Example.singleton_method(:test) #=> #<Method: Example.test>
# doesn't return instance method
Example.singleton_method(:test2) #=> #<NameError: undefined singleton method `test2' for `Example'>
# doesn't return inherited class method
Example.singleton_method(:name) #=> #<NameError: undefined singleton method `name' for `Example'>
example = Object.new
def example.test
end
example.singleton_method(:test) #=> #<Method: #<Object:0x007fc54997a610>.test>
Method#original_name
Method and UnboundMethod gain an #original_name
method to return the
un-aliased name.
class Example
def foo
"foo"
end
alias bar foo
end
example = Example.new
example.method(:foo).original_name #=> :foo
example.method(:bar).original_name #=> :foo
Example.instance_method(:bar).original_name #=> :foo
Mutex#owned?
Mutex#owned?
is no longer experimental, and there’s not much more to
say about that.
Hash#reject
Calling Hash#reject
on a subclass of Hash will issue a warning. In Ruby 2.2
#reject
called on a subclass of Hash will returns a new Hash instance,
rather than an instance of the subclass. So in preparation for that potentially
breaking change there is a warning.
class MyHash < Hash
end
example = MyHash.new
example[:a] = 1
example[:b] = 2
example.reject {|k,v| v > 1}.class #=> MyHash
Generates the following warning.
example.rb:8: warning: copying unguaranteed attributes: {:a=>1, :b=>2}
example.rb:8: warning: following atributes will not be copied in the future version:
example.rb:8: warning: subclass: MyHash
Ruby 2.1.1 accidentally included the full change, returning Hash
in the
example above and not generating a warning. This was reverted in 2.1.2.
Vector#cross_product
The Vector class gains a cross_product
instance method.
require "matrix"
Vector[1, 0, 0].cross_product(Vector[0, 1, 0]) #=> Vector[0, 0, -1]
#bit_length
Calling #bit_length
on an integer will return the number of digits it takes
to represent that number in binary.
128.bit_length #=> 8
32768.bit_length #=> 16
2147483648.bit_length #=> 32
4611686018427387904.bit_length #=> 63
pack
/unpack
Native Endian long long
Array#pack
and String#unpack
gain the ability to work with native endian
long long
s with the Q_
/Q!
and q_
/q!
directives.
# output may differ depending on the endianness of your system
unsigned_long_long_max = [2**64 - 1].pack("Q!") #=> "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF"
signed_long_long_min = [-2**63].pack("q!") #=> "\x00\x00\x00\x00\x00\x00\x00\x80"
signed_long_long_max = [2**63 - 1].pack("q!") #=> "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x7F"
unsigned_long_long_max.unpack("Q!") #=> 18446744073709551615
signed_long_long_min.unpack("q!") #=> -9223372036854775808
signed_long_long_max.unpack("q!") #=> 9223372036854775807
The HFS Plus filesystem on Mac OS X uses the UTF8-MAC encoding for filenames,
with decomposed characters, e.g. é is represented with e and U+0301, rather
than just U+00E9 (with some exceptions). Dir.glob
and Dir[]
now
normalise this back to UTF8 encoded strings with composed characters.
File.write("composed_e\u{301}xample.txt", "")
File.write("precomposed_\u{e9}xample.txt", "")
puts Dir["*"].map(&:dump)
"composed_\u{e9}xample.txt"
"example.rb"
"precomposed_\u{e9}xample.txt"
Numeric#quo
now calls #to_r
on the receiver which should allow for better
behaviour when implementing your own Numeric subclasses. It also means
TypeError rather than ArgumentError will be raised if the receiver can’t be
converted. As TypeError is a subclass of ArgumentError this shouldn’t be an
issue.
Binding#local_variable_get
/_set
/_defined?
Binding gets methods to get/set local variables. This can come in handy if you really want to use a keyword argument that clashes with a reserved word
def primes(begin: 2, end: 1000)
[binding.local_variable_get(:begin), 2].max.upto(binding.local_variable_get(:end)).each_with_object([]) do |i, array|
array << i unless (2...i).any? {|j| (i % j).zero?}
end
end
primes(end: 10) #=> [2, 3, 5, 7]
Or if you want to use a Hash to populate local variables in a Binding, say for evaluating a template
def make_binding(hash)
b = TOPLEVEL_BINDING.dup
hash.each {|k,v| b.local_variable_set(k, v)}
b
end
require "erb"
cover = %Q{<h1><%= title %></h1>\n<h2 class="big friendly"><%= subtitle %></h2>}
locals = {:title => "Hitchhiker's Guide to the Galaxy", :subtitle => "Don't Panic"}
ERB.new(cover).result(make_binding(locals)) #=> "<h1>Hitchhiker's Guide to the Galaxy</h1>\n<h2 class=\"big friendly\">Don't Panic</h2>"
CGI::Util
moduleCGI has a few handy utility class methods for escaping url and html strings. These have been moved to the CGI::Util module which can be included into other classes or the main scope for scripts.
require "cgi/util"
CGI.escape("hello world!") #=> "hello+world%21"
include CGI::Util
escape("hello world!") #=> "hello+world%21"
Digest::Class.file
passes arguments to initialiserThe various Digest classes have a shortcut method for producing the digest for a given file, this method has been updated to pass any extra arguments past the filename to the implementation’s initialiser. So rather than:
require "digest"
Digest::SHA2.new(512).hexdigest(File.read("example.txt")) #=> "f7fbba..."
It’s possible to do:
require "digest"
Digest::SHA2.file("example.txt", 512).hexdigest #=> "f7fbba..."
Net::SMTP#rset
It is now possible to abort an SMTP transaction by sending the RSET command
with Net::SMTP#rset
.
require "net/smtp"
smtp = Net::SMTP.start("some.smtp.server")
notification = "Hi %s,\n ..."
users.each do |user|
begin
smtp.mailfrom("[email protected]")
smtp.rcptto(user.email)
smtp.data(sprintf(notification, user.name))
rescue
smtp.rset
end
end
smtp.finish
open-uri allows Kernel#open
to open resources with a URI, and will extend the
return value with OpenURI::Meta
. This gains a new #metas
method to return
the header values as arrays, for the case when a header has been used multiple
times, eg set-cookie.
require "open-uri"
f = open("http://google.com")
f.meta["set-cookie"].class #=> String
f.metas["set-cookie"].class #=> Array
f.metas["set-cookie"].length #=> 2
Pathname gains #write
and #binwrite
methods to write to files.
require "pathname"
path = Pathname.new("test.txt").expand_path(__dir__)
path.write("foo")
path.write("bar", 3) # offset
path.write("baz", mode: "a") # append
Tempfile.create
Tempfile now has a create
method similar to new
but rather than returning a
Tempfile instance that uses a finaliser to clean up the file when the object is
garbage collected, it yields a plain File object to a block and cleans up the
file at the end of the block.
require "tempfile"
path = nil
Tempfile.create("example") do |f|
f #=> #<File:/tmp/example20140428-16851-15kf046>
path = f.path
end
File.exist?(path) #=> false
The Rinda Ring classes are now able to listen on/connect to multicast addresses.
Here’s an example of using Rinda to create an extremely simple service registry listening on the multicast address 239.0.0.1
require "rinda/ring"
require "rinda/tuplespace"
DRb.start_service
tuple_space = Rinda::TupleSpace.new
server = Rinda::RingServer.new(tuple_space, ["239.0.0.1"])
DRb.thread.join
To have a service register itself:
require "rinda/ring"
DRb.start_service
ring_finger = Rinda::RingFinger.new(["239.0.0.1"])
tuple_space = ring_finger.lookup_ring_any
tuple_space.write([:message_service, "localhost", 8080])
# start messaging service on localhost:8080
And discover the address of a service:
require "rinda/ring"
DRb.start_service
ring_finger = Rinda::RingFinger.new(["239.0.0.1"])
tuple_space = ring_finger.lookup_ring_any
_, host, port = tuple_space.read([:message_service, String, Fixnum])
# connect to messaging service
I had some issues with the tuple_space = ring_finger.lookup_ring_any
line
causing a segfault, and had to use the following in it’s place:
tuple_space = nil
ring_finger.lookup_ring(0.01) {|x| break tuple_space = x}
XMLRPC::Client#http
returns the Net::HTTP instance being used by the client
to allow minor configuration options that don’t have an accessor on the client
to be set.
client = XMLRPC::Client.new("example.com")
client.http.keep_alive_timeout = 30 # keep connection open for longer
# use client ...
URI.encode_
/decode_www_form
updated to match WHATWG standardURI.encode_www_form
and URI.decode_www_form
have been updated to match the
WHATWG standard.
URI.decode_www_form
no longer treats ;
as a separator, &
is the only
default separator, but there is a new separator:
keyword argument if you need
to change it.
require "uri"
URI.decode_www_form("foo=1;bar=2", separator: ";") #=> [["foo", "1"], ["bar", "2"]]
URI.decode_www_form
can also now successfully decode the output of
URI.encode_www_form
when a value is nil.
require "uri"
string = URI.encode_www_form(foo: 1, bar: nil, baz: 3) #=> "foo=1&bar&baz=3"
URI.decode_www_form("foo=1&bar&baz=3") #=> [["foo", "1"], ["bar", ""], ["baz", "3"]]
RbConfig::SIZEOF
RbConfig::SIZEOF
has been added to provide the size of C types.
require "rbconfig/sizeof"
RbConfig::SIZEOF["short"] #=> 2
RbConfig::SIZEOF["int"] #=> 4
RbConfig::SIZEOF["long"] #=> 8
Syslog::Logger, the Logger-compatible interface to Syslog, gets the ability to set the facility.
require "syslog/logger"
facility = Syslog::LOG_LOCAL0
logger = Syslog::Logger.new("MyApp", facility)
logger.debug("test")
CSV.foreach
with no block returns working enumeratorCSV.foreach
called without a block argument returns an enumerator, however
this has for a long time resulted in an IOError when it was actually used. This
has now been fixed.
require "csv"
enum = CSV.foreach("example.csv")
enum.next #=> ["1", "foo"]
enum.next #=> ["2", "bar"]
enum.next #=> ["3", "baz"]
OpenSSL::BN.new
now accepts integers as well as strings.
require "openssl"
OpenSSL::BN.new(4_611_686_018_427_387_904) #=> #<OpenSSL::BN:0x007fce7a0c56e8>
Enumerator.new
size argument fixed to accept any callable objectEnumerator.new
takes a size argument which can either be an integer, or an
object responding to #call
. Under 2.0.0 only integers and Procs would work,
despite what the documentation said. This has now been fixed.
require "thread"
queue = Queue.new
enum = Enumerator.new(queue.method(:size)) do |yielder|
loop {yielder << queue.pop}
end
queue << "foo"
enum.size #=> 1
curses has been removed from the standard library and is now available as a gem.
TSort can be useful for determining an order to complete tasks from a list of
dependancies. However it’s a bit of a hassle to use, having to implement a
class, include TSort, and implement #tsort_each_node
and #tsort_each_child
.
But now TSort is a little easier to use with, say, a hash. The same methods
that are available as instance methods are now available on the module itself,
taking two callable objects, one to take the place of #tsort_each_node
and
the second #tsort_each_child
.
require "tsort"
camping_steps = {
"sleep" => ["food", "tent"],
"tent" => ["camping site", "canvas"],
"canvas" => ["tent poles"],
"tent poles" => ["camping site"],
"food" => ["fish", "fire"],
"fire" => ["firewood", "matches", "camping site"],
"fish" => ["stream", "fishing rod"]
}
all_nodes = camping_steps.to_a.flatten
each_node = all_nodes.method(:each)
each_child = -> step, &b {camping_steps.fetch(step, []).each(&b)}
puts TSort.tsort(each_node, each_child)
Outputs:
stream
fishing rod
fish
firewood
matches
camping site
fire
food
tent poles
canvas
tent
sleep
Ruby 2.1 has added support for TCP Fast Open if it is available on your system. It’s possible to check whether it is available by checking the for existence of the Socket::TCP_FASTOPEN and Socket::MSG_FASTOPEN constants.
Server:
require "socket"
unless Socket.const_defined?(:TCP_FASTOPEN)
abort "TCP Fast Open not supported on this system"
end
server = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM)
server.setsockopt(Socket::SOL_TCP, Socket::TCP_FASTOPEN, 5)
addrinfo = Addrinfo.new(Socket.sockaddr_in(3000, "localhost"))
server.bind(addrinfo)
server.listen(1)
socket = server.accept
socket.write(socket.readline)
Client:
require "socket"
unless Socket.const_defined?(:MSG_FASTOPEN)
abort "TCP Fast Open not supported on this system"
end
socket = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM)
socket.send("foo\n", Socket::MSG_FASTOPEN, Socket.sockaddr_in(3000, "localhost"))
puts socket.readline
socket.close
Please let me know if there’s anything missing or incorrect here.