Exploring the Options: A Comprehensive Ruby Pattern Matching Guide

The latest and greatest feature coming to Ruby 2.7 is pattern matching. For those eager to try it out, Ruby 2.7.0-dev is available for installation, but keep in mind that it’s still under development and the development team welcomes your feedback.

This article aims to provide a comprehensive understanding of pattern matching and its application in Ruby.

Understanding Pattern Matching

Frequently employed in functional programming languages, pattern matching, as defined by Scala documentation, is essentially “a way to compare a value against a defined pattern. A successful match can also break down a value into its individual components.”

It’s important to note that pattern matching is not about strings but rather about the structure of data. My initial encounter with pattern matching was around two years ago while exploring Elixir. I was learning the ropes of Elixir and using it to tackle algorithms. Upon comparing my solutions with others, I discovered their use of pattern matching, which made their code remarkably concise and readable.

The elegance of pattern matching left a lasting impression on me. Here’s a glimpse of pattern matching in Elixir:

1
2
3
4
[a, b, c] = [:hello, "world", 42]
a #=> :hello
b #=> "world"
c #=> 42

This example might resemble multiple assignment in Ruby, but it goes a step further by verifying if the values align:

1
2
3
[a, b, 42] = [:hello, "world", 42]
a #=> :hello
b #=> "world"

In the above instances, the number 42 on the left isn’t a variable being assigned a value. It acts as a check to ensure the corresponding element at that specific index matches the one on the right side.

1
2
[a, b, 88] = [:hello, "world", 42]
** (MatchError) no match of right hand side value

Here, instead of assigning values, a MatchError is triggered because 88 doesn’t match 42.

This concept extends to maps (similar to hashes in Ruby):

1
2
%{"name": "Zote", "title": title } = %{"name": "Zote", "title": "The mighty"}
title #=> The mighty

This example verifies if the value associated with the key name is Zote and assigns the value of the key title to the variable title.

Pattern matching proves particularly valuable when dealing with intricate data structures. It allows you to assign variables and check values or types all in a single line.

Moreover, it empowers dynamically typed languages like Elixir to implement method overloading:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def process(%{"animal" => animal}) do
  IO.puts("The animal is: #{animal}")
end

def process(%{"plant" => plant}) do
  IO.puts("The plant is: #{plant}")
end

def process(%{"person" => person}) do
  IO.puts("The person is: #{person}")
end

The execution of different methods is determined by the key within the hash argument.

This hopefully illustrates the potency of pattern matching. Various attempts have been made to integrate pattern matching into Ruby using gems like noaidi, qo, and egison-ruby.

Ruby 2.7 introduces its own implementation, drawing inspiration from these gems. Let’s delve into how it’s currently designed.

Ruby Pattern Matching Syntax

Pattern matching in Ruby is implemented using a case statement, but instead of the familiar when, the keyword in is used. It also supports the inclusion of if or unless statements:

1
2
3
4
5
6
7
8
case [variable or expression]
in [pattern]
  ...
in [pattern] if [expression]
  ...
else
  ...
end

A case statement can process a variable or an expression, which is then matched against patterns provided in the in clause. Additionally, if or unless statements can be added after the pattern. Similar to a standard case statement, the equality check here employs ===, allowing you to match subsets and instances of classes. Let’s illustrate this with an example:

Matching Arrays

1
2
3
4
5
6
7
8
9
translation = ['th', 'เต้', 'ja', 'テイ']

case translation
in ['th', orig_text, 'en', trans_text]
  puts "English translation: #{orig_text} => #{trans_text}"
in ['th', orig_text, 'ja', trans_text]
  # this will get executed
  puts "Japanese translation: #{orig_text} => #{trans_text}"
end

In this example, the variable translation is compared against two patterns:

['th', orig_text, 'en', trans_text] and ['th', orig_text, 'ja', trans_text]. The code checks if the values within the pattern correspond to the values in the translation variable at their respective indices. If a match is found, the values from the translation variable are assigned to the variables in the pattern at their corresponding indices.

Ruby Pattern Matching Animation: Matching Arrays

Matching Hashes

1
2
3
4
5
6
translation = {orig_lang: 'th', trans_lang: 'en', orig_txt: 'เต้', trans_txt: 'tae' }

case translation
in {orig_lang: 'th', trans_lang: 'en', orig_txt: orig_txt, trans_txt: trans_txt}
  puts "#{orig_txt} => #{trans_txt}"
end

Here, the translation variable represents a hash. It’s compared against another hash in the in clause. The case statement verifies if all keys in the pattern exist in the translation variable. It also ensures that the values associated with each key match. If successful, the values are assigned to the corresponding variables in the hash.

Matching Subsets

The equality check in pattern matching adheres to the logic of ===.

Multiple Patterns

The | symbol can be used to specify multiple patterns for a single block.

1
2
3
4
5
6
translation = ['th', 'เต้', 'ja', 'テイ']
case array
in {orig_lang: 'th', trans_lang: 'ja', orig_txt: orig_txt, trans_txt: trans_txt} | ['th', orig_text, 'ja', trans_text]
  puts orig_text #=> เต้
  puts trans_text #=> テイ
end

In this instance, the translation variable is checked against both the {orig_lang: 'th', trans_lang: 'ja', orig_txt: orig_txt, trans_txt: trans_txt} hash and the ['th', orig_text, 'ja', trans_text] array.

This proves beneficial when dealing with slightly different data structures representing the same concept, and you want a single code block to handle both.

Arrow Assignment

The => symbol can be employed for assigning a matched value to a variable.

1
2
3
4
5
6
case ['I am a string', 10]
in [Integer, Integer] => a
  # not reached
in [String, Integer] => b
  puts b #=> ['I am a string', 10]
end

This is particularly helpful when you need to verify values within a data structure while also binding them to a variable.

Pin Operator

The pin operator prevents the reassignment of variables.

1
2
3
4
case [1,2,2]
in [a,a,a]
  puts a #=> 2
end

In this example, the variable a in the pattern is matched against 1, 2, and then 2 again. It gets assigned to 1, then 2, and then 2 again. This isn’t ideal if you’re aiming to confirm if all values within the array are identical.

1
2
3
4
5
6
7
case [1,2,2]
in [a,^a,^a]
  # not reached
in [a,b,^b]
  puts a #=> 1
  puts b #=> 2
end

When the pin operator is used, it evaluates the variable rather than reassigning it. In this scenario, [1,2,2] doesn’t match [a,^a,^a] because in the first index, a is assigned to 1. In the second and third indices, a is evaluated as 1 but is compared against 2.

However, [a,b,^b] matches [1,2,2] because a is assigned 1 in the first index, b is assigned 2 in the second index, and then ^b, which is now 2, is matched against 2 in the third index, resulting in a successful match.

1
2
3
4
5
6
7
a = 1
case [2,2]
in [^a,^a]
  #=> not reached
in [b,^b]
 puts b #=> 2
end

Variables defined outside the case statement can also be utilized, as demonstrated in the example above.

Underscore (`_`) Operator

The underscore (_) serves as a placeholder to disregard values. Let’s look at a couple of examples:

1
2
3
4
case ['this will be ignored',2]
in [_,a]
  puts a #=> 2
end

1
2
3
4
5
case ['a',2]
in [_,a] => b
  puts a #=> 2
  Puts b #=> ['a',2]
end

In these examples, any value matching against _ passes. In the second case statement, the => operator captures the disregarded value as well.

Practical Applications of Pattern Matching in Ruby

Imagine working with the following JSON data:

1
2
3
4
5
{
  nickName: 'Tae'
  realName: {firstName: 'Noppakun', lastName: 'Wongsrinoppakun'}
  username: 'tae8838'
}

In your Ruby project, you intend to parse this data and display the name based on these criteria:

If a username exists, return the username.
If the nickname, first name, and last name exist, return the nickname followed by the first and last names.
If the nickname is absent but the first and last names are present, return the first and last names.
If none of the conditions are met, return “New User.”

Currently, you might write this in Ruby as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def display_name(name_hash)
  if name_hash[:username]
    name_hash[:username]
  elsif name_hash[:nickname] && name_hash[:realname] && name_hash[:realname][:first] && name_hash[:realname][:last]
    "#{name_hash[:nickname]} #{name_hash[:realname][:first]} #{name_hash[:realname][:last]}"
  elsif name_hash[:first] && name_hash[:last]
    "#{name_hash[:first]} #{name_hash[:last]}"
  else
    'New User'
  end
end

Now, let’s see how this translates using pattern matching:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def display_name(name_hash)
  case name_hash
  in {username: username}
    username
  in {nickname: nickname, realname: {first: first, last: last}}
    "#{nickname} #{first} #{last}"
  in {first: first, last: last}
    "#{first} #{last}"
  else
    'New User'
  end
end

While syntax preferences can be subjective, the pattern matching approach appears more favorable. It allows us to explicitly define the expected hash structure instead of describing and checking its values. This enhances the readability and understanding of the expected data:

1
`{nickname: nickname, realname: {first: first, last: last}}` 

Instead of:

1
`name_hash[:nickname] && name_hash[:realname] && name_hash[:realname][:first] && name_hash[:realname][:last]`.

Deconstruct and Deconstruct_keys

Ruby 2.7 introduces two new methods: deconstruct and deconstruct_keys. When an instance of a class is matched against an array or hash, deconstruct or deconstruct_keys are invoked, respectively.

The results returned by these methods are then used for pattern matching. Here’s an illustration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class Coordinate
  attr_accessor :x, :y

  def initialize(x, y)
    @x = x
    @y = y
  end

  def deconstruct
    [@x, @y]
  end

  def deconstruct_key
    {x: @x, y: @y}
  end
end

This code snippet defines a class named Coordinate with x and y attributes. It also defines the methods deconstruct and deconstruct_keys.

1
2
3
4
5
6
7
c = Coordinates.new(32,50)

case c
in [a,b]
  p a #=> 32
  p b #=> 50
end

Here, an instance of Coordinate is created and pattern matched against an array. In this case, Coordinate#deconstruct is called, and the result is compared against the array [a,b] specified in the pattern.

1
2
3
4
5
case c
in {x:, y:}
  p x #=> 32
  p y #=> 50
end

In this example, the same instance of Coordinate is matched against a hash. The result of Coordinate#deconstruct_keys is then used for comparison against the hash {x: x, y: y} in the pattern.

An Exciting Experimental Feature

Having first encountered pattern matching in Elixir, I initially anticipated that this feature might encompass method overloading and be implemented with a more concise syntax. However, it’s understandable considering Ruby wasn’t designed with pattern matching in mind.

Utilizing a case statement is a relatively lightweight approach to implementation and doesn’t impact existing code (except for the introduction of deconstruct and deconstruct_keys methods). Interestingly, this use of the case statement resembles Scala’s implementation of pattern matching.

In my opinion, pattern matching is a promising addition to Ruby. It has the potential to enhance code clarity and bring a touch of modernity to the language. I’m eager to witness how developers embrace and leverage this feature as it continues to evolve.