Domain-specific languages (DSLs) are a potent tool for simplifying the programming and configuration of intricate systems. As a software engineer, you likely engage with numerous DSLs daily without realizing it.
This article will delve into the concept of DSLs, their ideal use cases, and guide you through creating your own DSL in Ruby using advanced metaprogramming techniques. This article builds on Nikola Todorovic’s introduction to Ruby metaprogramming, previously featured on the Toptal Blog. We recommend familiarizing yourself with that article if you are new to metaprogramming.
Defining a Domain Specific Language
DSLs are languages tailored to a specific application domain or use case. Their specialized nature limits their applicability to general-purpose software development. DSLs manifest in various forms, including:
- Markup languages like HTML and CSS, designed for structuring, populating, and styling web pages. These languages lack the capability to write arbitrary algorithms, thus fitting the DSL definition.
- Macro and query languages (e.g., SQL), operating on top of existing systems or programming languages, with inherent limitations that classify them as DSLs.
- DSLs that leverage the syntax of an established programming language in a way that mimics a distinct mini-language.
This last category, referred to as an internal DSL, will be the focus of our upcoming example. But first, let’s examine some prominent examples of internal DSLs. Rails’ route definition syntax exemplifies this concept:
| |
This Ruby code resembles a customized route definition language, thanks to metaprogramming techniques that enable its clean and user-friendly interface. Notice how the DSL’s structure employs Ruby blocks, while method calls such as get and resources function as keywords within this mini-language.
Metaprogramming features even more prominently in the RSpec testing library:
| |
This code snippet also illustrates fluent interfaces, which allow declarations to be interpreted as natural language sentences, enhancing code readability:
| |
Another instance of a fluent interface is ActiveRecord and Arel’s query interface, which utilizes an abstract syntax tree internally for constructing complex SQL queries:
| |
While Ruby’s expressiveness and metaprogramming capabilities make it well-suited for building DSLs, they are not exclusive to Ruby. Here’s a JavaScript test using the Jasmine framework:
| |
Although not as elegant as the Ruby examples, this syntax demonstrates that well-chosen names and creative syntax utilization enable internal DSL creation in almost any language.
The advantage of internal DSLs lies in their avoidance of a separate parser, a component often challenging to implement correctly. Using the syntax of their implementation language also ensures seamless integration with existing codebases.
However, this comes at the cost of syntactic freedom. Internal DSLs must adhere to the syntactic rules of their host language. The degree of compromise depends on the language itself. Verbose, statically typed languages like Java and VB.NET offer less flexibility compared to dynamic, metaprogramming-rich languages like Ruby.
Constructing Our Own: A Ruby DSL for Class Configuration
Our example Ruby DSL will be a reusable configuration engine. It will define configuration attributes for a Ruby class using a simplified syntax. Incorporating configuration capabilities into a class is a common requirement in Ruby, particularly for configuring external gems and API clients. A standard solution involves an interface similar to this:
| |
Let’s implement this interface initially and then refine it iteratively. We’ll add features, improve syntax clarity, and enhance reusability.
For this interface to function, the MyApp class requires a class method called configure. This method takes a block, executes it by yielding to it, and passes in a configuration object. This object, in turn, has accessor methods for reading and writing configuration values:
| |
After the configuration block executes, we can readily access and modify the values:
| |
While functional, this implementation lacks the distinct feel of a custom language to be considered a DSL. We will address this gradually. Our next step is decoupling the configuration functionality from the MyApp class, making it generic and applicable to various use cases.
Achieving Reusability
Currently, replicating similar configuration capabilities in another class would necessitate copying both the Configuration class and its associated setup methods. We’d also need to modify the attr_accessor list to accommodate the new configuration attributes. To circumvent this, let’s relocate the configuration features into a separate module called Configurable. With this change, our MyApp class would appear as follows:
| |
All configuration-related elements now reside within the Configurable module:
| |
The notable addition here is the self.included method. Module inclusion in Ruby only incorporates instance methods. Therefore, our config and configure class methods wouldn’t be added to the host class by default. However, defining a method named included within a module triggers its execution whenever that module is included in a class. This allows us to manually extend the host class with the methods contained in ClassMethods:
| |
Our work isn’t finished yet. Next, we need to enable the specification of supported attributes within the host class that includes the Configurable module. An ideal solution would resemble this:
| |
Surprisingly, this code is syntactically valid. include is not a keyword but a method expecting a Module object as its parameter. As long as we provide an expression that returns a Module, the inclusion will proceed smoothly. Therefore, instead of directly including Configurable, we need a method called with. This method will generate a new, customized module with the specified attributes:
| |
Let’s break down this code. The entire Configurable module now comprises a single with method, with all operations occurring within it. We initiate the process by creating a new anonymous class using Class.new to house our attribute accessor methods. Since Class.new accepts the class definition as a block and blocks have access to external variables, we can seamlessly pass the attrs variable to attr_accessor.
| |
This ability of Ruby blocks to access external variables contributes to their classification as closures. They “close over” the surrounding environment in which they were defined, not necessarily executed. This distinction is crucial. Regardless of when or where our define_method blocks eventually execute, they retain access to the config_class and class_methods variables. This access persists even after the with method completes and returns. The following example illustrates this behavior:
| |
Armed with this understanding of blocks, we can proceed to define an anonymous module within class_methods. This module will hold the class methods that will be added to the host class upon inclusion of our generated module. We utilize define_method to define the config method because we require access to the external config_class variable from within the method. Defining it with the def keyword wouldn’t grant this access because standard method definitions using def are not closures. However, define_method accepts a block, enabling this functionality:
| |
Finally, we invoke Module.new to create the module to be returned. Within this module, we need to define our self.included method. Unfortunately, the def keyword is not an option here, as the method needs access to the external class_methods variable. Consequently, we resort to define_method with a block again. However, this time, we apply it to the singleton class of the module since we are defining a method on the module instance itself. Moreover, since define_method is a private method of the singleton class, we use send to invoke it instead of a direct call:
| |
That was a deep dive into metaprogramming. But was the added complexity justified? The ease of use speaks for itself:
| |
Yet, we can do even better. Our next step is refining the syntax within the configure block to enhance the module’s usability.
Syntax Enhancement
One remaining aspect we can improve is the repetitive use of config on each line within the configuration block. An ideal DSL would implicitly understand that everything within the configure block operates within the context of our configuration object. This would allow us to achieve the same result with a cleaner syntax:
| |
Let’s implement this improvement. We require two key elements: a mechanism to execute the block passed to configure within the configuration object’s context and a modification to the accessor methods. These methods should write a value if an argument is provided and return the value when called without an argument. Here’s a possible implementation:
| |
The simpler change involves running the configure block within the context of the configuration object. Utilizing Ruby’s instance_eval method on an object allows the execution of an arbitrary block of code as if it were running within that object. Consequently, when the configuration block calls the app_id method on the first line, that call is directed to our configuration class instance.
Modifying the attribute accessor methods in config_class is a bit more involved. To grasp this, we need to understand the behind-the-scenes workings of attr_accessor. Let’s consider the following attr_accessor call:
| |
This is equivalent to defining a reader and writer method for each specified attribute:
| |
So, when we used attr_accessor *attrs in the original code, Ruby automatically generated the attribute reader and writer methods for us for every attribute in attrs. This resulted in the standard accessor methods: app_id, app_id=, title, title=, and so on.
In our enhanced version, we aim to retain the standard writer methods to ensure the proper functioning of assignments like this:
| |
We achieve this by continuing to auto-generate the writer methods using attr_writer *attrs. However, we can no longer rely on the standard reader methods. They need to be modified to support writing the attribute as well, accommodating this new syntax:
| |
To generate the reader methods ourselves, we iterate through the attrs array. For each attribute, we define a method that either returns the current value of the corresponding instance variable (if no new value is provided) or writes the new value if specified:
| |
Here, we leverage Ruby’s instance_variable_get method to read an instance variable with an arbitrary name, and instance_variable_set to assign a new value to it. It’s important to note that the variable name must be prefixed with an “@” sign in both cases, hence the string interpolation.
You might be wondering why we use a blank object as the default value for “not provided” instead of nil. The reason is straightforward: nil is a valid value that might be intentionally set for a configuration attribute. Testing for nil wouldn’t allow us to differentiate between these two scenarios:
| |
The blank object stored in not_provided is designed to be equal only to itself. This ensures that it won’t be inadvertently passed into our method, causing an unintended read instead of a write.
Incorporating Support for References
Let’s add one more feature to enhance our module’s versatility: the ability to reference a configuration attribute from another:
| |
We’ve introduced a reference from cookie_name to the app_id attribute. The expression containing the reference is enclosed in a block, enabling delayed evaluation of the attribute value. The idea is to evaluate the block later, when the attribute is read, rather than during definition. This prevents issues arising from defining attributes in an “incorrect” order:
| |
Wrapping the expression in a block prevents its immediate evaluation. We can store the block for later execution when retrieving the attribute value:
| |
Adding support for delayed evaluation using blocks requires minimal changes to the Configurable module. In fact, we only need to adjust the attribute method definition:
| |
When setting an attribute, the block || value expression saves the block (if provided) or the value itself. Subsequently, when reading the attribute, we check if it’s a block and evaluate it using instance_eval. If not, we return it as before.
However, supporting references introduces its own set of considerations and edge cases. For instance, consider what might happen if you attempt to read any attribute in this configuration:
| |
The Final Module
We’ve developed a robust module for making arbitrary classes configurable. It utilizes a clean and straightforward DSL that even supports referencing configuration attributes from one another:
| |
Here’s the complete module implementing our DSL, concisely written in 36 lines of code:
| |
Looking at this intricate Ruby code, which is arguably difficult to decipher and maintain, you might question whether the effort was worthwhile simply to enhance a DSL’s aesthetics. The answer depends on the context, leading us to the final point of this article.
Ruby DSLs: When to Use and When to Avoid Them
As we streamlined the external syntax of our DSL, we increasingly relied on complex metaprogramming tricks internally. This resulted in an implementation that might prove challenging to comprehend and modify in the future. Like many aspects of software development, this involves a trade-off that requires careful evaluation.
For a DSL to justify its implementation and maintenance overhead, it should offer substantial benefits. These benefits often come from reusability across various scenarios, effectively distributing the cost across multiple use cases. Frameworks and libraries frequently incorporate their own DSLs precisely because they cater to a large developer base, with each developer benefiting from the productivity gains of these embedded languages.
Therefore, a general guideline is to consider building DSLs when you, your fellow developers, or your application’s end-users will derive significant value from them. If you choose to create a DSL, prioritize comprehensive test coverage and clear documentation of its syntax. This is crucial, as understanding a DSL’s functionality solely from its implementation can be very difficult. Your future self and fellow developers will appreciate your foresight.