Exploring Different Tech Stack Options

When a web application reaches a certain age and size, splitting it into smaller, self-contained parts and extracting services can become necessary. This can be driven by the need to speed up testing, enable independent deployments of different app sections, or establish clear boundaries between subsystems. Service extraction presents software engineers with crucial decisions, including the technology stack for the new service.

This post recounts our experience extracting a new service from our monolithic Toptal Platform. We’ll delve into our chosen tech stack, the rationale behind it, and the challenges we encountered during implementation.

Our Toptal Chronicles service manages all user actions on the Toptal Platform, essentially functioning as a log of user activities. Each action, like publishing a blog post or approving a job, generates a new log entry.

While originating from our Platform, Chronicles is inherently independent and can integrate with any application. This independence is why we’re sharing our process and the hurdles our engineering team overcame during the transition.

Several factors motivated our decision to extract the service and upgrade the stack:

We aimed to enable other services to log events viewable and usable elsewhere.
The database tables storing history records were growing rapidly and unevenly, leading to high operational costs.
We recognized a significant technical debt in the existing implementation.

While appearing straightforward initially, alternative tech stacks often bring unforeseen challenges, which is what this article will explore.

Architecture Overview

Chronicles comprises three relatively independent parts, each running in separate Docker containers:

Kafka consumer: A lightweight Karafka-based Kafka consumer that receives entry creation messages and queues them in Sidekiq.
Sidekiq worker: Processes Kafka messages and creates corresponding entries in the database table.
GraphQL endpoints:
- Public endpoint: Exposes the entry search API used for various Platform functions (e.g., rendering comment tooltips, displaying job change history).
- Internal endpoint: Enables tag rule and template creation from data migrations.

Initially, Chronicles connected to two databases: its own for storing tag rules and templates, and the Platform database for user actions, tags, and taggings. During extraction, we migrated data from the Platform database and severed the connection.

Initial Plan

We initially opted for Hanami and its default ecosystem (hanami-model with ROM.rb, dry-rb, hanami-newrelic, etc.). This “standard” approach promised minimal friction, rapid implementation, and readily available solutions for potential issues. The hanami ecosystem offered maturity, popularity, and active maintenance by respected Ruby community members.

Since a significant portion of the system, like the GraphQL Entry Search endpoint and CreateEntry operation, was already implemented on the Platform side, we planned to reuse much of that code in Chronicles without modification. This was another reason for not choosing Elixir, as it wouldn’t allow for direct code reuse.

Rails felt excessive for this relatively small project, particularly features like ActiveSupport, which wouldn’t provide significant benefits for our needs.

When the Plan Goes South

Despite our best efforts, several factors led us to deviate from the initial plan: our limited experience with the chosen stack, genuine issues with the stack itself, and our non-standard two-database setup. Ultimately, we abandoned hanami-model and then Hanami entirely, replacing it with Sinatra.

Sinatra, a well-maintained library with over a decade of history and widespread popularity, became our choice. Its familiarity within the team ensured ample hands-on experience.

Incompatible Dependencies

At the time of Chronicles’ extraction in June 2019, Hanami lacked compatibility with the latest dry-rb gems. Hanami 1.3.1 (the latest then) only supported dry-validation 0.12, while we needed dry-validation 1.0.0 for its contract feature. Additionally, Kafka 1.2 was incompatible with dry gems, forcing us to use the repository version. Currently, we use 1.3.0.rc1, which supports the latest dry gems.

Unnecessary Dependencies

The Hanami gem brought along many dependencies we didn’t intend to use, including hanami-cli, hanami-assets, hanami-mailer, hanami-view, and even hanami-controller. The hanami-model readme also revealed that it supported only one database by default, while ROM.rb, the foundation of hanami-model, offered out-of-the-box multi-database configurations.

Ultimately, Hanami and hanami-model felt like unnecessary layers of abstraction.

Consequently, ten days after the first substantial PR for Chronicles, we replaced hanami with Sinatra. While pure Rack was also an option due to our simple routing needs (four “static” endpoints), we opted for the slightly more structured Sinatra, which proved to be a perfect fit. If you are interested in learning more about this, check out our Sinatra and Sequel tutorial.

Dry-schema and Dry-validation Misunderstandings

Mastering dry-validation’s intricacies and proper usage took time and experimentation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
params do
  required(:url).filled(:string)
end

params do
  required(:url).value(:string)
end

params do
  optional(:url).value(:string?)
end

params do
  optional(:url).filled(Types::String)
end

params do
  optional(:url).filled(Types::Coercible::String)
end

This snippet shows multiple, slightly different ways of defining the url parameter. Some are equivalent, while others are nonsensical. Initially, we struggled to differentiate between these definitions due to our limited understanding, resulting in a somewhat chaotic first version of our contracts. Over time, we gained proficiency in reading and writing DRY contracts, resulting in consistent and elegant code—beautiful, even. We’ve even extended contract validation to our application configuration.

Problems with ROM.rb and Sequel

ROM.rb and Sequel’s differences from ActiveRecord were expected. Our initial assumption of easily copying code from the Platform, which heavily relied on ActiveRecord, proved incorrect. We had to rewrite almost everything in ROM/Sequel, with only small, framework-agnostic code snippets being reusable. This process brought its share of frustrating issues and bugs.

Filtering by Subquery

For instance, figuring out subqueries in ROM.rb/Sequel took considerable effort. What would be a simple scope.where(sequence_code: subquery) in Rails became more involved in Sequel: not that easy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def apply_subquery_filter(base_query, params)
  subquery = as_subquery(build_subquery(params))
  base_query.where { Sequel.lit('sequence_code IN ?', subquery) }
end

# This is a fixed version of https://github.com/rom-rb/rom-sql/blob/6fa344d7022b5cc9ad8e0d026448a32ca5b37f12/lib/rom/sql/relation/reading.rb#L998
# The original version has `unorder` on the subquery.
# The fix was merged: https://github.com/rom-rb/rom-sql/pull/342.
def as_subquery(relation)
  attr = relation.schema.to_a[0]
  subquery = relation.schema.project(attr).call(relation).dataset
  ROM::SQL::Attribute[attr.type].meta(sql_expr: subquery)
end

Instead of a concise base_query.where(sequence_code: bild_subquery(params)), we ended up with lengthy code, raw SQL fragments, and multiline comments explaining the reasons behind this verbosity.

Associations with Non-trivial Join Fields

The entry relation (representing the performed_actions table) uses the sequence_code column for joining with *taggings tables, despite having a primary id field. While straightforward in ActiveRecord:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class PerformedAction < ApplicationRecord
  has_many :feed_taggings,
    class_name: 'PerformedActionFeedTagging',
    foreign_key: 'performed_action_sequence_code',
    primary_key: 'sequence_code',
end

class PerformedActionFeedTagging < ApplicationRecord
  db_belongs_to :performed_action,
    foreign_key: 'performed_action_sequence_code',
    primary_key: 'sequence_code'
end

Replicating this in ROM was possible:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
module Chronicles::Persistence::Relations::Entries < ROM::Relation[:sql]
  struct_namespace Chronicles::Entities
  auto_struct true

  schema(:performed_actions, as: :entries) do
    attribute :id, ROM::Types::Integer
    attribute :sequence_code, ::Types::UUID
    primary_key :id

    associations do
      has_many :access_taggings,
        foreign_key: :performed_action_sequence_code,
        primary_key: :sequence_code
    end
  end
end

module Chronicles::Persistence::Relations::AccessTaggings < ROM::Relation[:sql]
  struct_namespace Chronicles::Entities
  auto_struct true

  schema(:performed_action_access_taggings, as: :access_taggings, infer: false) do
    attribute :performed_action_sequence_code, ::Types::UUID
    
    associations do
      belongs_to :entry, foreign_key: :performed_action_sequence_code,
                          primary_key: :sequence_code,
                          null: false
    end
  end
end

However, this seemingly correct code would compile but fail during runtime:

1
2
3
4
5
6
7
[4] pry(main)> Chronicles::Persistence.relations[:platform][:entries].join(:access_taggings).limit(1).to_a
E, [2019-09-05T15:54:16.706292 #20153] ERROR -- : PG::UndefinedFunction: ERROR:  operator does not exist: integer = uuid
LINE 1: ...ion_access_taggings" ON ("performed_actions"."id" = "perform...
                                                            ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.: SELECT <..snip..> FROM "performed_actions" INNER JOIN "performed_action_access_taggings" ON ("performed_actions"."id" = "performed_action_access_taggings"."performed_action_sequence_code") ORDER BY "performed_actions"."id" LIMIT 1
Sequel::DatabaseError: PG::UndefinedFunction: ERROR:  operator does not exist: integer = uuid
LINE 1: ...ion_access_taggings" ON ("performed_actions"."id" = "perform...

Fortunately, the differing types of id and sequence_code resulted in a PostgreSQL type error. Had they been the same, debugging would have been significantly more challenging.

Since entries.join(:access_taggings) failed, we tried explicitly specifying the join condition as entries.join(:access_taggings, performed_action_sequence_code: :sequence_code), as suggested in the documentation:

1
2
3
4
5
[8] pry(main)> Chronicles::Persistence.relations[:platform][:entries].join(:access_taggings, performed_action_sequence_code: :sequence_code).limit(1).to_a
E, [2019-09-05T16:02:16.952972 #20153] ERROR -- : PG::UndefinedTable: ERROR:  relation "access_taggings" does not exist
LINE 1: ...."updated_at" FROM "performed_actions" INNER JOIN "access_ta...
                                                             ^: SELECT <snip> FROM "performed_actions" INNER JOIN "access_taggings" ON ("access_taggings"."performed_action_sequence_code" = "performed_actions"."sequence_code") ORDER BY "performed_actions"."id" LIMIT 1
Sequel::DatabaseError: PG::UndefinedTable: ERROR:  relation "access_taggings" does not exist

This, however, resulted in :access_taggings being incorrectly interpreted as the table name. Replacing it with the actual table name:

1
2
3
[10] pry(main)> data = Chronicles::Persistence.relations[:platform][:entries].join(:performed_action_access_taggings, performed_action_sequence_code: :sequence_code).limit(1).to_a

=> [#<Chronicles::Entities::Entry id=22 subject_gid="gid://platform/Talent/124383" ... updated_at=2012-05-10 08:46:43 UTC>]

While this finally worked, it introduced a leaky abstraction, as table names ideally shouldn’t leak into application code.

SQL Parameter Interpolation

Chronicles’ search feature allows users to search by payload using queries like {operation: :EQ, path: ["flag", "gid"], value: "gid://plat/Flag/1"}, where path is always an array of strings and value can be any valid JSON value.

In ActiveRecord, this translates to: this

1
@scope.where('payload -> :path #> :value::jsonb', path: path, value: value.to_json)

In Sequel, properly interpolating :path proved difficult, forcing us to resort to: that

1
base_query.where(Sequel.lit("payload #> '{#{path.join(',')}}' = ?::jsonb", value.to_json))

While path undergoes validation to ensure it contains only alphanumeric characters, this code snippet highlights the awkwardness we encountered.

Silent Magic of ROM-factory

We used the rom-factory gem to streamline model creation in tests. However, we encountered unexpected behavior several times. Can you identify the issue in this test?

1
2
3
4
action1 = RomFactory[:action, app: 'plat', subject_type: 'Job', action: 'deleted']
action2 = RomFactory[:action, app: 'plat', subject_type: 'Job', action: 'updated']

expect(action1.id).not_to eq(action2.id)

The expectation itself is not the problem.

The issue lies in the second line, which fails due to a unique constraint validation error. This occurs because the Action model doesn’t have an action attribute. The correct attribute is action_name. The correct way to create actions would be:

1
RomFactory[:action, app: 'plat', subject_type: 'Job', action_name: 'deleted']

The mistyped attribute was silently ignored, defaulting to the factory’s default (action_name { 'created' }). This led to a unique constraint violation when attempting to create two identical actions. This issue recurred multiple times, proving quite cumbersome.

Fortunately, this was fixed in version 0.9.0. Dependabot automatically submitted a pull request with the library update, which we merged after fixing several mistyped attributes in our tests.

General Ergonomics

This example illustrates the situation effectively:

1
2
3
4
5
# ActiveRecord
PerformedAction.count _# => 30232445_

# ROM
EntryRepository.new.root.count _# => 30232445_

This difference becomes even more pronounced in more complex scenarios.

The Good Parts

Despite the challenges, our journey also had numerous positive aspects that far outweighed the negatives, making the entire endeavor worthwhile.

Test Speed

Running the entire test suite locally takes a mere 5-10 seconds, as does RuboCop. While CI takes longer (3-4 minutes), the ability to run everything locally minimizes the impact, as CI failures become less frequent.

The guard gem has become usable again. Imagine being able to write code and run tests on every save, receiving instant feedback. This was unimaginable when working with the Platform.

Deploy Times

Deploying the extracted Chronicles app takes just two minutes. While not exceptionally fast, it’s still an improvement. Frequent deployments amplify the benefits of even small time savings.

Application Performance

Chronicles’ most performance-critical aspect is Entry search. With about 20 places in the Platform back end fetching history entries from Chronicles, its response time contributes to the Platform’s 60-second response time budget.

Despite the massive actions log (over 30 million rows and growing), the average response time is less than 100ms, as illustrated in this chart:

With 80-90% of the app’s time spent in the database, this demonstrates a healthy performance profile.

While some slow queries can take up to tens of seconds, we have a plan to eliminate them, further improving the extracted app’s performance.

Structure

dry-validation has proven to be a powerful and versatile tool for our requirements. By passing all external input through contracts, we ensure data integrity and well-defined types.

The need for to_s.to_sym.to_i calls in application code is eliminated, as data is cleansed and typecast at the app’s boundaries. This effectively brings the sanity of strong typing to the dynamic world of Ruby. We highly recommend it.

Final Words

Choosing a non-standard stack turned out to be more complex than anticipated. We carefully considered various factors: the monolith’s existing stack, the team’s familiarity with the new stack, and the chosen stack’s maintenance status.

Despite meticulous planning and initially opting for the standard Hanami stack, the project’s unique technical requirements forced us to reconsider. We eventually settled on Sinatra and a DRY-based stack.

Would we choose Hanami again for a similar app extraction? Probably yes. With our current knowledge of the library’s strengths and weaknesses, we could make more informed decisions from the outset. However, a plain Sinatra/DRY.rb app would also be a strong contender.

Ultimately, exploring new frameworks, paradigms, and programming languages provides valuable insights and fresh perspectives on our existing tech stack. Expanding our knowledge of available tools allows us to make better-informed decisions and choose the most appropriate tools for our applications.