Enhancing SQL Traceability with Sqlcommenter in Elixir

Enhancing SQL Traceability with Sqlcommenter in Elixir

Table of Contents

This post is outdated - there is a solution now: https://dev.to/dkuku/sql-commenter-with-postgrex-2bfd

For anyone who prefers to read the code - both ways are here: TLDR

In the world of database operations, traceability is a crucial aspect. It’s essential to know which piece of code generated a specific SQL query. This knowledge becomes even more important when working with data mapping frameworks. That’s where Sqlcommenter for Elixir comes into play.

Sqlcommenter is an Elixir library designed for SQL comment management, enabling you to securely attach metadata as comments to SQL statements generated within your application. The metadata undergoes an escape process before being transformed into a comment, ensuring that it remains safeguarded against SQL injection risks. In this blog post, we’ll explore how to use Sqlcommenter to enhance your SQL traceability in Elixir.

Sqlcommenter is built based on the specification provided by https://google.github.io/sqlcommenter/. This specification serves as the foundation for Sqlcommenter’s functionality and ensures its compatibility with industry standards.

Installation

Getting started with Sqlcommenter in your Elixir project is straightforward. You need to add it as a dependency in your mix.exs file:

def deps do
  [{:sqlcommenter, "~> 0.1"}]
end

After adding the dependency, run mix deps.get.

Usage

Once Sqlcommenter is installed, you can start using it to trace your SQL queries. Currently Ecto does not have any way to do this automatically, there is no comment option in ecto. I saw multiple questions about this already and no proper plan to implement it. We need to work around this limitation. I found 2 ways of doing this both with own limitations. Here’s a step-by-step guide on how to do it:

Generate raw sql and attach the trace info

First, you’ll need to modify your MyApp.Repo module by adding a new function called all_traced. This function will be responsible for attaching SQL comments to your SQL queries:

  def all_traced(queryable, opts \\ []) do
    {metadata, opts} = Keyword.pop(opts, :metadata, %{})
    {:current_stacktrace, stacktrace} = Process.info(self(), :current_stacktrace)

    stacktrace =
      stacktrace |> Enum.drop(2) |> Enum.take(1) |> Exception.format_stacktrace() |> String.trim()

    queryable = Ecto.Queryable.to_query(queryable)
    {query, params} = __MODULE__.to_sql(:all, queryable)
    query = Sqlcommenter.append_to_query(query, put_in(metadata, [:stacktrace], stacktrace))

    {:ok, result} =
      __MODULE__.query(query, params, opts)

    struct =
      case queryable.from do
        # maybe other load types
        %{source: {_, struct}} -> struct
      end

    Enum.map(
      result.rows,
      &__MODULE__.load(struct, {result.columns, &1})
    )
  end

The all_traced function is a custom function you can add to your Elixir application when using the Sqlcommenter library. This function serves the purpose of enhancing traceability for SQL queries generated by your application when interacting with a database. Let’s break down what this function does and why it’s useful:

Function Signature: The all_traced function takes two arguments:

  • queryable: This argument represents the ecto query you want to execute
  • opts: This is an optional argument that allows you to pass additional options to the function. In the context of Sqlcommenter, it can include metadata that you want to attach as SQL comments to the generated query.

Metadata: Within the function, the opts argument is processed to extract metadata. Metadata consists of key-value pairs that provide context or information about the query. For example, in the blog post’s code snippet, the request_id is extracted from the metadata. Additionally the metadata will be enhanced by the function name that executed the query. This is extracted from the stacktrace.

SQL Query Generation: The function uses Repo.to_sql to convert the queryable into an SQL query and its parameters. This step is necessary to prepare the query for execution.

Sqlcommenter Integration: The key role of the all_traced function is to integrate Sqlcommenter into the query. It calls Sqlcommenter.append_to_query and passes the original query and metadata params as arguments. Sqlcommenter then appends the metadata as an SQL comment to the query.

Query Execution: Finally, the modified query is passed to Repo.query for execution. This is where the actual database query is executed, but now it includes the SQL comment added by Sqlcommenter.

Loading Data: Repo.query returns raw values from the database, we have to load the values back to Ecto.Schemas - this happens at the end of the function where we extract the struct name and use Repo.load to load the data.

In essence, the all_traced function is a middleware or wrapper around your database queries. It extends the query with SQL comments that contain metadata, making it easier to trace the origin of the query. This traceability is particularly valuable in scenarios where you need to correlate your query with web request id.

By using all_traced and Sqlcommenter, you can associate SQL queries with specific context information, like the request_id in the example, which can be immensely helpful for debugging, monitoring, and auditing your database interactions. You can also load the trace_id, span_id when using OpenTelemetry or Spandex. A disadvantage here is that it has to be modified to use more complicated queries with preloads or schemaless queries

2. Using the all_traced Function

Now that you’ve defined the all_traced function, you can use it for querying your database. Here’s an example:

Schemas.Person
|> Repo.all_traced(metadata: %{request_id: Ecto.UUID.generate()})

In this example, we’re querying the Person schema and attaching a request_id to it. This request_id is added as an SQL comment, making it easier to trace the query later.

SQL Query Traceability

When you execute the above code, you’ll receive your data as usual. However, behind the scenes, Sqlcommenter has added a comment to your SQL query. Here’s what it might look like:

SELECT p0."id", p0."first_name" FROM "person"."person" AS p0 /*request_id='fa2af7b2-d8e1-4e8f-8820-3fd648b73187'*/ []

As you can see, the request_id is embedded as an SQL comment within the query. This simple addition makes it much easier to correlate the SQL statement with the originating code.

Using Ecto.Query.lock

Lock is the last part appended to the query in postgres. Ecto accepts a string as this param, so we can just insert or append our Sqlcommenter data into this field. This change is very simple:

Person
|> where([p], p.id == ^person_id)
|> lock(Sqlcommenter.to_str(metadata))

It could be as simple as that but ecto is checking and makes sure there is no way of SQL injection attacks.

** (Ecto.Query.CompileError) `Sqlcommenter.to_str(a: :b)`
 is not a valid lock. For security reasons,
 a lock must always be a literal string or a fragment

Also ecto is caching queries in ets, This way it does not have to build the query every time is executed, having every time a different lock makes it impossible to cache. That being said, we can get around it by modifying the function and swapping the data in the actual query to skip this validation:

  def all_traced_lock(queryable, opts \\ []) do
    {metadata, opts} = Keyword.pop(opts, :metadata, %{})
    {:current_stacktrace, stacktrace} = Process.info(self(), :current_stacktrace)

    stacktrace =
      stacktrace |> Enum.drop(2) |> Enum.take(1) |> Exception.format_stacktrace() |> String.trim()

    metadata = put_in(metadata, [:stacktrace], stacktrace)

    queryable =
      queryable
      |> Ecto.Queryable.to_query()
      |> Map.update(:lock, nil, fn
        nil -> Sqlcommenter.to_str(metadata)
        lock -> Sqlcommenter.append_to_query(lock, metadata)
      end)

    __MODULE__.all(queryable, opts)
  end

This change either creates the lock entry or appends to it. This makes it easier to load the data, you can also use schemaless queries and preloads.

Conclusion

It would be great when ecto would support this out of the box, sqlcommenter is compatible with opentelemetry which becomes the de facto standard for tracing elixir code. I know that both of my proposal are hacks but any tool in your toolbox that helps with debugging may come in handy at some point.

comments powered by Disqus

Related Posts

Pseudolocalization in Phoenix with gettext_pseudolocalize

Pseudolocalization in Phoenix with gettext_pseudolocalize

Understanding Gettext in Elixir Applications

Gettext is a widely adopted internationalization (i18n) system that helps developers make their applications available in multiple languages. Originally developed for GNU projects, it has become a standard solution across many programming languages and frameworks, including Elixir and Phoenix.

Read More
Reverse Engineer Unknown Data Format Using Elixir

Reverse Engineer Unknown Data Format Using Elixir

I recently worked on porting the csv file converter for AdventureWorks database from ruby to elixir. This ended successful and I have all the data in my local postgresql to play with. By browsing the tables I found spatial_location column in person.address table and I wanted to decode it.

Read More
Enhancements to dbg in Elixir 1.18

Enhancements to dbg in Elixir 1.18

Elixir 1.18 added some interesting features, but one that went under the radar was extended support for dbg. In 1.17 when you had this code:

Read More