Debugging a tricky production error with rubber duck GPT


I was recently working on a bug that only appeared on production. I couldn’t reproduce it locally at all. These types of bugs are challenging.

Although I had some experience with this sort of bug, I remembered that it took me a long time to figure it out the first time. Could GPT help me debug it?

Although the framework in use is Ruby on Rails, I’ve written to focus more on the problem solving process with GPT than the code. I’ll try to explain anything that may require specific knowledge.

(All function and file names have been changed from the original)

1. What happened

A user reported that they couldn’t create a weekly task. I went on our production site, and checked that the bug did exist. After creating an issue for it (and some time later), I tried to reproduce the bug locally on my development server. I couldn’t.

It’s one of those tricky bugs. Okay, what I can do next is grab the relevant logs from production to see what’s wrong:

ActionController::RoutingError
uninitialized constant Users::Weekly::WeeklyTasksController
Did you mean?  Users::WeeklyTasksController

🤔 That’s strange. Our codebase doesn’t have Users::Weekly::WeeklyTasksController. The one suggested by Did you mean? is the one that actually exists in our codebase: Users::WeeklyTasksController.

Somehow Rails thinks our WeeklyTasksController is nested under /users/weekly/ when it’s nested under /users/ (without the weekly).

Correct Rails controller location

Let’s see what GPT thinks.

2.Finding the root cause of the discrepancy between dev and prod

a. Routes

[Me]: You are an expert programmer and debugger for the Ruby on Rails framework. Please help me with this problem. The below comes from the log on our production server. However, it mentions this controller exists but I checked that it does not exist in our codebase: `Users::Weekly::WeeklyTasksController`. The controller that does exist is `Users::WeeklyTasksController`. Why would this routing error occur?

[ChatGPT]: It seems that there is a namespace mismatch in the route. The error message mentions Users::Weekly::WeeklyTasksController but the actual controller that exists in the codebase is Users::WeeklyTasksController. This could happen if there was a recent refactoring of the code that renamed or moved the controller to a different namespace.

To fix this error, you will need to update the routes file to point to the correct controller or rename the controller to match the namespace. It is also possible that there are other places in the codebase that reference the incorrect namespace and would need to be updated as well.

💡 Makes sense, there might be a problem in the routes file.

If we aren’t familiar with Rails, we might not know much about namespaces and route matching. Let’s ask GPT.

[Me]: How do I check which url namespaces point to which controllers?

[ChatGPT]: To check which URL namespaces point to which controllers within a Rails application, you can analyze the config/routes.rb file in your codebase. This file defines the routing table that Rails uses to map incoming URLs to your application’s controllers and actions.

To view a list of all the defined routes in your application and the corresponding controller/action pair, you can run the command rails routes. This displays a table of all the defined routes, their HTTP verb, URL pattern, naming prefix, controller, and action.

You can also search for a specific controller by running the command rails routes | grep <controller_name>. This will display all the routes that are mapped to the specified controller. For example, running rails routes | grep HomeController would display all the routes mapped to the HomeController.

Great. I follow these instructions and find out that the route table actually contains the controller that shouldn’t exist! Users::Weekly::WeeklyTasksController. I’m going to make a mental note to follow this path later to see how the route and controllers are defined.

b. Controller discrepancy

First, I want to figure out why would Rails say this controller exists when it doesn’t.

[Me]: The rails routes command returns one route where the Controller#Action is users/weekly/weekly_tasks_controller#create. However, that does not exist. The actual Controller#Action is users/weekly_tasks_controller#create. How is Ruby on Rails able to still reroute it to the correct controller?

[ChatGPT]: In Rails, routes are matched in the order they are defined. When the request is made, Rails searches through each route in the order they are listed until it finds one that matches the incoming request URL. In this case, since users/weekly/weekly_tasks_controller#create is not found but users/weekly_tasks_controller#create is defined, Rails will match and route the request to the correct controller and action.

❓ I’m still confused here.

[Me]: That doesn’t make sense because users/weekly_tasks_controller#create does not exist in Rails routes.

[ChatGPT]: Rails uses a default convention for mapping URLs to controller actions. For example, a URL ending in /users/weekly_tasks_controller would by default be mapped to the weekly_tasks_controller action inside the UsersController. If we add /create to the end of that URL, Rails will by default map it to the create action of the same controller (UsersController).

So even though the output of rails routes does not show this exact route, because of Rails’ default URL-convention, when we access a URL that matches the pattern /users/weekly_tasks_controller/create, Rails is able to correctly route the request to the create action inside the Users.

💡 From this exchange, I’m beginning to think that there is some dynamic matching that makes the route match to the controller, even though the controller’s path is incorrect. But I’m still confused about the discrepancy between development and production.

c. Configuration mismatch

[Me]: Why would Rails be able to correctly route the request on my local dev server, but not on production?

[ChatGPT]: One possible reason why Rails is able to correctly route the request on your local dev server but not on production is because there might be a mismatch between the deployed code on your production server and the code that you tested locally.

This could happen due to a number of reasons such as using different versions of Ruby, Rails or other gems, differences in the configuration settings, or different versions of third-party dependencies. Therefore, it would be a good idea to check for these discrepancies and ensure that both environments are configured identically.

Additionally, there could be differences in the server configuration or environment variables used in production as compared to the local development environment that could also affect the routing of the request.

💡 That last paragraph seems promising. Let’s dig into prod/dev configuration a bit more.

[Me]: How do I find differences in configuration?

[ChatGPT]: To find differences in configuration between your local development environment and production, you can start by comparing files such as:

  • Gemfile and Gemfile.lock
  • config/database.yml
  • config/application.rb
  • config/environments/production.rb
  • config/environments/development.rb

You can also check for differences in server configurations such as the version of Ruby or Rails, the web server (such as Nginx or Apache) and its configuration file(s), environment variables, and any other server-specific settings.

In addition to manually comparing files and configurations, you could also use tools like Ansible, Chef or Puppet to automate the configuration management process and ensure that both environments are provisioned the same way. You can also use a platform like Heroku that provides a consistent environment for running Ruby on Rails applications, eliminating many of the configuration discrepancies that can arise between different environments.

Yes, let’s check out these two files production.rb and development.rb.

After some digging, I found a few differences.

💡 One of the most important difference seems to be eager loading. Let’s start with that.

d. Eager loading

[Me]: Is it possible for eager loading differences between development and production to cause a difference in URL mapping? Please explain your reasoning.

[ChatGPT]: Yes, it is possible for eager loading differences between development and production to cause differences in URL mapping.

Eager loading is a way to improve the performance of a Rails application by loading data ahead of time, before it is actually needed, in order to reduce database queries. Rails has a few strategies for eager loading, including eager_load and autoload.

If you have different eager loading strategies between your development and production environments, it’s possible that some models or modules might not be loaded correctly in one environment or the other. This could cause differences in the behaviour of your Rails application, including differences in URL mapping.

For example, if a certain module or class is not loaded correctly in production, a URL that works in development might not match the right controller and action in production, resulting in a routing error.

To avoid issues related to eager loading, it’s recommended to test your application thoroughly in all environments and ensure that all required models, classes and modules are loaded correctly. You can also use tools like the Rails console to check which models or classes are being loaded in each environment, and investigate any discrepancies that you find.

Yep this sounds like it’s causing the difference between what’s happening on production versus development.

🔎 After reading some documentation about eager loading, I’m confident that this configuration is causing the difference in behaviour.

🎉 We now understand why there’s a discrepancy between production and development. Next we need to figure out why Rails thinks our controller Users::WeeklyTasksController is nested under Weekly.

3. Digging into the routes

I made a mental note earlier to look into routes based off of this comment.

[ChatGPT]: To check which URL namespaces point to which controllers within a Rails application, you can analyze the config/routes.rb file in your codebase.

After going into the route file, I find this relevant section:

namespace :users do
  # ...
  resources :weekly_tasks_controller,  module: :weekly do
    resources :edit_tasks
  end
end

a. Namespaces and modules

Here we can go directly to the documentation about controller namespaces and routing to understand what’s happening. Basically, the code above says that these two controllers exist:

The crucial part here is module: :weekly, which routes both WeeklyTasksController and EditTasksController under Weekly .

I have a suspicion that this code was edited recently. After checking recent git commits here, I’ve verified that the module bit was indeed added recently.

The problem is that Users::WeeklyTasksController isn’t nested under Weekly. The person who made the change probably just wanted to nest EditTasksController under Weekly, but accidentally included WeeklyTasksController as well.

b. The fix

Now that I know what the problem is, the fix is simple. Just need to move module to the edit_tasks line since only edit_tasks is nested under weekly.

Here’s our fix:

namespace :users do
  # ...
  resources :weekly_tasks_controller do
    resources :edit_tasks, module :weekly
  end
end

Testing this fix locally and on staging confirms that it works! 🎉 Our job is done.

Closing

This experiment to use ChatGPT as a rubber duck worked well. I was wary of ChatGPT hallucinations if I asked for something too specific. If I needed to look for something specific, I went for the documentation instead. ChatGPT was there to help brainstorm different ways of understanding the problem.

Also, I noticed different prompts would get you different answers, some more useful than others. I’ll go over this in the next post.


Have some thoughts on this post? Reply with an email.

If you're interested in updates, you can subscribe below or via the RSS feed

Powered by Buttondown.