How to tackle complex refactorings in big projects
Sometimes huge refactorings or refactoring of core concepts of your application are necessary for being able to meet new requirements or to keep your application maintainable on the long run. Here are some thoughts about how to approach such challenges.
Try to break your refactoring down in different parts. Try to make tests green for each part of your refactoring as soon as possible and only move to the next big part if your tests are fixed. It's not a good idea to work for weeks or months and wait for ALL puzzle pieces to come together to be able to use the GUI or run tests again. Don't be afraid to temporarily add workarounds in your code to achieve this goal. Of course you should get rid of any workaround code before finishing your refactoring! Instead of using workarounds you can try to apply the.
You have to move heavy loads of functionality from one or many models to another model (or split a model into different new models). If this includes a lot of associated records, it may not be wise to try to do it all at once. You could use ruby's power and temporarily use
method_missing to delegate requests to the "old" model(s) while you try to move one functionality at a time, always making tests green in between.
This might seem obvious, but in my experience is an underestimated aspect that is easily forgotten.
Before you start, consider the limitations of your refactoring. Are there any things you don't have under your control (e.g. an API to another system which is not owned by you but you receive data from) that are critical to your refactoring? - You should address these things first. There's no point in spending a lot of time refactoring your application just to realize at the end that there are unsolvable problems which your whole work depends on.
You are refactoring concerns of a model that is not completely under your control, e.g. because the data it carries is imported through an API, but you want to change the data structure of how the model is saved in your application. Instead of migrating all the contents you already imported and refactoring your application logic first, start with refactoring your importer. You could as a first step build a new (empty) model with the new data structure you want and make your importer be able to transform the data in the way needed for the new model. If this succeeds, you can as a second step refactor your application, migrate your existing records to the new schema and afterwards make the importer use the correct model (now in the new structure) again.
This way you have proof of concept before spending weeks refactoring your application.
So, you started to move some code around and change things and now you need to update all the occurrences, of a model / method. If using your new models / methods does not feel good to you or you notice you have to pass stuff around A LOT (e.g. newly referenced models) you should maybe take one step back from your code and ask yourself if you're using the right level of abstraction. Would things be easier if you could operate "one layer above"? If yes, then it may be worth implementing this layer. It may be helpful to think in terms of patterns: Would your code improve by using decorators or presenter models for example? Would using a facade pattern make things easier?
This point is important when regular development continues during a large (and long-lasting) refactoring.
When doing large refactorings, chances are you see a lot of code that is not directly concerning your changes but could be improved nonetheless. Think twice before you go after each opportunity. On the one hand, a large refactoring is time-consuming as is and you want to finish at some point. On the other hand, unnecessary changes can make your and your fellow developers' life a lot harder.
If you can't restrain your urge to refactor additional code, at least do it in a separate commit. If this commit can be rebased to BEFORE your main refactoring its likely to be "safe" and to not cause any trouble.
Consider this scenario: Your refactoring is finished, merged into
master but client approval takes its time and then over time more and more rejects regarding the refactoring come in. Other development has gone on and already changed the code you have refactored. Now the client wants those other things to be deployed to production, but without the refactoring, because there are still rejects. Suddenly you realize that it's not that easy to detangle commits and changes made in the same files! The more files you touched "needlessly", the more complex detangling gets and the harder it will be to merge from
production or vice versa.
If you know that your refactoring will take weeks or months and other development has to go on in the meantime, think of a workflow to handle this before you start. If more than one developer is involved in the refactoring you should have a
refactoring-master branch. The on-going development can still happen in
master this way. It may make sense to deploy staging from your
refactoring-master instead of from
master at some point.
Having an extra branch apart from
master makes production deployments easier, because you are constantly forced to keep refactoring and on-going development apart (See "Stay focused"). You should, however, regularly try to incorporate the changes of the ongoing development into your
While refactoring, you should commit often and make rather small commits than big ones and maybe you should not name all of them "wip" (this is common sense, I hope :) ). While working in your own feature branch, you can then regularly squash these small commits to meaningful, bigger ones. When merging into master it's better to have only few (or maybe just one!) commits. Each of them should have a green test suite - otherwise
git bisect will not work reliably. Remember: if you for some reason have to detangle functionality / commits (e.g. because of a production deploy with only part of functionality), it's easier with fewer commits (see "stay focused").
Flaky tests are tests that sometimes fail for no obvious reason. They are the plague of many end-to-end (E2E) test suites that automate the browser through tools like Capybara and Selenium.
Join our free training event and learn to fix any flaky test suite, even in large legacy applications.