Rails: Rest API post-mortem analysis
This is a personal post-mortem analysis of a project that was mainly build to provide a REST API to mobile clients.
For the API backend we used the following components:
- Active Model Serializer (AMS) to serializer our Active Record models to JSON.
- JSON Schema to test the responses of our server.
- SwaggerUI to document the API.
The concept worked really good. Here are two points that were extraordinary compared to normal Rails project with many UI components:
- Having a Rails application, that has no UI components (only the documentation) and therefore no integration tests, the tests are really fast. With ~25 endpoints the tests are finished within 22 seconds (16 cores).
- Adding new features was very fast as building a new endpoint was most often just copy & paste. So we had more time to
build the underlying core features instead of spending much time with UI components.
But there are a few point you might want to look up different before you need to integrate an API component into your Rails application.
REST API vs. GraphQL
It is worth to consider whether you want to have a REST API or a GraphQL API endpoint (not yet used, so the opinion has to be carefully considered).
- In discussions many developers argue that REST APIs are easier for developers to read if they are completely documented. Even if GraphQL APIs are always completely documented in the GraphQL Schema, it still could be hard to figure out how to get the required data (like the Github API).
- In discussions many developers argue that you should use GraphQL for internal APIs and REST APIs for external APIs.
- GraphQL can lead to more (complicated) performance issues than a REST API (here a video that contains a section for performance issues).
- For a REST API it is very hard to figure out how the clients will use the data. You will encounter many feature requests where clients want an association included in some endpoint, otherwise they need to make N+1 requests. GraphQL will solve this problem, you don't need to take care about this.
- From a subjective point of view, GraphQL libraries are more currently more actively maintained than REST API libraries (see the sub chapters below). Still many big providers have REST APIs (like Stripe or Youtube) and it works very good.
- For both API standards file uploads are a little bit tricky.
- GraphQL has a build in pagination mechanism.
Models & Controllers
We used Active Model Serializer to serializer our Active Record models to JSON. This worked quite good for most use cases, but it has some pitfalls:
- It is not maintained anymore. Many issues are open and you will run into them. Checkout the alternatives first.
- You need to configure includes in the controller and not the serializer itself. As we wanted to have the same includes
for CREATE, SHOW, UPDATE, and DELETE we stored them in a sometimes long list as a controller constant. This requires you to
e.g. include the user, which created the resource, in every controller.
- Active Model Serializer is not very performant for large responses (you should avoid them in general).
- We didn't use JSON:API and just wrapped the params like in Rails applications with forms (
warp_paramswill do that for you). But maybe its a better idea to stick to a standard that client libraries know to handle.
We used JSON Schema to test the responses of our server. This was really nice and we implemented it in a similar way like here.
- In the beginning we tried to remove any duplicated in the schema files. So when a booking had a user associated, the
booking schema referenced the user schema. This approach didn't scale very well, as we got more and more edge cases, where the attributes for the same resource were different. E.g. the user avatar url attribute should only be shown when the user endpoint was requested, but not when it was included in any response as a reference for the creator of this resource. We started to duplicate the schemas, so a booking had three files: the booking index schema, the booking show schema and the user schema. As the same serializers were used, it never happened that we missed to add an attribute somewhere.
- The gem
json-schemauses draft-v4 as default, whereas the most recent version is
draft-08. You might want to use
another gem if you need the newest version of
- We used
'json_matchers', '< 0.8.0'as we wanted to use the
strictoption for validations. So all attributes are required and no optional attributes are allowed.
- In the end we had many request specs that look quite similar (as much copy & paste was possible). Maybe a good refactoring would have been to introduce more custom matchers and shared examples.
For the documentation we used Swagger Docs V2.0 with the gem
swagger-blocks. Despite the gem is not maintained very well anymore, so it does not support V3.0 at the moment. We would not use Swagger Docs anymore.
- Swagger Docs limits you in the way you can document your endpoints. We had the need to add bigger sections of text, e.g. for the workflow how a user can get his API token or in which way authorization is managed. This was not possible and ended up in a link to a second documentation.
- Swagger Docs makes it hard to define the attributes and associations for a resource e.g. you need two model definitions for a user
anyOfare your friends in the beginning, but you will miss many errors and the documentation blow up to a unreadable blob of nodes (we had up two 5 different model definitions like
- Objects (hashes) are not really supported in Swagger V2.0, so we could not document them without hacks.
We would recommend you to build or find an existing solution that uses JSON Schema as documentation. Uses the schemas in the documentation in your controllers to validate the request, so the documentation could never lie. With this approach it is possible to create testable example request and responses like you have in Swagger Docs or tools like Postman.
We did not build any mechanism that manage limits for a API user and there where customers that brought the servers to their limits, when they started a bulk import.
So the recommendation is to implement a limit in the beginning and also include analytic tools like New Relic to figure out the load impact of each endpoints. It will help you to understand if customers misuse different endpoints (N+1 queries).