There are a lot of updates to this repo lately:
Added more code examples:
- Added UnitOfWork
- All Domain Events can now be executed in a single database transaction using a UnitOfWork
- Added Wallet module to show an example of using a UnitOfWork together with Domain Events
- Added BDD tests example
Refactoring:
- Refactored Domain Events and Domain Events Handlers
- Commands are now plain objects
- Moved generic files to /libs directory
- Refactored Entity/Aggregate creation
- More small changes
Updates in readme and code:
- My opinion on some topics evolve over time so readme and code gets updated constantly.
- Added more resources and topics to readme
Main emphasis of this project is to provide recommendations on how to design software applications. In this readme are presented some of the techniques, tools, best practices, architectural patterns and guidelines gathered from different sources.
Everything below should be seen as a recommendation. Keep in mind that different projects have different requirements, so any pattern mentioned in this readme can be replaced or skipped if needed.
Code examples are written using NodeJS, TypeScript, NestJS framework and Typeorm for the database access.
Though patterns and principles presented here are framework/language agnostic, so above technologies can be easily replaced with any alternative. No matter what language or framework is used, any application can benefit from principles described below.
Note: code examples are adapted to TypeScript and mentioned above frameworks so may not fit well for other languages. Also remember that code examples presented here are just examples and must be changed according to project's needs or personal preference.
-
Other recommendations and best practices
- Error Handling
- Testing
- Configuration
- Logging
- Health monitoring
- Folder and File Structure
- File names
- Static Code Analysis
- Code formatting
- Documentation
- Make application easy to setup
- Seeds
- Migrations
- Rate Limiting
- Code Generation
- Custom utility types
- Pre-push/pre-commit hooks
- Prevent massive inheritance chains
- Conventional commits
Mainly based on:
- Domain-Driven Design (DDD)
- Hexagonal (Ports and Adapters) Architecture
- Secure by Design
- Clean Architecture
- Onion Architecture
- SOLID Principles
- Software Design Patterns
And many other sources (more links below in every chapter).
Before we begin, here are the PROS and CONS of using a complete architecture like this:
- Independent of external frameworks, technologies, databases, etc. Frameworks and external resources can be plugged/unplugged with much less effort.
- Easily testable and scalable.
- More secure. Some security principles are baked in design itself.
- The solution can be worked on and maintained by different teams, without stepping on each other's toes.
- Easier to add new features. As the system grows over time, the difficulty in adding new features remains constant and relatively small.
- If the solution is properly broken apart along bounded context lines, it becomes easy to convert pieces of it into microservices if needed.
-
This is a sophisticated architecture which requires a firm understanding of quality software principles, such as SOLID, Clean/Hexagonal Architecture, Domain-Driven Design, etc. Any team implementing such a solution will almost certainly require an expert to drive the solution and keep it from evolving the wrong way and accumulating technical debt.
-
Some of the practices presented here are not recommended for small-medium sized applications with not a lot of business logic. There is added up-front complexity to support all those building blocks and layers, boilerplate code, abstractions, data mapping etc. thus implementing a complete architecture like this is generally ill-suited to simple CRUD applications and could over-complicate such solutions. Some of the described below principles can be used in a smaller sized applications but must be implemented only after analyzing and understanding all pros and cons.
Diagram is mostly based on this one + others found online
In short, data flow looks like this (from left to right):
- Request/CLI command/event is sent to the controller using plain DTO;
- Controller parses this DTO, maps it to a Command/Query object format and passes it to a Application service;
- Application service handles this Command/Query; it executes business logic using domain services and/or entities and uses the infrastructure layer through ports;
- Infrastructure layer uses a mapper to convert data to format that it needs, uses repositories to fetch/persist data and adapters to send events or do other I/O communications, maps data back to domain format and returns it back to Application service;
- After application service finishes doing it's job, it returns data/confirmation back to Controllers;
- Controllers return data back to the user (if application has presenters/views, those are returned instead).
Each layer is in charge of it's own logic and has building blocks that usually should follow a Single-responsibility principle when possible and when it makes sense (for example, using Repositories
only for database access, using Entities
for business logic etc).
Keep in mind that different projects can have more or less steps/layers/building blocks than described here. Add more if application requires it, and skip some if application is not that complex and doesn't need all that abstraction.
General recommendation for any project: analyze how big/complex the application will be, find a compromise and use as many layers/building blocks as needed for the project and skip ones that may over-complicate things.
More in details on each step below.
This project's code examples use separation by modules (also called components). Each module gets its own folder with a dedicated codebase, and each use case inside that module gets it's own folder to store most of the things it needs (this is also called Vertical Slicing).
It is easier to work on things that change together if those things are gathered relatively close to each other. Try not to create dependencies between modules or use cases, move shared logic into a separate files and make both depend on that instead of depending on each other.
Try to make every module independent and keep interactions between modules minimal. Think of each module as a mini application bounded by a single context. Try to avoid direct imports between modules (like importing a service from other domain) since this creates tight coupling. Communication between modules can be done using events, public interfaces or through a port/adapter (more on that topic below).
This approach ensures loose coupling, and, if bounded contexts are defined and designed properly, each module can be easily separated into a microservice if needed without touching any domain logic.
Read more about modular programming benefits:
Each module is separated in layers described below.
This is the core of the system which is built using DDD building blocks:
- Entities
- Aggregates
- Domain Services
- Value Objects
- Application Services
- Commands and Queries
- Ports
More building blocks may be added if needed.
Are also called "Workflow Services", "Use Cases", "Interactors" etc. These services orchestrate the steps required to fulfill the commands imposed by the client.
- Typically used to orchestrate how the outside world interacts with your application and performs tasks required by the end users.
- Contain no domain-specific business logic;
- Operate on scalar types, transforming them into Domain types. A scalar type can be considered any type that's unknown to the Domain Model. This includes primitive types and types that don't belong to the Domain.
- Application services declare dependencies on infrastructural services required to execute domain logic (by using ports).
- Are used in order to fetch domain
Entities
(or anything else) from database/outside world through ports; - Execute other out-of-process communications through
Ports
(like event emits, sending emails etc); - In case of interacting with one Entity/Aggregate, executes its methods directly;
- In case of working with multiple Entities/Aggregates, uses a
Domain Service
to orchestrate them; - Are basically a
Command
/Query
handlers; - Should not depend on other application services since it may cause problems (like cyclic dependencies);
One service per use case is considered a good practice.
What are "Use Cases"?
wiki:
In software and systems engineering, a use case is a list of actions or event steps typically defining the interactions between a role (known in the Unified Modeling Language as an actor) and a system to achieve a goal.
Use cases are, simply said, list of actions required from an application.
Example file: create-user.service.ts
More about services:
Notes: Interfaces for each Use Case and Local DTOs
Some people prefer having an interface for each use case (Driving Port), which Application Service
implements and a Controller
depends on. This is a viable option, but this project doesn't use interfaces for every use case for simplicity: it makes sense using interfaces when there are multiple implementations of a workflow, but use cases are too specific and should not have multiple implementations of the same workflow (one service per use case rule mentioned above). Controllers
naturally depend on a concrete implementation thus making interfaces redundant. More on this topic here.
Another thing that can be seen in some projects is local DTOs. Some people prefer never to use domain objects (like entities) outside of core (in controllers
, for example), and are using DTOs instead. This project doesn't use this technique to avoid extra interfaces and data mapping. Either to use local DTOs or not is a matter of taste.
Here are Martin Fowler's thoughts on local DTOs, in short (quote):
Some people argue for them(DTOs) as part of a Service Layer API because they ensure that service layer clients aren't dependent upon an underlying Domain Model. While that may be handy, I don't think it's worth the cost of all of that data mapping.
This principle is called Command–Query Separation(CQS). When possible, methods should be separated into Commands
(state-changing operations) and Queries
(data-retrieval operations). To make a clear distinction between those two types of operations, input objects can be represented as Commands
and Queries
. Before DTO reaches the domain, it is converted into a Command
/Query
object.
Commands
are used for state-changing actions, like creating new user and saving it to the database. Create, Update and Delete operations are considered as state-changing.
Data retrieval is responsibility of Queries
, so Command
methods should not return business data.
Some CQS purists may say that a Command
shouldn't return anything at all. But you will need at least an ID of a created item to access it later. To achieve that you can let clients generate a UUID (more info here: CQS versus server generated IDs).
Though, violating this rule and returning some metadata, like ID
of a created item, redirect link, confirmation message, status, or other metadata is a more practical approach than following dogmas.
All changes done by Commands
(or by events or anything else) across multiple aggregates should be saved in a single database transaction (if you are using a single database). This means that inside a single process, one command/request to your application usually should execute only one transactional operation to save all changes (or cancel all changes of that command/request in case if something fails). This should be done to maintain consistency. To do that something like Unit of Work or similar patterns can be used. Example: create-user.service.ts - notice how everything is wrapped in a UnitOfWork method UnitOfWork.execute(...)
.
Note: Command
is not the same as Command Pattern, it is just a convenient name to represent that this object executes some state-changing action. Both Commands
and Queries
in this example are just simple objects that carry data between layers.
A Command
describes a single action, it does not perform it.
Example of a command object: create-user.command.ts
Query
is used for retrieving data and should not make any state changes (like writes to the database, files etc).
Queries are usually just a data retrieval operation and have no business logic involved; so, if needed, application and domain layers can be bypassed completely. Though, if some additional non-state changing logic has to be applied before returning a query response (like calculating something), it can be done in a application/domain layer.
Example of a query object: find-users.query.ts
Example of query bypassing application/domain layers completely: find-users.http.controller.ts
By enforcing Command
and Query
separation, the code becomes simpler to understand. One changes something, another just retrieves data.
Also, following CQS from the start will facilitate separating write and read models into different databases (CQRS) if someday in the future the need for it arises.
Note: NestJS provides a package for CQRS that can be used as an alternative to code examples presented in this repo: NestJS CQRS. Basically if offers to use Command Handlers
and instead of passing commands/queries directly to methods this package proposes a command/query bus + an event bus for events. But it is important to keep in mind that this package doesn't provide any consistency for emitted events (neither does Node.js Event Emitter) since there is no way to await
for events to finish. More info in Domain Events and Integration Events sections below.
Read more about CQS and CQRS:
- Command Query Segregation.
- Exposing CQRS Through a RESTful API
- What is the CQRS pattern?
- CQRS and REST: the perfect match
Ports (for Driven Adapters) are interfaces that define contracts which must be implemented by infrastructure adapters in order to execute some action more related to technology details rather than business logic. Ports act like abstractions for technology details that business logic does not care about.
In Application Core dependencies point inwards. Outer layers can depend on inner layers, but inner layers never depend on outer layers. Application Core shouldn't depend on frameworks or access external resources directly. Any external calls to out-of-process resources/retrieval of data from remote processes should be done through ports
(interfaces), with class implementations created somewhere in infrastructure layer and injected into application's core (Dependency Injection and Dependency Inversion). This makes business logic independent of technology, facilitates testing, allows to plug/unplug/swap any external resources easily making application modular and loosely coupled.
- Ports are basically just interfaces that define what has to be done and don't care about how it is done.
- Ports can be created to abstract I/O operations, technology details, invasive libraries, legacy code etc. from the Domain.
- Ports should be created to fit the Domain needs, not simply mimic the tools APIs.
- Mock implementations can be passed to ports while testing. Mocking makes your tests faster and independent from the environment.
- When designing ports, remember about Interface segregation principle. Split large interfaces into a smaller ones when it makes sense, but also keep in mind to not overdo it when not necessary.
- Ports can also help to delay decisions. Domain layer can be implemented before even deciding what technologies (frameworks, database etc) will be used.
Note: since most ports implementations are injected and executed in application service, Application Layer can be a good place to keep those ports. But there are times when Domain Layer's business logic depends on executing some external resource, in that case those ports can be put in a Domain Layer.
Note: creating ports in smaller applications/APIs may overcomplicate such solutions by adding unnecessary abstractions. Using concrete implementations directly instead of ports may be enough in such applications. Consider all pros and cons before using this pattern.
Example files:
This layer contains application's business rules.
Domain should only operate using domain objects, most important ones are described below.
Entities are the core of the domain. They encapsulate Enterprise wide business rules and attributes. An entity can be an object with properties and methods, or it can be a set of data structures and functions.
Entities represent business models and express what properties a particular model has, what it can do, when and at what conditions it can do it. An example of business model can be a User, Product, Booking, Ticket, Wallet etc.
Entities must always protect it's invariant:
Domain entities should always be valid entities. There are a certain number of invariants for an object that should always be true. For example, an order item object always has to have a quantity that must be a positive integer, plus an article name and price. Therefore, invariants enforcement is the responsibility of the domain entities (especially of the aggregate root) and an entity object should not be able to exist without being valid.
Entities:
- Contain Domain business logic. Avoid having business logic in your services when possible, this leads to Anemic Domain Model (domain services are exception for business logic that can't be put in a single entity).
- Have an identity that defines it and makes it distinguishable from others. It's identity is consistent during its life cycle.
- Equality between two entities is determined by comparing their identificators (usually its
id
field). - Can contain other objects, such as other entities or value objects.
- Are responsible for collecting all the understanding of state and how it changes in the same place.
- Responsible for the coordination of operations on the objects it owns.
- Know nothing about upper layers (services, controllers etc).
- Domain entities data should be modelled to accommodate business logic, not some database schema.
- Entities must protect their invariants, try to avoid public setters - update state using methods and execute invariant validation on each update if needed (this can be a simple
validate()
method that checks if business rules are not violated by update). - Must be consistent on creation. Validate Entities and other domain objects on creation and throw an error on first failure. Fail Fast.
- Avoid no-arg (empty) constructors, accept and validate all required properties through a constructor.
- For optional properties that require some complex setting up, Fluent interface and Builder Pattern can be used.
- Make Entities partially immutable. Identify what properties shouldn't change after creation and make them
readonly
(for exampleid
orcreatedAt
).
Note: A lof of people tend to create one module per entity, but this approach is not very good. Each module may have multiple entities. One thing to keep in mind is that putting entities in a single module requires those entities to have related business logic, don't group unrelated entities in one module.
Example files:
Read more:
Aggregate is a cluster of domain objects that can be treated as a single unit. It encapsulates entities and value objects which conceptually belong together. It also contains a set of operations which those domain objects can be operated on.
- Aggregates help to simplify the domain model by gathering multiple domain objects under a single abstraction.
- Aggregates should not be influenced by data model. Associations between domain objects are not the same as database relationships.
- Aggregate root is an entity that contains other entities/value objects and all logic to operate them.
- Aggregate root has global identity. Entities inside the boundary have local identity, unique only within the Aggregate.
- Aggregate root is a gateway to entire aggregate. Any references from outside the aggregate should only go to the aggregate root.
- Any operations on an aggregate must be transactional operations. Either everything gets saved/updated/deleted or nothing.
- Only Aggregate Roots can be obtained directly with database queries. Everything else must be done through traversal.
- Similar to
Entities
, aggregates must protect their invariants through entire lifecycle. When a change to any object within the Aggregate boundary is committed, all invariants of the whole Aggregate must be satisfied. Simply said, all objects in an aggregate must be consistent, meaning that if one object inside an aggregate changes state, this shouldn't conflict with other domain objects inside this aggregate (this is called Consistency Boundary). - Objects within the Aggregate can hold references to other Aggregate roots. Prefer references to external aggregates only by their globally unique identity, not by holding a direct object reference.
- Try to avoid aggregates that are too big, this can lead to performance and maintaining problems.
- Aggregates can publish
Domain Events
(more on that below).
All of this rules just come from the idea of creating a boundary around Aggregates. The boundary simplifies business model, as it forces us to consider each relationship very carefully, and within a well-defined set of rules.
Example files:
- aggregate-root.base.ts - abstract base class.
- user.entity.ts - aggregate implementations are similar to
Entities
, with some additional rules described above.
Read more:
- Understanding Aggregates in Domain-Driven Design
- What Are Aggregates In Domain-Driven Design? <- this is a series of multiple articles, don't forget to click "Next article" at the end.
- Effective Aggregate Design Part I: Modeling a Single Aggregate
- Effective Aggregate Design Part II: Making Aggregates Work Together
Domain event indicates that something happened in a domain that you want other parts of the same domain (in-process) to be aware of. Domain events are just messages pushed to an in-memory domain event dispatcher.
For example, if a user buys something, you may want to:
- Update his shopping cart;
- Withdraw money from his wallet;
- Create a new shipping order;
- Perform other domain operations that are not a concern of an aggregate that executes a "buy" command.
Typical approach that is usually used involves executing all this logic in a service that performs a buy operation. But this creates coupling between different subdomains.
An alternative approach would be publishing a Domain Event
. If executing a command related to one aggregate instance requires additional domain rules to be run on one or more additional aggregates, you can design and implement those side effects to be triggered by Domain Events. Propagation of state changes across multiple aggregates within the same domain model can be performed by subscribing to a concrete Domain Event
and creating as many event handlers as needed. This prevents coupling between aggregates.
Domain Events may be useful for creating an audit log to track all changes to important entities by saving each event to the database. Read more on why audit logs may be useful: Why soft deletes are evil and what to do instead.
All changes done by Domain Events (or by anything else) across multiple aggregates in a single process should be saved in a single database transaction to maintain consistency. Patterns like Unit of Work or similar can help with that. Example: src/modules/wallet/wallet.providers.ts - notice how OnUserCreatedDomainEventHandler
has a factory for creating an instance of this class with included transactional repository.
Note: this project uses custom implementation for publishing Domain Events. Reason for not using Node Event Emitter or packages that offer an event bus (like NestJS CQRS) is that they don't offer an option to await
for all events to finish, which is useful when making all events a part of a transaction. Inside a single process either all changes done by events should be saved, or none of them in case if one of the events fails.
There are multiple ways on implementing an event bus for Domain Events, for example by using ideas from patterns like Mediator or Observer.
Examples:
- domain-events.ts - this class is responsible for providing publish/subscribe functionality for anyone who needs to emit or listen to events. Keep in mind that this is just a proof of concept example and may not be a best solution for a production application.
- user-created.domain-event.ts - simple object that holds data related to published event.
- create-wallet-when-user-is-created.domain-event-handler.ts - this is an example of Domain Event Handler that executes some actions when a domain event is raised (in this case, when user is created it also creates a wallet for that user).
- typeorm.repository.base.ts - repository publishes all domain events for execution when it persists changes to an aggregate.
To have a better understanding on domain events and implementation read this:
Note: keep in mind that while using only events for complex workflows with a lot of steps it will be hard to track everything that is happening across the application. One event may trigger another one, then another one, and so on. To track the entire workflow you'll have to go multiple places and search for an event handler for each step which is hard to maintain. In this cases using a service/orchestrator/mediator might be a preferred approach than only using events since you will have an entire workflow in one place. This might create some coupling, but is easier to maintain. Don't rely on events only, pick the right tool for the job.
Note: keep in mind that if you are using Event Sourcing pattern with a single stream per aggregate you most likely will not be able to save all events across multiple aggregates in a single transaction. In this case saving events across multiple aggregates should be treated differently (for example by using Sagas with compensating events or a Process Manager or something similar).
Out-of-process communications (calling microservices, external apis) are called Integration Events
. If sending a Domain Event to external process is needed then domain event handler should send an Integration Event
.
Integration Events should be published only after all Domain Events finished executing and saving all changes to the database.
To handle integration events in microservices you may need an external message broker / event bus like RabbitMQ or Kafka together with patterns like Transactional outbox, Change Data Capture, Sagas or a Process Manager to maintain eventual consistency.
Read more:
For integration events in distributed systems here are some patterns that may be useful:
Eric Evans, Domain-Driven Design:
Domain services are used for "a significant process or transformation in the domain that is not a natural responsibility of an ENTITY or VALUE OBJECT"
- Domain Service is a specific type of domain layer class that is used to execute domain logic that relies on two or more
Entities
. - Domain Services are used when putting the logic on a particular
Entity
would break encapsulation and require theEntity
to know about things it really shouldn't be concerned with. - Domain services are very granular where as application services are a facade purposed with providing an API.
- Domain services operate only on types belonging to the Domain. They contain meaningful concepts that can be found within the Ubiquitous Language. They hold operations that don't fit well into Value Objects or Entities.
Some Attributes and behaviors can be moved out of the entity itself and put into Value Objects
.
Value Objects:
- Have no identity. Equality is determined through structural property.
- Are immutable.
- Can be used as an attribute of
entities
and othervalue objects
. - Explicitly defines and enforces important constraints (invariants).
Value object shouldn’t be just a convenient grouping of attributes but should form a well-defined concept in the domain model. This is true even if it contains only one attribute. When modeled as a conceptual whole, it carries meaning when passed around, and it can uphold its constraints.
Imagine you have a User
entity which needs to have an address
of a user. Usually an address is simply a complex value that has no identity in the domain and is composed of multiple other values, like country
, street
, postalCode
etc; so it can be modeled and treated as a Value Object
with it's own business logic.
Value object
isn’t just a data structure that holds values. It can also encapsulate logic associated with the concept it represents.
Example files:
Read more about Value Objects:
Most of the code bases operate on primitive types – strings
, numbers
etc. In the Domain Model, this level of abstraction may be too low.
Significant business concepts can be expressed using specific types and classes. Value Objects
can be used instead primitives to avoid primitives obsession.
So, for example, email
of type string
:
email: string;
could be represented as a Value Object
instead:
email: Email;
Now the only way to make an email
is to create a new instance of Email
class first, this ensures it will be validated on creation and a wrong value won't get into Entities
.
Also an important behavior of the domain primitive is encapsulated in one place. By having the domain primitive own and control domain operations, you reduce the risk of bugs caused by lack of detailed domain knowledge of the concepts involved in the operation.
Creating an object for primitive values may be cumbersome, but it somewhat forces a developer to study domain more in details instead of just throwing a primitive type without even thinking what that value represents in domain.
Using Value Objects
for primitive types is also called a domain primitive
. The concept and naming are proposed in the book "Secure by Design".
Using Value Objects
instead of primitives:
- Makes code easier to understand by using ubiquitous language instead of just
string
. - Improves security by ensuring invariants of every property.
- Encapsulates specific business rules associated with a value.
Also an alternative for creating an object may be a type alias just to give this primitive a semantic meaning.
Example files:
Recommended to read:
- Primitive Obsession — A Code Smell that Hurts People the Most
- Value Objects Like a Pro
- Developing the ubiquitous language
Use Value Objects/Domain Primitives and Types system to make illegal states unrepresentable in your program.
Some people recommend using objects for every value:
Quote from John A De Goes:
Making illegal states unrepresentable is all about statically proving that all runtime values (without exception) correspond to valid objects in the business domain. The effect of this technique on eliminating meaningless runtime states is astounding and cannot be overstated.
Lets distinguish two types of protection from illegal states: at compile time and at runtime.
Types give useful semantic information to a developer. Good code should be easy to use correctly, and hard to use incorrectly. Types system can be a good help for that. It can prevent some nasty errors at a compile time, so IDE will show type errors right away.
The simplest example may be using enums instead of constants, and use those enums as input type for something. When passing anything that is not intended IDE will show a type error.
Or, for example, imagine that business logic requires to have contact info of a person by either having email
, or phone
, or both. Both email
and phone
could be represented as optional, for example:
interface ContactInfo {
email?: Email;
phone?: Phone;
}
But what happens if both are not provided by a programmer? Business rule violated. Illegal state allowed.
Solution: this could be presented as a union type
type ContactInfo = Email | Phone | [Email, Phone];
Now only either Email
, or Phone
, or both must be provided. If nothing is provided IDE will show a type error right away. Now business rule validation is moved from runtime to a compile time which makes application more secure and gives a faster feedback when something is not used as intended.
This is called a typestate pattern.
The typestate pattern is an API design pattern that encodes information about an object’s run-time state in its compile-time type.
Read more about typestates:
Things that can't be validated at compile time (like user input) are validated at runtime.
Domain objects have to protect their invariants. Having some validation rules here will protect their state from corruption.
Value Object
can represent a typed value in domain (a domain primitive). The goal here is to encapsulate validations and business logic related only to the represented fields and make it impossible to pass around raw values by forcing a creation of valid Value Objects
first. This object only accepts values which make sense in its context.
If every argument and return value of a method is valid by definition, you’ll have input and output validation in every single method in your codebase without any extra effort. This will make application more resilient to errors and will protect it from a whole class of bugs and security vulnerabilities caused by invalid input data.
Data should not be trusted. There are a lot of cases when invalid data may end up in a domain. For example, if data comes from external API, database, or if it's just a programmer error.
Enforcing self-validation will inform immediately when data is corrupted. Not validating domain objects allows them to be in an incorrect state, this leads to problems.
Without domain primitives, the remaining code needs to take care of validation, formatting, comparing, and lots of other details. Entities represent long-lived objects with a distinguished identity, such as articles in a news feed, rooms in a hotel, and shopping carts in online sales. The functionality in a system often centers around changing the state of these objects: hotel rooms are booked, shopping cart contents are paid for, and so on. Sooner or later the flow of control will be guided to some code representing these entities. And if all the data is transmitted as generic types such as int or String , responsibilities fall on the entity code to validate, compare, and format the data, among other tasks. The entity code will be burdened with a lot of tasks, rather than focusing on the central business flow-of-state changes that it models. Using domain primitives can counteract the tendency for entities to grow overly complex.
Quote from: Secure by design: Chapter 5.3 Standing on the shoulders of domain primitives
Note: Though primitive obsession is a code smell, some people consider making a class/object for every primitive may be an overengineering. For less complex and smaller projects it definitely may be. For bigger projects, there are people who advocate for and against this approach. If creating a class for every primitive is not preferred, create classes just for those primitives that have specific rules or behavior, or just validate only outside of domain using some validation framework. Here are some thoughts on this topic: From Primitive Obsession to Domain Modelling - Over-engineering?.
Recommended to read:
- Making illegal states unrepresentable
- Domain Primitives: what they are and how you can use them to make more secure software
- "Secure by Design" Chapter 5: Domain Primitives (a full chapter of the article above)
For simple validation like checking for nulls, empty arrays, input length etc. a library of guards can be created.
Example file: guard.ts
Read more: Refactoring: Guard Clauses
Another solution would be using an external validation library, but it is not a good practice to tie domain to external libraries and is not usually recommended.
Although exceptions can be made if needed, especially for very specific validation libraries that validate only one thing (like specific IDs, for example bitcoin wallet address). Tying only one or just few Value Objects
to such a specific library won't cause any harm. Unlike general purpose validation libraries which will be tied to domain everywhere and it will be troublesome to change it in every Value Object
in case when old library is no longer maintained, contains critical bugs or is compromised by hackers etc.
Though, it is fine to do full sanity checks using validation framework or library outside of domain (for example class-validator decorators in DTOs
), and do only some basic checks inside of Value Objects
(besides business rules), like checking for null
or undefined
, checking length, matching against simple regexp etc. to check if value makes sense and for extra security.
Note about using regexp
Be careful with custom regexp validations for things like validating email
, only use custom regexp for some very simple rules and, if possible, let validation library do it's job on more difficult ones to avoid problems in case your regexp is not good enough.
Also, keep in mind that custom regexp that does same type of validation that is already done by validation library outside of domain may create conflicts between your regexp and the one used by a validation library.
For example, value can be accepted as valid by a validation library, but Value Object
may throw an error because custom regexp is not good enough (validating email
is more complex than just copy - pasting a regular expression found in google. Though, it can be validated by a simple rule that is true all the time and won't cause any conflicts, like every email
must contain an @
). Try finding and validating only patterns that won't cause conflicts.
Although there are other strategies on how to do validation inside domain, like passing validation schema as a dependency when creating new Value Object
, but this creates extra complexity.
Either to use external library/framework for validation inside domain or not is a tradeoff, analyze all the pros and cons and choose what is more appropriate for current application.
For some projects, especially smaller ones, it might be easier and more appropriate to just use validation library/framework.
Keep in mind that not all validations can be done in a single Value Object
, it should validate only rules shared by all contexts. There are cases when validation may be different depending on a context, or one field may involve another field, or even a different entity. Handle those cases accordingly.
There are some general recommendations for validation order. Cheap operations like checking for null/undefined and checking length of data come early in the list, and more expensive operations that require calling the database come later.
Preferably in this order:
- Origin - Is the data from a legitimate sender? When possible, accept data only from authorized users / whitelisted IPs etc. depending on the situation.
- Existence - are provided data not empty? Further validations make no sense if data is empty. Check for empty values: null/undefined, empty objects and arrays.
- Size - Is it reasonably big? Before any further steps, check length/size of input data, no matter what type it is. This will prevent validating data that is too big which may block a thread entirely (sending data that is too big may be a DoS attack).
- Lexical content - Does it contain the right characters and encoding? For example, if we expect data that only contains digits, we scan it to see if there’s anything else. If we find anything else, we draw the conclusion that the data is either broken by mistake or has been maliciously crafted to fool our system.
- Syntax - Is the format right? Check if data format is right. Sometimes checking syntax is as simple as using a regexp, or it may be more complex like parsing a XML or JSON.
- Semantics - Does the data make sense? Check data in connection with the rest of the system (like database, other processes etc). For example, checking in a database if ID of item exists.
Read more about validation types described above:
Whether or not to use libraries in application layer and especially domain layer is a subject of a lot of debates. In real world, injecting every library instead of importing it directly is not always practical, so exceptions can be made for some single responsibility libraries that help to implement domain logic (like working with numbers).
Main recommendations to keep in mind is that libraries imported in application's core shouldn't expose:
- Functionality to access any out-of-process resources (http calls, database access etc);
- Functionality not relevant to domain (frameworks, technology details like ORMs, Logger etc).
- Functionality that brings randomness (generating random IDs, timestamps etc) since this makes tests unpredictable (though in TypeScript world it is not that big of a deal since this can be mocked by a test library without using DI);
- Frameworks can be a real nuisance because by definition they want to be in control. Isolate them within the adapters and keep our domain model clean of them.
- If a library changes often or has a lot of dependencies of its own it most likely shouldn't be used in domain layer.
To use such libraries consider creating an anti-corruption
layer by using adapter or facade patterns.
We sometimes tolerate libraries in the center: libraries are not in control so they are less intrusive. But be careful with general purpose libraries that may scatter across many domain objects. It will be hard to replace those libraries if needed. Tying only one or just few domain objects to some single-responsibility library should be fine. It is way easier to replace a specific library that is tied to one or few objects than a general purpose library that is everywhere.
Offload as much of irrelevant responsibilities as possible from the core, especially from domain layer. In addition, try to minimize usage of dependencies in general. More dependencies your software has means more potential errors and security holes. One technique for making software more robust is to minimize what your software depends on - the less that can go wrong, the less that will go wrong. On the other hand, removing all dependencies would be counterproductive as replicating that functionality would have been a huge amount of work and less reliable than just using a widely-used dependency. Finding a good balance is important, this skill requires experience.
Read more:
Interface adapters (also called driving/primary adapters) are user-facing interfaces that take input data from the user and repackage it in a form that is convenient for the use cases(services/command handlers) and entities. Then they take the output from those use cases and entities and repackage it in a form that is convenient for displaying it back for the user. User can be either a person using an application or another server.
Contains Controllers
and Request
/Response
DTOs (can also contain Views
, like backend-generated HTML templates, if required).
- Controllers are used for parsing requests, triggering use cases and presenting the result back to the client.
- One controller per use case is considered a good practice.
- In NestJS world controllers may be a good place to use OpenAPI/Swagger decorators for documentation.
One controller per trigger type can be used to have a more clear separation. For example:
- create-user.http.controller.ts for http requests (NestJS Controllers),
- create-user.cli.controller.ts for command line interface access (NestJS Console)
- create-user.message.controller.ts for external messages (NetJS Microservices).
- etc.
Data that comes from external applications should be represented by a special type of classes - Data Transfer Objects (DTO for short). Data Transfer Object is an object that carries data between processes. It defines a contract between your API and clients.
Input data sent by a user.
- Using Request DTOs gives a contract that a client of your API has to follow to make a correct request.
Examples:
Output data returned to a user.
- Using Response DTOs ensures clients only receive data described in DTOs contract, not everything that your model/entity owns (which may result in data leaks).
Examples:
Using DTOs protects your clients from internal data structure changes that may happen in your API. When internal data models change (like renaming variables or splitting tables), they can still be mapped to match a corresponding DTO to maintain compatibility for anyone using your API.
When updating DTO interfaces, a new version of API can be created by prefixing an endpoint with a version number, for example: v2/users
. This will make transition painless by preventing breaking compatibility for users that are slow to update their apps that uses your API.
You may have noticed that our create-user.command.ts contains the same properties as create-user.request.dto.ts. So why do we need DTOs if we already have Command objects that carry properties? Shouldn't we just have one class to avoid duplication?
Because commands and DTOs are different things, they tackle different problems. Commands are serializable method calls - calls of the methods in the domain model. Whereas DTOs are the data contracts. The main reason to introduce this separate layer with data contracts is to provide backward compatibility for the clients of your API. Without the DTOs, the API will have breaking changes with every modification of the domain model.
More info on this subject here: Are CQRS commands part of the domain model? (read "Commands vs DTOs" section).
- DTOs should be data-oriented, not object-oriented. Its properties should be mostly primitives. We are not modeling anything here, just sending flat data around.
- When returning a
Response
prefer whitelisting properties over blacklisting. This ensures that no sensitive data will leak in case if programmer forgets to blacklist newly added properties that shouldn't be returned to the user. - Interfaces for
Request
/Response
objects should be kept somewhere in shared directory instead of module directory since they may be used by a different application (like front-end page, mobile app or microservice). Consider creating git submodule or a separate package for sharing interfaces. Request
/Response
DTO classes may be a good place to use validation and sanitization decorators like class-validator and class-sanitizer (make sure that all validation errors are gathered first and only then return them to the user, this is called Notification pattern. Class-validator does this by default).Request
/Response
DTO classes may also be a good place to use Swagger/OpenAPI library decorators that NestJS provides.- If DTO decorators for validation/documentation are not used, DTO can be just an interface instead of class + interface.
- Data can be transformed to DTO format using a separate mapper or right in the constructor if DTO classes are used.
The Infrastructure is responsible strictly to keep technology. You can find there the implementations of database repositories for business entities, message brokers, I/O components, dependency injection, frameworks and any other thing that represents a detail for the architecture, mostly framework dependent, external dependencies, and so on.
It's the most volatile layer. Since the things in this layer are so likely to change, they are kept as far away as possible from the more stable domain layers. Because they are kept separate, it's relatively easy make changes or swap one component for another.
Infrastructure layer can contain Adapters
, database related files like Repositories
, ORM entities
/Schemas
, framework related files etc.
- Infrastructure adapters (also called driven/secondary adapters) enable a software system to interact with external systems by receiving, storing and providing data when requested (like persistence, message brokers, sending emails or messages, requesting 3rd party APIs etc).
- Adapters also can be used to interact with different domains inside single process to avoid coupling between those domains.
- Adapters are essentially an implementation of ports. They are not supposed to be called directly in any point in code, only through ports(interfaces).
- Adapters can be used as Anti-Corruption Layer (ACL) for legacy code.
Read more on ACL: Anti-Corruption Layer: How to Keep Legacy Support from Breaking New Systems
Adapters should have:
- a
port
somewhere in application/domain layer that it implements; - a mapper that maps data from and to domain (if it's needed);
- a DTO/interface for received data;
- a validator to make sure incoming data is not corrupted (validation can reside in DTO class using decorators, or it can be validated by
Value Objects
).
Repositories centralize common data access functionality. They encapsulate the logic required to access that data. Entities/aggregates can be put into a repository and then retrieved at a later time without domain even knowing where data is saved, in a database, or a file, or some other source.
We use repositories to decouple the infrastructure or technology used to access databases from the domain model layer.
Martin Fowler describes a repository as follows:
A repository performs the tasks of an intermediary between the domain model layers and data mapping, acting in a similar way to a set of domain objects in memory. Client objects declaratively build queries and send them to the repositories for answers. Conceptually, a repository encapsulates a set of objects stored in the database and operations that can be performed on them, providing a way that is closer to the persistence layer. Repositories, also, support the purpose of separating, clearly and in one direction, the dependency between the work domain and the data allocation or mapping.
The data flow here looks something like this: repository receives a domain Entity
from application service, maps it to database schema/ORM format, does required operations and maps it back to domain Entity
and returns it back to service.
Keep in mind that application's core is not allowed to depend on repositories directly, instead it depends on abstractions (ports/interfaces). This makes data retrieval technology-agnostic.
This project contains abstract repository class that allows to make basic CRUD operations: typeorm.repository.base.ts. This base class is then extended by a specific repository, and all specific operations that an entity may need is implemented in that specific repo: user.repository.ts.
Using a single entity for domain logic and database concerns leads to a database-centric architecture. In DDD world domain model and persistance model should be separated.
Since domain Entities
have their data modeled so that it best accommodates domain logic, it may be not in the best shape to save in a database. For that purpose Persistence models
can be created that have a shape that is better represented in a particular database that is used. Domain layer should not know anything about persistance models, and it should not care.
There can be multiple models optimized for different purposes, for example:
- Domain with it's own models -
Entities
,Aggregates
andValue Objects
. - Persistence layer with it's own models -
ORM
for SQL,Schemas
for NoSQL,Read/Write
models if databases are separated into a read and write db (CQRS) etc.
Over time, when the amount of data grows, there may be a need to make some changes in the database like improving performance or data integrity by re-designing some tables or even changing the database entirely. Without an explicit separation between Domain
and Persistance
models any change to the database will lead to change in your domain Entities
or Aggregates
. For example, when performing a database normalization data can spread across multiple tables rather than being in one table, or vice-versa for denormalization. This may force a team to do a complete refactoring of a domain layer which may cause unexpected bugs and challenges. Separating Domain and Persistance models prevents that.
An alternative to using Persistence Models may be raw queries or some sort of a query builder, in this case you may not need to create ORM Entities
or Schemas
.
Note: separating domain and persistance models may be an overkill for smaller applications, consider all pros and cons before making this decision.
Example files:
- user.orm-entity.ts <- Persistence model using ORM.
- user.orm-mapper.ts <- Persistence models should also have a corresponding mapper to map from domain to persistence and back.
Read more:
- Stack Overflow question: DDD - Persistence Model and Domain Model
- Just Stop It! The Domain Model Is Not The Persistence Model
- Secure by Design: Chapter 6.2.2 ORM frameworks and no-arg constructors
- Framework related files;
- Application logger implementation;
- Infrastructure related events (Nest-event)
- Periodic cron jobs or tasks launchers (NestJS Schedule);
- Other technology related files.
Be careful when implementing any complex architecture in small-medium sized projects with not a lot of business logic. Some of the building blocks/patterns/principles may fit well, but others may be an overengineering.
For example:
- Separating code into modules/layers/use-cases, using some building blocks like controllers/services/entities, respecting boundaries and dependency injections etc. may be a good idea for any project.
- But practices like creating an object for every primitive, using
Value Objects
to separate business logic into smaller classes, separatingDomain Models
fromPersistence Models
etc. in projects that are more data-centric and have little or no business logic may only complicate such solutions and add extra boilerplate code, data mapping, maintenance overheads etc. without adding much benefit.
DDD and other practices described here are mostly about creating software with complex business logic. But what would be a better approach for simpler applications?
For applications with not a lot of business logic consider other architectures. The most popular is probably MVC. Model-View-Controller is better suited for CRUD applications with little business logic since it tends to favor designs where software is mostly the view of the database.
Different projects most likely will have different requirements. Some principles/patterns in such projects can be implemented in a simplified form, some can be skipped. Follow YAGNI principle and don't overengineer.
Sometimes complex architecture and principles like SOLID can be incompatible with YAGNI and KISS. A good programmer should be pragmatic and has to be able to combine his skills and knowledge with a common sense to choose the best solution for the problem.
You need some experience with object-oriented software development in real world projects before they are of any use to you. Furthermore, they don’t tell you when you have found a good solution and when you went too far. Going too far means that you are outside the “scope” of a principle and the expected advantages don’t appear.
Principles, Heuristics, ‘laws of engineering’ are like hint signs, they are helpful when you know where they are pointing to and you know when you have gone too far. Applying them requires experience, that is trying things out, failing, analysing, talking to people, failing again, fixing, learning and failing some more. There is no short cut as far as I know.
Before implementing any pattern always analyze if benefit given by using it worth extra code complexity.
Effective design argues that we need to know the price of a pattern is worth paying - that's its own skill.
However, remember:
It's easier to refactor over-design than it is to refactor no design.
Read more:
- Martin Fowler blog: Yagni
- 7 Software Development Principles That Should Be Embraced Daily
- SOLID Principles and the Arts of Finding the Beach
Consider extending Error
object to make custom generic exception types for different situations. For example: DomainException
, ValidationException
etc. This is especially relevant in NodeJS world since there is no exceptions for different situations by default. Also, you can create domain-specific exceptions for different use-cases, like NotEnoughFundsError
or SeatIsAlreadyBookedError
.
Keep in mind that application's core
shouldn't throw HTTP exceptions or statuses since it shouldn't know in what context it is used, since it can be used by anything: HTTP controller, Microservice event handler, Command Line Interface etc. A better approach is to create custom error codes for your app, like code: 'WALLET.NOT_ENOUGH_FUNDS'
etc.
When used in HTTP context, for returning proper status code back to user an instanceof
or a switch/case
check against the custom code can be performed in exception interceptor or in a controller and appropriate HTTP exception can be returned depending on exception type/code.
Exception interceptor example: exception.interceptor.ts - notice how custom exceptions are converted to nest.js exceptions.
Adding a name
or code
string with type name or a custom status code for every exception is a good practice, since when that exception is transferred to another process instanceof
check cannot be performed anymore so a name
/code
string is used instead. name
or code
enum types can be stored in a separate file so they can be shared and reused on a receiving side: exception.types.ts.
When using microservices, exception types/enums/codes can be packed into a library and reused in each microservice for consistency.
Application should be protected not only from operational errors (like incorrect user input), but from a programmer errors as well by throwing exceptions when something is not used as intended.
For example:
- Operational errors can happen when validation error is thrown by validating user input, it means that input body is incorrect and a
400 Bad Request
exception should be returned to the user with details of what fields are incorrect (notification pattern). In this case user can fix the input body and retry the request. - On the other hand, programmer error means something unexpected occurs in the program. For example, when exception happens on a new domain object creation, sometimes it can mean that a class is not used as intended and some rule is violated, for example a programmer did a mistake by assigning an incorrect value to a constructor, or value got mutated at some point and is no longer valid. In this case user cannot do anything to fix this, only a programmer can, so it may be more appropriate to throw a different type of exception that should be logged and then returned to the user as
500 Internal Server Error
, in this case without adding much additional details to the response since it may cause a leak of some sensitive data.
Consider adding optional metadata
object to exceptions (if language doesn't support anything similar by default) and pass some useful technical information about the error when throwing. This will make debugging easier.
Important to keep in mind: never log or add to metadata
any sensitive information (like passwords, emails, phone or credit card numbers etc) since this information may leak into log files, and if log files are not protected properly this information can leak or be seen by developers who have access to log files. Aim adding only technical information to your logs.
- If translations of error messages to other languages is needed, consider storing those error messages in a separate object/class rather than inline string literals. This will make it easier to implement localization by adding conditional getters. Also, it is usually better to store all localization in a single place, for example, having a single file/folder for all messages that need translation, and then import them where needed. It is easier to add new translations when all of your messages are in one place rather then scattered across the app.
- You can use "Problem Details for HTTP APIs" standard for returned exceptions, described in RFC 7807. Read more about this standard: REST API Error Handling - Problem Details Response
- By default in NodeJS Error objects are not serialized properly when sending plain objects to external processes. Consider creating a
toJSON()
method so it can be easily sent to other processes as a plain object. (see example in exception.base.ts). But keep in mind not to return a stack trace when in production.
Example files:
- exception.base.ts - Exception abstract base class
- domain.exception.ts - Domain Exception class example
- Check exceptions folder to see more examples (some of them are exceptions from other languages like C# or Java)
Read more:
There is an alternative approach of not throwing exceptions, but returning some kind of Result object type with a Success or a Failure (an Either
monad from functional languages like Haskell). Unlike throwing exceptions, this approach allows to define types for exceptional outcomes and will force us to handle those cases explicitly instead of using try/catch
. For example:
class User {
// ...
public createUser(): Result<User, EmailInvalidException> {
// ...code for creating a user
if (invalidEmail) {
return Result.err(EmailInvalidException); // <- returning instead of throwing
}
return Result.ok(User);
}
}
This approach has its advantages each and may work nicely in some languages, especially in functional languages which support Either
type natively, but is not widely used in TypeScript/Javascript world.
Advantages:
- Explicitly shows type of each exception that a method can return so you can handle it accordingly.
- Complex domains may have a lot of exceptions that need special handling and are part of a business logic (like seat already booked, choose another one). In those cases explicit error types may be useful.
- Makes error tracing easier.
Downsides:
- If used incorrectly, i.e for technical (connection failed) errors, It may cause some security issues and goes against Fail-fast principle. Instead of terminating a program flow, returning exception continues program execution and allows it to run in an incorrect state, which may lead to more unexpected errors, so it's generally better to throw in those cases rather then returning an error.
- It adds extra complexity. Exception cases returned somewhere deep inside application have to be handled by functions in upper layers until it reaches controllers which may add a lot of extra
if
statements. - More boilerplate code.
In most applications it makes more sense to just throw an exception and notify a user immediately. Use Result
/Either
error types carefully and if you really need it and know what you are doing (unless you're using a language like Rust which has this functionality built-in).
Read more:
- "Secure by Design" Chapter 9.2: Handling failures without exceptions
- Flexible Error Handling w/ the Result Class
Software Testing helps catching bugs early. Properly tested software product ensures reliability, security and high performance which further results in time saving, cost effectiveness and customer satisfaction.
Lets review two types of software testing:
Testing module/use-case internal structures (creating a test for every file/class) is called White Box
testing. White Box testing is widely used technique, but it has disadvantages. It creates coupling to implementation details, so every time you decide to refactor business logic code this may also cause a refactoring of corresponding tests.
Use case requirements may change mid work, your understanding of a problem may evolve or you may start noticing new patterns that emerge during development, in other words, you start noticing a "big picture", which may lead to refactoring. For example: imagine that you defined a White box
test for a class, and while developing this class you start noticing that it does too much and should be separated into two classes. Now you'll also have to refactor your unit test. After some time, while implementing a new feature, you notice that this new feature uses some code from that class you defined before, so you decide to separate that code and make it reusable, creating a third class (which originally was one), which leads to changing your unit tests yet again, every time you refactor. Use case requirements, input, output or behavior never changed, but unit tests had to be changed multiple times. This is inefficient and time consuming.
To solve this and get the most out of your tests, prefer Black Box
testing (Behavioral Testing). This means that tests should focus on testing user-facing behavior users care about (your code's public API), not the implementation details of individual units it has inside. This avoids coupling, protects tests from changes that may happen while refactoring, makes tests easier to understand and maintain thus saving time.
Tests that are independent of implementation details are easier to maintain since they don't need to be changed each time you make a change to the implementation.
Try to avoid White Box testing when possible. However, it's worth mentioning that there are cases when White Box testing may be useful. For instance, we need to go deeper into the implementation when it is required to reduce combinations of testing conditions, for example, a class uses several plug-in strategies, thus it is easier for us to test those strategies one at a time, in this case White Box tests may be appropriate.
Use White Box testing only when it is really needed and as an addition to Black Box testing, not the other way around.
It's all about investing only in the tests that yield the biggest return on your effort.
Behavioral tests can be divided in two parts:
- Fast: Use cases tests in isolation which test only your business logic, with all I/O (external API or database calls, file reads etc.) mocked. This makes tests fast so they can be run all the time (after each change or before every commit). This will inform you when something fails as fast as possible. Finding bugs early is critical and saves a lot of time.
- Slow: Full End to End (e2e) tests which test a use case from end-user standpoint. Instead of injecting I/O mocks those tests should have all infrastructure up and running: like database, API routes etc. Those tests check how everything works together and are slower so can be run only before pushing/deploying. Though e2e tests live in the same project/repository, it is a good practice to have e2e tests independent from project's code. In bigger projects e2e tests are usually written by a separate QA team.
Note: some people try to make e2e tests faster by using in-memory or embedded databases (like sqlite3). This makes tests faster, but reduces the reliability of those tests and should be avoided. Read more: Don't use In-Memory Databases for Tests.
For BDD tests Cucumber with Gherkin syntax can give a structure and meaning to your tests. This way even people not involved in a development can define steps needed for testing. In node.js world jest-cucumber is a nice package to achieve that.
Example files:
- create-user.feature - feature file that contains Gherkin steps
- create-user.e2e-spec.ts - spec file that executes Gherkin steps
Read more:
- Pragmatic unit testing
- Google Blog: Test Behavior, Not Implementation
- Writing BDD Test Scenarios
- Book: Unit Testing Principles, Practices, and Patterns
For projects with a bigger user base you might want to implement some kind of load testing to see how program behaves with a lot of concurrent users.
Load testing is a great way to minimize performance risks, because it ensures an API can handle an expected load. By simulating traffic to an API in development, businesses can identify bottlenecks before they reach production environments. These bottlenecks can be difficult to find in development environments in the absence of a production load.
Automatic load testing tools can simulate that load by making a lot of concurrent requests to an API and measure response times and error rates.
Example tools:
Example files:
- create-user.artillery.yaml - Artillery load testing config file. Also can be useful for seeding database with dummy data.
More info:
- Top 6 Tools for API & Load Testing.
- Getting started with API Load Testing (Stress, Spike, Load, Soak)
Fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program.
Fuzzing is a common method hackers use to find vulnerabilities of the system. For example:
- JavaScript injections can be executed if input is not sanitized properly, so a malicious JS code can end up in a database and then gets executed in a browser when somebody reads that data.
- SQL injection attacks can occur if data is not sanitized properly, so hackers can get access to a database (though modern ORM libraries can protect from that kind of attacks when used properly).
- Sending weird unicode characters, emojis etc. can crash your application.
There are a lot of examples of a problems like this, for example sending a certain character could crash and disable access to apps on an iPhone.
Sanitizing and validating input data is very important. But sometimes we make mistakes of not sanitizing/validating data properly, opening application to certain vulnerabilities.
Automated Fuzz testing tools can prevent such vulnerabilities. Those tools contain a list of strings that are usually sent by hackers, like malicious code snippets, SQL queries, unicode symbols etc. (for example: Big List of Naughty Strings), which helps test most common cases of different injection attacks.
Fuzz testing is a nice addition to typical testing methods described above and potentially can find serious security vulnerabilities or defects.
Example tools:
- Artillery Fuzzer is a plugin for Artillery to perform Fuzz testing.
- sqlmap - an open source penetration testing tool that automates the process of detecting and exploiting SQL injection flaws
Read more:
- Store all configurable variables/parameters in config files. Try to avoid using in-line literals/primitives. This will make it easier to find and maintain all configurable parameters when they are in one place.
- Never store sensitive configuration variables (passwords/API keys/secret keys etc) in plain text in a configuration files or source code.
- Store sensitive configuration variables, or variables that change depending on environment, as environment variables (dotenv is a nice package for that) or as a Docker/Kubernetes secrets.
- Create hierarchical config files that are grouped into sections. If possible, create multiple files for different configs (like database config, API config, tasks config etc).
- Application should fail and provide the immediate feedback if the required environment variables are not present at start-up.
- For most projects plain object configs may be enough, but there are other options, for example: NestJS Configuration, rc, nconf or any other package.
Example files:
- ormconfig.ts - this is typeorm database config file. Notice
process.env
- those are environmental variables. - .env.example - this is dotenv example file. This file should only store dummy example secret keys, never store actual development/production secrets in it. This file later is renamed to
.env
and populated with real keys for every environment (local, dev or prod). Don't forget to add.env
to .gitignore file to avoid pushing it to repo and leaking all keys.
- Try to log all meaningful events in a program that can be useful to anybody in your team.
- Use proper log levels:
log
/info
for events that are meaningful during production,debug
for events useful while developing/debugging, andwarn
/error
for unwanted behavior on any stage. - Write meaningful log messages and include metadata that may be useful. Try to avoid cryptic messages that only you understand.
- Never log sensitive data: passwords, emails, credit card numbers etc. since this data will end up in log files. If log files are not stored securely this data can be leaked.
- Avoid default logging tools (like
console.log
). Use mature logger libraries (for example Winston) that support features like enabling/disabling log levels, convenient log formats that are easy to parse (like JSON) etc. - Consider including user id in logs. It will facilitate investigating if user creates an incident ticket.
- In distributed systems a gateway can generate an unique correlation id for each request and pass it to every system that processes this request. Logging this id will make it easier to find related logs across different systems/files.
- Use consistent structure across all logs. Each log line should represent one single event and can contain things like a timestamp, context, unique user id or correlation id and/or id of an entity/aggregate that is being modified, as well as additional metadata if required.
- Use log managements systems. This will allow you to track and analyze logs as they happen in real-time. Here are some short list of log managers: Sentry, Loggly, Logstash, Splunk etc.
- Send notifications of important events that happen in production to a corporate chat like Slack or even by SMS.
- Don't write logs to a file from your program. Write all logs to stdout (to a terminal window) and let other tools handle writing logs to a file (for example docker supports writing logs to a file). Read more: Why should your Node.js application not handle log routing?
- Logs can be visualized by using a tool like Kibana.
Read more:
Additionally to logging tools, when something unexpected happens in production, it's critical to have thorough monitoring in place. As software hardens more and more, unexpected events will get more and more infrequent and reproducing those events will become harder and harder. So when one of those unexpected events happens, there should be as much data available about the event as possible. Software should be designed from the start to be monitored. Monitoring aspects of software are almost as important as the functionality of the software itself, especially in big systems, since unexpected events can lead to money and reputation loss for a company. Monitoring helps fixing and sometimes preventing unexpected behavior like failures, slow response times, errors etc.
Health monitoring tools are a good way to keep track of system performance, identify causes of crashes or downtime, monitor behavior, availability and load.
Some health monitoring tools already include logging management and error tracking, as well as alerts and general performance monitoring.
Here are some basic recommendation on what can be monitored:
- Connectivity – Verify if user can successfully send a request to the API endpoint and get a response with expected HTTP status code. This will confirm if the API endpoint is up and running. This can be achieved by creating some kind of 'heath check' endpoint.
- Performance – Make sure the response time of the API is within acceptable limits. Long response times cause bad user experience.
- Error rate – errors immediately affect your customers, you need to know when errors happen right away and fix them.
- CPU and Memory usage – spikes in CPU and Memory usage can indicate that there are problems in your system, for example bad optimized code, unwanted process running, memory leaks etc. This can result in loss of money for your organization, especially when cloud providers are used.
- Storage usage – servers run out of storage. Monitoring storage usage is essential to avoid data loss.
Choose health monitoring tools depending on your needs, here are some examples:
Read more:
So instead of using typical layered style when an entire application is divided into services, controllers etc, we divide everything by modules. Now, how to structure files inside those modules?
A lot of people tend to do the same thing as before: create one big service/controller for a module and keep all logic for module's use cases there, making those controllers and services hundreds of lines long, which is hard to navigate and makes merge conflicts a nightmare to manage. Or they create a folder for each file type, like interfaces
or services
folder and store all unrelated to each other interfaces/services in there. This is the same approach that makes navigation harder. Every time you need to change something, instead of having all related files in the same place, you have to jump folders to find where the related files are.
It would be more logical to separate every module by components and have all related files close together. For example, check out create-user folder. It has most of the files that it needs inside the same folder: a controller, service, command etc. Now if a use-case changes, most of the changes are usually made in a single component (folder), not everywhere across the module.
And shared files, like domain objects (entities/aggregates), repositories, shared dtos and interfaces etc are stored apart since those are reused by multiple use-cases. Domain layer is isolated, and use-cases which are essentially wrappers around business logic are treated as components. This approach makes navigation and maintaining easier. Check user module for more examples.
This is called The Common Closure Principle (CCP). Folder/file structure in this project uses this principle. Related files that usually change together (and are not used by anything else outside of that component) are stored close together, in a single use-case folder.
The aim here should to be strategic and place classes that we, from experience, know often changes together into the same component.
Keep in mind that this project's folder/file structure is an example and might not work for everyone. Main recommendations here are:
- Separate you application into modules;
- Keep files that change together close to each other (Common Closure Principle);
- Group files by their behavior that changes together, not by a type of functionality that file provides;
- Keep files that are reused by multiple components apart;
- Respect boundaries in your code, keeping files together doesn't mean inner layers can import outer layers;
- Try to avoid a lot of nested folders;
- Move files around until it feels right.
There are different approaches to file/folder structuring, like explicitly separating each layer into a corresponding folder. This defines boundaries more clearly but is harder to navigate. Choose what suits better for the project/personal preference.
Examples:
- Commands folder contains all state changing use cases and each use case inside it contains most of the things that it needs: controller, service, dto, command etc.
- Queries folder is structured in the same way as commands but contains data retrieval use cases.
Read more:
Consider giving a descriptive type names to files after a dot ".
", like *.service.ts
or *.entity.ts
. This makes it easier to differentiate what files does what and makes it easier to find those files using fuzzy search (CTRL+P
for Windows/Linux and ⌘+P
for MacOS in VSCode to try it out).
Read more:
Static code analysis is a method of debugging by examining source code before a program is run.
For JavasScript and TypeScript, Eslint with typescript-eslint plugin and some rules (like airbnb / airbnb-typescript) can be a great tool to enforce writing better code.
Try to make linter rules reasonably strict, this will help greatly to avoid "shooting yourself in a foot". Strict linter rules can prevent bugs and even serious security holes (eslint-plugin-security).
Adopt programming habits that constrain you, to help you to limit mistakes.
For example:
Using explicit any
type is a bad practice. Consider disallowing it (and other things that may cause problems):
// .eslintrc.js file
rules: {
'@typescript-eslint/no-explicit-any': 'error',
// ...
}
Also, enabling strict mode in tsconfig.json
is recommended, this will disallow things like implicit any
types:
"compilerOptions": {
"strict": true,
// ...
}
Example file: .eslintrc.js
Code Spell Checker may be a good addition to eslint.
Read more:
The way code looks adds to our understanding of it. Good style makes reading code a pleasurable and consistent experience.
Consider using code formatters like Prettier to maintain same code styles in the project.
Read more:
Here are some useful tips to help users/other developers to use your program.
Use OpenAPI (Swagger) or GraphQL specifications. Document in details every endpoint. Add description and examples of every request, response, properties and exceptions that endpoints may return or receive as body/parameters. This will help greatly to other developers and users of your API.
Example files:
- user.response.dto.ts - notice
@ApiProperty()
decorators. This is NestJS Swagger module. - create-user.http.controller.ts - notice
@ApiOperation()
and@ApiResponse()
decorators.
Read more:
Create a simple readme file in a git repository that describes basic app functionality, available CLI commands, how to setup a new project etc.
Code can be self-documenting to some degree. One useful trick is to separate complex code to smaller chunks with a descriptive name. For example:
- Separating a big function into a bunch of small ones with descriptive names, each with a single responsibility;
- Moving in-line primitives or hard to read conditionals into a variable with a descriptive name.
This makes code easier to understand and maintain.
Read more:
Writing readable code, using descriptive function/method/variable names and creating tests can document your code well enough. Try to avoid comments when possible and try to make your code legible and tested instead.
Use comments only when it's really needed. Commenting may be a code smell in some cases, like when code gets changed but a developer forgets to update a comment (comments should be maintained, too).
Code never lies, comments sometimes do.
Use comments only in some special cases, like when writing an counter-intuitive "hack" or performance optimization which is hard to read.
For documenting public APIs use code annotations (like JSDoc) instead of comments, this works nicely with code editor intellisense.
Read more:
Types give useful semantic information to a developer and can be useful for documenting code, so prefer static typed languages to dynamic typed (untyped) languages for larger projects (for example by using TypeScript over JavaScript).
Note: For smaller projects/scripts/jobs static typing may not be needed.
There are a lot of projects out there which take effort to configure after downloading it. Everything has to be set up manually: database, all configs etc. If new developer joins the team he has to waste a lot of time just to make application work.
This is a bad practice and should be avoided. Setting up project after downloading it should be as easy as launching one or few commands in terminal. Consider adding scripts to do this automatically:
- package.json scripts
- docker-compose file
- Makefile
- Database seeding and migrations (described below)
- or any other tools.
Example files:
- package.json - notice all added scripts for launching tests, migrations, seeding, docker environment etc.
- docker-compose.yml - after configuring everything in a docker-compose file, running a database and a db admin panel (and any other additional tools) can be done using only one command. This way there is no need to install and configure a database separately.
To avoid manually creating data in the database, seeding is a great solution to populate database with data for development and testing purposes (e2e testing). Wiki description.
This project uses typeorm-seeding package.
Example file: user.seeds.ts
Migrations are used for database table/schema changes:
Database migration refers to the management of incremental, reversible changes and version control to relational database schemas. A schema migration is performed on a database whenever it is necessary to update or revert that database's schema to some newer or older version.
Source: Wiki
Migrations should be generated every time database table schema is changed. When pushed to production it can be launched automatically.
BE CAREFUL not to drop some columns/tables that contain data by accident. Perform data migrations before table schema migrations and always backup database before doing anything.
This project uses Typeorm Migrations which automatically generates sql table schema migrations like this:
Example file: 1611765824842-CreateTables.ts
Seeds and migrations belong to Infrastructure layer.
By default there is no limit on how many request users can make to your API. This may lead to problems, like DoS or brute force attacks, performance issues like high response time etc.
To solve this, implementing Rate Limiting is essential for any API.
- In NodeJS world, express-rate-limit is an option for simple APIs.
- Another alternative is NGINX Rate Limiting.
- Kong has rate limiting plugin.
Read more:
- Everything You Need To Know About API Rate Limiting
- Rate-limiting strategies and techniques
- How to Design a Scalable Rate Limiting Algorithm
Code generation can be important when using complex architectures to avoid typing boilerplate code manually.
Hygen is a great example. This tool can generate building blocks (or entire modules) by using custom templates. Templates can be designed to follow best practices and concepts based on Clean/Hexagonal Architecture, DDD, SOLID etc.
Main advantages of automatic code generation are:
- Avoid manual typing or copy-pasting of boilerplate code.
- No hand-coding means less errors and faster implementations. Simple CRUD module can be generated and used right away in seconds without any manual code writing.
- Using auto-generated code templates ensures that everyone in the team uses the same folder/file structures, name conventions, architectural and code styles.
Note:
- To really understand and work with generated templates you need to understand what is being generated and why, so full understanding of an architecture and patterns used is required.
Consider creating a bunch of shared custom utility types for different situations.
Some examples can be found in types folder.
Consider launching tests/code formatting/linting every time you do git push
or git commit
. This prevents bad code getting in your repo. Husky is a great tool for that.
Read more:
This can be achieved by making class final
.
Note: in TypeScript, unlike other languages, there is no default way to make class final
. But there is a way around it using a custom decorator.
Example file: final.decorator.ts
Read more:
Conventional commits add some useful prefixes to your commit messages, for example:
feat: added ability to delete user's profile
This creates a common language that makes easier communicating the nature of changes to teammates and also may be useful for automatic package versioning and release notes generation.
Read more:
- DDD, Hexagonal, Onion, Clean, CQRS, … How I put it all together
- Hexagonal Architecture
- Clean architecture series
- Clean architecture for the rest of us
- An illustrated guide to 12 Factor Apps
- The Twelve-Factor App
- Refactoring guru - Catalog of Design Patterns
- Microsoft - Cloud Design Patterns
- More Testable Code with the Hexagonal Architecture
- Playlist: Design Patterns Video Tutorial
- Playlist: Design Patterns in Object Oriented Programming
- Herberto Graca - Making architecture explicit
- "Domain-Driven Design: Tackling Complexity in the Heart of Software" by Eric Evans
- "Secure by Design" by Dan Bergh Johnsson, Daniel Deogun, Daniel Sawano
- "Implementing Domain-Driven Design" by Vaughn Vernon
- "Clean Architecture: A Craftsman's Guide to Software Structure and Design" by Robert Martin
- Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann