Your ORM Is Lying to You (And You’re Paying for It)
I need to tell you something. And that took me a very long time to admit.
For most of my career, I thought that writing raw SQL is a sign of failure. It meant you couldn’t make the ORM do what you wanted. It meant your abstractions had broken down. Senior engineers used ORMs. Junior engineers wrote SQL.
I was wrong. Not just wrong. That belief was actively harmful. It led me to write code that looked clean and efficient but was actually a performance disaster.
Can you guess how many SQL queries it took to render a single page in one of our applications? A simple list page with twenty rows.
And the number is 47. 47 round trips to the database and back to show twenty rows of data.
The code that produced that was absolutely clean. Type-safe. Fully tested. With beautiful abstractions.
And the code was quietly destroying our response times. And no one had noticed that because the abstraction had hidden the crime from us.
The one thing that nobody puts in the ORM’s marketing material is that it is very good at hiding what it is really doing with the database.
Let’s Talk About What You Actually Have
ORM brings a lot to the table. And that is a part of the problem. Some of that stuff is so good that it can hurt you without you noticing that.
ORM brings schema management – versioned migrations, rollbacks, and schema diffing. And that part is fantastic. And you should absolutely keep using it.
Also, it brings a nice API for simple object operations – `findById`, `create`, `update`, and `delete`. And that part is also fantastic. The code is clean, the type safety is real, and there’s no meaningful performance cost.
Next thing – mapping between objects and tables. And this is a core part of ORM. And also, this is the place where problems start. This mapping is perfect when it works and creates a silent problem when it does not.
ORM brings an illusion of database portability. And yes in some ideal world you can switch from MySQL to PostgreSQL without changing a line of code. But in the real world, you never do. You pick a database and you stick with it. And chasing that illusion causes you to avoid the database-specific features that would actually make your application fast.
Every abstraction makes tradeoffs. If you understand them, you will survive. If you don’t, you will end up with forty-seven (or even more) queries to show twenty rows.
The N+1 Problem Is a Symptom, Not the Disease
Almost every developer who has worked with ORM has heard about the N+1 query problem. Lazy load relationship inside a loop, and this will trigger one query per iteration. And almost every tutorial for ORMs brings this as an example.
But I think focusing just on the N+1 problem misses the bigger picture. The N+1 problem is just a visible part of the iceberg of ORM issues – they make it too easy to write code that looks efficient but is a disaster under the hood.
My 47 queries were not all N+1 patterns. Some were redundant lookups. In some of them, ORM was loading full entities when only two fields were needed. Some of them were automatic COUNT queries sent by the pagination library. And none of this is obvious from reading the application code! You can see it only by looking at the query log.
And raw SQL gives you the opposite experience. You know exactly what queries are being sent to the database.
Check this raw SQL.
```
javascript
SELECT id, name, email
FROM users
WHERE accountid = ?
AND isactive = 1
ORDER BY created_at DESC
LIMIT 20 OFFSET 0;
You know the query. And you know the exact columns being fetched. You can think of the indexes you will need for this query. And also, you can paste it directly to your database console and run EXPLAIN command on it.
With the ORM, all you know is the intent. The real implementation will be later at runtime. And the real queries might change over time with your model changes, and you might not have control over them.
## Where ORMs Genuinely Cannot Follow You
There is a class of problems where ORM is not just slow. It simply cannot handle them. And configuration cannot fix that.
### Window Functions
Window functions - `RANK()`, `ROW_NUMBER()`, `LAG()`, `LEAD()`, running totals, moving averages - are one of the most powerful features of modern SQL. In one query they let you perform such things that on the application level will take quite complex logic.
sql
javascript
SELECT
userid,
amount,
orderdate,
SUM(amount) OVER (
PARTITION BY userid
ORDER BY orderdate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS runningtotal
FROM orders
WHERE accountid = ?;
This query will produce a running total for the account in a single request. It is fast, elegant and correct.
If you try to implement the same on the ORM's query builder, you will not be able to do that. Not without dropping to the same raw SQL level anyway.
So there are two options to perform that calculation. You either pull all the rows into the memory and calculate it on the application level, or you write raw SQL. There is no third option.
### Bulk Operations
Imagine the scenario. You need to insert 10,000 records from an external API and need to insert or update them in your database. New records should be inserted and existing should be updated if incoming data is newer.
The ORM approach:
php
javascript
foreach ($incomingRecords as $data) {
$record = $this->repository->findOneBy([‘external_id’ => $data[‘id’]]);
if ($record === null) {
$record = new SyncRecord();
$record->setExternalId($data['id']);
}
$record->setPayload($data['payload']);
$record->setSyncedAt(new DateTimeImmutable());
$this->em->persist($record);
}
$this->em->flush();
sql
In order to process 10k records we will have to execute 10,000 select queries and 10,000 select or update queries. And this will take quite some time to process.
n
The SQL approach:
php
sql
INSERT INTO syncrecords (externalid, payload, syncedat)
VALUES (?, ?, NOW()),
(?, ?, NOW()),
…
ON DUPLICATE KEY UPDATE
payload = VALUES(payload),
syncedat = VALUES(synced_at);
php
Batched in chunks of 500-1,000 rows. Total execution time: seconds. `ON DUPLICATE KEY UPDATE` is a MySQL feature. It has no ORM equivalent. The abstraction simply cannot cover this.
### JSON Operations
If you're storing JSON in your database, and after MySQL 8 and PostgreSQL made it pretty useful. And many applications do that. You will eventually run into the wall of what your ORM can express.
Filtering by a JSON path? Indexing on a JSON field value? Using `JSON_TABLE` to convert a JSON array into rows? These are real operations that databases handle well. They are also operations that ORMs handle poorly, partially, or not at all.
sql
sql
— Find all products where attributes contain color = ‘blue’
SELECT *
FROM products
WHERE JSONUNQUOTE(JSONEXTRACT(attributes, ‘$.color’)) = ‘blue’;
— This works because we created a functional index:
— CREATE INDEX idxcolor ON products((JSONUNQUOTE(JSON_EXTRACT(attributes, ‘$.color’))));
sql
The moment you need something like this, you're writing raw SQL anyway. Technically, you can convert the inner json field into a virtual field in the table, but it is more like a trick.
## Debugging: The Hidden Tax
Here's a scenario you've probably lived through.
You probably remember a slow query in production. Not terribly slow. But slow enough for the customers to complain about it, or for monitoring to start flagging it.
You open the application code to find the query. It's a method chain: fifteen calls deep, filters applied conditionally based on request parameters, a couple of joins added somewhere in a base repository class, eager loading configured in a model file you haven't opened in a while.
You enable query logging and capture the SQL. It's 60 lines. It has a subquery you don't remember writing. There's an `ORDER BY` on a column that isn't indexed.
Now you need to find out where that subquery came from. Which of the fifteen method chain calls produced it? Is it coming from the base class or the child?
This is the hidden tax of ORM complexity. When something goes wrong, you're debugging two layers simultaneously: the application logic and the query generation logic. With raw SQL, you only debug one.
sql
php
// What you see in code review
$users = $this->userRepository
->createQueryBuilder(‘u’)
->leftJoin(‘u.account’, ‘a’)
->andWhere(‘u.isActive = :active’)
->andWhere(‘a.plan = :plan’)
->setParameter(‘active’, true)
->setParameter(‘plan’, $plan)
->getQuery()
->getResult();
// What actually runs. And nobody reads until something breaks
SELECT u.id, u.name, u.email, u.createdat, u.updatedat, u.isactive,
u.accountid, a.id AS aid, a.name AS aname, a.plan AS aplan,
a.isactive AS aisactive, a.createdat AS acreatedat,
a.updatedat AS aupdatedat, a.meta AS ameta
FROM users u
LEFT JOIN accounts a ON a.id = u.accountid
WHERE u.is_active = 1
AND a.plan = ‘enterprise’
That `SELECT *` behavior - pulling every column from every joined table into memory - is what you're paying for when you use full entity hydration. Often, you only needed three of those twenty columns. The ORM fetched all twenty because it wanted to give you a fully-hydrated object.
## The Architecture That Actually Works
After ten years of making and watching these mistakes, here is the structure I apply to the projects I work on.
**ORM owns:**
- All schema definitions and migrations
- Entity lifecycle management (create, persist, flush, remove)
- Simple lookups: find by ID, find by single filter, paginated lists of entities
- Relationship management within bounded, small result sets
**Raw SQL owns:**
- Queries destined for reports, dashboards, or exports
- All bulk operations: batch inserts, batch updates, upserts
- Aggregation across multiple tables
- Any query that uses database-specific features
- And quite often, any query that shows up in the slow query log
In Symfony, the implementation of this separation is clean. Entity repositories handle the ORM layer. A separate set of query classes - I call them "read repositories" or "query repositories" - handle the SQL layer using DBAL directly: n
php
javascript
// Entity repository – pure ORM, simple operations
class UserRepository extends ServiceEntityRepository
{
public function findActiveById(int $id): ?User
{
return $this->findOneBy([‘id’ => $id, ‘isActive’ => true]);
}
public function save(User $user): void
{
$this->getEntityManager()->persist($user);
$this->getEntityManager()->flush();
}
}
// Query repository – raw SQL, complex reads
class UserAnalyticsRepository
{
public function __construct(private readonly Connection $db) {}
public function getRetentionBySignupCohort(DateTimeImmutable $from, DateTimeImmutable $to): array
{
$sql = <<<SQL
SELECT
DATE_FORMAT(u.created_at, '%Y-%m') AS cohort,
COUNT(DISTINCT u.id) AS total_users,
COUNT(DISTINCT CASE
WHEN u.last_active_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
THEN u.id END) AS active_last_30d,
ROUND(
COUNT(DISTINCT CASE
WHEN u.last_active_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
THEN u.id END)
/ COUNT(DISTINCT u.id) * 100, 1
) AS retention_pct
FROM users u
WHERE u.created_at BETWEEN :from AND :to
GROUP BY cohort
ORDER BY cohort ASC
SQL;
return $this->db->fetchAllAssociative($sql, [
'from' => $from->format('Y-m-d'),
'to' => $to->format('Y-m-d'),
]);
}
}
php
“`
The DBAL gives you everything you need – prepared statements, connection management, transaction support. You’re not giving up safety to write SQL. You’re giving up object hydration, which you didn’t need for this query anyway.
Notice also what the raw SQL version gives you that the ORM version can’t: the query is right there. Any developer on the team can read it, understand it, copy it into a database console, run `EXPLAIN` on it, and tune it. No need to extract it from the code.
The Skill That Was Never Optional
There’s a belief that has gotten more common as ORMs have improved. Some folks think that SQL is a legacy skill, something you fall back on when modern tools fail you, like knowing how to use a physical map because your phone died.
I think this gets it backwards.
SQL is the language your database speaks. The database is doing more work than any other component in most backend applications. Understanding what you’re asking it to do is not optional expertise. It is foundational.
The developers I’ve most respected over my career share something in common: they’re comfortable at every layer. They use the ORM when it fits and write SQL when it doesn’t. They don’t feel like reaching for SQL is an admission of defeat. They feel like it’s using the right tool. And they know the right tool because they understand both.
We don’t care how the final SQL query was created. We care whether it can use an index, how many rows it has to examine, and how much data it has to move. Meet it where it lives.