Monthly Archives: September 2025

September 5, 2025

The definitive answer to the ID vs TABLE_ID debate

This is a generated image showing two tables with surrogate keys, one with the surrogate key named ID and another with the surrogate key named TABLE_ID. The is a box below the tables with the text "ID vs TABLE_ID".

Few debates in database design spark as much disagreement as what to name a surrogate key. There has been a debate over the years about the name of that column, and it generally falls into two choices:

Every table that has a surrogate key should name the column ID.
Every table that has a surrogate key should name the column TABLE_ID.

I have very strong opinions about this debate, born out of over 35 years of building database-backed business applications.

My sincere belief is that every surrogate key should be prefixed with the table name. No, I’m not going to fall for the straw man argument that any of the other columns in a table should be prefixed with the table name. The surrogate key is special.

The ID folks try to make some of the following arguments:

It is short and universal. You don’t have to think as much.
If you are using an ORM (Object-Relational Mapping framework like Hibernate for example), you don’t have to do the incredibly simple thing of updating the ORM to expect table_id instead of just id, because most ORMs default to just ID. Of course, they don’t say it that way; they just say “Our ORM expects ID.”.
We already have tables that just use ID, so we should stick to that standard.

And.. That’s it. Now, sometimes folks will talk about already having to prefix column names in select clauses, and employee.employee_id is redundant, but the reality is that you are almost certainly going to alias the employee table name as either e or maybe emp, you’re never really going to type employee.employee_id.

So, why are the ID folks wrong?

To me, the biggest, most fundamental thing is this:

You have a thing in the database, and you are calling it two different names depending on where it is located

If we have an employee table with an ID column as a surrogate key, whenever we use that surrogate key in another table as a foreign key, we typically put the table name in front of it. So ID in the employee table, and then EMPLOYEE_ID as a foreign key in any table related to the employee table.

If we call this thing EMPLOYEE_ID in the employee table and EMPLOYEE_ID whenever we use it as a foreign key (and yes, I am aware this thing could have another name when used in a different context, for example, MANAGER_ID), then it has the same name everywhere it is used.

It also goes the other way, in that you have hundreds of things named ID, but they are all different things in different tables.

This puts to rest the “It’s short and universal, you don’t have to think as much” argument because you end up thinking more every time you access the table joined to another table.

I strongly believe that the ID to EMPLOYEE_ID naming convention is a cognitive tax that must be paid over and over again.

However, there are a host of other reasons why table_id is the right way to go…

ID as a surrogate key means you are unable to join tables with a USING clause

I totally understand that some people don’t like the USING clause, but I find it very nice.

select ...
  from department d
  join employee e using(department_id);

The above is much cleaner, contains fewer opportunities for mistakes, and is easier to read compared to the below.

select ...
  from department d
  join employee e on d.id = e.department_id;

There are so many ways to make mistakes in the 2nd statement compared to the first statement. Now make this an 8-way table join, and the fun just compounds!

Clarity in queries

Surrogate key names now carry meaning without qualification, which is great in SQL, views, etc.

Self-joins are clearer

where manager_id = employee_id

reads better than

where manager_id = id

Bridge tables read naturally

Associative tables like employee_project(employee_id,project_id, ...) are self-documenting and enable USING clauses in both directions.

Grepability/searching/diagnostics is much easier

Searching logs or code for where a given entity flows is much harder if every table has a column named ID. Sometimes ID means EMPLOYEE_ID, sometimes ID means DEPARTMENT_ID, etc.

REST APIs and JSON payloads are going to read better

Your API contracts and/or JSON payloads are probably going to use EMPLOYEE_ID and DEPARTMENT_ID anyway.

But wait, what about our tables that already use ID?

We’ve already discussed the cognitive tax when using ID, and, given the choice, I’d remove that cognitive tax on all new tables in the database. With old tables that used ID, there are some choices:

Bite the bullet and do the rename of the column, and then update all your applications (APEX makes this pretty easy to find and replace those references) and source code (ALL_SOURCE & ALL_VIEWS). This could be a big project, but the tools are all there. You could do a big bang approach and do everything at once, or do it over time, table by table.
Create a view on top of the legacy tables that does the rename of the ID column to TABLE_ID, and have new applications and code use the view.
Rename the table and the column, and then create a view on top with the old table name that maintains the legacy ID name.

In conclusion

The surrogate key naming decision may seem small at first, but in practice, it touches every query, every log, every API contract, and more. Prefix your surrogate keys (and ONLY your surrogate keys). Your future self, your team, and your database will thank you.

Leave a comment | tags: Data Modeling, Database Design | posted in Code, Oracle Database, SQL

Cars and Code

Monthly Archives: September 2025

The definitive answer to the ID vs TABLE_ID debate

You have a thing in the database, and you are calling it two different names depending on where it is located

ID as a surrogate key means you are unable to join tables with a USING clause

Clarity in queries

Self-joins are clearer

Bridge tables read naturally

Grepability/searching/diagnostics is much easier

REST APIs and JSON payloads are going to read better

But wait, what about our tables that already use ID?

In conclusion

Categories

Archives

Blogroll

Recent Posts

Email Subscription

Top Posts & Pages