Be consistent in naming
„There are only two hard problems in software engineering: Naming and Caching“. Based on that I want to state out the problem about naming on database level I have seen a lot. It starts with the design of the domain model. During that time, at the beginning of a project, mostly the entities written on boards or pages is like „user“, „file“, „role“, „article“ and such. At that time its fine and correct to have a clear perspective on the domain itself. But whenever there is the design phase over which could be after hours, days or weeks it is IMHO strictly necessary to start reviewing the names of all those entities. And this won’t be the first and last time you should review your entity names. I am a fan of Domain Driven Design which says that naming must be consistent across all participants of the project. This means also that whenever your business decides to rename the product to „sujet“ or slt to strictly avoid „translations“ you must rename the entity in the codebase/database too. Because:
What if you have „product“, renamed to „sujet“ and then a new „product“? A discussion in the code about the product which is actually a sujet or the „new“ product is hard. It is even more complicated when it comes to onboarding of new team members, when they fix new bugs or implement new features and business mentions product.
After a long period of time review your domain model again. Example: Let’s imagine you are developing a software to manage vehicles. After six months of development you have a base table „vehicle“ and two sub-tables „truck“, „personal_car“. Now a new entity is necessary which divides the „personal_car“ into e-cars and fuel-cars. Introducing a subtable to „personal_car“ which could be „personal_e_car“ leads to problems:
- e-cars and fuel-cars must be equally treated and a base table and a subtable does not enforce this
- fuel-cars would contain then e-cars which is also wrong
- there are attributes which belong maybe to fuel-cars only and would be available in e-cars also which is also wrong
The correct way is to make two distinct sets. Have a base table „personal_car“ and two subtables „personal_e_car“ and „personal_fuel_car“.
Note: Sometimes it makes sense to have prefixes in table names. In the case here it would be „personal_“. It could also be „personal_car“ which maybe is better. The reason is that whenever you have a „subdomain“ in a big domain and some tables belong there, they would be grouped together in browsers, lists, etc. when they have the same prefixes.
UPPER_CASE or lower_case
In the past when there was not syntax highlighting it made sense at that time when there were language keywords written in UPPER_CASE. Today we have: syntax highlighting, auto completion. There is no need to follow that ugly pattern anymore. Write it lower_case. Also decide on DB level if you want to have snake_case or camelCase. I still recommend to use snake_case. The reason is that there are some common RDBMS like Postgres which are converting all code into lowercase. So writing it upper case would only distort the real code executed. Maybe a small issue but it is. Also and this is my pain point: ALL THESE UPPER CASE CODE PARTS remind me of those spam mails. And I promise ALL UPPER CASE is harder to read (https://en.wikipedia.org/wiki/All_caps).
Find a name of the column
Whenever there is a table created all of the columns are named based on its purpose. A table „article“ has maybe columns „title“, „content“, „language“ and so on. You do not write „column1“, „column2“, …. You name it because you want to show the purpose of the column. It happens that when there is a foreign key introduced the naming of the column is „<name_of_referencing_table>_<primary_key_column_name>“. So maybe the table „article“ has „user_id“ as foreign key. Well, thats the worst thing every invented. Because it does not tell anything about the purpose of the field. It only tells that there is a foreign key but it does NOT tell why its there. Tell the purpose. E.g. for the „article“ table it could be „author_user_id“, „last_editor_user_id“, „reviewer_user_id“ and such. Because the reason is simple why this must be done: Today the requirement is maybe to story the author of the article. This could be „user_id“. But tomorrow there is maybe the requirement: Store the last editor additionally. So then you would have „user_id“ and „last_editor_user_id“. Feels awkward to me. From a mathematical perspective ever last_editor_user is a user and is somehow related to the field user_id. But it is not. Think about it as completely independent. The naming convention is „<purpose>_<target_table_name>_<primary_key_column_name>“. Sure, if the column name gets too long forget about the <target_table_name>.
TL;DR
- use lower_snake_case naming
- name foreign keys „<purpose>_<target_table_name>_<primary_key_column_name>“
- review your entity names and rename in case