Relational DB schema design – How to model 1-to-1 mapping right from an entity’s set of fields

Relational DB schema design – How to model 1-to-1 mapping right from an entity’s set of fields

I am working on a MySQL (via Prisma datamodel) schema for a Customer entity in an e-commerce context. As the number of fields have exploded (on including say, engagement tracking) I have two possible designs here:
type Customer {
email: String! @unique
name: String
birthDate: DateTime
addresses: [Address!]!

productsVisited: [Product!]!
productsShared: [Product!]!
productsSearched: [Product!]!

}

That, or should fields conveying info together should be carved out into their own table, and joined to the previous one by a 1-to-1 relation:
type Customer {
profile: CustomerProfile! @relation(name: “CustomerProfile”, onDelete: CASCADE)
addresses: [Address!]!

productEngagements: ProductEngagement! @relation(name: “CustomerProductEngagements”, onDelete: CASCADE)

}

type CustomerProfile {
customer: Customer! @relation(name: “CustomerProfile”, onDelete: SET_NULL)
email: String! @unique
name: String
birthDate: DateTime
}

type ProductEngagement {
customer: Customer! @relation(name: “CustomerProductEngagements”, onDelete: SET_NULL)
productsVisited: [Product!]!
productsShared: [Product!]!
productsSearched: [Product!]!
}

QUESTIONS:

What is the right way of thinking design here? I am currently driven by my ER diagrams, and intuition. Do I get or lose any execution or flexibility advantage with making a table thin w.r.t. number of columns?
In the second method, do queries have to do extra work for table joins?

Abstract Question:

Is a method definitely better than the other, given evolving schema, or performance criteria? Or, is this just a matter of taste here?

Solutions/Answers:

Solution 1:

In a RDBMS, there is rarely any good reason for splitting a table into two tables in a 1:1 relationship. To JOIN them back together in a SELECT is certainly possible, but adds some overhead, code complexity, and is a slight performance hit.

Related:  Connect Gatsby with Postgres

That being said, I have (rarely) encountered a cases where such a “vertical partitioning” is beneficial.

  • There are “too many” columns in the one table, and splitting avoids hitting some database limit… (This situation is likely to be better handled by some other solution.)
  • Some of the columns are both bulky and rarely used… By moving them into a separate table, the JOIN is usually avoided, and the potential performance hit for “bulky” is usually avoided.
  • You need to add some columns to a huge table, but the ALTER TABLE command is so costly (think, downtime) that you are desperate to find a way to avoid it… Such a separate table is quick, easy, etc. (But, of course, suffers from the inconvenience, etc, of needing the JOIN.)
  • You have a set of columns that are rarely present in the table… Before splitting them out, you intended to populate the columns with NULLs (a valid thing to do). But after splitting them out, you simply don’t have a row in the other table. Then you use a LEFT JOIN, thereby reconstructing the NULLs out of thin air.

Solution 2:

Both approaches definitely will work and it indeed somewhat depends on your personal preferences and the API you want to have when working with the data.

With the second approach you’ve outlined, Prisma will indeed create tables for CustomerProfile, ProductEngagement and all other types (as a general rule, Prisma maps all the type definitions from your datamodel to their own tables). So, as @rick-james pointed out, there might be overhead in the JOINs that need to be performed when retrieving the data.

What is the right way of thinking design here? I am currently driven by my ER diagrams.

This generally is a valid approach, since the Prisma datamodel ultimately is mapped to the database. In that sense, it should be fine to think of it in terms of an ER-diagram.

Related:  How do I return the id property of the user object in Apollo?

Also note that Prisma will soon support embeeded types which allow to define a type in the Prisma datamodel that does not get its own table but where the data is rather stored in a JSON column inside of the table of a non-embedded type. This is currently already supported when using Prisma with MongoDB but not yet with SQL. You can learn more about this topic in this feature request.

References