Foreign keys and mysql

It seems like some people is not aware yet of the fact that with mysql you can have not only transactions but also foreign key constrains, apart from other features which were usually found only on expensive products like Oracle, MSSQL server, etc…

The problem is that as these features weren’t available until some months/years ago, there’s a vast amount of php software which does not use them, and hence php developers [which read/use other developers code] do not know about. So all majority of code examples and real-life applications do not take care of adding constraints or transactions in the SQL data model declaration. Instead, they do all the checkings and validations in the php code. And that’s also what I do.

Is it right? I wouldn’t say yes, but I wouldn’t say no. My workmate is completely obsessed with foreign keys, he can’t stop adding those constraints to every pair of tables that are linked… and he also can’t stop complaining about me not setting those constraints when declaring the tables. Why do I do that?

Mainly for convenience. In this world of undecided clients and blurred targets, we are going to possibly mess up with the database tables lots of times, at least at the beginning of the project when the data model tends to not to be very well defined. Also, for having foreign keys and transactions and all that funky stuff, your tables need to be InnoDB, which are said to be slower than MyIsam (which are the default ones that mysql uses when you do not indicate any table type during table creation), mainly when accecsing data in read only mode, not updating or inserting it – and that’s how websites works, don’t they?

Sure, we all have been taught that a good data model must be consistent and preserve data integrity as one of the main goals (apart from storing data in itself, of course!). One could argue that foreign keys are data in themselves, as they mean something; you can’t simply remove them without losing information somehow. But as I said before, during development I think it’s going to be harder if you are playing with the tables and the data inside them. Lots of integrity constraint failed errors, lots of lost time because of that.

I am thinking of a compromise solution: not using foreign keys during development but adding them when everything is almost definitive (where everything means the data model, etc…). As we also make checks in our business layer (I hate speaking with that terminology!), we can’t lose any functionality. And if there was an error, it will show!
I would love to listen any suggestion/opinion/truth/whatever. Please take into account that I am approaching the foreign key issue from a practical point of view; I would prefer to avoid metaphilosophical discussions about the database designs. Or at least I’ll avoid those which are not human readable. Thanks!

ie6 for Android

Comments

  • Interesting. I’m just having this same argument with a mate of mine, except in the opposite point. (And using SQL Server).
    Arguments for….
    —————–
    1. Clear definition of what the table is linked to. Only costs 3 lines to add a constraint, easy to put NOCHECK on if required.
    2. Validation of data. Data is the most important part, foreign keys give it context, which makes it information.
    3. Easier for new developers to learn the schema.
    4. Picks up coding errors in the app layer early (ie – at the database layer), rather than allowing them to go through to production.

    Arguments against
    —————–
    1. Performance. (I think needs to be profiled before choosing that as a reason to not use them though – and even then, drop and recreate before and after is a "better" solution IMO )
    2. Administrative overhead. (What you were alluding to)
    3. Restoring individual tables

    I think 2 and 3 can be fixed fairly easily with a bit of thought. I’ve written an article about it (having just been through myself trying to decide whether to have them or not) http://dotnetjunkies.com/WebLog/piquet/archive/2005/06/28/regeneratingConstraints.aspx

    Once you get in the habit of dropping and creating, or just disabling them using NOCHECK, I find that it’s pretty quick development. It’s just a different technique of development.

    I think it comes down to how much you value your data. Personally, the data is the centre, and I’ll do everything possible to make sure it’s valid. I hope that my databases will live longer than the time I spend developing them, so they should also aid the "health" of the database. To me, just defensive programming. Each of those "errors" that crop up when playing with data is just a reminder that I haven’t structured the insert or delete statement correctly. Forces you to remember the way to work with the database. Helps new people learning also.

    Good reading though – thanks.

  • Some meditative months after… I found a good addendum to this discussion:

    It’s interesting to know (or remember) that we can have foreign keys on cascade, or not. So we could simply indicate "Hey, table, your column «bla» is associated with the column «id» of that table" and stop here, not adding cascade or anything else to this definition.
    Then a developer can come and see this database design and understand that those two tables are linked.
    No worries about deleting tables and hence deleting more data than you could expect without knowing.

    And the other good idea is to use transactions. Since it can be used with mysql (and of course all of those expensive rdbms like oracle or sqlserver), we can take profit of this feature by starting a transaction at the beginning of all the delete or table manipulation and rolling back the transaction if something goes wrong – for example in the case that a child row does not verify the foreign constraint.

    Definitely, for using foreign keys one should use transactions, specially if you need to alter lots of tables and need that these operations are completely commited or not commited at all.