Example of Idiocy, Bad Normalization, and Stupid Joins

Normalization for Idiots

The last few days I have been working on an Detail-Edit screen and having no architecture (no nothing, really, not even good manners to lean on) on the back-end, I have been rebuilding it best as I can to make it MVC like, using naming conventions, and abstracting it out as much as possible to make it modular, reusable, and resemble object oriented programming. 

The edit detail is to view and edit basic contact info of a possible client and track contact efforts with a basic contact note system.

In addition to adding many new user features and exposing many more fields which were hidden to all except the database admin (how useful), I wanted to let the user insert and update the child contact records correctly with a "time spent" value corresponding to phone time when appropriate.

I go about creating the extra form values, init, display, validation, javascript validation on front the front end, etc., etc. 

Then I add the additional values in the query, add the additional query to insert this "time spent' value, and then start wondering why there are two inserts to redisplay one record? I know we have two tables that can hold the child records, but I had figured there was a reason, and hopefully a good reason. I thought that the phone system connected to one for special reason and the other was the 'native' table. However, after pushing data to both, one, or the other tables, I realized not all the records display with the original query. So it was not a psuedo-union or union, Then I notice the time span I am pushing in the second table doesn't EVER show. So what is joining these tables? I check the data in the tables and I am correctly creating records. So there must be something in the way they are joined... and why are they joined again? Why are there two tables? 

As I start examining the joins on the display and the inserts and the actual fields I need to push values to, it finally dawns on me. When some people go to Database 101 (or 1010 in the new era of semesters across the world), they must be sleeping. Normalization seems to be arcane to so many people I meet, I wonder what business they have using a computer at all. Better they use pen and paper. Check that, pencil and paper. 

Or the person who put these in place was a bona fide idiot. And he slept through his computer classes on top of that. Or whatever.

Here is what I figured. The two tables that display one record are identical except one has an additional field to hold a StopDate in addition to a StartDate. They don't use the same naming convention either, so glancing at them you wouldn't immediately notice they are identical. And when I say naming convention, I really mean absolutely no naming conventions are used except they both use names that are using Roman alphabet letters.

When I look carefully at the join and what it implies I realize what is going. Here  is the query after I cleaned it up and mask the table field names. 

	SELECT
		LMI.CONTACTDATE AS dt_Contact,
		LMI.NOTES AS sz_Note,
		UM.FIRSTNAME AS sz_TMNameFirst,
		UM.LASTNAME AS sz_TMNameLast,
		LS.STATUS AS uc_StatusDesc,
		ISNULL(CT.ENDTIME,LMI.CONTACTDATE) AS dt_End
	FROM TBL_LEAD_CONTACTHISTORY LMI
		LEFT OUTER JOIN TBL_USERS_MAIN UM ON LMI.TM_ID = UM.USER_ID
		LEFT OUTER JOIN TBL_LEAD_STATUS LS ON LMI.STATUSID = LS.STATUSID
		LEFT OUTER JOIN TBL_CALLTIME CT ON LMI.CONTACTDATE = CT.STARTTIME AND LMI.LEADID = CT.LEADID
	WHERE (LMI.LEADID = #local.n_LeadID#)
	ORDER BY LMI.CONTACTDATE

The join on the two identical tables is joined on the idea that the start or contact time is identical. Well, that is excellent! NOT! I never thought of using, effectively, a timestamp as a unique identifier... BRILLIANT! NOT!

For a small system you can probably get away with this, but it is just bad practice. If you get several hundred users, sooner or later you will  get  multiple records that match this join and it will cause issues. Awesome! Plus it is just stupid. 

So, really, two idiotic things are going on.

  1. Someone brilliantly dumb is using a timestamp as a record key... okay....
  2. Someone brilliantly idiotic is using an extra table with the full set of identical information to hold an End Date record to join to the smaller table...

And this is normalization at work. He normalized the EndDate, because, after all, not every contact record needs one. 

Good Form Normalization to the EXTREME!

This reminds me of the time when I saw a system that normalized every single field for every single record for every single 'virtual' table. Although it was very systematic and functional, what you are really doing when you go to that extreme, is creating a database system on top of a database. And then you need to create another layer to let you look at the 'tables' and data your system created...

You know, that is fine, I guess. But I got pretty frustrated when I had to dig in and figure out what went to what. When I had to create a 7 join query to get a basic address book record, I started to get pretty upset with how much work this system would require. Who wants to build views on all those tables every time they are updated? Or Need a different view? It's just a little crazy. Academically it is interesting to take normalization to the extreme, but overkill is overkill.

If you are IBM or SAP and building a middleware layer so you can sit on any database and have the resources to build and maintain this under the covers, fine. If you are John Doe in a one to ten man shop, you should not do this!




CachedSince:{ts '2024-11-21 00:57:34'}