That’s not very Data Warehouse - an argument against elegant SQL code

Alex Antra - Mar 26 '19 - - Dev Community

In my role I build one off reports for clients.

Recently I had my code peer-reviewed (as we do for all code) and it got the green light.

Great! Two thumbs up... it does what it set out to do and won’t lie to the customer.

However, that peer review came with some advice. My methodology wasn’t very ‘data warehouse’ and wasn’t a methodology that gets used in advanced applications.

Now I’m not here to write a scathing (and cowardly) response to that person’s unsolicited advice. I’m here to challenge the culture where that advice came from.

See I have trained four different people SQL in the past few years and every single one of them has naturally, for some unknown reason, gravitated towards the exact same fallacy, that; the more joins, aggregations, functions, CTE’s and sub-queries they can fit into the least amount of code the better a coder they are. They believe elegant code is the best code.

This is what they meant when they critiqued my methodology, it wasn’t elegant, and I think that’s a load of BS.

I’m not sure how everyone ends up in the same mindset, but I know why elegant code is wrong.

Simply put, I don’t write my SQL scripts for me, I write them for my tester and/or for the member of my team who has to read it later (for whatever reason).

To do that I keep it simple.

I write lots of notes and break the script out into bite sized chunks attempting to only do one or two things at a time.

I do this so another person can easily understand what the script is doing, for a tester this makes it easy to follow and the test points obvious. And if another coder has been tasked with enhancing that code or wants to extract out chunks for their own purposes they can do so easily. They don’t have to reverse engineer a nuclear warhead just to understand.

Sometimes I’ve had code so elegant that a simple addition required hours of retooling.

Yes there are scenarios where adding an extra column in step 1 benefits step 7 and I have no issue with that provided its properly noted, however when you can add bits to step 1 that benefit step 3, 8, 9, and 11 this becomes too complicated for a third party to quickly understand, and if your tester struggles to understand your code, the risk of them missing something increases.

The other thing I do is reduce complexity by not doing too many joins, unions or CTE’s. Every time you do something like this you increase or decrease your row count, do too many and you lose the ability to say with 100% certainty that you are loosing or gaining rows with accuracy, especially when dealing with large data sets.

A good tester will try verify that every join is doing exactly what its set out to do, by making six joins to a source dataset means your tester has to do a lot more heavy lifting to make sure its doing what its meant to.

For all you know, a bad join has lost you a hundred thousand rows but that may not be apparent to you by doing a simple count(*) at the end, and the complexity of your code has just decreased the chances your tester will pick up on that.

That is how I code, it benefits my team and my testers and while we will always find ways to improve I’m not going to deviate from the above points just because it doesn’t look elegant or in hindsight it could have been half the length. You can always look back on code and make it better, however just because you can re-write code, doesn’t necessarily mean you should, that’s certainly not a valid point for my team, our goal is get to the results faster, safer.

Not to mention: elegant code isn’t significant.

Code should be accurate and not needlessly slow. Elegance is subjective, human subjective, fundamentally a SQL compiler doesn’t care whether it gets asked to do 12 easy statements in a row or whether you have smushed all 12 steps into an ‘elegant’ mega query. It doesn’t care.

And so that’s what I wanted to challenge. Write code that works for your teams goals, don’t strive to make your code elegant, don’t re-write it to make it smaller if it’s going to make a testers job harder, and don’t write it for you. You already know what your doing and why, but no one else can read your mind, they can only read the script and the ticket, if it doesn’t help them understand then that’s not their failure it’s yours.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player