SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Friday, May 11, 2007

difficulty with SQL to get view

SQL Apprentice Question
am having difficulty in designing my SQL. :(


In this setup, an Invoice can have multiple Bills (installment
payments). I would like a query that returns invoices with overdue
bills


a bill is overdue if :: NOT B_Paid and Now() > B_DueDate --- how to
put this in the following '???'


SELECT B_Invoice, ??? As Overdue
FROM T_Bill
GROUP BY B_Invoice


CREATE TABLE T_Bill (
RefNo INTEGER IDENTITY PRIMARY KEY,
B_Invoice INTEGER NOT NULL REFERENCES
T_Invoice,
B_BillDate SMALLDATETIME DEFAULT CURRENT_TIMESTAMP
NOT NULL,
B_DueDate SMALLDATETIME DEFAULT (CURRENT_TIMESTAMP
+14) NOT NULL,
B_Paid BIT DEFAULT 0 NOT NULL,
B_PaidDate SMALLDATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
B_PaymentInfo VARCHAR(24) DEFAULT '???' NOT NULL
);



Celko Answers
Your table is not designed properly. Your assembly-style bit flag is
redundant when you have a payment date. IDENTITY cannot be a
relational key by definition and it is redundant given the invoice
number. You never put the table as a prefix on a data element names
because that would change the names from table to table. If the due
date is always 14 days after the billing date, then put compute it in
a VIEW or a query instead of wasting space. This also gives you more
control without having to alter the tables.

CREATE TABLE Billings
(invoice_nbr INTEGER NOT NULL
REFERENCES Invoices (invoice_nbr),
billing_date DATETIME NOT NULL,
PRIMARY KEY (invoice_nbr, billing_date),
paid_date DATETIME, -- null is unpaid
payment_note VARCHAR(24) DEFAULT '???' NOT NULL);



>> I would like a query that returns invoices with overdue bills <<


SELECT invoice_nbr, CURRENT_TIMESTAMP
FROM Billings
WHERE paid_date IS NULL
AND DATEADD (D, billing_date, 14) >= CURRENT_TIMESTAMP;

You need to get a book on basic data modeling. And one on RDBMS.
What you have is a badly designed punch card file system, with flag,
sequential numbering, etc. .

1 comment:

Hilarion said...

I agree that this data model is probably not best, but I'd hesitate to force invoice number with billing date to be primary key, because we do not know if there can or can't be more than one bill per invoice with same billing date. If yes, then one needs something more to make them distinct, if not, one should ensure that we do not violate the constraint by entering two bills, which differ only in time portion of billing date field.
If it's about prefixes, then I'm not sure if you understood the author's use correctly - he's probably prefixing all table names with "T_" and all field names with "B_", so it's not making him change names between tables. If he does really use different prefixes for field names, based on table name, in which the field is defined, then yes, it can be a problem sometimes, but not that big. If fields are in different tables, than they have at least slightly different meaning, so they can (and sometimes should) have different names. Using table name based prefixes may not be a good way, but it works and sometimes allows omitting aliasing tables in complex join situations.
Regarding the use of flags (bit fields in this case) I agree that redundancy in data model usually is something one should avoid, but in case of implementations of data models, redundancy may solve some efficiency issues. For example when one uses "some_field IS NULL" expression in WHERE clause in MS SQL Server, then there can be a problem, because MS SQL Server does not always index NULL values. So even if this "some_field" is indexed, use of given expression may still cause full table scan. In this case one may try to rise performance by adding additional flag, that mirrors nullity of some field, and use that flag instead of "NOT NULL". In such case there should be additional constraints on data, that enforce the flag value rules (and maybe triggers that set the flag automatically).
In this case one could solve the performance problem by using complex index (on "billing_date" and "paid_date") and this way force MS SQL Server to also index NULL values in "paid_date".
In general: When performance is important and redundancy may improve performance, I'd consider introducing redundancy, but only if there are no alternatives, and tests really prove that in this case the performance improvement is significant, and when I have means to control that the redundancy does not introduce ways to make data inconsistent.