Wednesday, March 7, 2012

Really need help with this tough query

Hello all. What I want to do has left me stumped. I will try to articulate everything, but please reply if I have omitted some details.

My goal is to effectively choose the next sequencial contract number as follows: there are two tables, each with a contract_number field, whose format is as follows:
ANNNA-NNNN, where A represents a single alphabetic character, and each N represents an integer.
What I need done is to chose the largest value occupying the last four digits of a contract number.

An example would clarify:

table1
C600D-1234
C500E-1231

table2
A400F-1235
G300I-1222

My query should return: A400F-1235 (or just 1235) since the last four digits are greatest in this instance.
I have a feeling that I will use the functions CAST, MAX, and SUBSTRING, but this is just too complicate for me. To make matters worse, there will be some obsolete ill-formatted numbers that should be ignored in case they do not conform to the format.

Even the smallest bit of advice could prove priceless.
Thanks!To make this problem a whole lot more manageable, forget the issue about there being two table; I can take care of that with my server-side code. So basically, I just need the maximum integer value retrieved from a column containing both numbers and letters, as described in the previous post. Again, however, I need a way to cope with (ignore) poorly-formatted contract numbers.

Thanks!

Originally posted by ujohnc00
Hello all. What I want to do has left me stumped. I will try to articulate everything, but please reply if I have omitted some details.

My goal is to effectively choose the next sequencial contract number as follows: there are two tables, each with a contract_number field, whose format is as follows:
ANNNA-NNNN, where A represents a single alphabetic character, and each N represents an integer.
What I need done is to chose the largest value occupying the last four digits of a contract number.

An example would clarify:

table1
C600D-1234
C500E-1231

table2
A400F-1235
G300I-1222

My query should return: A400F-1235 (or just 1235) since the last four digits are greatest in this instance.
I have a feeling that I will use the functions CAST, MAX, and SUBSTRING, but this is just too complicate for me. To make matters worse, there will be some obsolete ill-formatted numbers that should be ignored in case they do not conform to the format.

Even the smallest bit of advice could prove priceless.
Thanks!|||Try this

create table foo (test varchar(30))

insert into foo select 'abcd-1234' union all select 'dfrg-1235' union all select 'adgf-9999'

select test from foo where right(test,4) in (select max (right(test,4)) from foo)

HTH|||I think you would get faster performance using the = operator rather than the "in" operator:

where right(test,4) = (select max (right(test,4)) from foo)

...but unless the query is very large the difference is probably negligable.

Another minor point; you have the option of running your select clause from the union query directly, without using a temporary table if you want.

blindman|||Oops, absolutely right, you want to use the '=' not 'in'

Thanks|||Originally posted by rhigdon

insert into foo select 'abcd-1234' union all select 'dfrg-1235' union all select 'adgf-9999'

HTH

hi, i am new to all these db stuff
but wat does the above statement do

i don't get the
select 'abcd-1234' union all select 'dfrg-1235' union all select 'adgf-9999'
part

thanx|||Just a way to insert rows instead of saying

insert into table select 'blah'
insert into table select 'blah'
insert into table select 'blah'

Using the union all statement allows you to write one insert that accomodates multiple inserts.

HTH|||Originally posted by rhigdon
Just a way to insert rows instead of saying

insert into table select 'blah'
insert into table select 'blah'
insert into table select 'blah'

Using the union all statement allows you to write one insert that accomodates multiple inserts.

HTH

arhh... thanx,

is there any good resources that you gurus out would recommend me to read, either books or online resources

cheers
j|||BOL - http://microsoft.com/sql/techinfo/productdoc/2000/books.asp

Inside SQL Server - Kalen Delaney

Data and databases - Joe Celko

Guru's guide to transact SQL - Ken Henderson

HTH|||thank you all for your replies.

i think your suggestions will work for my situation, with one exception: some of the sequence numbers are poorly-formatted, and in that case, the "max (right(test,4))" may cause an error, or produce undesriable results. basically, if "right(test,4)" is not an integer, i don't want to deal with that record. any ideas of how to accomplish this?

also, blindman wrote:
"Another minor point; you have the option of running your select clause from the union query directly, without using a temporary table if you want."

Can you show the sql code for this (I can't picture what you mean)?

Thank you infinitely!|||You can use the isnumeric function.

Not tested but something like this:

if
isnumeric(right(test,4)) = 1
then|||This illustrates how to use UNION to combine both tables into a subquery, for both your outer select and for determining the largest current value. Whether you use this, a temporary table, or a table variable is a matter of choice and the details of your project.

This also shows how to use the ISNUMERIC function to screen out non-numeric bad data.

select test
from (select * from table1 Union select * from table2) foo
where right(test,4) =
(select max(right(test, 4))
from (select * from table1 Union select * from table2) foo2
where ISNUMERIC(right(test, 4))=1)

blindman|||blindman and rhigdon...

thanks infinitely for helping me out. i believe i now have exactly what i want thanks to you.

one curiosity: when we write SELECT MAX(RIGHT(...)), why doesn't some sort of cast to an integer need to take place (or does sql server do this implicitly)?

take care.|||You can use the Max and Min aggregate functions on non-numeric data, which is then evaluated according the sort-order of the character set. For numeric data stored as text, there is no difference in the result as long as all values have the same number of digits.
Of course, you can't use statistical functions like Avg with non-numeric data!

blindman|||I bet you thought this thread was dead... never!

anyways, thanks for the continual help.

as a slight modfication to my original question, imagine now that the formatting is in one of the following formats (where L's represent letters, and N's represent integers):

L600-NNN-LLLL
LL600-NNN-LLLL
LLL600-NNN-LLLL
LLLL600-NNN-LLLL

the question remains the same: how can I find the largest NNN in a column whose contents usually follow the above formatting? how can i ignore those numbers that don't conform to the above formatting? the reason that this formatting poses a new problem for me is that i can't use the RIGHT or LEFT functions, and therefore don't know how to parse out the NNN values.

thank you in advance for your time.
cheers|||Consider using the LIKE operator, along with wildcards, to filter out bad data. For example:
Where [DataString] like '%-[0-9][0-9][0-9]-[a-z][a-z][a-z][a-z]'
(This is untested, but you get the idea...)

You are going to need to familiarize yourself with the character functions available in TSQL: RIGHT, LEFT, TRIM, SUBSTRING, CHARINDEX, for starters.

To get the substring for your example, we'll nest two functions. The inner-most peels off the leftmost 8 characters, and the outer function grabs the left three from the result:

left(right([DataString], 8), 3)

blindman|||Is this a test?

We are really delving into the Sea of Volitility here...

If you ALWAYS have the hyphens, try this on for size:

SELECT MAX(LEFT(RIGHT(test,LEN(test)-CHARINDEX('-',test)),CHARINDEX(RIGHT(test,LEN(test)-CHARINDEX('-',test))-1)) FROM (SELECT test FROM table1) UNION ALL (SELECT test FROM table2)

I think a better solution would be to scold the person who created the contract numbering scheme and take it away from them.|||blindman and bpdWork,
thanks again for all of your guidance. you are the reason that the internet is such a vast resource.

i ended up following your suggestions, blindman, and things are working great. i will surely look into those functions you mentioned. bpdWork, I may have used your SQL but I couldn't parse it mentally, and that would be a bad thing if I needed to modify it later on.

cheers!|||bpdWork, rest assured that I have scolded those who create the contract numbering scheme, but unfortunately I don't have the power to take it away from them.

I do have yet another question for this thread (that has been so helpful thus far. The question is the same as the first one I posted on this thread, with the only difference being that I can't use the RIGHT function, because now the RIGHT four characters might not be the sequence number. This is because an options "-TO##" suffix may be appended to the contract number, where ## is a two digit integer between 00 and 99.

I expect to use CHARINDEX or maybe the CASE conditional, but I am not sure.

If I am bugging you with seemingly simple questions that aren't very different from eachother, it is only because I don't yet have full comprehension of the tools needed to write these types of queries (but you are helping a lot).

Thanks again!

Originally posted by bpdWork
Is this a test?

We are really delving into the Sea of Volitility here...

If you ALWAYS have the hyphens, try this on for size:

SELECT MAX(LEFT(RIGHT(test,LEN(test)-CHARINDEX('-',test)),CHARINDEX(RIGHT(test,LEN(test)-CHARINDEX('-',test))-1)) FROM (SELECT test FROM table1) UNION ALL (SELECT test FROM table2)

I think a better solution would be to scold the person who created the contract numbering scheme and take it away from them.|||I think you should use user-defined functions for this. I would write two: one to call that returns an integer, and one to parse the contract number into parts (it will be called from the first).

The parsing function is useful in many cases if written well. Here's one that'll accept a huge VARCHAR and a delimiter such as a hyphen (defaults to comma), and returns a numbered table of elements:

CREATE FUNCTION [fn_parse_string] (@.data_string VARCHAR(8000), @.delimiter VARCHAR(3)=',')
RETURNS @.element_table TABLE (element_id INT, element VARCHAR(512)) AS
BEGIN

DECLARE @.temp_table TABLE (element_id INT NOT NULL IDENTITY(1,1), element VARCHAR(512))
DECLARE @.iPos INT, @.element VARCHAR(512)

WHILE (LEN(@.data_string)>0) AND (@.iPos<>0)
BEGIN
SET @.iPos = CHARINDEX(@.delimiter, @.data_string)

IF @.iPos > 0
SET @.element = LEFT(@.data_string, @.iPos - 1)
ELSE
SET @.element = @.data_string

SET @.data_string = SUBSTRING(@.data_string, @.iPos + 1, LEN(@.data_string) - @.iPos)

INSERT @.temp_table (element)
VALUES (SELECT LTRIM(RTRIM(@.element)))

END

INSERT @.element_table
SELECT element_id, element
FROM @.temp_table

RETURN
END

Now you will want the function that will return the integer from a supplied contract code.

CREATE FUNCTION [fn_get_contract_number] (@.contract_code VARCHAR(50))
RETURNS INT AS
BEGIN
DECLARE @.contract_number AS INT
SELECT @.contract_number = CAST(element AS INT)
FROM dbo.fn_parse_string(@.contract_code)
WHERE element_id=2 -- if this is the part that holds the number
RETURN @.contract_number
END

Ok, it could use some error handling, as it will puke if element number two is not in integer, but you can play with that bit. Anyway, you could then use this function to do something like:

SELECT MAX(dbo.fn_get_contract_number(contracts.contract_ code)) FROM contracts

to get the Max number.

-b|||BTW: it will be slow, but you will have more managable code, and you will be able to exercise some more control without dizzying Select statements. Just get bigger processors!|||bpdWork, what a response!

I really appreciate your help. Your solution certainly seems flexible!

I am trying to figure out if this solution can handle these examples correctly:

V600P-1324 (will work)
V600P-1653-TO12 (will work)
V600P(C)-SAS-1653 (won't work as I need the 1653, not the SAS)
V600P(C)-SAS-1651-TO21 (won't work as I need the 1651)

I sincerely apologize for the continual complications on this question. The most simple way to put it is that if the RIGHT four characters or ints, then that is the number I want. If the RIGHT four character are TO##, then the four numbers that come before the TO## are what I want. I am sure that I could modify your code somehow to get this to work.

Thanks.|||Sure, simply modify the fn_get_contract_number to iterate through and return the first element that isnumeric().

If ISNUMERIC(element)...|||hmmm... not sure if I understand.

in all of the above contract number examples, I will need to extract the four digit number out of the whole string. wouldn't "If ISNUMBERIC" incorrectly skip some of the records?

thanks|||CREATE FUNCTION [fn_get_contract_number] (@.contract_code VARCHAR(50))
RETURNS INT AS
BEGIN
DECLARE @.contract_number AS INT
SELECT @.contract_number = CAST(element AS INT)
FROM dbo.fn_parse_string(@.contract_code)
WHERE ISNUMERIC(element)=1 -- only returns numeric elements
RETURN @.contract_number
END|||wow; i actually took 15 minutes to fully read through all of your code, and i fully understand what's going on. I have never used functions in SQL Server, but I can see how they can be useful. Any tips on when to use Sql Server functions?

i have really learned a lot from you and this thread.

thanks!

No comments:

Post a Comment