select what?

this is really silly, but i didn’t know what

select 1 from dbo.sometable

did.  I understood what select * did, and i understood the consequences of no where clause, but select 1 just didn’t seem to parse in my head.  Now, lets extend the idea to

select 'this_string' from dbo.sometable.

But dbo.sometable doesn’t CONTAIN “this_string” so how can i select it?  To be honest, i’m not sure how to explain it, but i know it works as i expect it to – returns “this_string” as many times as there are rows in the table. No data is returned FROM the table, but the row-count (row presence) is used as a psuedo-cursor to drive the select statement.

Jeff Moden expands on the psuedo-cursor idea at SQL Server Central

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
from dbo.sometable

Now, there IS a row_number, but it is not being returned from dbo.sometable (?) or at least there is not data in the table.  The row_number functions needs an “over” statement, so we make up a select null clause that doesn’t nothing.

the result is a psuedo-cursor which returns a list of numbers – as many as there are rows in sometable

 SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM [master].[sys].[stats] -- st1
CROSS JOIN [master].[sys].[stats] st2;

What this will do is to count the lines of a cartesian join between the two tables.  One from table one, all from table 2, second from table one, all from table w, and so on.  In this case, sys.stats has 299 entries, so cross joining on ourselves gives 299*299 or 89401 rows.  You have to give at least one of the joins an alias.  Convention says do it twice.  

 with myCTE as (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) as counter
FROM [master].[sys].[stats]
CROSS JOIN [master].[sys].[stats] st2
)
select counter from myCTE where myCTE.counter between 1 and 10

The select_row number stuff is saved into a CTE (common table expression) called myCTE, the column named counter, and then counter is returned – but only the first 10 rows.  If the between was different,  you would get a different start and stop and total count.  For example, the following gets 5,6,7,8,9,10,100

with myCTE as (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) as counter
FROM [master].[sys].[stats] 
CROSS JOIN [master].[sys].[stats] st2 
)
select counter from myCTE where (myCTE.counter between 5 and 10 or myCTE.counter = 100)

This would let you create a psuedo-cursor for, say, days in several months

Combining the CTE and cross-joins, we have the famous Itzik Ben-Gan auxillary table method. Create a virtual table with two rows in it containing, eg. 1s.  Cross join that to itself.  You have 4 rows.  Cross-join 4×4 and get 16.  16*16 is  256.  Now it gets fun (big): 16X16 is 65,536.  Next stop is 4+ million.  Do a select row_number over … on the last table and you get lots of sequential numbers efficiently

WITH
C0 AS(SELECT 0 AS const UNION ALL SELECT 0),
C1 AS(SELECT 0 AS const FROM C0 AS A cross join C0 AS B),
C2 AS(SELECT 0 AS const FROM C1 AS A cross join C1 AS B),
C3 AS(SELECT 0 AS const FROM C2 AS A cross join C2 AS B),
C4 AS(SELECT 0 AS const FROM C3 AS A cross join C3 AS B),
C5 AS(SELECT 0 AS const FROM C4 AS A cross join C4 AS B),
C6 AS(SELECT 0 AS const FROM C5 AS A, C5 AS B) — shorthand for cross join
SELECT TOP(100) ROW_NUMBER() OVER(ORDER BY const) AS n from C6

 

.  

 

 

 

 

 

remove dups, quick and simple

this is a nice quick method for removing duplicates in a table.  It appears to let me include as many fields as i want to determine what a duplicate is.  

Create a common table expression, and select fields, and a row number over some or all fields.  If there is more than one row for that selection, the row number will be 1 plus 1++ for each duplicate.  remove those with row number greater than 1.  I THINK that the order by might allow us to select which one we want to remove, but right now i’m happy with removing ANY duplicates.

;WITH CTE AS 
(
SELECT *, ROW_NUMBER() OVER(
PARTITION
BY field1, field2 
ORDER BY [field1]

AS Rnum FROM table

DELETE FROM CTE WHERE Rnum <> 1

versatile dynamic code generator

We have columns in the table.  We want to get a sum of the rows in that column so that we can quickly check for changes, such as import/export.  Actually want the sum, and not a check-sum or hashcode solution.  OK, so sum(columnname) right?  Well, we have 60 or so columns, multiple tables with a similar format, and i DON’T want to hand code each one.

Fortunately, they are formful, as in “something_date_somethingelse” so i have a shot at coding it.  Would be fairly easy to use powershell, but alas, i lost it in the move.  That was a joke.

Dynamic SQL, since we are writing SQL code, is a good option.  Even if i don’t want to immediately run the code with exec(@SQLCode), i can generate the query, and store it, or run it manually when i want.  

The column names run from t_200906 to t_201405.  WTF?  i can work with the “t_’ part, but how do write logic to do an entire year (2010, 2011, 2012, 2013) but not all of 2009 and 2014 and how do i put this into the query?  Well, i don’t and i won’t.  I will generate a date temp table, containing the meta dates, with an id field, running from 1 to 72 and do all of the years time all 12 month.  THEN i will deal with the logic of getting out the right dates.  OK, so make the temp table.  

create table #myyears (
[each] [int],
[yearMonth] [varchar](10)
)

Gee, wasn’t THAT easy.  OK, wisenheimer, put the data in it.

Create a couple of local variables, and create a couple of nested while look.  Start at 2009 and increment the year counter until it is more than 2014.  Inside that loop, cycle from 1 to 12.  Add the two numbers…. ah poop.  If i add 2009+12, i get 2021, which is NOT 200912.  So i need to cast or convert int into varchar() so that i can stick them together.  Oh, remember also that we want 200901 and not 20091 for the first month.  The details are below: 

declare @mycounter int
declare @myyear int
declare @mymonth int
set @mymonth = 1
set @mycounter = 1
set @myyear = 2009
while @myyear < 2015
begin
while @mymonth< 13
begin
INSERT INTO #myyears ( each, yearMonth)
select @myCounter,
case when @mymonth < 10
          then cast(@myyear as varchar(10)) + '0' + cast(@mymonth as varchar(10)
else 
          cast(@myyear as varchar(10)) + cast(@mymonth as varchar(10))
end
set @mycounter = @mycounter+1
set @mymonth = @mymonth+1
end
set @myyear = @myyear+1
set @mymonth = 1
end
-- select * from #myYears where each <66 and each > 6

a little secret sauce and a little mea culpa.  You use the temp table by using “each” to index into the table to find the correct yearMonth.  By setting the limits of each correctly, you get just the right collection of dates.  The bad news is that the first time i wrote this, the years were ordered 100% wrong from the excel sheet i was supposed to update.  No sweat, i changed the order of the limits and decremented the internal counter rather than incrementing it.  But i forgot to change one of the GT\LTs and spent at least 10 minutes checking if the local variables were counting up and down correctly (they were) but i wan’t getting anything from the select so i wasn’t inserting any data.

NEXT, since we have the dates, we need to generate code that looks like 

select sum([n_200907]) as ‘200907’ FROM table.filename

but do it 59 times.  OK, 58 MORE selects.  

So we need to loop around a number of times, with the number of loop constrained by the two values of each, selecting the appropriate yearMonth from the temp table based on the value of “each”.  That is straightforward and we just have to keep beating on the loop until we get something that works.  And remember to initialize the @SQLCommand with “select” so we only get a select at the beginning, and append the from statement at the end after we leave the loop.  Looks like this

declare @SQLCommand varchar(5000)
set @SQLCommand = 'select'
declare @columnname varchar(100)
Declare @counter2 [int] = 65
while @counter2 > 6
begin
set @columnname = (select [yearMonth] from #myYears where each = @counter2)
set @SQLCommand = @SQLCommand + '
sum([t_' + @columnname + ']) as '''+ @columnname +''','
set @counter2 = @counter2-1
end
set @SQLCommand = @SQLCommand + ' FROM [Momentum].[dbo].[INS_TRX]'
print @SQLCommand

 

bcp in

In a previous post, i talked about getting text data out of a table with bcp utility.  That same CLI application can be use to push data into a table.  You run the utility, indicate in rather than out or outquery, you provide the path to the data file, and the destination database.

Ah, first problem.  Do we have a table to bcp into?  If we do and it has data, we might want to truncate the table.

select count(*) from MyTable
truncate table MyTable
select count(*) from MyTable

If not, we can quickly create a new blank table using t-SQL and SSMS.  Select the table, the right-click menu for “script table as.., create to…, new query editor window”.  this generates a t-sql script to create the table, you need to change the table name (at least two places, one is in comment) and then after checking the syntax, run the script.  This will create MyTable_test (you DID rename it, right?) and you can bcp into it.  

It’s actually as easy as bcp out.

exec master..xp_cmdshell 'bcp databse..MyTABLE_test in                     "E:\datbase\filename.txt" -c -T -t"|"'

although we have specified character data in our bcp export and here in import, because the database fields are specified, the utility does an implicit conversion and we get the right type of data.  This process can take several (4X?) times longer to run than the export, so plan accordingly.

 

 

replicate and sort

I want final output to contain a number, which should be formatted into 8 characters. That is, 00000001 or 00003345. Output can be a string. What i am asking for is the numbers to be padded to a total of 8 characters. The default solution is to use replicate, like

select REPLICATE('0', 4) + [myNumber] AS 'the number' from table

which will return 00003345, 00003346, but also 00001. Which pads thousands to 8, but single digits to 5. We need to take in the number of digits returned before added the padding.


SELECT REPLICATE('0', 8 - DATALENGTH(myNumber)) + c1 AS 'the number',

it must be noted that the result, with the padding, is now a varchar.

We also get back a varchar if we use format

select format (myNumber,"D8")

or

DECLARE @d int;
set @d = 123;
select format(@d,'D8')

If we want to sort a varchar(), we can’t, Or rather, we can but if the varchar() is a number that is sorted as a varchar() (now who would do that???), we get 1 before 2 before 2000 before 3 and so on.  Sorted on the first character, not as a number.  We need to cast (or convert) the number prior to sorting.  Like

SELECT myNumber FROM table ORDER BY CAST(myNumber as INT)

or

SELECT (cast (myNumber as int)) FROM table ORDER BY myNumber