The Shuffle Riff Solutions
Thank you to everyone who sent in their solutions, similarly to last quarter, a selection are posted below from those we received.
The problem posed was as follows:
Consider yourself the SAS programmer who works in a team that deals with address data. Unfortunately, the system used to capture address details does not have any constraints on how address details are entered.
The data can be reproduced by downloading the SAS program here.
Write a SAS program which shuffle's the addresses such that Address1 is always populated with the first address line given, Address2 is always populated with the second address line given, etc. Your results should appear as shown within the comments of the attached SAS program.
Finally (and nothing to do with SAS) what is the significance of each address?
Most of your solutions used arrays. Indeed, the original problem that inspired this quiz was coded up using arrays too. Some of the solutions we received and an alternative from us at Amadeus is shown below.
Arrays
Adrien Vallee sent the solution below, like many others including Allan Bowe and Tim Armstrong who are worthy mentions for their solutions.
DATA SHUFFLE;
SET SAMPLE;
/* Put address variables in an array */
ARRAY address{*} _CHARACTER_;
/* Initialize the subscripts:
i -> current address
j -> Next address */
i=1; j=1;
/* Loop: Stop when the size of the array is reached */
DO WHILE(i+j<=dim(address));
/* If the current adress is filled:
Move to the next address */
if address(i) ne "" then i=i+1;
/* If the current address is empty and the
next address is filled */
else if address(i) eq "" and address(i+j) ne "" then do;
/* Set the current address to the next address */
address(i)=address(i+j);
address(i+j)=" "; /* Set the next address to empty */
i=i+1; /* Move to the next address */
j=1; /* Set the next address to 1 */
end;
/* If the next address is also empty:
Move to the address after */
else j=j+1;
END;
RUN;
Transpose
Another possible solution is the use of Proc TRANSPOSE. Tim Church send the code below that works by transposing the columns into rows. Tim filters out the empty rows before a second transpose restores the table to its original structure, but now with standardised data.
proc transpose data=sample
out=tran1(where=(COL1 ne ' '));
var address:;
by sku;
run;
proc transpose data=tran1
out=tran2(drop=_NAME_) prefix=address;
var COL1;
by sku;
run;
proc print noobs;
run;
Read Test and Read the Rest
Andrew York provided the following method that tests the value of each column in the input buffer.
By using the trailing @ on the INPUT statement Andrew has left access open to the current input record for further INPUT statements on the same iteration of the data step:
data sample(drop=n i);
length sku 8;
infile datalines dlm=',' dsd missover;
array addy{*} $30 address1-address4;
n=1;
input sku @;
do i=1 to dim(addy);
input addy{n} @;
if addy{n} ne ' ' then n=n+1;
end;
input;
datalines;
0001,9 Church Green,Witney,Oxfordshire,OX28 4AZ
0002,,2 Witney Lane,Leafield,OX29 9PG,
0004,176 Monovale Dr,Los Angeles,CA United States,
0005,,,24800 Pacific Coast Highway,Malibu,
0006,,31663 Broadbeach Way.,Malibu
run;
Our Solution
We put together a solution for publication here too. Arrays are used but we also came up with an application for the little used LEAVE statement:
data sample;
length sku 8 address1-address4 $30;
infile datalines dlm=',' dsd missover;
input sku address1 $ address2 $ address3 $ address4 $ ;
* Loop over the addresses until the first non-missing
column is found, "leaving" the loop at that point
so i contains the arrays element of the first
non-missing value.;
array a{*} address1-address4;
do i = 1 to dim(a);
if ^missing(a{i}) then leave;
end;
* Now start from the first non-missing element and
assign it into the first array element. Move across
all the fields until we have non left.;
do j = 1 to dim(a);
if (i+j-1)<=dim(a) then a{j}=a{(i+j-1)};
else a{j}='';
end;
drop i j;
datalines;
0001,9 Church Green,Witney,Oxfordshire,OX28 4AZ
0002,,2 Witney Lane,Leafield,OX29 9PG,
0004,176 Monovale Dr,Los Angeles,CA United States,
0005,,,24800 Pacific Coast Highway,Malibu,
0006,,31663 Broadbeach Way.,Malibu
run;
proc print;
run;
The Significance of the Addresses?
For something off topic, we asked the significance of each address. The first two are indeed the previous and current head office addresses of Amadeus Software.
1: Amadeus Software's former head office
2: Amadeus Software's current head office
4: Jane Seymour's home address.
5: Pierce Brosnan's home address.
The third caused some controversy! The intention was to list addresses of actors and actresses who have appeared in James Bond films, providing a link to the next Friday Quiz. Unfortunately the source of information used was a perhaps less than accurate.
My apologies.
Jane Seymour and Pierce Brosnan provide a tenuous link to the next Friday Quiz: "A View to a Kill".


