Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

SAS Data Sets Concatenation and Interleaving with PROC APPEND and SET Statement - Prof. Ja, Study notes of Statistics

Virginia Commonwealth University (VCU)Statistics

Prof. James Davenport

How to concatenate and interleave sas data sets using proc append and the set statement. It covers the differences in behavior when variables have different attributes, lengths, and types. It also discusses the use of force option and interleaving with by statement and sort procedure.

Typology: Study notes

Pre 2010

Uploaded on 02/12/2009

koofers-user-159 🇺🇸

10 documents

1 / 8

This page cannot be seen from the preview

Don't miss anything!

*** CONCATENATING SAS DATA SETS ***

*** WITH PROC APPEND ***

When Variables Have Different Attributes

If a variable has different attributes in the BASE= data set

than it does in the DATA= data set, the attributes in the

BASE= data set prevail.

In the cases of differing formats, informats, and labels, the

concatenation succeeds.

If the length of a variable is longer in the BASE= data set

than in the DATA= data set, the concatenation succeeds.

If the length of the variable is longer in the DATA= data

set than in the BASE= data set, or if the same variable is a

character variable in one and numeric in the other, PROC

APPEND fails to concatenate the files unless you specify

the FORCE option.

Using the FORCE options has the following consequences:

The length specified in the BASE= data set prevails.

Therefore, the SAS System truncates values from the

DATA= data set to fit them into the length specified in

the BASE= data set (or pads them with blanks).

The type specified in the BASE= data set prevails. The

procedure replaces values of the wrong type (all values

for the variable in the DATA= data set) with missing

values.

1

Partial preview of the text

Download SAS Data Sets Concatenation and Interleaving with PROC APPEND and SET Statement - Prof. Ja and more Study notes Statistics in PDF only on Docsity!

* CONCATENATING SAS DATA SETS *

* WITH PROC APPEND *

 When Variables Have Different Attributes  If a variable has different attributes in the BASE= data set than it does in the DATA= data set, the attributes in the BASE= data set prevail.  In the cases of differing formats, informats, and labels, the concatenation succeeds.  If the length of a variable is longer in the BASE= data set than in the DATA= data set, the concatenation succeeds.  If the length of the variable is longer in the DATA= data set than in the BASE= data set, or if the same variable is a character variable in one and numeric in the other, PROC APPEND fails to concatenate the files unless you specify the FORCE option. Using the FORCE options has the following consequences:  The length specified in the BASE= data set prevails. Therefore, the SAS System truncates values from the DATA= data set to fit them into the length specified in the BASE= data set (or pads them with blanks).  The type specified in the BASE= data set prevails. The procedure replaces values of the wrong type (all values for the variable in the DATA= data set) with missing values.

 Choosing between PROC APPEND and the SET Statement  If two data sets contain the same variables and the variables possess the same attributes, the file that results from concatenating them with PROC APPEND is the same as the file that results from concatenating them with the SET Statement. However, PROC APPEND does this much faster (especially if the BASE= data set is large -- you are avoiding the processing of all that data).  The two methods differ enough when the variables or their attributes don’t match, that you must consider the differences in behavior before you decide which method to use.

Different lengths If the same variable has a different length in two or more data sets, uses the length from the data set you name first in the SET statement. Requires the FORCE option if the length of a variable is longer in the DATA= data set. Truncates the values of the variable to match the length in the BASE= data set. Different types Doesn’t concatenate Requires the FORCE option to concatenate. Use type from the BASE= data set and assigns missing values to the variable in observations from the DATA= data set.

***** Interleaving SAS Data Sets ***** Interleaving combines individually sorted SAS data sets into one sorted data set, using SET statements and BY statements. The number of observations in the new data set is the sum of the number of observations in the original data sets.  How to use the By statement  How to sort data sets to prepare for interleaving  How to use the SET and BY statement together to interleave observations. Using BY-Group Processing The BY Statement specifies the variable or variables by which you want to interleave the data sets. To understand this, we first review our understanding of BY variables, BY values and BY Groups.  BY variable – is a variable named by the BY statement.  BY value – is the value of a BY variable.  BY group – is all observations with the same value for all BY variables. In discussions of interleaving, BY groups commonly span more than one data set. If you use more than one variable in a BY statement, a BY group is a group of observations with a unique combination of values for those variables.

We have two data sets (from two divisions), each containing the variables  Project is a unique code that identifies the project  Dept Is the name of the department involved in the project  Manager is the name of the manager of the dept.  Headcoun is the number of people working for the manager on the project
( See program SAS_Create_interleav_randd.sas ) Note: Data Set randd is already sorted by PROJECT. ( See program SAS_Create_interleav_pubs.sas ) Note: Data Set pubs has variables in a different order and is not sorted by PROJECT. We want to combine the data sets by PROJECT so that the new data set shows the resources that both divisions are devoting to each project. Both data sets must be sorted by PROJECT before you can interleave them.

Interleaving the Data Sets To interleave the data set INTLEAVE.RANDD and the data set INTLEAVE.PUBS, use the SET statement and the BY statement as follows: data randdpub; set intleave.randd intleave.pubs; by project; run; ( See program SAS_Interleave_randd-pubs.sas ) Note, we did not have to sort INTLEAVE.RANDD since it was already sorted by PROJECT; otherwise we should sort it first.