Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Organising Data Set - Econometrics - Lecture Slides, Slides of Econometrics and Mathematical Economics

Organizing Data Set, Generate new variables, Introduction to missing values, Rename Command, Egenerate command, Syntax, Numeric missing values, String missing values are key and learning points in this lecture of Econometric.

Typology: Slides

2011/2012

Uploaded on 11/10/2012

uzman
uzman 🇮🇳

4.8

(12)

148 documents

1 / 28

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
ORGANISING DATA SET
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c

Partial preview of the text

Download Organising Data Set - Econometrics - Lecture Slides and more Slides Econometrics and Mathematical Economics in PDF only on Docsity!

1

ORGANISING DATA SET

2

Objectives

The Objectives of this lecture are:

  • To organize the data set.– To generate new variables.– Introduction to Missing Values

4

Organizing a Data Set

Load the data set ‘world.dta’.Type describe to describe the data.

5

The rename Command

Rename command This command is used to renamean existing variable into a new

variable. Syntax

rename old_varname new_varname Remember that the contents of the variable do not change.Note, you can only rename one variable at a time. Example:

rename isocode countryren yr year

7

The keep & drop Command

drop command This command eliminates variables or observations from the data. keep command Keep

works

the

same

as

drop,

except

that

you

specify

the

variables or observations to be kept rather to be deleted. Syntax

drop varlistdrop if expdrop in range [if exp]keep varlistkeep if expkeep in range [if exp]

Docsity.com

8

The keep & drop Command

Example:

keep pop cg cc ci yr OR

drop xr pi ki kc kg i openc openk csave cgdp

cgnp rgdpch rgdpl

rgdptt y p pc pg rgdpeqa rgdpwok NOTE:

both commands above will do the same job. (why?).

You can also drop or keep observations, such as those before 1960:

keep if year>=

OR

drop if year<

10

The keep & drop Command

You may want to drop observations with specific values such as

missing values (denoted in Stata by a dot): drop if pop==.You may want to keep observations for all countries other than

those for Angola and Zimbabwe: drop if country=="AGO" | country=="ZWE" Note:

With string variables, you must enclose the observation reference in double quotes. Otherwise, Stata will claim not tobe able to find what you are referring to.

11

The sort Command

sort command This command is used to arrange the observations of the current

data into ascending order. Syntax

sort varlist [in] [, stable] Missing values are treated as being larger than any other number.

So, they are placed last. For stable option, type

help sort

Example:

sort countrysort pop

13

The gsort Command

gsort command It sorts observations in ascending or descending order of the specified variables. Syntax

gsort [+|-] varname [[+|-] varname ...] [, generate(newvar) mfirst] gsort differs from sort command that produces only ascending order arrangement.The observations are placed in ascending order of

varname

if + or nothing is

typed in front of the name, and in descending order if – is typed. Options: generate(newvar) creates newvar containing 1,2,3,… for each group denoted by

ordered data. This is useful when using the ordering in a subsequent byoperation. mfirst specifies that missing values be placed first in descending orderings rather

that last.

14

The gsort Command

Example: gsort country

(same as sort country)

list in 1/10gsort +country

(same as sort country)

list in 1/10gsort -country

(reverse sort)

list in 1/10gsort country year

(ascending country, asc. year)

list in 1/

16

The gsort Command

The result of gsort country, gen(test) is given below:

17

The recode Command

Recode It changes the values of numeric variables according to the rules

specified. Syntax: recode varlist (rule) [(rule) ...] [, generate(newvar)]

19

The recode Command

Load data set personal info.dta. Examples: ‘summarize age’

to get the summary of min & max.

recode age (min/12=1) (13/19=2) (20/39=3),gen(agecat) This makes age categories of the variable age and saves them in

a new variable named ‘agecat’. Type list command to see itsimpact. list age agecat

20

The recode Command

You can provide value labels as well.recode age (min/12=1 "childhood") (13/19=2 "teenagers")

(20/39=3 "young"),gen(agecat2) list age agecat agecat2To get more on this topic:Type:

help recode