How to identify and remove unique and duplicate values?

Question

How to identify and remove unique and duplicate values?

1 Answer

john ganales · Answer 1 · 2024-08-20T15:19:18+0000

1. Use PROC SORT with NODUPKEY and NODUP Options.

2. Use First. and Last. Variables - Detailed Explanation

The detailed explanation is shown below :

SAMPLE DATA SET

ID Name Score

1 David 45

1 David 74

2 Sam 45

2 Ram 54

3 Bane 87

3 Mary 92

3 Bane 87

4 Dane 23

5 Jenny 87

5 Ken 87

6 Simran 63

8 Priya 72

Create this data set in SAS

data readin;

input ID Name $ Score;

cards;

1 David 45

1 David 74

2 Sam 45

2 Ram 54

3 Bane 87

3 Mary 92

3 Bane 87

4 Dane 23

5 Jenny 87

5 Ken 87

6 Simran 63

8 Priya 72;

run;

There are several ways to identify and remove unique and duplicate values:

PROC SORT

In PROC SORT, there are two options by which we can remove duplicates.

1. NODUPKEY Option 2. NODUP Option

The NODUPKEY option removes duplicate observations where value of a variable listed in BY statement is repeated while NODUP option removes duplicate observations where values in all the variables are repeated (identical observations).

PROC SORT DATA=readin NODUPKEY;

BY ID;

RUN;

PROC SORT DATA=readin NODUP;

BY ID;

RUN;

The output is shown below :

SAS : NODUPKEY vs NODUP

The NODUPKEY has deleted 5 observations with duplicate values whereas NODUP has not deleted any observations.

How to identify and remove unique and duplicate values?

Please log in or register to answer this question.

1 Answer