1. Use PROC SORT with NODUPKEY and NODUP Options.
2. Use First. and Last. Variables - Detailed Explanation
The detailed explanation is shown below :
SAMPLE DATA SET
ID Name Score
1 David 45
1 David 74
2 Sam 45
2 Ram 54
3 Bane 87
3 Mary 92
3 Bane 87
4 Dane 23
5 Jenny 87
5 Ken 87
6 Simran 63
8 Priya 72
Create this data set in SAS
data readin;
input ID Name $ Score;
cards;
1 David 45
1 David 74
2 Sam 45
2 Ram 54
3 Bane 87
3 Mary 92
3 Bane 87
4 Dane 23
5 Jenny 87
5 Ken 87
6 Simran 63
8 Priya 72;
run;
There are several ways to identify and remove unique and duplicate values:
PROC SORT
In PROC SORT, there are two options by which we can remove duplicates.
1. NODUPKEY Option 2. NODUP Option
The NODUPKEY option removes duplicate observations where value of a variable listed in BY statement is repeated while NODUP option removes duplicate observations where values in all the variables are repeated (identical observations).
PROC SORT DATA=readin NODUPKEY;
BY ID;
RUN;
PROC SORT DATA=readin NODUP;
BY ID;
RUN;
The output is shown below :
SAS : NODUPKEY vs NODUP
SAS : NODUPKEY vs NODUP
The NODUPKEY has deleted 5 observations with duplicate values whereas NODUP has not deleted any observations.