Introduction to R

Create a variable, data types, data structure, basic data manipulation, import data

R is a language and environment for statistical computing, data analysis, visualisation and graphics and many more. It is a free and open source software, under the terms of GNU General Public License.

R runs on a wide variety of platforms, including Windows, Linux and MacOS.

Getting help
  • Use Google well! There are a lot of discussions in forums such as StackOverflow
  • Cheatsheets made by Posit and community
  • Type ?functionName in the console
  • Package vignettes (longer format of documentation)

Create variables

Create a numeric variable

To create a variable, you type variable_name <- variable_value in the console.

# create a numeric variable number_1
a <- 3
a
[1] 3

You can carry out **mathematical calculation8* on numeric variables, such as exponentiation, addition, division and many more.

# assign values to variables a, b, c
a <- 3
b <- 4
c <- 7

# calculate the average of a,b,c
# output directly
(a+b+c)/3
[1] 4.666667
# or, save into a new variable d
d <- (a+b+c)/3
d
[1] 4.666667
# e to the power of a (e = 2.7182)
exp(a)
[1] 20.08554

Data types

In R, there are a few types of variables. The ones you will interact with are:

  • numeric (real numbers): 1.2, -5
  • integer: 1, 2, 2000
  • character (strings): “male”, “female”
  • logical (binary, 1/0): True or False

Note that code that start with # are comments, and are not evaluated.

# create a numeric variable number_1
number_1 <- 1.2

# a character variable student
student <- 'hadley'

# a logical variable true_or_false
true_or_false <- T

To evaluate (or return) the variable you have created, you can either type the name of the variable, or print() with the variable name inside the bracket.

number_1
[1] 1.2
print(number_1)
[1] 1.2

You can check the variable type using class(variable_name):

class(number_1)
[1] "numeric"
class(student)
[1] "character"
class(true_or_false)
[1] "logical"
Name your variable

It is good practice to give your variable a name that is both easy to understand, and also valid.

  • Names are case sensitive, VariableA is not the same as variablea
  • Numbers can not be a variable name by itself. Combining numbers and letters is allowed, but should start with a letter, such as variable3, but NOT 22variable
  • You can use underscores (“snake_case” naming style). In fact it encourages readability, so it is my personal favoriate.

Avoid the following:

  • Other special characters, such as dot and dollar sign: var.A, var$A have special meanings in R.
  • Avoid using function names like function, list and so on. If you really can’t think of a better name, you can use names my_function, list_1 to avoid the ambiguity.

Data structure

Vectors

A vector is a list of values; it can be numeric, and also characters and logical.

To create a vector, use function c().

# numeric
num_vector <- c(1, 2, 3, 4, 5)
num_vector
[1] 1 2 3 4 5
# character
char_vector <- c('student_a', 'student_b', 'student_c')
char_vector
[1] "student_a" "student_b" "student_c"
# logical 
logical_vector <- c(T, F, T, F)
logical_vector
[1]  TRUE FALSE  TRUE FALSE

There are some shortcuts to create a sequence of values; not required to learn, but very useful.

# numeric
# num_vector <- c(1, 2, 3, 4, 5)
num_vector <- 1:5 # from 1 to 5
seq(from = 1, to = 11, by = 2) # from 1 to 11, with 2 between each
[1]  1  3  5  7  9 11
rep(1, 5) # repeat 1 for 5 times
[1] 1 1 1 1 1
# character
# char_vector <- c('student_a', 'student_b', 'student_c')
char_vector <- paste0('student_', c('a', 'b', 'c'))
char_vector
[1] "student_a" "student_b" "student_c"
Types of elements in a vector

In a vector, types of the elements must be the same. If you try to combine multiple types of variables in the same vector, such as a numeric number and a character, R will try to convert them into the same type.

Try to combine the following values into a vector, and see what happens.

  • 1.52, “student_a”
  • 1.52, TRUE (logical)
  • TRUE, “student_a”

Combine multiple vectors

You can combine multiple vectors using c(). For example, vec1 has 3 elements, vec2 has 2 elements (assuming that they are of the same type), combining them gives 5 elements.

vec1 <- c(1, 3, 5)
vec2 <- c(100, 101)
c(vec1, vec2)
[1]   1   3   5 100 101
# you can also save it into a new variable, 
# so that you can access it in the future
vec_combined <- c(vec1, vec2)
vec_combined
[1]   1   3   5 100 101

Matrix

A matrix can be thought of as a stack of vectors. When you collect data from \(n\) patients (or subjects), you measure a few aspects on each patient such as age, sex, height and smoking. Let’s say you have measured \(p\) aspects. This forms a matrix of size \(n \times p\).

You might not need to create a matrix from scratch in R (because the focus of this course is data analysis); but it is helpful to understand some basic data manipulation commands.

You can create a matrix using matrix(), with some parameters:

matrix_1 <- matrix(data = c(1, 2, 3, 4), nrow = 2, ncol = 2, byrow = T)
matrix_1
     [,1] [,2]
[1,]    1    2
[2,]    3    4

You can also create a matrix by combining two vectors of the same size, using cbind() or rbind(), which stands for “column bind” and “rowbind”.

vec1 <- c(1, 2)
vec2 <- c(3, 4)

# bind by columnn
matrix_c <- cbind(vec1, vec2)
matrix_c
     vec1 vec2
[1,]    1    3
[2,]    2    4
# bind by row
matrix_r <- rbind(vec1, vec2)
matrix_r
     [,1] [,2]
vec1    1    2
vec2    3    4

Dataframe

Dataframe, data.frame is a format of data commonly used in data analysis with R and python. It can be considered as a matrix, but allows a mixture of data types, such as numeric and categorical measurements (age and sex).

In this course, you will mostly be working with dataframes.

We create a small dataframe of 3 subjects:

  • Subject 1 is a 20 years-old male who has covid
  • Subject 2 is a 50 years-old female who has covid
  • Subject 3 is a 32 years-old male who does not have covid

This is how you can present the dataframe, where each column has a different data type.

mini_data <- data.frame(
  age = c(20, 50, 32), 
  sex = c('male', 'female', 'male'), 
  has_covid = c(T, T, F)
)
mini_data
  age    sex has_covid
1  20   male      TRUE
2  50 female      TRUE
3  32   male     FALSE

Basic data manipulation

Dimension of your data

You can find the size of a vector with length().

For a matrix or dataframe, you can use dim(). It will return nrow ncol, number of rows and number of columns.

vec1 <- c(1, 2)
length(vec1)
[1] 2
# matrix
mat <- matrix(data = c(1, 2, 3, 4), nrow = 2, byrow = 2)
dim(mat)
[1] 2 2
# dataframe
dim(mini_data)
[1] 3 3
dim() or length()

If you use dim() on a vector, it returns NULL. Given that a vector is just a matrix with 1 row (or column), this seems insensible.

Nonetheless, dim() works on matrix objects. if you convert the vector into a matrix with nrow =1 or ncol = 1, dim() will work.

If you use length() on a matrix, it will return the total number of elements, i.e. ncol times nrow.

You can also use nrow(), ncol() to get the number of rows and columns explicitly.

# ncol, nrow
# dim(mini_data)
nrow(mini_data)
[1] 3
ncol(mini_data)
[1] 3

Accessing elements in your data

For a vector, you can access

  • an element at a given position
  • multiple elements at given positions
  • elements beyond, or below a certain element

Sometimes you might need to combine previous knowledge to get what you want (e.g. to know how many elements in total there are).

letters <- c('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')

# 3rd letter
letters[3]
[1] "c"
# 3rd, and 5th
letters[c(3, 5)]
[1] "c" "e"
# letters beyond 4
letters[5:8] # or, letters[5:length(letters)]
[1] "e" "f" "g" "h"

For a matrix,

  • matrix[r, c] to get the element on \(r\)-th row, \(c\)-th column.
  • matrix[r, ], matrix[, c] to get all the elements on \(r\)-th row or \(c\)-th column
mat_3by3 <- matrix(data = 1:9, nrow = 3, byrow = T)
mat_3by3
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
# element (2,3)
mat_3by3[2, 3]
[1] 6
# first row
mat_3by3[1,]
[1] 1 2 3

For a dataframe,

  • you can use indices (row, col) in the same way as matrices above;
  • use data$column_name, or data['column_name'] to access the entire column

Conventionally, each row is a subject, and each columnn is a variable (or aspect of measurement, feature, characteristic, risk factor etc).

mini_data
  age    sex has_covid
1  20   male      TRUE
2  50 female      TRUE
3  32   male     FALSE
# first row 
mini_data[1, ]
  age  sex has_covid
1  20 male      TRUE
# second col
mini_data[, 2]
[1] "male"   "female" "male"  
# via column name using $
mini_data$age
[1] 20 50 32
# alternatively, 
mini_data['age']
  age
1  20
2  50
3  32
Filter data based on criteria

You might have a task where you need to filter elements based on another variable: for example, select the age based on sex. This task is done in 2 steps:

  • create a logical (binary, true or false) variable on sex, call it sex_indicator
  • select the elements in age vector, corresponding to sex_ind == TRUE. (The operator == evaluates whether the criteria is met)

The following example illustrates this process. You will use this a few times in the course, for example to select the height measured for men and women.

age <- c(55, 60, 65)
sex <- c('Male', 'Female', 'Male')

# select age for only female
# first create a variable indicating 'sex == Female'
# i.e. if the element is Female, returns T; otherwise, F

sex_indicator <- sex == 'Female'
sex_indicator
[1] FALSE  TRUE FALSE
# next combine age with sex_indicator, this only selects the 2nd element
age[sex_indicator] 
[1] 60
# you can skip the middle step:
age[sex == 'Female']
[1] 60

Modify existing data (optional)

Keep your original data safe!

Modifying an existing data is easy, but you should be aware of the risks. In this class we only modify data we created in the class so there is little risk, but you might have your own datasets to analyse in the future.

You should keep your original data in a safe place, and work on copies of it.

Version control is a good skill to learn.

# vector
# make e into E 
letters[5] <- 'E'
letters
[1] "a" "b" "c" "d" "E" "f" "g" "h"
# matrix
# make (1, 1) 20
mat_3by3[1, 1] # originally was 1
[1] 1
mat_3by3[1, 1] <- 20
mat_3by3
     [,1] [,2] [,3]
[1,]   20    2    3
[2,]    4    5    6
[3,]    7    8    9
# dataframe
# make so that subject 2 does not have covid
mini_data
  age    sex has_covid
1  20   male      TRUE
2  50 female      TRUE
3  32   male     FALSE
mini_data$has_covid[2] <- F
mini_data
  age    sex has_covid
1  20   male      TRUE
2  50 female     FALSE
3  32   male     FALSE

Import data

Before importing a dataset, you need to know where it is, and how to tell R to find it in your file system.

Working directory, R project

You can think of the working directory as the folder where R looks for (and saves) your scripts by default.

You can check where your working directory by running the following command.

getwd()
[1] "/Users/chizhang/Documents/GitHub/teaching_mf9130e/lab"

You can manually set this to a folder of your choosing by setwd(path).

It is recommanded to use R project. It sets a folder just for the current tasks you work on, so that you do not need to set the working directory every time you open RStudio. Read more about how to create an R project.

Import data

Data exist in different formats,

  • csv is one of the most commonly used data format for tabular data. If possible, it is a good idea to use this data format as it is readable by different languages and softwares
  • xlsx is also good for storing tabular data; however it is slightly more complicated than csv.
  • rda can be used to store R data (such as lists, higher dimensional arrays);
  • Some formats are data created by foreign softwares (such as dta created by STATA), and they would require some specific R packages to load in.

It is difficult to summarise all the data formats here, so you should check the documentation on how to import and write (save) data of different types.

# read a csv file
birth <- read.csv('data/birth.csv', sep = ',')

birth
     id         low age lwt   eth       smk ptl  ht  ui fvt ttv  bwt
1     4 bwt <= 2500  28 120 other    smoker   1  no yes   0   0  709
2    10 bwt <= 2500  29 130 white nonsmoker   0  no yes   2   0 1021
3    11 bwt <= 2500  34 187 black    smoker   0 yes  no   0   0 1135
4    13 bwt <= 2500  25 105 other nonsmoker   1 yes  no   0   0 1330
5    15 bwt <= 2500  25  85 other nonsmoker   0  no yes   0   4 1474
6    16 bwt <= 2500  27 150 other nonsmoker   0  no  no   0   5 1588
7    17 bwt <= 2500  23  97 other nonsmoker   0  no yes   1   5 1588
8    18 bwt <= 2500  24 128 black nonsmoker   1  no  no   1   2 1701
9    19 bwt <= 2500  24 132 other nonsmoker   0 yes  no   0   5 1729
10   20 bwt <= 2500  21 165 white    smoker   0 yes  no   1   4 1790
11   22 bwt <= 2500  32 105 white    smoker   0  no  no   0   0 1818
12   23 bwt <= 2500  19  91 white    smoker   2  no yes   0  12 1885
13   24 bwt <= 2500  25 115 other nonsmoker   0  no  no   0   3 1893
14   25 bwt <= 2500  16 130 other nonsmoker   0  no  no   1   4 1899
15   26 bwt <= 2500  25  92 white    smoker   0  no  no   0   4 1928
16   27 bwt <= 2500  20 150 white    smoker   0  no  no   2   5 1928
17   28 bwt <= 2500  21 200 black nonsmoker   0  no yes   2   4 1928
18   29 bwt <= 2500  24 155 white    smoker   1  no  no   0   6 1936
19   30 bwt <= 2500  21 103 other nonsmoker   0  no  no   0   5 1970
20   31 bwt <= 2500  20 125 other nonsmoker   0  no yes   0   2 2055
21   32 bwt <= 2500  25  89 other nonsmoker   2  no  no   1   4 2055
22   33 bwt <= 2500  19 102 white nonsmoker   0  no  no   2   3 2082
23   34 bwt <= 2500  19 112 white    smoker   0  no yes   0   4 2084
24   35 bwt <= 2500  26 117 white    smoker   1  no  no   0   7 2084
25   36 bwt <= 2500  24 138 white nonsmoker   0  no  no   0   1 2100
26   37 bwt <= 2500  17 130 other    smoker   1  no yes   0   9 2125
27   40 bwt <= 2500  20 120 black    smoker   0  no  no   3   6 2126
28   42 bwt <= 2500  22 130 white    smoker   1  no yes   1   4 2187
29   43 bwt <= 2500  27 130 black nonsmoker   0  no yes   0   6 2187
30   44 bwt <= 2500  20  80 other    smoker   0  no yes   0   6 2211
31   45 bwt <= 2500  17 110 white    smoker   0  no  no   0   5 2225
32   46 bwt <= 2500  25 105 other nonsmoker   1  no  no   1   5 2240
33   47 bwt <= 2500  20 109 other nonsmoker   0  no  no   0   5 2240
34   49 bwt <= 2500  18 148 other nonsmoker   0  no  no   0   3 2282
35   50 bwt <= 2500  18 110 black    smoker   1  no  no   0   4 2296
36   51 bwt <= 2500  20 121 white    smoker   1  no yes   0   4 2296
37   52 bwt <= 2500  21 100 other nonsmoker   1  no  no   4   0 2301
38   54 bwt <= 2500  26  96 other nonsmoker   0  no  no   0   6 2325
39   56 bwt <= 2500  31 102 white    smoker   1  no  no   1   5 2353
40   57 bwt <= 2500  15 110 white nonsmoker   0  no  no   0   3 2353
41   59 bwt <= 2500  23 187 black    smoker   0  no  no   1   5 2367
42   60 bwt <= 2500  20 122 black    smoker   0  no  no   0   4 2381
43   61 bwt <= 2500  24 105 black    smoker   0  no  no   0   3 2381
44   62 bwt <= 2500  15 115 other nonsmoker   0  no yes   0   4 2381
45   63 bwt <= 2500  23 120 other nonsmoker   0  no  no   0   2 2395
46   65 bwt <= 2500  30 142 white    smoker   1  no  no   0   4 2410
47   67 bwt <= 2500  22 130 white    smoker   0  no  no   1   2 2410
48   68 bwt <= 2500  17 120 white    smoker   0  no  no   3   6 2414
49   69 bwt <= 2500  23 110 white    smoker   1  no  no   0   9 2424
50   71 bwt <= 2500  17 120 black nonsmoker   0  no  no   2   6 2438
51   75 bwt <= 2500  26 154 other nonsmoker   1 yes  no   1  10 2442
52   76 bwt <= 2500  20 105 other nonsmoker   0  no  no   3   6 2450
53   77 bwt <= 2500  26 190 white    smoker   0  no  no   0   4 2466
54   78 bwt <= 2500  14 101 other    smoker   1  no  no   0   7 2466
55   79 bwt <= 2500  28  95 white    smoker   0  no  no   2   7 2466
56   81 bwt <= 2500  14 100 other nonsmoker   0  no  no   2   6 2495
57   82 bwt <= 2500  23  94 other    smoker   0  no  no   0   4 2495
58   83 bwt <= 2500  17 142 black nonsmoker   0 yes  no   0   2 2495
59   84 bwt <= 2500  21 130 white    smoker   0 yes  no   3   4 2495
60   85  bwt > 2500  19 182 black nonsmoker   0  no yes   0   4 2523
61   86  bwt > 2500  33 155 other nonsmoker   0  no  no   3   6 2551
62   87  bwt > 2500  20 105 white    smoker   0  no  no   1  10 2557
63   88  bwt > 2500  21 108 white    smoker   0  no yes   2  10 2594
64   89  bwt > 2500  18 107 white    smoker   0  no yes   0   2 2600
65   91  bwt > 2500  21 124 other nonsmoker   0  no  no   0   5 2622
66   92  bwt > 2500  22 118 white nonsmoker   0  no  no   1   1 2637
67   93  bwt > 2500  17 103 other nonsmoker   0  no  no   1   7 2637
68   94  bwt > 2500  29 123 white    smoker   0  no  no   1   4 2663
69   95  bwt > 2500  26 113 white    smoker   0  no  no   0   2 2665
70   96  bwt > 2500  19  95 other nonsmoker   0  no  no   0   4 2722
71   97  bwt > 2500  19 150 other nonsmoker   0  no  no   1   9 2733
72   98  bwt > 2500  22  95 other nonsmoker   0 yes  no   0  10 2750
73   99  bwt > 2500  30 107 other nonsmoker   1  no yes   2  17 2750
74  100  bwt > 2500  18 100 white    smoker   0  no  no   0   0 2769
75  101  bwt > 2500  18 100 white    smoker   0  no  no   0   0 2769
76  102  bwt > 2500  15  98 black nonsmoker   0  no  no   0   7 2778
77  103  bwt > 2500  25 118 white    smoker   0  no  no   3   7 2782
78  104  bwt > 2500  20 120 other nonsmoker   0  no yes   0   4 2807
79  105  bwt > 2500  28 120 white    smoker   0  no  no   1   6 2821
80  106  bwt > 2500  32 121 other nonsmoker   0  no  no   2  10 2835
81  107  bwt > 2500  31 100 white nonsmoker   0  no yes   3   4 2835
82  108  bwt > 2500  36 202 white nonsmoker   0  no  no   1   7 2836
83  109  bwt > 2500  28 120 other nonsmoker   0  no  no   0   8 2863
84  111  bwt > 2500  25 120 other nonsmoker   0  no yes   2  10 2877
85  112  bwt > 2500  28 167 white nonsmoker   0  no  no   0  12 2877
86  113  bwt > 2500  17 122 white    smoker   0  no  no   0   9 2906
87  114  bwt > 2500  29 150 white nonsmoker   0  no  no   2   4 2920
88  115  bwt > 2500  26 168 black    smoker   0  no  no   0   6 2920
89  116  bwt > 2500  17 113 black nonsmoker   0  no  no   1  12 2920
90  117  bwt > 2500  17 113 black nonsmoker   0  no  no   1  12 2920
91  118  bwt > 2500  24  90 white    smoker   1  no  no   1   1 2948
92  119  bwt > 2500  35 121 black    smoker   1  no  no   1  11 2948
93  120  bwt > 2500  25 155 white nonsmoker   0  no  no   1   5 2977
94  121  bwt > 2500  25 125 black nonsmoker   0  no  no   0   4 2977
95  123  bwt > 2500  29 140 white    smoker   0  no  no   2   7 2977
96  124  bwt > 2500  19 138 white    smoker   0  no  no   2   2 2977
97  125  bwt > 2500  27 124 white    smoker   0  no  no   0   3 2992
98  126  bwt > 2500  31 215 white    smoker   0  no  no   2  11 3005
99  127  bwt > 2500  33 109 white    smoker   0  no  no   1   6 3033
100 128  bwt > 2500  21 185 black    smoker   0  no  no   2   8 3042
101 129  bwt > 2500  19 189 white nonsmoker   0  no  no   2   4 3062
102 130  bwt > 2500  23 130 black nonsmoker   0  no  no   1   4 3062
103 131  bwt > 2500  21 160 white nonsmoker   0  no  no   0  11 3062
104 132  bwt > 2500  18  90 white    smoker   0  no yes   0   6 3076
105 133  bwt > 2500  18  90 white    smoker   0  no yes   0   6 3076
106 134  bwt > 2500  32 132 white nonsmoker   0  no  no   4   7 3080
107 135  bwt > 2500  19 132 other nonsmoker   0  no  no   0   3 3090
108 136  bwt > 2500  24 115 white nonsmoker   0  no  no   2   5 3090
109 137  bwt > 2500  22  85 other    smoker   0  no  no   0   5 3090
110 138  bwt > 2500  22 120 white nonsmoker   0 yes  no   1   3 3100
111 139  bwt > 2500  23 128 other nonsmoker   0  no  no   0   8 3104
112 140  bwt > 2500  22 130 white    smoker   0  no  no   0   4 3132
113 141  bwt > 2500  30  95 white    smoker   0  no  no   2   4 3147
114 142  bwt > 2500  19 115 other nonsmoker   0  no  no   0   7 3175
115 143  bwt > 2500  16 110 other nonsmoker   0  no  no   0   3 3175
116 144  bwt > 2500  21 110 other    smoker   0  no yes   0   7 3203
117 145  bwt > 2500  30 153 other nonsmoker   0  no  no   0   6 3203
118 146  bwt > 2500  20 103 other nonsmoker   0  no  no   0   5 3203
119 147  bwt > 2500  17 119 other nonsmoker   0  no  no   0   9 3225
120 148  bwt > 2500  17 119 other nonsmoker   0  no  no   0   9 3225
121 149  bwt > 2500  23 119 other nonsmoker   0  no  no   2   5 3232
122 150  bwt > 2500  24 110 other nonsmoker   0  no  no   0   6 3232
123 151  bwt > 2500  28 140 white nonsmoker   0  no  no   0   4 3234
124 154  bwt > 2500  26 133 other    smoker   2  no  no   0   3 3260
125 155  bwt > 2500  20 169 other nonsmoker   1  no yes   1   8 3274
126 156  bwt > 2500  24 115 other nonsmoker   0  no  no   2  11 3274
127 159  bwt > 2500  28 250 other    smoker   0  no  no   6  13 3303
128 160  bwt > 2500  20 141 white nonsmoker   2  no yes   1   7 3317
129 161  bwt > 2500  22 158 black nonsmoker   1  no  no   2   5 3317
130 162  bwt > 2500  22 112 white    smoker   2  no  no   0   7 3317
131 163  bwt > 2500  31 150 other    smoker   0  no  no   2   7 3321
132 164  bwt > 2500  23 115 other    smoker   0  no  no   1  10 3331
133 166  bwt > 2500  16 112 black nonsmoker   0  no  no   0  11 3374
134 167  bwt > 2500  16 135 white    smoker   0  no  no   0   3 3374
135 168  bwt > 2500  18 229 black nonsmoker   0  no  no   0   6 3402
136 169  bwt > 2500  25 140 white nonsmoker   0  no  no   1   8 3416
137 170  bwt > 2500  32 134 white    smoker   1  no  no   4   7 3430
138 172  bwt > 2500  20 121 black    smoker   0  no  no   0   6 3444
139 173  bwt > 2500  23 190 white nonsmoker   0  no  no   0   3 3459
140 174  bwt > 2500  22 131 white nonsmoker   0  no  no   1   7 3460
141 175  bwt > 2500  32 170 white nonsmoker   0  no  no   0   4 3473
142 176  bwt > 2500  30 110 other nonsmoker   0  no  no   0   8 3475
143 177  bwt > 2500  20 127 other nonsmoker   0  no  no   0   3 3487
144 179  bwt > 2500  23 123 other nonsmoker   0  no  no   0  10 3544
145 180  bwt > 2500  17 120 other    smoker   0  no  no   0   7 3572
146 181  bwt > 2500  19 105 other nonsmoker   0  no  no   0   8 3572
147 182  bwt > 2500  23 130 white nonsmoker   0  no  no   0   4 3586
148 183  bwt > 2500  36 175 white nonsmoker   0  no  no   0  12 3600
149 184  bwt > 2500  22 125 white nonsmoker   0  no  no   1  13 3614
150 185  bwt > 2500  24 133 white nonsmoker   0  no  no   0   7 3614
151 186  bwt > 2500  21 134 other nonsmoker   0  no  no   2   8 3629
152 187  bwt > 2500  19 235 white    smoker   0 yes  no   0   5 3629
153 188  bwt > 2500  25  95 white    smoker   3  no yes   0   8 3637
154 189  bwt > 2500  16 135 white    smoker   0  no  no   0   2 3643
155 190  bwt > 2500  29 135 white nonsmoker   0  no  no   1   4 3651
156 191  bwt > 2500  29 154 white nonsmoker   0  no  no   1   5 3651
157 192  bwt > 2500  19 147 white    smoker   0  no  no   0   4 3651
158 193  bwt > 2500  19 147 white    smoker   0  no  no   0   4 3651
159 195  bwt > 2500  30 137 white nonsmoker   0  no  no   1   5 3699
160 196  bwt > 2500  24 110 white nonsmoker   0  no  no   1   8 3728
161 197  bwt > 2500  19 184 white    smoker   0 yes  no   0   7 3756
162 199  bwt > 2500  24 110 other nonsmoker   1  no  no   0  10 3770
163 200  bwt > 2500  23 110 white nonsmoker   0  no  no   1   4 3770
164 201  bwt > 2500  20 120 other nonsmoker   0  no  no   0   2 3770
165 202  bwt > 2500  25 241 black nonsmoker   0 yes  no   0  10 3790
166 203  bwt > 2500  30 112 white nonsmoker   0  no  no   1   5 3799
167 204  bwt > 2500  22 169 white nonsmoker   0  no  no   0   7 3827
168 205  bwt > 2500  18 120 white    smoker   0  no  no   2   6 3856
169 206  bwt > 2500  16 170 black nonsmoker   0  no  no   4   8 3860
170 207  bwt > 2500  32 186 white nonsmoker   0  no  no   2   6 3860
171 208  bwt > 2500  18 120 other nonsmoker   0  no  no   1  13 3884
172 209  bwt > 2500  29 130 white    smoker   0  no  no   2   8 3884
173 210  bwt > 2500  33 117 white nonsmoker   0  no yes   1   2 3912
174 211  bwt > 2500  20 170 white    smoker   0  no  no   0   4 3940
175 212  bwt > 2500  28 134 other nonsmoker   0  no  no   1   8 3941
176 213  bwt > 2500  14 135 white nonsmoker   0  no  no   0   8 3941
177 214  bwt > 2500  28 130 other nonsmoker   0  no  no   0   8 3969
178 215  bwt > 2500  25 120 white nonsmoker   0  no  no   2   7 3983
179 216  bwt > 2500  16  95 other nonsmoker   0  no  no   1  10 3997
180 217  bwt > 2500  20 158 white nonsmoker   0  no  no   1   6 3997
181 218  bwt > 2500  26 160 other nonsmoker   0  no  no   0   9 4054
182 219  bwt > 2500  21 115 white nonsmoker   0  no  no   1   5 4054
183 220  bwt > 2500  22 129 white nonsmoker   0  no  no   0   4 4111
184 221  bwt > 2500  25 130 white nonsmoker   0  no  no   2   9 4153
185 222  bwt > 2500  31 120 white nonsmoker   0  no  no   2   7 4167
186 223  bwt > 2500  35 170 white nonsmoker   1  no  no   1   6 4174
187 224  bwt > 2500  19 120 white    smoker   0  no  no   0   3 4238
188 225  bwt > 2500  24 116 white nonsmoker   0  no  no   1   7 4593
189 226  bwt > 2500  45 123 white nonsmoker   0  no  no   1   5 4990