- Published on
DS 4100 Day 3
- Authors
- Name
- Jacob Aronoff
DS 4100 Data Collection, Integration, and Analysis
Day 3, we're going over basic control flow:
> for (i in 1:3) {
print(paste("i =",i))
}
[1] "i = 1"
[1] "i = 2"
[1] "i = 3"
> i
[1] 3
cities <- c("Boston", "New York", "San Francisco")
for(city in cities) {
print(city)
}
You can also use for loops to iterate data frames.
# create a data frame
c1<-c("AA","BB","CC")
c2<-c(11,22,33)
df<-data.frame(c1,c2)
# display data frame
df
# len() returns the number of columns
len(df)
# nrow() returns the number of rows
nrow(df)
traverseDF <- function() {
n<-nrow(df)
for (i in 1:n) {
# access column 2 in row i
print(df[i,2])
# or this way
print(df$c2[i])
}
}
Now we're talking a bit more about enviroments which I went over in my most recent Learning R. Now we're on to more column manipulation:
> ap$Total <- cbind(rowSums(ap))
> ap
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 Total
1 1952 171 180 193 181 183 218 230 242 209 191 172 194 4316
2 1953 196 196 236 235 229 243 264 272 237 211 180 201 4653
3 1954 204 188 235 227 234 264 302 293 259 229 203 229 4821
>
Now that's pretty cool, you can add a new column which is just the sum. Just like excel, but with code <3
. Here's some really nice syntax. R redeems itself in regards to it's data management. Now onto R's naming conventions and data types:
- R usually uses lower camel case
- Modes are weird (their sort of like thunks almost)
- Factors are more efficient
And thats it! Next class is going to be more on factors, sadly I won't be there :( but I'll have my friend tell me what happened.