There are many ways to extract columns from the data frame. In this article, we will discuss two scenarios with their corresponding methods.
Now, we will see how to extract columns from a data frame. First, let’s create a data frame.
market=data.frame(market_id=c(1,2,3,4),market_name=c('M1','M2','M3','M4'),
market_place=c('India','USA','India','Australia'),market_type=c('grocery','bar','grocery',
'restaurent'),market_squarefeet=c(120,342,220,110))
#display the market dataframe
print(market)
Result:
You can see the market data frame here:
Let’s discuss them one by one.
Scenario 1: Extract Columns From the Data Frame by Column Name
In this scenario, we will see different methods to extract column/s from a data frame using column names. It returns the values present in the column in the form of a vector.
Method 1: $ Operator
The $ operator will be used to access the data present in a data frame column.
Syntax:
Where,
- The dataframe_object is the data frame.
- The column is the name of the column to be retrieved.
Example
In this example, we will extract market_name and market_type columns separately.
market=data.frame(market_id=c(1,2,3,4),market_name=c('M1','M2','M3','M4'),
market_place=c('India','USA','India','Australia'),market_type=c('grocery','bar','grocery',
'restaurent'),market_squarefeet=c(120,342,220,110))
#extract market_name column
print(market$market_name)
#extract market_type column
print(market$market_type)
Result:
We can see that the values present in market_name and market_type were returned.
Method 2: Specifying Column Names in a Vector
Here, we are specifying column names to be extracted inside a vector.
Syntax:
Where,
- The dataframe_object is the data frame.
- The column is the name of the column/s to be retrieved.
Example
In this example, we will extract “market_id”, “market_squarefeet”, and “market_place” columns at a time.
market=data.frame(market_id=c(1,2,3,4),market_name=c('M1','M2','M3','M4'),
market_place=c('India','USA','India','Australia'),market_type=c('grocery','bar','grocery',
'restaurent'),market_squarefeet=c(120,342,220,110))
#extract columns - "market_id","market_squarefeet" and "market_place"
print(market[ , c("market_id", "market_squarefeet","market_place")])
Result:
We can see that the columns: “market_id”, “market_squarefeet”, and “market_place” were returned.
Method 3: subset() With select()
In this case, we are using subset() with a select parameter to extract column names from the data frame. It takes two parameters. The first parameter is the data frame object, and the second parameter is the select() method. The column names through a vector are assigned to this method.
Syntax:
Parameters:
- The dataframe_object is the data frame.
- The column is the name of the column/s to be retrieved via the select() method.
Example
In this example, we will extract “market_id”,”market_squarefeet” and “market_place” columns at a time using subset() with select parameter.
market=data.frame(market_id=c(1,2,3,4),market_name=c('M1','M2','M3','M4'),
market_place=c('India','USA','India','Australia'),market_type=c('grocery','bar','grocery',
'restaurent'),market_squarefeet=c(120,342,220,110))
#extract columns -"market_id","market_squarefeet" and "market_place"
print(subset(market,select= c("market_id", "market_squarefeet","market_place")) )
Result:
We can see that the columns: “market_id”, “market_squarefeet”, and “market_place” were returned.
Method 4: select()
The select() method takes column names to be extracted from the data frame and loaded into the dataframe object using the “%>%” operator. The select() method is available in the dplyr library. Therefore, we need to use this library.
Syntax:
Parameters:
- The dataframe_object is the data frame.
- The column is the name of the column/s to be retrieved.
Example
In this example, we will extract “market_id”,”market_squarefeet”, and “market_place” columns at a time using the select() method.
#create a dataframe-market that has 4 rows and 5 columns.
market=data.frame(market_id=c(1,2,3,4),market_name=c('M1','M2','M3','M4'),
market_place=c('India','USA','India','Australia'),market_type=c('grocery','bar','grocery',
'restaurent'),market_squarefeet=c(120,342,220,110))
#extract columns - "market_id","market_squarefeet", and "market_place"
print(market %>% select("market_id", "market_squarefeet","market_place"))
Result:
We can see that the columns: “market_id”, “market_squarefeet”, and “market_place” were returned.
Scenario 2: Extract Columns From Data Frame by Column Indices
In this scenario, we will see different methods to extract column/s from a data frame using column index. It returns the values present in the column in the form of a vector. Index starts with 1.
Method 1: Specifying Column Indices in a Vector
Here, we are specifying column indices to be extracted inside a vector.
Syntax:
Where,
-
-
-
- The dataframe_object is the data frame.
- The index represents the column/s position to be retrieved.
-
-
Example
In this example, we will extract “market_id”,”market_squarefeet”, and “market_place” columns at a time.
market=data.frame(market_id=c(1,2,3,4),market_name=c('M1','M2','M3','M4'),
market_place=c('India','USA','India','Australia'),market_type=c('grocery','bar','grocery',
'restaurent'),market_squarefeet=c(120,342,220,110))
#extract columns - "market_id","market_squarefeet" and "market_place" using column indices
print(market[ , c(1,5,3)])
Result:
We can see that the columns – “market_id”,”market_squarefeet” and “market_place” were returned.
Method 2: subset() With select()
In this case, we are using subset() with select parameters to extract columns from the data frame with column indices. It takes two parameters. The first parameter is the dataframe object and the second parameter is the select() method. The column indices through a vector are assigned to this method.
Syntax:
Parameters:
- The dataframe_object is the data frame.
- The index represents the column/s position to be retrieved.
Example
In this example, we will extract “market_id”, “market_squarefeet”, and “market_place” columns at a time using the subset() method with select parameter.
market=data.frame(market_id=c(1,2,3,4),market_name=c('M1','M2','M3','M4'),
market_place=c('India','USA','India','Australia'),market_type=c('grocery','bar','grocery',
'restaurent'),market_squarefeet=c(120,342,220,110))
#extract columns - #extract columns - "market_id","market_squarefeet" and "market_place" using column indices
print(subset(market,select= c(1,5,3)) )
Result:
We can see that the columns: “market_id”, “market_squarefeet”, and “market_place” were returned.
Method 3: select()
The select() method takes the column indices to be extracted from the data frame and loaded into the data frame object using the “%>%” operator. The select() method is available in the dplyr library. Therefore, we need to use this library.
Syntax:
Parameters:
- The dataframe_object is the data frame.
- The index represents the column/s position to be retrieved.
Example
In this example, we will extract “market_id”,”market_squarefeet”, and “market_place” columns at a time using the select() method.
#create a dataframe-market that has 4 rows and 5 columns.
market=data.frame(market_id=c(1,2,3,4),market_name=c('M1','M2','M3','M4'),
market_place=c('India','USA','India','Australia'),market_type=c('grocery','bar','grocery',
'restaurent'),market_squarefeet=c(120,342,220,110))
#extract columns - #extract columns - "market_id","market_squarefeet" and "market_place" using column indices
print(market %>% select(1,5,3))
Result:
We can see that the columns: “market_id”, “market_squarefeet”, and “market_place” were returned.
Conclusion
This article discussed how we could extract the columns through column names and column indices using the select() and subset() methods with select parameters. And if we want to extract a single column, simply use the “$” operator.
from https://ift.tt/2edOKu4
0 Comments