
Fast Data Manipulation
fast-data-manipulation.Rdcollapse provides the following functions for fast manipulation of (mostly) data frames.
fselectis a much faster alternative todplyr::selectto select columns using expressions involving column names.get_varsis a more versatile and programmer friendly function to efficiently select and replace columns by names, indices, logical vectors, regular expressions or using functions to identify columns.The functions
num_vars,cat_vars,char_vars,fact_vars,logi_varsanddate_varsare convenience functions to efficiently select and replace columns by data type.add_varsefficiently adds new columns at any position within a data frame (default at the end). This can be done vie replacement (i.e.add_vars(data) <- newdata) or returning the appended data (i.e.add_vars(data, newdata1, newdata2, ...)). Because of the latter,add_varsis also a more efficient alternative tocbind.data.frame.rowbindefficiently combines data frames / lists row-wise. The implementation is derived fromdata.table::rbindlist, it is also a fast alternative torbind.data.frame.joinprovides fast class-agnostic and verbose table joins.pivotefficiently reshapes data, supporting longer, wider and recast pivoting, as well as multi-column-pivots and taking along variable labels.fsubsetis a much faster version ofsubsetto efficiently subset vectors, matrices and data frames. If the non-standard evaluation offered byfsubsetis not needed, the functionssis a much faster and also more secure alternative to[.data.frame.fsummariseis a much faster version ofdplyr::summarisewhen used together with the Fast Statistical Functions andfgroup_by, with whom it also supports super fast weighted aggregation.fmutateis a much faster version ofdplyr::mutatewhen used together with the Fast Statistical Functions as well as fast Data Transformation Functions andfgroup_by.ftransformis a much faster version oftransform, which also supports list input and nested pipelines.settransformdoes all of that by reference, i.e. it modifies the data frame in the global environment.fcomputeis similar toftransformbut only returns modified and computed columns in a new data frame.roworderis a fast substitute fordplyr::arrange, but the syntax is inspired bydata.table::setorder.colorderefficiently reorders columns in a data frame, see alsodata.table::setcolorder.frenameis a fast substitute fordplyr::rename, to efficiently rename various objects.setrenamerenames objects by reference.relabelandsetrelabeldo the same thing for variable labels (see alsovlabels).
Table of Functions
| Function / S3 Generic | Methods | Description | ||
fselect(<-) | No methods, for data frames | Fast select or replace columns (non-standard evaluation) | ||
get_vars(<-), num_vars(<-), cat_vars(<-), char_vars(<-), fact_vars(<-), logi_vars(<-), date_vars(<-) | No methods, for data frames | Fast select or replace columns | ||
add_vars(<-) | No methods, for data frames | Fast add columns | ||
rowbind | No methods, for lists of lists/data frames | Fast row-binding lists | ||
join | No methods, for data frames | Fast table joins | ||
pivot | No methods, for data frames | Fast reshaping | ||
fsubset | default, matrix, data.frame, pseries, pdata.frame | Fast subset data (non-standard evaluation) | ||
ss | No methods, for data frames | Fast subset data frames | ||
fsummarise | No methods, for data frames | Fast data aggregation | ||
fmutate, (f/set)ftransform(<-) | No methods, for data frames | Compute, modify or delete columns (non-standard evaluation) | ||
fcompute(v) | No methods, for data frames | Compute or modify columns, returned in a new data frame (non-standard evaluation) | ||
roworder(v) | No methods, for data frames incl. pdata.frame | Reorder rows and return data frame (standard and non-standard evaluation) | ||
colorder(v) | No methods, for data frames | Reorder columns and return data frame (standard and non-standard evaluation) | ||
(f/set)rename, (set)relabel | No methods, for all objects with 'names' attribute | Rename and return object / relabel columns in a data frame. |