
Summary Statistics
summary-statistics.Rdcollapse provides the following functions to efficiently summarize and examine data:
- qsu, shorthand for quick-summary, is an extremely fast summary command inspired by the (xt)summarize command in the STATA statistical software. It computes a set of 7 statistics (nobs, mean, sd, min, max, skewness and kurtosis) using a numerically stable one-pass method. Statistics can be computed weighted, by groups, and also within-and between entities (for multilevel / panel data).
- qtab, shorthand for quick-table, is a faster and more versatile alternative to- table. Notably, it also supports tabulations with frequency weights, as well as computing a statistic over combinations of variables. 'qtab's inherit the 'table' class, allowing for seamless application of 'table' methods.
- descrcomputes a concise and detailed description of a data frame, including (sorted) frequency tables for categorical variables and various statistics and quantiles for numeric variables. It is inspired by- Hmisc::describe, but about 10x faster.
- pwcor,- pwcovand- pwnobscompute (weighted) pairwise correlations, covariances and observation counts on matrices and data frames. Pairwise correlations and covariances can be computed together with observation counts and p-values. The elaborate print method displays all of these statistics in a single correlation table.
- varyingvery efficiently checks for the presence of any variation in data (optionally) within groups (such as panel-identifiers). A variable is variant if it has at least 2 distinct non-missing data points.
Table of Functions
| Function / S3 Generic | Methods | Description | ||
| qsu | default, matrix, data.frame, grouped_df, pseries, pdata.frame, sf | Fast (grouped, weighted, panel-decomposed) summary statistics | ||
| qtab | No methods, for data frames or vectors | Fast (weighted) cross tabulation | ||
| descr | default, grouped_df(default method handles most objects) | Detailed statistical description of data frame | ||
| pwcor | No methods, for matrices or data frames | Pairwise (weighted) correlations | ||
| pwcov | No methods, for matrices or data frames | Pairwise (weighted) covariances | ||
| pwnobs | No methods, for matrices or data frames | Pairwise observation counts | ||
| varying | default, matrix, data.frame, pseries, pdata.frame, grouped_df | Fast variation check |