
biomaRt - Interface to BioMart databases (i.e. Ensembl)
In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<https://www.ensembl.org/info/data/biomart/index.html>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintained by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.
Last updated
annotationbioconductorbiomartensembl
16.50 score 49 stars 214 dependents 17k scripts 41k downloadsrhdf5 - R Interface to HDF5
This package provides an interface between HDF5 and R. HDF5's main features are the ability to store and access very large and/or complex datasets and a wide variety of metadata on mass storage (disk) through a completely portable file format. The rhdf5 package is thus suited for the exchange of large and/or complex datasets between R and other software package, and for letting R applications work on datasets that are larger than the available RAM.
Last updated
infrastructuredataimporthdf5rhdf5curlopensslcpp
15.12 score 73 stars 230 dependents 6.3k scripts
scoringutils - Utilities for Scoring and Assessing Predictions
Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.
Last updated
forecast-evaluationforecasting
11.17 score 61 stars 7 dependents 445 scripts 1.1k downloadssocialmixr - Social Mixing Matrices for Infectious Disease Modelling
Methods for sampling contact matrices from diary data for use in infectious disease modelling, as discussed in Mossong et al. (2008) <doi:10.1371/journal.pmed.0050074>.
Last updated
10.90 score 44 stars 2 dependents 376 scripts 1.1k downloadsRarr - Read Zarr Files in R
The Zarr specification defines a format for chunked, compressed, N-dimensional arrays. It's design allows efficient access to subsets of the stored array, and supports both local and cloud storage systems. Rarr aims to implement this specification in R with minimal reliance on an external tools or libraries.
Last updated
dataimportbioconductorome-ngffome-zarron-diskout-of-memoryzarrc-blosclibzstd
10.65 score 53 stars 7 dependents 92 scriptspavo - Perceptual Analysis, Visualization and Organization of Spectral Colour Data
A cohesive framework for the spectral and spatial analysis of colour described in Maia, Eliason, Bitton, Doucet & Shawkey (2013) <doi:10.1111/2041-210X.12069> and Maia, Gruson, Endler & White (2019) <doi:10.1111/2041-210X.13174>.
Last updated
10.02 score 81 stars 2 dependents 172 scripts 857 downloads
Rhdf5lib - hdf5 library as an R package
Provides C and C++ hdf5 libraries.
Last updated
infrastructurebioconductorhdf5hdf5-library
9.84 score 7 stars 346 dependents 29 scripts
epinowcast - A Bayesian Framework for Real-time Infectious Disease Surveillance
A modular Bayesian framework for real-time infectious disease surveillance. Provides tools for nowcasting, reproduction number estimation, delay estimation, and forecasting from data subject to reporting delays, right-truncation, missing data, and incomplete ascertainment. Users can build models suited to their setting using a flexible formula interface supporting fixed effects, random effects, random walks, and time-varying parameters, with options including parametric and non-parametric delay distributions with optional modifiers (via discrete-time hazard models), renewal processes, observation models, missing data imputation, and stratified analyses with partial pooling. By jointly estimating disease dynamics and reporting patterns, our framework enables earlier and more reliable detection of trends. While designed with epidemiological applications in mind, the framework can be applied to any right-truncated time series count data.
Last updated
cmdstanreffective-reproduction-number-estimationepidemiologyinfectious-disease-surveillancenowcastingoutbreak-analysispandemic-preparednessreal-time-infectious-disease-modellingstan
9.28 score 65 stars 98 scriptsRarr - Read Zarr Files in R
The Zarr specification defines a format for chunked, compressed, N-dimensional arrays. It's design allows efficient access to subsets of the stored array, and supports both local and cloud storage systems. Rarr aims to implement this specification in R with minimal reliance on an external tools or libraries.
Last updated
dataimportbioconductorome-ngffome-zarron-diskout-of-memoryzarrc-blosclibzstd
9.03 score 53 stars 7 dependents 91 scripts
rhdf5filters - HDF5 Compression Filters
Provides a collection of additional compression filters for HDF5 datasets. The package is intended to provide seamless integration with rhdf5, however the compiled filters can also be used with external applications.
Last updated
infrastructuredataimportcompressionfilter-pluginhdf5
8.49 score 5 stars 231 dependents 9 scriptslinelist - Tagging and Validating Epidemiological Data
Provides tools to help storing and handling case line list data. The 'linelist' class adds a tagging system to classical 'data.frame' objects to identify key epidemiological data such as dates of symptom onset, epidemiological case definition, age, gender or disease outcome. Once tagged, these variables can be seamlessly used in downstream analyses, making data pipelines more robust and reliable.
Last updated
datadata-structuresepidemiologyepiverseoutbreakssdg-3structured-data
8.39 score 11 stars 2 dependents 89 scripts 567 downloadslightr - Read Spectrometric Data and Metadata
Parse various reflectance/transmittance/absorbance spectra file formats to extract spectral data and metadata, as described in Gruson, White & Maia (2019) <doi:10.21105/joss.01857>. Among other formats, it can import files from 'Avantes' <https://www.avantes.com/>, 'CRAIC' <https://www.microspectra.com/>, and 'OceanOptics'/'OceanInsight' <https://www.oceanoptics.com/> brands.
Last updated
file-importreproducibilityreproducible-researchreproducible-sciencespectral-dataspectroscopy
8.27 score 15 stars 3 dependents 19 scripts 991 downloadslightr - Read Spectrometric Data and Metadata
Parse various reflectance/transmittance/absorbance spectra file formats to extract spectral data and metadata, as described in Gruson, White & Maia (2019) <doi:10.21105/joss.01857>. Among other formats, it can import files from 'Avantes' <https://www.avantes.com/>, 'CRAIC' <https://www.microspectra.com/>, and 'OceanOptics'/'OceanInsight' <https://www.oceanoptics.com/> brands.
Last updated
file-importreproducibilityreproducible-researchreproducible-sciencespectral-dataspectroscopy
8.27 score 15 stars 3 dependents 19 scripts 991 downloads
fundiversity - Easy Computation of Functional Diversity Indices
Computes six functional diversity indices. These are namely, Functional Divergence (FDiv), Function Evenness (FEve), Functional Richness (FRic), Functional Richness intersections (FRic_intersect), Functional Dispersion (FDis), and Rao's entropy (Q) (reviewed in Villéger et al. 2008 <doi:10.1890/07-1206.1>). Provides efficient, modular, and parallel functions to compute functional diversity indices (Grenié & Gruson 2023 <doi:10.1111/ecog.06585>).
Last updated
biodiversitybiodiversity-indicatorsbiodiversity-informaticsfunctional-diversityfunctional-ecologyfunctional-traitfunctional-traitstraittrait-basedtraits
8.12 score 43 stars 1 dependents 51 scripts 460 downloadsarrayQualityMetrics - Quality metrics report for microarray data sets
This package generates microarray quality metrics reports for data in Bioconductor microarray data containers (ExpressionSet, NChannelSet, AffyBatch). One and two color array platforms are supported.
Last updated
microarrayqualitycontrolonechanneltwochannelreportwritingbioconductor
8.09 score 1 stars 337 scripts 1.0k downloadsDEXSeq - Inference of differential exon usage in RNA-Seq
The package is focused on finding differential exon usage using RNA-seq exon counts between samples with different experimental designs. It provides functions that allows the user to make the necessary statistical tests based on a model that uses the negative binomial distribution to estimate the variance between biological replicates and generalized linear models for testing. The package also provides functions for the visualization and exploration of the results.
Last updated
immunooncologysequencingrnaseqdifferentialexpressionalternativesplicingdifferentialsplicinggeneexpressionvisualization
7.95 score 5 dependents 498 scripts 3.0k downloadsgrumpy - Read 'NumPy' '.npy' and '.npz' Files
Lightweight way to read 'NumPy' '.npy' and '.npz' files in R. All data types supported by 'NumPy', with all sizes (converted internally to R native size), both C and 'Fortran' order, and any shape, up to an arbitrary number of dimensions, are supported.
Last updated
quarto
7.28 score 6 stars 7 dependents 2.0k downloadsZarrArray - Bring Zarr datasets in R as DelayedArray objects
The ZarrArray package leverages the Rarr package to bring Zarr datasets in R as DelayedArray objects. The main class in the package is the ZarrArray class. A ZarrArray object is an array-like object that represents a Zarr dataset in R. ZarrArray objects are DelayedArray derivatives and therefore support all operations (delayed or block-processed) supported by DelayedArray objects.
Last updated
infrastructuredatarepresentationdataimportbioconductor-packagecore-packageu24ca289073
6.62 score 5 stars 4 dependents 7 scriptscontactdata - Social Contact Matrices for 177 Countries
Data package for the supplementary data in Prem et al. (2017) <doi:10.1371/journal.pcbi.1005697> and Prem et al. <doi:10.1371/journal.pcbi.1009098>. Provides easy access to contact data for 177 countries, for use in epidemiological, demographic or social sciences research.
Last updated
demographicsepidemiologysocial-sciences
5.56 score 10 stars 24 scripts 323 downloadscontactdata - Social Contact Matrices for 177 Countries
Data package for the supplementary data in Prem et al. (2017) <doi:10.1371/journal.pcbi.1005697> and Prem et al. <doi:10.1371/journal.pcbi.1009098>. Provides easy access to contact data for 177 countries, for use in epidemiological, demographic or social sciences research.
Last updated
demographicsepidemiologysocial-sciences
5.56 score 10 stars 24 scripts 459 downloadsmcmcensemble - Ensemble Sampler for Affine-Invariant MCMC
Provides ensemble samplers for affine-invariant Monte Carlo Markov Chain, which allow a faster convergence for badly scaled estimation problems. Two samplers are proposed: the 'differential.evolution' sampler from ter Braak and Vrugt (2008) <doi:10.1007/s11222-008-9104-9> and the 'stretch' sampler from Goodman and Weare (2010) <doi:10.2140/camcos.2010.5.65>.
Last updated
mcmcmcmc-sampler
5.08 score 2 stars 15 scripts 258 downloadsauthoritative - Parse and Deduplicate Author Names
Utilities to parse authors fields from DESCRIPTION files and general purpose functions to deduplicate names in database, beyond the specific case of R package authors.
Last updated
data-cleaningdata-extraction
4.70 score 7 scripts 154 downloadsnumberize - Convert Words to Numbers in Multiple Languages
Converts written out numbers into their equivalent numbers. Supports numbers written out in English, French, or Spanish.
Last updated
r-programming
4.26 score 4 stars 1 dependents 3 scripts 290 downloads