Releases: SomaLogic/SomaDataIO
SomaDataIO 6.1.0
Lifting Code 🚀
-
Major restructure of
lift_adat()
functionality (@stufield, #81, #78)lift_adat()
now takes abridge =
argument,
replacing theanno.tbl =
argument (which is now deprecated). Lifting
is now performed internally for a better (and safer)
user experience, without the necessity of an
external annotations (Excel) file.- the majority of this refactoring was internal
and the user should not experience a major
disruption to the API. - much improved lifting/bridging documentation (#82)
-
Added a new lifting and bridging vignette (@stufield, #77)
- in addition to the improved lifting documentation
this new vignette provides additional context,
explanation, clear examples, and lifting guidance.
- in addition to the improved lifting documentation
New Functions ✨
-
is_lifted()
is new and returns a boolean according to
whether the signal space (RFU) has been previously lifted -
Lifting accessor function for Lin's CCC values (#88)
getSomaScanLiftCCC()
accesses the lifting correlations between
SomaScan versions for each analyte- returns a
tibble
split by sample matrix (serum or plasma)
-
merge_clin()
is newly exported (#80)- a thin wrapper that allows users to merge
clinical variables tosoma_adat
objects easily - previously users had to either use the CLI merge tool
or merge in clinical variables themselves withdplyr
- a thin wrapper that allows users to merge
-
Newly exported ADAT "get**" helpers (#83)
- functions to access properties of ADATs
getAdatVersion()
getSomaScanVersion()
getSignalSpace()
checkSomaScanVersion()
getAdatVersion()
gets a new S3 method (#92)- this enables passing of different objects
- namely
soma_adat
orlist
depending on the situation
- functions to access properties of ADATs
-
Newly exported functions that were previously internal only:
addAttributes()
addClass()
cleanNames()
New Vignettes 🤓
-
The package
README
is now simplified (#35)- example analysis workflows are now split out
into their own vignettes/articles
and cross-linked in theREADME
- example analysis workflows are now split out
-
Reorganization and expansion of statistical vignettes (#35, #47)
- moved 3 existing statistical examples from
README
into their own vignettes - resulting in four new "Statistical Workflow" vignettes/articles:
- Binary classification via logistic regression
- Linear regression for continuous variables
- Two-group comparison via t-test
- Three-group analysis ANOVA
- moved 3 existing statistical examples from
-
Added new general analysis workflow vignettes
- articles for the pkgdown website have been built out
- new articles on:
- safely mapping values among variables
- safely renaming a data frame
- loading-and-wrangling
- typical train and test data splits
- beginning the FAQs and/or Coming Soon pages
-
Added a new vignette describing how to use the
command-line interface merge tool (#45)- the new CLI merge tool used to add
new clinical data to existing ADAT file
- the new CLI merge tool used to add
Updates and Improvements 🔨
-
collapseAdats()
better combinesHEADER
information (#86)- certain information, e.g.
PlateScale
andCal*
,
are better maintained in the final collapsed ADAT - other entries are combined by pasting into a single string
- should result in less duplication of superfluous entries and
retention of more "useful"HEADER
information
in the resulting (collapsed)soma_adat
- certain information, e.g.
-
Update
read_annotations()
with11k
content (#85) -
Update
transform()
andscaleAnalytes()
scaleAnalytes()
(internal) now skips missing references
and is much more like a "step" in therecipes
packagetransform()
gets edge case protection withdrop = FALSE
in case a single-analytesoma_adat
is scaled.
-
New
row.names()
S3 method support forsoma_adat
class- dispatched on calls to
rownmaes()
- rather than calling
NextMethod()
which normally
would invokedata.frame
, we now force thedata.frame
method in case there aretbl_df
orgrouped_df
classes present that would be dispatched.
Those are bypassed in favor of thedata.frame
becausetbl_df
1) can nuke the attributes, 2)
triggers a warning about adding rownames to atibble
.
- dispatched on calls to
-
New
grouped_df
S3 print support for the groupedsoma_adat
- now displays Grouping information from a call to
the S3 print method forsoma_adat
class
- now displays Grouping information from a call to
-
New
grouped_df
S3 method support forsoma_adat
class (#66)grouped_df
data objects previously unsupported and were
interfering with downstream S3 methods fordplyr
verbs
onceNextMethod()
was called- this support now ensures that the group
methods are maintained, as well as thesoma_adat
class itself (and most importantly, with its attributes intact)
-
tidyr::separate.soma_adat()
S3 method was simplified (#72)- now uses
%||%
helper internally - expanded error messages inside
stopifnot()
to be more informative
- now uses
-
is_intact_attr()
is now much quieter, signaling only when called indirectly (#71)- new conditional logic to silences signaling messages when
called from within another function (indirectly) - these previously lead to confusing messages
when they appear in wrappers, whereis_intact_attr()
can be, sometimes deeply, nested in the call stack
- new conditional logic to silences signaling messages when
-
Development and improvements to the
pkgdown
website- added new links and improved clarity in YAML
- added new logo at footer
- restyled side bar for easier hyperlinking and getting help
- clicking on the SomaLogic logo in the GitHub
README
now links to thepkgdown
website - new "Coming Soon" drop-down section in the website header
to let users know about active progress (but not yet ready
for external publication)
-
SomaDataIO
no longer depends ondesc
package- to generate the
README.md
- to generate the
Internal 🚧
-
Internal rowname helpers were upgraded
- they now use internal cross-functions
as originally intended to avoid redundancy, efficiency,
and improved debugging
- they now use internal cross-functions
-
sysdata.rda
no longer contains non-exported functions (#59)- new internal helper functions:
convertColMeta()
genRowNames()
parseCheck()
syncColMeta()
scaleAnalytes()
- new internal helper functions:
-
Bug-fix for corner-case writing a single-analyte ADAT (#51)
- RFU values are rounded to 1 decimal place when written by
write_adat()
, via a call toapply()
, which expects a 2-dim object
when replacing those values. write_adat()
no longer usesapply()
and instead converts
the entire RFU data frame to a matrix (maintains original dimensions),
and use vectorized format conversion viasprintf()
- in theory this should be faster because
sprintf()
is only called once on a long vector, rather than
1000s of times on shorter vectors (insideapply()
).
- RFU values are rounded to 1 decimal place when written by
SomaDataIO 6.0.0
We are now on CRAN! 🥳
New changes
- New clinical data merge CLI tool (#25)
Rscript --vanilla merge_clin.R
for merging clinical variables
into existing*.adat
SomaScan data files- added 2 new example
meta.csv
andmeta2.csv
files
to run examples with random data but with valid index keys - see
dir(system.file("cli", "merge", package = "SomaDataIO"))
Package data objects
- general reorganization of package objects, trimming
and slimming, renaming, and file structure (#32) example_data.adat
was reduced in size ton = 10
samples (from 192)
to conform to CRAN size requirements (< 5MB)- the current file was renamed
example_data10.adat
to reflect this change - this likely has far-reaching consequences for users who access
this flat file viasystem.file()
- the
example_data
object itself however remains true to its original
file (https://github.com/SomaLogic/SomaLogic-Data/blob/master/example_data.adat
) - the directory location
inst/example/
was renamedinst/extdata/
to conform to CRAN package standard naming conventions - the file
single_sample.adat
was removed from package data
as it is now redundant (however still used in unit testing) SomaDataObjects
was renamed and is nowSomaScanObjects
Gradual deprecation
read.adat()
is now soft-deprecated; please useread_adat() instead
- lifecycle for soft-deprecated
warn()
->stop()
for functions
that have been been soft deprecated sincev5.0.0
getSomamers()
getSomamerData()
meltExpressionSet()
S3 Methods
- New S3 print method default
tibble
has newmax_extra_cols =
argument, which
is set to6
for theprint.soma_adat
method
- New S3 merge method (#31)
- calling
base::merge()
on asoma_adat
is strongly discouraged - we now redirect users to use
dplyr::*_join()
alternatives
which are designed to preservesoma_adat
attributes
- calling
Internal upgrades
-
Code hardening for
prepHeaderMeta()
- some ADATs do not have
CreatedDate
andCreatedBy
in the HEADER entry. This currently breaks the writer - simplified to make more robust but also refactor
to be more convenient (for abnormal ADATs not generated
by standard SomaScan processing) CreatedDateHistory
was removed as an entry from written ADATsCreatedByHistory
was combined and dated for written ADATsNULL
behavior remains if keys are missingCreatedBy
andCreatedDate
will be generated either
as new entries or over-written as appropriate
- some ADATs do not have
-
Numerous non-user-facing (API) changes internal package
maintenance, efficiency, and structural upgrades were included
SomaDataIO 5.3.1
This bug-fix release primarily includes fixes related to write_adat()
and its functionality:
-
fixed bug in
write_adat()
that resulted from
adding/removing clinical (non-SomaScan) variables to an
ADAT. Export viawrite_adat()
resulted in a broken ADAT file (#18) -
write_adat()
now has much higher fidelity to original
text file (*.adat
) in full-cycle read-write-read operations;
particularly in presence of bangs (!
) in the Header
section and in floating point decimals in the?Col.Meta
section (#24) -
write_adat()
no longer converts commas (,
) to
semi-colons (;
) in the?Col.Meta
block -
write_adat()
no longer concatenates written ADATs,
when writing to the same file. Data is over-written
to file to avoid mangled ADATs resulting from re-writing
to the same connection and to match the default behavior
ofwrite.table()
,write.csv()
, etc. -
read_adat()
now has more consistent character type
theBarcode2
variable in standard ADATs -
Decreased dependency of
magrittr
pipes (%>%
) in favor of the native R pipe (|>
) -
Migration to the default branch in GitHub from
master
->main
(#19)
SomaDataIO 5.3.0
Minor version
- updates from internal SomaLogic Operating Co. Inc. code base
- many code simplifications
- reduction in package dependencies have been reduced (in favor of base R implementation)
- speed/implementation improvements
- a minor bug-fix for the
dplyr::mutate.soma_adat()
S3 method fordplyr v1.0.8
which included a breaking change that strippedsoma_adat
class attributes. - Enjoy! 🎉
SomaDataIO 5.2.0
Lifting suite
- incorporates SomaScan "lifting" code that allows conversion between SomaScan
v4
<->v4.1
- new functions:
lift_adat()
read_annotations()
- minor internal structural-refactoring.
SomaDataIO 5.1.0
Minor release version
- internal structural-refactoring changes
- some exported functions were renamed (with redirects for backward compatibility)
SomaDataIO 5.0.0
Initial Public GitHub Release of SomaDataIO
🎉
SomaDataIO
starts atv5.0.0
because existing versions of the
package already exist and have been disseminated via other means- this release on GitHub allows for a more open development process
quicker feedback from users, and issue public tracking for questions/answers