Datamash is a command-line utility that can validate and perform summary analysis on text format data files such as CSV and tab-delimited. It can compute a number of descriptive statistics (for example, mean, median, and standard deviation) and even includes a number of statistical tests to determine if data were drawn from a normal distribution. It can perform cross-tabulation on the input to summarize it by categories, similar to the pivot table feature in many spreadsheet programs. The Examples sub-section of the Datamash website (found by scrolling to the Documentation and Help section of the Software page) provides several sample files and uses them to demonstrate the capability of the software. The examples given include analysis of grades and bioinformatics on data from the Human Genome Project. The Datamash manual (located on the Docs page) serves as a comprehensive reference on the features of the software. Windows users can locate installers by following the "download section" link (under Downloading Datamesh on the page linked above). Linux and BSD users can locate Datamash in their system's package repositories. MacOS users can install Datamash using Macports, Fink, or Nixpkgs.
Comments