Skip Navigation

Scout Archives

Home Projects Publications Archives About Sign Up or Log In

GNU Datamash

Datamash is a command-line utility that can validate and perform summary analysis on text format data files such as CSV and tab-delimited. It can compute a number of descriptive statistics (for example, mean, median, and standard deviation) and even includes a number of statistical tests to determine if data were drawn from a normal distribution. It can perform cross-tabulation on the input to summarize it by categories, similar to the pivot table feature in many spreadsheet programs. The Examples sub-section of the Datamash website (found by scrolling to the Documentation and Help section of the Software page) provides several sample files and uses them to demonstrate the capability of the software. The examples given include analysis of grades and bioinformatics on data from the Human Genome Project. The Datamash manual (located on the Docs page) serves as a comprehensive reference on the features of the software. Windows users can locate installers by following the "download section" link (under Downloading Datamesh on the page linked above). Linux and BSD users can locate Datamash in their system's package repositories. MacOS users can install Datamash using Macports, Fink, or Nixpkgs.
Archived Scout Publication URL
Scout Publication
GEM Subject
Date of Scout Publication
August 21st, 2020
Date Of Record Creation
August 11th, 2020 at 10:49am
Date Of Record Release
August 11th, 2020 at 1:46pm
Resource URL Clicks
Add Comment


(no comments available yet)