Guide to Unix/Commands/Text Processing

< Guide to Unix < Commands

Unix supports multiple text processing commands.

awk

awk is a powerful text-processing tool using regular expressions, providing expanded capabilities beyond #cut and #sed. You can learn more in AWK and An Awk Primer Wikibooks.

Links:

comm

Identifies lines common to two files or unique to them. Options control the manner of identification, e.g. outputting only common lines.

Links:

csplit

Splits input into output files. The split can be driven by the number of lines and by a regex match.

Links:

cut

cut can select columns ("fields") from lines in text files, with specifiable column separator.

Links:

expand

Converts tabs to spaces, defaulting to 8 spaces per tab. See also #unexpand.

Links:

fmt

Formats text, including reflowing paragraphs to a specific maximum number of characters per line. Does not seem covered by POSIX:

Links:

fold

Limits the maximum length of a line in a manner different from #fmt.

Links:

iconv

Converts between character encodings.

Links:

join

Combines lines from files based of their fields, assuming the files are sorted on the fields used for joining.

Links:

nl

Adds line numbers.

Links:

paste

For multiple files, joins lines corresponding by line number as if each file were a column of a table and each file line a row of the table.

Links:

pr

Formats input for printing, including pagination with header and footer.

Links:

sed

sed, a stream editor, is noted for its text replacement capability with regular expression support, but can do more. You can learn more in Sed Wikibook.

Links:

sort

Sorts lines in files.

Links:

spell

Peforms spell checking. Seems absent from POSIX.

Links:

tr

Performs a character-by-character mapping or "translation", and more.

Links:

unexpand

Converts spaces to tabs, defaulting to 8 spaces per tabs.

Links:

uniq

Outputs single lines out of each same-line bloks, and more. Ideally used with the input sorted.

Links:

This article is issued from Wikibooks. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.