hidden pixel

Plain Text Information

In computing, plain text is the contents of an ordinary sequential file readable as textual material without much processing, usually opposed to formatted text.

The encoding has traditionally been either ASCII, one of its many derivatives such as ISO/IEC 646 etc., or sometimes EBCDIC.

Unicode is today gradually replacing the older ASCII derivatives limited to 7 or 8 bit codes.

Contents

Usage

The purpose of using plain text today is primarily a "lowest common denominator" independence from programs that require their very own special encoding or formatting (with due sacrifices and limitations). Plain text files can be opened, read, and edited with most text editors. Examples include Notepad (Windows), (DOS), ed, emacs, vi, vim, Gedit or nano (Unix, Linux), SimpleText (Mac OS), or TextEdit (Mac OS X). Other computer programs are also capable of reading and importing plain text. It can also be used by simple computer tools such as line printing text commands like type (DOS and Windows) and cat (Unix), but also for more complex activities like web browsers, i.e. Lynx and the Line Mode Browser.

Plain text files are almost universal in programming; a source code file containing instructions in a programming language is almost always a plain text file. Plain text is also commonly used for configuration files, which are read for saved settings at the startup of a program.

Plain text is the original and ever popular method of conveying e-mail. HTML formatted e-mail messages often include an automatically generated plain text copy as well, for compatibility reasons.

Encoding

Character encodings

Main article: Character encoding

Text was once commonly encoded in ASCII, using 8 bits for one letter or other character, encoding 7 bits, allowing 128 values, and using the 8th as a checksum bit when transferring a file. This just allowed the ordinary Latin alphabet, transfer control codes, parentheses and interpunction, which annoyed computer users, especially Portuguese and Swedish[citation needed] users.

When data transfer became more stable, the 8th bit stopped being used as a checksum and was used to extend the character set by another 128 characters; these non-standard characters were encoded differently in different countries, and in a way that made multilingual texts impossible to encode. For instance, a browser may display ¬A rather than ` if it tries to interpret one character set as another.

At last Unicode was defined, which currently allows for 1,114,112 code values used for any modern text writing system and a lot of extinct ones, and is universal. For example, Unicode encodes characters for Chinese, Hebrew, and Cyrillic as well as Latin. Some of these text formats may be quite complicated to process correctly, but they still contain no structural data, such as bold start and end markers, and are therefore plain text.

Control codes

Main article: Newline

The ASCII codes before SPACE (= 32 = 20H) are not intended as displayable characters, but instead as control characters. They are used for diverse interpreted meanings. For example, the code NULL (= 0, sometimes denoted Ctrl-@) is used as string end markers in the programming language C and successors. Most troublesome of these are the codes LF (= LINE FEED = 10 = 0AH) and CR (= CARRIAGE RETURN = 13 = 0DH). Windows and OS/2 require the sequence CR,LF to represent a newline, while Unix and relatives use just the LF, and Classic Mac OS (but not Mac OS X) uses just the code CR. This was once a slight problem when transferring files between Windows and Unices, but today most computer programs treat this seamlessly.

See also

· · Data types
Uninterpreted Bit · Byte · Trit · Tryte · Word
Numeric Integer · Fixed-point · Floating-point · Rational · Complex · Bignum · Interval
Text Character · String
Pointer Address · Reference
Composite Algebraic data type (generalized) · Array · Associative array · Class · List · Object · Option type · Product · Record · Set · Union (tagged)
Other Boolean · Bottom type · Collection · Enumerated type · Exception · Function type · Opaque data type · Recursive data type · Semaphore · Stream · Top type · Type class · Unit type · Void
Related topics Abstract data type · Data structure · Interface · Kind · Primitive data type · Subtyping Type constructor · Parametric polymorphism

Categories: Computer file formats

 

The above information uses material from Wikipedia and is licensed under the GNU Free Documentation License.
Some facts may not have been fully verified for accuracy. [Disclaimers]
This page was last archived by our server on Wed Dec 14 13:25:50 2011.
Displaying this page or its contents does not use any Wikimedia Foundation's resources.
The owners of this site proudly support the Wikimedia Foundation.