ͻ TextToSC, text encoding converter for a number of formats and languages. Format used in russian version of StarCraft is also supported :-) Transliteration is possible. Line feed styles: <LF>, <CR><LF> FreeWare (look at the bottom) ͼ Version 0.8.3 from Nov 30 2014 LANGUAGES & ENCODINGS SUPPORTED: east-european (Win-1251, 866, 855, ISO-8859-5, ISO-IR-111, KOI8-R, Mac-Cyr, Mac-U, KOI8-U, KOI7); Turkish (Win-1254, 857, Mac-Turk, ISO-8859-9); west-european (Win-1252, 850, ISO-8859-1, ISO-8859-15, Mac-Roman). INTERFACE LANGUAGES: English, Russian. System requirements OS: Win9x/ME, Win NT, 2000, XP or higher. CPU: 80386 or above. Interface Syntax of a program call: texttosc [_switches_] [_file1_ _file2_] _switches_ one of several switches (READ BELOW) _file1_ name of source file; _file2_ filename, where to write a result. Termination codes: 0 - file's converted; 1 - file's not converted. EXAMPLES: texttosc -koi8 -towin letter.eml "my letter.txt" texttosc -1254 -tomacturk page.txt archive.txt texttosc -iso8859_9 -to857 -src unix.c dos.c texttosc -noenc -lf mywin.pl mybsd.pl texttosc -1251 -tokoi8r -html myhome.htm myserver.htm WARNING: file masks (e. g. *.txt) are NOT allowed List of command line switches (categorical) -----Input codings (by LANGUAGE)--- =EAST-EUROPEAN (CYRILLICS)= -dos or -866 source file in the DOS 866 codepage; -w or -win Windows-1251; -koi8 or -koi8r KOI8R; -855 DOS 855; -koi7 KOI7; -mac Mac-Cyr; -iso or -iso8859_5 ISO 8859-5; -iso_ir_111 or -isoir111 ISO-IR-111; -koi8u KOI8-U; -macu or -mac_u Mac-U; -sc StarCraft mission editor format; -u same as -koi8 (for compatibility with v 0.5-); switch's obsolete; use not recommended; =TURKISH= -857 DOS-857 codepage; -1254 Windows-1254; -iso8859_9 ISO-8859-9; -macturk Mac-Turk; =WEST-EUROPEAN= -850 DOS-850 codepage; -1252 Windows-1252; -iso8859_1 ISO-8859-1; -iso8859_15 ISO-8859-15; -macroman Mac-Roman; by default is taken that source file in DOS-866 coding. -----Output codings (by LANGUAGE)--- =EAST-EUROPEAN (CYRILLICS)= -todos or -to866 to convert into the format 866 (DOS); -towin or -tow Windows-1251; -tokoi8 or -tokoi8r KOI8R; -to855 DOS 855; -tokoi7 KOI7; -tomac Mac-Cyr; -toiso or -toiso8859_5 ISO 8859-5; -toiso_ir_111 or -toisoir111 ISO-IR-111; -tokoi8u KOI8-U; -tomacu or -tomac_u Mac-U; -tosc StarCraft mission editor format; -translit -totl into latin (text transliteration); -tou same as -tokoi8 (for compatibility with v 0.5-); switch's obsolete; use not recommended; =TURKISH= -to857 DOS-857 codepage; -to1254 Windows-1254; -toiso8859_9 ISO-8859-9; -tomacturk Mac-Turk; =WEST-EUROPEAN= -to850 DOS-850 codepage; -to1252 Windows-1252; -toiso8859_1 ISO-8859-1; -toiso8859_15 ISO-8859-15; -tomacroman Mac-Roman; by default translates into StarCraft rus. ---Line feed styles--- -lf UNIX's <LF>; -crlf Windows' <CR><LF>; -noaltlf do not alter original file file feeds; by default linefeed sequence will be chosen automatically in dependence of destination file encoding. ---Modes-- -src programs sources conversion mode; prevents from transformations of some symbols into symbols of double or single quotes (which in programming languages are used for marking the strings/symbols); this switch also should be used if You convert a certain text, which will be inserted into the program; -html HTML-compatible conversion mode; prevents from loss of quotes, from generating extra symbols (brackets etc.) corrupting HTML document structure; use it when converting HTML files, Java/VBScripts, PHP, server CGI scripts and so on; -dhcs denie HTML character substitution; DISABLES character replacing with HTML named entities (e.g. © for copyright mark character) when given character not present in destination encoding; can be used with -html switch; -nta disable text analyzer; text analyzer - feature, which changes symbols of input text orientating not only upon their code, as well as, e. g., on precede symbols; at present only one possibility available- to replace simple quotation marks with more beautiful-looking openning and closing, finding itself where what to put; but with this switch all possibilities of analyzer will be disabled; preferrable for text files; for program sources the -src switch is better; for HTML or scripts -html switch is better (see above); -noenc not to do encoding conversion; this switch is to be used ONLY if you run program to simply adjust file's linefeed style (e.g. for CGI program - see -lf and -crlf switches) and don't need an encoding translation; -32 not to use DOS special symbols (with codes < 32); switch forbids changing some "extended" symbols (such as paragraph sign) with DOS' similar from the special symbols; similarly and at the reconversion; works ONLY with DOS-866 encoding; with others symbols<32 are not altered; -nstr not to translate symbols into strings; switch forbids changing some symbols (which do not exist in destination codepage) with combinations of other symbols; use this if You do not want output file be large than input one; IF YOU CONVERT TEXTS ON SOUTH-SLAVIAN LANGUAGES, DO NOT USE THIS KEY, because it can bring about the loss of some symbols (for example, ); doesn't work when doing transliteration; -nqt do not translit Cyrillic hard and soft signs into apostrophes; for use only with transliteration; such symbols will translit into latin q of corresponding register; -pdt preserve original file's date/time; the output file's date/time will be set to input file's date/time (or original date/time will not be modified if same file specified as both input and output); -----Options---- -en use english program messages; this switch is not longer needed because program messages are english by default :-) -rus use russian program messages; -ni dont show information on versions; -? or /? help; ________What's new in version 0.8.3_______ 30.11.2014 (11/30/2014) 1. Help screens and help files has been corrected. Now there is no reason in using short filenames, You may use long ones (since version 0.8.2). ________What's new in version 0.8.2_______ 23.11.2014 (11/23/2014) 1. I remade readme. 2. Executable recompiled for Windows. DOS is not longer supported. ________What's new in version 0.8.1_______ 18.02.2003 (02/18/2003) 1. Added -pdt switch to preserve original file's date/time. ________What's new in version 0.8_________ 06.12.2002 (12/06/2002) 1. BUGS with some Russian symbols in KOI8 and Mac-Cyrillic encodings fixed. 2. Automatic line feed sequence adjustment depending on output encoding when no -lf nor -crlf switch specified. In such case sets <CRLF> for OEM and Win encodings and <LF> for others. To turn off this feature use -noaltlf switch (it will preserve all symbols CR=13 and LF=10 of original text- as previous TextToSc versions did). The feature doesn't work automatically with transliteration function (-translit). 3. Added HTML-compatible mode for web document translation. So -html and -dhcs switches added. ________What's new in version 0.7_________ 07.11.2002 (11/07/2002) 1. Now possible to "adjust" line feed style for output file: <LF> or <CRLF> independ- ently of input file's one. Switches -lf and -crlf added for this purpose. By default (with no of these switches) program doesn't alter LF/CRLF and works on line feeds as previous version did. 2. Switch -noenc added for those, who only convert LF/CRLF style but not an encoding. Useful for CGI script authors. 3. Fixed a FOOLISH bug of v 0.6: a lot of cyrillic symbols were not being translated correctly. Bug was of v 0.6 only, not 0.5 4. Western-european encodings support added. 5. Corrected bug of v 0.6-: when transcoding from rus Starcraft format into other en- coding some Russian symbols aren't being converted correctly. ________What's new in version 0.6_________ 31.07.2002 (07/31/2002) 1. ISO-IR-111, KOI8-U, Mac-U, KOI7 support added. Now maximally supports non-russian Cyrillics. Program goes to support A LOT of different languages though -nec switch is now obsolete. 2. "New" switches -iso8859_5 -toiso8859_5 for ISO. Old -u -tou for KOI8-R are placed only for compatibility with v 0.5- 3. New feature - transliteration (cyrillic into latin). 4. Now program will correctly convert all possible symbols being present in destination encoding. E.g. pseudographics in KOI8, 855 and 866; extended symbols in Mac and ISO and so on. 5. Now interface is english by default. 6. Added support for Turkish encodings. ________What's new in version 0.5_________ 29.07.2001 (07/29/2001) 1. Now we can convert into any one of formats, which is readable. 2. Support for pages 855 (old DOS), MAC, ISO added. 3. Multi-language interface (russian and english yet). 4. Analyzer of text and more neat functioning with write in 1251. 5. Pseudographic symbols from 866 in other codings are changed close to visual representaion ( + - and | yet). ________What's new in version 0.4_________ 14.07.2001 (07/14/2001) 1. Supports not only russian, as well as other cyrillic symbols (while transforming from the Windows coding). 2. Support for transforming additional symbols from Windows encoding. 3. Transformation now can change symbols into groups (strings) if there's no necessary symbol in the target coding, so output file can be larger than input was (feature may be disabled with -nstr switch). 4. New modes and command line switches. On the command -? now only short reference is displayed, but rest in README file (you're reading it now). ________What's new in version 0.3_________ 1. Now coding not in StarCraft only, as well as in DOS (for those, who don't like StarCraft). 2. Now understands KOI8-R coding. 3. Now it's possible to assign same file as source and destination, so instead of using a wintosc.bat file it's possible simply reference all in command line (e. g., instead of command wintosc.bat sample.txt possible to execute texttosc.com -w sample.txt sample.txt ). For those, who does not entrust to the program or wants to save reserve files I has left wintosc.bat file (as an archaism) and has made wintodos.bat and koitodos.bat files. ______Versions 0.1 and 0.2____________ Program was born when I wrote missions for StarCraft in russian and has wanted to automate transcoding texts for them. Well, I discovered that russian version of game uses for cyrillics some perverted "clone" of 866 encoding __!!!___Bugs___!!!__ Ŀ What a bug (description) Who has found and when What's done ͵ In v 0.7- some few Russian Author, Dec 6 2002 Fixed in v 0.8 characters were being processed incorrectly in KOI8 and Mac-Cyrillic encodings Ĵ In v 0.6- when transcoding Author, Nov 6 2002 Fixed in v 0.7 from rus Starcraft format into other encoding some Russian symbols aren't being converted correctly Ĵ In v 0.6 there was a HUGE Alexander Bakhuta, Fixed in v 0.7 bug causing non-correct Oct 8 2002 translation of many Cyrillic symbols (with Turkish all was ok) Ĵ In v 0.5 and earlier some Author, July 31 2002 Fixed in v 0.6 non-russian Cyrillic symbols were being translated incor- rectly into encoding other than DOS-866 Ĵ In v 0.3 and 0.4 russian Author, July 20-th Fixed in v 0.5 small 'io' is not being con- 2001 verted from KOI8-R coding Ĵ Don't know whether to con- Author, June or July In v 0.4 fixed sider this to be a bug. But 2001 (don't remember) for 1251 codepage until v 0.4 program didn't process a lot of 1251 code- In v 0.6 fixed page extended symbols (such finitely as alternative quotes) BUGS LISTED ONLY FOR PUBLICLY DISTRIBUTED VERSIONS -------------------------------------------------------------------------- UPDATED LIST of bug reports possible to find on http://super-m.savesoul.ru/support/ (yet in russian) Or simply visit my site (refer below). New versions of program You may find there too. -------------------------------------------------------------------------- If You have got some version of program and You have found a bug there, please send me its description (certainly, nobody does ask You for making any experiments). ķ Reports on "gluks" (aka BUGS) are being taken by author with gratitude (aka THANKS). Ľ Reviews - good things too ķ Ŀ New, fresh and interesting ideas are to be met ! by me with gladness! And I'm ready to make em quietly into reality ͼ ɻ ======= THIS PROGRAM IS FULLY FREEWARE ======= You may freely distribute and use it in any purposes. ȼ The Author Mikhail Shmelev. E-mail: super-m@ya.ru Multilanguage Web-site: http://super-m.savesoul.ru/ Program source (free too): http://super-m.savesoul.ru/source/texttosc.zip