OpenExtensions for z/VM

Porting: ASCII to EBCDIC conversion

When porting a program to OpenExtensions, keep an eye out for these areas where the ASCII to EBCDIC conversion may cause problems:

  • Hard-coded ASCII characters in C code as well as shell scripts

    Avoid using hardcoded values or depending on the values of characters at all costs. For example, a program might use '\012' (octal) instead of '\n'. A program might use characters as indices into arrays that were populated using the ASCII values for indices ( for example, hash_table['a'] --> is not the same as hash_table[0x61] ), etc..

  • Using the high-order bit of a character for some special purpose

    You can do this in ASCII because only 7 bits are necessary for all the printable characters, but that is not true in EBCDIC.

  • Assuming the alphabet ('a'...'z') is contiguous

    This is true in ASCII, but not in EBCDIC where there are three noncontiguous groups of letters. Even seemingly harmless code like the following probably needs to be changed: char c; for (c='a'; c<='z'; c++) { ... }

  • Using code generated by lexx or yacc

    Often, packages contain C code that were generated by the lexx or yacc utilities. This code will probably contain ASCII dependencies and won't work on VM. The code needs to be generated on VM by re-running the utilities. Note that this may introduce EBCDIC depenendencies making the code less-portable to other systems but at least it will work on VM.

    For example, y.tab.c is typically generated by yacc and there should be commands in the package's makefile instructing make how to invoke yacc to rebuild y.tab.c. There should also be a comment in y.tab.c that specifies the source file that yacc processed to generate y.tab.c.

Handling conversion of text files in archives

You can set an environment variable in your .profile to handle conversion from ASCII to EBCDIC for text files contained in archives. Here is an example showing how to set an environment variable called A2E and then use it:

 
  $ export A2E='-o from=ISO8859-1,to=IBM-1047'
  .
  .
  .
  $ pax $A2E -rzf foobar.tar.Z

The -o option is not pretty to look at, but once you hide it in a variable, it is easy to use and works perfectly.

Commands and Functions for Handling Conversion

There are shell commands and CMS commands that handle ASCII to EBCDIC conversion.

Two shell commands that are useful are:

  • iconv. For example, the command:
    iconv -f IBM-1047 -t IS08859-1 words.txt >converted.txt
    
    converts the file words.txt from the IBM-1047 standard code set to the ISO 8859-1 standard code set and stores it in the file named converted.txt.

  • pax. For example, the command:
    pax -wf testpgm.pax -o to=IBM-1047,from=ISO8859-1 /tmp/posix/testpgm
    
    backs up the /tmp/posix/testpgm directory, which is in the character set CP1047, into an archive file that is targeted to an ASCII character set(IS646).

The CMS commands:
OPENVM PUTbfs and OPENVM GETbfs let you convert files between ASCII and EBCDIC using the TRANSLATE option.