Lev Walkin <vlm@lionet.info>
This chapter defines some basic ASN.1 concepts and describes several most widely used types. It is by no means an authoritative or complete reference. For more complete ASN.1 description, please refer to Olivier Dubuisson's book [Dub00] or the ASN.1 body of standards itself [ITU-T/ASN.1].
The Abstract Syntax Notation One is used to formally describe the semantics of data transmitted across the network. Two communicating parties may have different formats of their native data types (i.e. number of bits in the integer type), thus it is important to have a way to describe the data in a manner which is independent from the particular machine's representation. The ASN.1 specifications is used to achieve one or more of the following:
Rectangle ::= SEQUENCE { height INTEGER, width INTEGER }
The complete specification must be wrapped in a module, which looks like this:
UsageExampleModule1 { iso org(3) dod(6) internet(1) private(4) enterprise(1) spelio(9363) software(1) asn1c(5) docs(2) usage(1) 1 } AUTOMATIC TAGS DEFINITIONS ::= BEGIN -- This is a comment which describes nothing. Rectangle ::= SEQUENCE { height INTEGER, -- Height of the rectangle width INTEGER, -- Width of the rectangle } END
The BOOLEAN type models the simple binary TRUE/FALSE, YES/NO, ON/OFF or a similar kind of two-way choice.
The INTEGER type is a signed natural number type without any restrictions on its size. If the automatic checking on INTEGER value bounds are necessary, the subtype constraints must be used.
SimpleInteger ::= INTEGER -- An integer with a very limited range SmallInt ::= INTEGER (0..127) -- Integer, negative NegativeInt ::= INTEGER (MIN..0)
The ENUMERATED type is semantically equivalent to the INTEGER type with some integer values explicitly named.
FruitId ::= ENUMERATED { apple(1), orange(2) } -- The numbers in braces are optional, -- the enumeration may be performed -- automatically by the compiler ComputerOSType ::= ENUMERATED { FreeBSD, -- will be 0 Windows, -- will be 1 Solaris(5), -- will remain 5 Linux, -- will be 6 MacOS -- will be 7 }
This type models the sequence of 8-bit bytes. This may be used to transmit some opaque data or data serialized by other types of encoders (i.e. video file, photo picture, etc).
The OBJECT IDENTIFIER is used to represent the unique identifier of any object, starting from the very root of the registration tree. If your organization needs to uniquely identify something (a router, a room, a person, a standard, or whatever), you are encouraged to get your own identification subtree at http://www.iana.org/protocols/forms.htm.
For example, the very first ASN.1 module in this document has the following OBJECT IDENTIFIER: 1 3 6 1 4 1 9363 1 5 2 1 1.
ExampleOID ::= OBJECT IDENTIFIER usageExampleModule1-oid ExampleOID ::= { 1 3 6 1 4 1 9363 1 5 2 1 1 } -- An identifier of the Internet. internet-id OBJECT IDENTIFIER ::= { iso(1) identified-organization(3) dod(6) internet(1) }
The RELATIVE-OID type has the semantics of a subtree of an OBJECT IDENTIFIER. There may be no need to repeat the whole sequence of numbers from the root of the registration tree where the only thing of interest is some of the tree's subsequence.
this-document RELATIVE-OID ::= { docs(2) usage(1) } this-example RELATIVE-OID ::= { this-document assorted-examples(0) this-example(1) }
This is essentially the ASCII, with 128 character codes available (7 lower bits of 8-bit byte).
This is the character string which encodes the full Unicode range (4 bytes) using multibyte character sequences.
This type represents the character string with the alphabet consisting of numbers (''0'' to ''9'') and a space.
The character string with the following alphabet: space, ''''' (single quote), ''('', '')'', ''+'', '','' (comma), ''-'', ''.'', ''/'', digits (''0'' to ''9''), '':'', ''='', ''?'', upper-case and lower-case letters (''A'' to ''Z'' and ''a'' to ''z'')
The character string with the alphabet which is more or less a subset of ASCII between space and ''~'' (tilde). Alternatively, the alphabet may be represented as the PrintableString alphabet described earlier, plus the following characters: ''!'', '''''', ''#'', ''$'', ''%'', ''&'', ''*'', '';'', ''<'', ''>'', ''['', ''\'', '']'', ''^'', ''_'', ''`'' (single left quote), ''{'', ''|'', ''}'', ''~''.
This is an ordered collection of other simple or constructed types. The SEQUENCE constructed type resembles the C ''struct'' statement.
Address ::= SEQUENCE { -- The apartment number may be omitted apartmentNumber NumericString OPTIONAL, streetName PrintableString, cityName PrintableString, stateName PrintableString, -- This one may be omitted too zipNo NumericString OPTIONAL }
This is a collection of other simple or constructed types. Ordering is not important. The data may arrive in the order which is different from the order of specification. Data is encoded in the order not necessarily corresponding to the order of specification.
This type is just a choice between the subtypes specified in it. The CHOICE type contains at most one of the subtypes specified, and it is always implicitly known which choice is being decoded or encoded. This one resembles the C ''union'' statement.
The following type defines a response code, which may be either an integer code or a boolean ''true''/''false'' code.
ResponseCode ::= CHOICE { intCode INTEGER, boolCode BOOLEAN }
This one is the list (array) of simple or constructed types:
-- Example 1 ManyIntegers ::= SEQUENCE OF INTEGER -- Example 2 ManyRectangles ::= SEQUENCE OF Rectangle -- More complex example: -- an array of structures defined in place. ManyCircles ::= SEQUENCE OF SEQUENCE { radius INTEGER }
The SET OF type models the bag of structures. It resembles the SEQUENCE OF type, but the order is not important: i.e. the elements may arrive in the order which is not necessarily the same as the in-memory order on the remote machines.
-- A set of structures defined elsewhere SetOfApples :: SET OF Apple -- Set of integers encoding the kind of a fruit FruitBag ::= SET OF ENUMERATED { apple, orange }
The purpose of the ASN.1 compiler, of which this document is part, is to convert the ASN.1 specifications to some other target language (currently, only C is supported2.1). The compiler reads the specification and emits a series of target language structures and surrounding maintenance code. For example, the C structure which may be created by compiler to represent the simple Rectangle specification defined earlier in this document, may look like this2.2:
typedef struct Rectangle_s { int height; int width; } Rectangle_t;
After building and installing the compiler, the asn1c command may be used to compile the ASN.1 specification2.4:
asn1c <spec.asn1>
asn1c <spec1.asn1> <spec2.asn1> ...
asn1c -EF <spec-to-test.asn1>
After compiling, the following entities will be created in your current directory:
In other words, after compiling the Rectangle module, you have the following set of files: { Makefile.am.sample, Rectangle.c, Rectangle.h, ... }, where ''...'' stands for the set of additional ''helper'' files created by the compiler. If you add the simple file with the int main() routine, it would even be possible to compile everything with the single instruction:
cc -o rectangle *.c # It could be that simple2.6
First of all, you would want to include one or more header files into your application. For the Rectangle module, including the Rectangle.h file is enough:
#include <Rectangle.h>
Rectangle_t *rect = ; asn1_DEF_Rectangle->free_struct(&asn1_DEF_Rectangle, rect, 0);
There are several generic functions available:
Here is how the buffer can be deserialized into the structure:
Rectangle_t * simple_deserializer(void *buffer, size_t buf_size) { Rectangle_t *rect = 0; /* Note this 0! */ ber_dec_rval_t rval; rval = asn1_DEF_Rectangle->ber_decoder( &asn1_DEF_Rectangle, (void **)&rect, buffer, buf_size, 0); if(rval.code == RC_OK) { return rect; /* Decoding succeeded */ } else { asn1_DEF_Rectangle->free_struct( &asn1_DEF_Rectangle, rect, 0); return 0; } }
Restartable decoding is a little bit trickier: you need to provide the old target structure pointer (which might be already half-decoded) and react on RC_WMORE return code. This will be explained later in Section sub:Decoding-BER
The Basic Encoding Rules describe the basic way how the structure can be encoded and decoded. Several other encoding rules (CER, DER) define a more restrictive versions of BER, so the generic BER parser is also capable of decoding the data encoded by CER and DER encoders. The opposite is not true.
The ASN.1 compiler provides the generic BER decoder which is implicitly capable of decoding BER, CER and DER encoded data.
The decoder is restartable (stream-oriented), which means that in case the buffer has less data than it is expected, the decoder will process whatever it is available and ask for more data to be provided. Please note that the decoder may actually process less data than it is given in the buffer, which means that you should be able to make the next buffer contain the unprocessed part of the previous buffer.
Suppose, you have two buffers of encoded data: 100 bytes and 200 bytes.
There are two ways to invoke a BER decoder. The first one is a direct reference of the type-specific decoder. This way was shown in the previous example of simple_deserializer function. The second way is to invoke a ber_decode function, which is just a simple wrapper of the former approach into a less wordy notation:
rval = ber_decode(&asn1_DEF_Rectangle, (void **)&rect, buffer, buf_size);
These two ways of invocations are fully equivalent.
The BER decoder may fail because (the following RC_... codes are defined in ber_decoder.h):
Please look into ber_decoder.h for the precise definition of ber_decode() and related types.
The Distinguished Encoding Rules is the variant of BER encoding rules which is oriented on representing the structures with length known beforehand. This is probably exactly how you want to encode: either after a BER decoding or after a manual fill-up, the target structure contains the data which size is implicitly known before encoding. The DER encoding is used, for example, to encode X.509 certificates.
As with BER decoder, the DER encoder may be invoked either directly from the ASN.1 type descriptor (asn1_DEF_Rectangle) or from the stand-alone function, which is somewhat simpler:
/* * This is a custom function which writes the * encoded output into some FILE stream. */ int _write_stream(void *buffer, size_t size, void *app_key) { FILE *ostream = app_key; size_t wrote; wrote = fwrite(buffer, 1, size, ostream); return (wrote == size) ? 0 : -1; } /* * This is the serializer itself, * it supplies the DER encoder with the * pointer to the custom output function. */ ssize_t simple_serializer(FILE *ostream, Rectangle_t *rect) { der_enc_rval_t rval; /* Return value */ rval = der_encode(&asn1_DEF_Rect, rect, _write_stream, ostream); if(rval.encoded == -1) { /* * Failure to encode the rectangle data. */ fprintf(stderr, ''Cannot encode %s: %s\n'', rval.failed_type->name, strerror(errno)); return -1; } else { /* Return the number of bytes */ return rval.encoded; } }
If the custom write function is not given (passed as 0), then the DER encoder will essentially do the same thing (i.e., encode the data) but no callbacks will be invoked (so the data goes nowhere). It may prove useful to determine the size of the structure's encoding before actually doing the encoding2.8.
Please look into der_encoder.h for the precise definition of der_encode() and related types.
Sometimes the target structure needs to be validated. For example, if the structure was created by the application (as opposed to being decoded from some external source), some important information required by the ASN.1 specification might be missing. On the other hand, the successful decoding of the data from some external source does not necessarily mean that the data is fully valid either. It might well be the case that the specification describes some subtype constraints that were not taken into account during decoding, and it would actually be useful to perform the last check when the data is ready to be encoded or when the data has just been decoded to ensure its validity according to some stricter rules.
The asn_check_constraints() function checks the type for various implicit and explicit constraints. It is recommended to use asn_check_constraints() function after each decoding and before each encoding.
Please look into constraints.h for the precise definition of asn_check_constraints() and related types.
There are two ways to print the target structure: either invoke the print_struct member of the ASN.1 type descriptor, or using the asn_fprint() function, which is a simpler wrapper of the former:
asn_fprint(stdout, &asn1_DEF_Rectangle, rect);
Freeing the structure is slightly more complex than it may seem to. When the ASN.1 structure is freed, all the members of the structure and their submembers etc etc are recursively freed too. But it might not be feasible to free the structure itself. Consider the following case:
struct my_figure { /* The custom structure */ int flags; /* <some custom member> */ /* The type is generated by the ASN.1 compiler */ Rectangle_t rect; /* other members of the structure */ };
To solve this problem, the free_struct routine has the additional argument (besides the intuitive type descriptor and target structure pointers), which is the flag specifying whether the outer pointer itself must be freed (0, default) or it should be left intact (non-zero value).
/* Rectangle_t is defined within my_figure */ struct my_figure *mf = ...; /* * Freeing the Rectangle_td * without freeing the mf->rect pointer */ asn1_DEF_Rectangle->free_struct( &asn1_DEF_Rectangle, &mf->rect, 1 /* !free */); /* Rectangle_t is a stand-alone pointer */ Rectangle_t *rect = ...; /* * Freeing the Rectangle_t * and freeing the rect pointer */ asn1_DEF_Rectangle->free_struct( &asn1_DEF_Rectangle, rect, 0 /* free the pointer too */);