Programming Language Critiques: Pascal, C, C++, and C-Linda

Jim Basney

May 1995



The essential purpose of a programming language is to allow a programmer to give precise instructions to the computer. Programming languages do this with varying degrees of effectiveness. High-level programming languages have many benefits: they provide for portability; they abstract away machine details to simplify the task of programming; and some of them use type systems to provide program-correctness checking. New programming languages are being developed each year, and one would hope that they are getting better at filling the needs of programmers. This process can not proceed without a dialogue of criticism on current programming languages and what programmers would like to see in new languages. In this paper I will survey some criticisms of the Pascal, C, and C++ languages, and also I will put forward some of my own criticisms of the C-Linda programming language.

The Pascal programming language has a relatively strong type system, partly due to the fact that it was originally intended to be a language for instruction, and type checking can help to catch a lot of the errors of beginning programmers. Pascal allows recursion, an improvement over many earlier programming languages. It's type system includes enumerated types and boolean variables, both of which provide additional typing over C's enumerated types and booleans as integers scheme. Additionally, Pascal provides some run-time type checking, the most notable of which is bounds checking on array references. Rather than a segmentation fault, an error C programmers are very familiar with, Pascal can give the programmer a meaningful error message when an array is accessed out of declared bounds. However, Pascal's treatment of arrays is cumbersome in other ways, especially the fact that array size is part of the type of the declared array. This makes it impossible to define utility functions and procedures that operate on arrays of any length, which is a severe limitation, particularly in string handling. Also, although this is not a type concern, arrays are passed by value by default. This requires the programmer to treat val and var parameters both as signifying if a parameter is updated inside a procedure and whether it is efficient to create a local copy of a parameter. This second decision is one that could be made by the compiler and is, therefore, an unnecessary programming detail. Additionally, Pascal's limitation that non-basic types can not be named as literal formal procedure parameters is restrictive. "The discipline of inventing type names is helpful for types that are used often, but it is a distraction for things used only once" (Kernighan 5). A final type critique that can be made is that Pascal does not allow type casting. This is controversial, though, because type casting is essentially used to defeat the type system. If one sees Pascal's strong typing as a plus, then there is no need for type casting.

Additionally, there are significant problems with control-flow and source code organization. There is "no guaranteed order of evaluation of the logical operators and and or" (Kernighan 7). This disallows tests like "while (i <= XMAX) and (x[i] > 0) do ..." because the programmer can't be assured that the left test will be evaluated before the right test. There is also no break statement for exiting loops and no return statement for functions. This is a result of the one in-one out design of Pascal, which can be a useful restriction in terms of source code analysis, but it forces the programmer to write unnecessarily confusing code in some cases. The fact that there is no default clause in cast statements makes the lack of a break more cumbersome, and generally makes the case construct unusable. Another effect of the one in-one out design is the necessity for an unnatural declaration order. "In particular, procedures and functions must be declared (body and all) before they are used" (Kernighan 5). Also, the fact that declarations must be in a strict order (labels, then constants, then types, then variables, then procedures, then main body) forces the programmer to group declarations by the type of object he or she is declaring, rather than its logical relation to the program. Brian Kernighan sums up his critique of the Pascal programming language with the following: "I feel that it is a mistake to use Pascal for anything much beyond its original target. In its pure form, Pascal is a toy language, suitable for teaching but not for real programming" (Kernighan 13).

Turning now to a critique of the C programming language, it will become clear that much of language critique depends upon a programmer's needs and preferences. Some of the things that Kernighan criticizes Pascal for are fixed in C, and these fixes are lamented by Moylan and Joyner. Looking first at the type system of C, Joyner writes, "Type casting undermines the purpose of strongly typed systems" (Joyner 26). Thus, while Kernighan wanted type casting for Pascal, Joyner would like to see it removed from C to make the typing system more secure. In one way, I see these criticisms as missing the point because type conversion functions can easily be written in Pascal and the type conversion operators of C can always be avoided by the programmer. On the other hand, having type conversion primitives is more convenient. I agree that if a program is well-typed, meaning that each variable and function have a specific type that they operate on, then type casting is unnecessary and even dangerous, allowing poorly typed programs to execute. I think the most dangerous cast is an implicit cast, i.e., a cast that is performed to fit operands to operator, or function arguments to function definition, and that is not explicitly declared by the programmer. While in many cases this yields predictable results for compilers strictly following the standard, often the programmer is not aware of how the cast will be performed (for example, by not knowing the correct type declaration of an operand or library procedure). Interestingly, casts provide a limited type of polymorphism, by allowing a function defined for one type to actually be used on parameters of another type. However, the polymorphism here happens at the time of the call, not at the function declaration. This can cause confusion and type inconsistency.

There are numerous type criticisms with C's handling of arrays. One is that arrays are not primitive objects, i.e., one can not assign a constant to an array, or perform operations on an array as a whole. This brings up another issue with C that runs through the two critiques being examined here. Dennis Ritchie writes, "At the time of publication of K&R, C was thought of mainly as the system programming language of Unix" (Ritchie 1993)1. C is therefore an "intermediate-level machine-oriented" (Moylan 2) language, that is missing a lot of useful high-level functionality that has since been developed in newer languages. Moylan writes, "We have learnt some new things about language design in the last 20 years, and we do know that some of the things that seemed like a good idea at the time are in fact not such good ideas" (Moylan 4). Thus, the whole issue of whether C (or C++ for that matter) should be the most popular programming language for writing production code is raised. I choose to leave this argument for other authors and continue with the type critique. Another complaint with arrays in C is that there is no run-time bounds checking, like there is in Pascal. Also, in the C implementation of strings (as arrays of characters), there is confusion between "data and metadata" (Joyner 24), because both the data of the string and metadata characters (like the end of string character) are stored in the array. As a type critique, it can be said that data and metadata are two different types of data, and should not be lumped together into the same programming language type.

Pointers in C are also a source of criticism. Joyner writes, "C pointers are a low-level mechanism that should not be the concern of the programmer" (Joyner 21). Pointers are used to specify a number of programming constructions: val vs. var parameters, recursive structures, and memory allocated at run-time, to name a few. However, as was described above, val vs. var parameters are unnecessary as ways of specifying how a parameter should be passed, as this is a decision that the compiler can make. Also, recursive structures can be defined with a higher level syntax, as can allocated memory, making pointers unnecessary. Pointers seem to be a major source of complication and confusion in C programs. Abstracting them away would have significant benefits, with seemingly minimal losses. In the same vein, there is no need to have two operators, "." and "->", for specifying structure members, depending on whether the structure is named as an actual variable or a pointer. This is something that the compiler can decide and is therefore another unnecessary programming detail. Additionally, pointers of type void are used specifically to undermine the type system of C.

The C++ programming language, while susceptible to the above criticisms of the C language, also has some specific type problems. C++ provides a syntax for overloading functions (a weak polymorphism). When an overloaded function is called, C++ tries to find a function declaration that exactly matches the types of the function. However, if this is not possible, C++ tries various type casts to fit the call to one of the definitions for the overloaded function. Combining these two methods for overloading is confusing. While Stroustrup specifies the algorithm used for this matching in his C++ manual, and C++ compilers will catch any ambiguities that are not resolved by the algorithm, the algorithm is complicated and programmers are generally left to running their code to see which version of the function is called. In addition to overloading functions, C++ allows the programmer to give functions default parameters, so that if the function is called with less than the defined number of parameters, the default parameters are used for the missing parameters. However, "optional parameters mean that C++ is not type safe, and that the compiler can not check that the parameters in the call exactly match the function signature" (Joyner 11).

Additionally, there is criticism of the way C++ implements object-oriented programming. One of my main complaints with C++, which Joyner also makes in his paper, is the fact that private class members must be included in the header file that is exported to the user of the class. This seems to go against the object-oriented paradigm, where the user of the class should not know how the class is implemented, but is only told how to use the interface to that class. There is also no way to declare a method for a class which can not be overridden by a child class. Lastly, virtual function declarations in classes are unnecessary bookkeeping that could be done by the compiler.

A compilation system can detect polymorphism, and generate the underlying virtual code, where and only where necessary. Having to specify virtual burdens the programmer with another bookkeeping task. This is the main reason why C++ is a weak object-oriented language as the programmer must constantly be concerned with low-level details. (Joyner 5)
Thus, C++ follows the C model of intermediate- or low-level programming, even when providing facilities for an object-oriented paradigm. It does not provide the level of abstraction that one would expect from a relatively new programming language.

The C-Linda programming language, as described by the "C-Linda Reference Manual" (Carriero), is an impressively small, yet powerful, language for concurrent programming. In my work with C-Linda, however, I have come across some issues which I feel are left unanswered by the original language specification. Linda (the concurrent part of C-Linda) uses type information for matching queries to shared memory with data in the shared memory space. This requires the C-Linda system to have a significant amount of type information about the program at run-time. For example, arrays of different size are treated as different types, so each array must know its type and size at run-time. In fact, the size of an array returned from the shared memory space can be returned to a user variable. While each program can be optimized to only keep track of type information for objects eventually used in a Linda expression, this is still a considerable amount of overhead. However, I think the benefit of run-time type checking outweighs the overhead, because it provides for type-safe shared memory handling. Another type issue with Linda has to do with struct and union type matching. Linda matches queries that have the same name for a struct or union type. What if two structs share the same name, in separate program modules? Linda's answer is that the match occurs only if the name and the storage size of the struct or union type match. This, however, is not reliable across multiple architectures. The size of integers, for example, could be different on a 16 bit machine compared to a 32 bit machine. If they communicate with each other through a distributed shared memory, then structs and union matching is not reliable. Another question that is not addressed in the manual is whether Linda expressions can contain members with user defined type (declared in C typedef statements). How would matching occur in this case?

One final criticism, although not related to type concerns, is with the Linda eval() function. Per the reference manual, this function creates a process for each parameter to this function, to evaluate any expressions in the parameter (function calls, for example) in parallel. However, in many cases only a small percentage of the eval() parameters are actually spawning useful concurrent processes. Other process are just evaluating actual values (like "foo" or 23.4), which require no processing. Because of the overhead of spawning a process, this makes Linda inefficient. It would be useful to be able to specify within the language which parameters in the eval() call are meant to be evaluated concurrently, and which ones can just be evaluated before the call is made in the calling process.

One thing that these critiques have made clear to me is that there is a need for industry and academia to catch up with work being done in programming languages. As Moylan points out, there is no reason why we should be using outdated programming paradigms to construct our new programming languages. Adding object-orientation to C is an example of this. He writes, "Adding object-orientation to C is like adding air conditioning to a bicycle" (Moylan 10). As programming projects become more complex, and programming teams become larger, there is a growing need for programming languages to provide higher abstractions, more facilities for cooperation between programmers (better modularization), and advanced program correctness checking (for example, strong type systems). Compiler technology has advanced to the point where it is counter-productive to have the programmer make low-level decisions. Instead, the compiler for each machine should be given a chance to optimize the code for that specific processor. I look forward to exploring the innovative programming tools currently available and to working with the next generation of programming languages.


Bibliography

Carriero, Nicholas and David Gelernter. How to Write Parallel Programs: A First Course. Cambridge, Massachusetts: The MIT Press, 1990. Appendix: C-Linda Reference Manual.

Joyner, Ian. "C++?? A Critique of C++: 2nd Edition." Australia: Unisys, 1992.

Kernighan, Brian W. "Why Pascal is Not My Favorite Programming Language." Murray Hill, New Jersey: AT&T Bell Laboratories, 1981.

Kernighan, Brian W. and Dennis M. Ritchie. The C Programming Language: Second Addition. Engelwood Cliffs, New Jersey: Prentice Hall, 1988.

Moylan, P. J. "The Case Against C." Technical Report EE9240. Australia: Department of Electrical and Computer Engineering, University of Newcastle, July 1992.

Ritchie, Dennis. "The Development of the C Language." Murray Hill, New Jersey: AT&T Bell Laboratories, 1993.

Stroustrup, Bjarne. The C++ Programming Language: Second Edition. Reading, Massachusetts: Addison-Wesley Publishing, 1991.


Jim Basney / Oberlin College / jbasney@cs.oberlin.edu
Last Modified: May 17, 1995