Wednesday, October 23, 2013

The language of languages

It's something interesting to think about the fact that most majorly used programming languages are in English - Java, C, C++, HTML, CSS, Javascript, and Perl all use English keywords. More interesting is that this isn't a result of it being impossible - non-English programming languages do exist, but the standard that's evolved is to use English for the keywords. There are several reasons I've found from minimal research for this standard - back when compilers for these languages were being developed, it was decided that it would probably be too complicated. One article I read stated that the committee reviewing one compiler thought that the demo for a certain compiler, which had actually implemented support for multiple languages, must have been fake because they simply didn't understand how much could be done with computers. Eventually, it became just so accepted to develop new programming languages in English that even foreigners with differing native tongues just wrote their programming languages in English. It's possible that the first developers were primarily English speakers, but it seems that once other countries adopted that standard, it wouldn't really be changed.

The unfortunate thing about this standard is the difficulty it places on International students, particularly students with no prior programming experience. I know a few Chinese business students who struggle in their CS courses for a combination of reasons. It's already hard to grasp the concepts presented in a language such as Java. Classes, objects, variables, and package are all terms in Java that have very specific definitions and understandings to go with them. However, to someone whose first language is not English, a class is something they pay tuition for, an object is a thing sitting on their desk, a variable is a math term, and a package is what they get in the mail. The average English speaker, on the other hand, understands that they have to redefine these terms, or use their intimate knowledge of the English language to help them understand them - a "class" is a group of things with similar qualities, which we know from learning about binomial nomenclature in high school biology. The definition is similar in Java - and that's one of the reasons it is easier for a native English speaker to understand the keywords and terms used. The problem with most CS classes, however, is that they are geared towards native English speakers - in essence, the international students with no prior experience barely learn anything during class time and are left further and further behind in the course.

I've noticed a difference, however, when I have a Chinese-born teacher in Computer Science - the classes seem much more transparent most of the time. I believe this is because the teachers themselves have experienced the difficulty of learning the language or material in a language not native to them, and as a result, know how to explain it in the simplest possible way - they explain it so that an international student can understand, and as a result, makes the class much easier on the native English speakers. It really is too bad that there aren't more of those teachers teaching introductory programming courses - when an international student gets left behind early, it really hurts them in later programming courses. They have to deal with both a language barrier and a lack of mastery in seemingly basic programming skills.

I'm not suggesting that the standard be changed - there are plenty of programming languages that aren't English based, but they're for the interested: an international business student would have little to no drive to study them alongside his/her already difficult workload. However, I do think it would be great if teachers were trained to explain the terms simply enough for non-English speakers to understand. At least, it would be good if the international students were allowed a class with a teacher more tailored to their needs. Maybe put the teachers who explain things well in all the introductory classes, so that these students can form a good base before being thrown into a Data Structures course with no clue how to program in Java.

At Stevens, I don't think CAL101, 103 and 105 are enough. A grade of "C+" in those courses is supposed to certify a good English background, but I don't think it certifies a good TECHNICAL English background. The stuff these business students learn in that class - reading the newspaper, freewriting, making presentations, and writing persuasive essays - simply doesn't prepare them properly for what they need to survive in some of their courses at a tech school. I realize it's unlikely to get special treatment for these international students who suffer because of the language barrier. I'm just saying it'd be nice.

2 comments:

  1. It's a perfectly reasonable issue that should be taken into account. One potential reason why programming languages were initially in English was because character standards tended to stay within the byte range of 0 to 255. To have a programming language use an alternative char set required a better understanding of encoding, something that even now is just "use UTF8 and be done with it".

    Many localized programs keep a list of languages so their text can be changed to many different languages. It's not hard to imagine something like GCC having a table of languages that can be used for error messages and possibly keyword translations.

    From a technical side, the biggest hurtle will be interops between localized code. If someone writes something in Chinese, how does an English program work with it? Do you require an English function name (if it's public) or something else?

    From an education side, it's like flipping a coin. To use your example, a Chinese student in an primarily-English class would probably struggle to understand the basics. But an English student primarily-Chinese class tends to feel that the class is too "broken" (I guess would be the term). Would the ability to localize a language make it easier, or would separating the class into different languages make it easier?

    ReplyDelete
  2. Good blog post, this is an interesting subject. English has become the de-facto language for programming, and I always wondered what challenges that presented to non-native English speakers. I remember reading an anecdote once about a guy who was working with foreign programmers, who would use programming keywords like "continue" and "object" much more frequently in speech, which I thought was interesting. I think the problem you are talking about lies, as you mentioned, in the fact that people who are learning the language may have trouble distinguishing between when we are talking about the literal use of a word like "object" versus the computer-science use of the word. If the professor said something like "let us continue," someone who is not familiar with English might be unsure if he was literally talking about continuing with the lecture, or if he was saying "write the continue statement in this loop". I could be wrong, and I am a native speaker so I can't speak for those who are not, but I imagine this would be a problem. Other issues include things that are considered good practice, like writing descriptive variable and function names, become more challenging when your vocabulary is limited. I agree that it would be good if professors of introductory courses were aware of this problem, so that it could be fixed as soon as possible in a student's career.

    ReplyDelete