# Python or Java – What is better as first language?

Whenever I have the pleasure to teach the beginner course in programming I start with the same question: Python or Java? Since colleagues have good success with Python and since Python is a useful language I am tempted ever again. However, I decide to go with Java every time. Let me explain the reasons.

This year, I even started a discussion on a Python forum to push me towards Python and tell me why I should prefer it. The discussion was an interesting read, but nothing that convinced me.

So I did my own research on Python sites designed for beginners and see how they handle the teaching. As expected, they dive into Python and use its data collections almost from the start, at least shortly after using the Python command line for interactive computations. Some even go quickly into real world usage with Python packages for graphics, artificial intelligence or numerical mathematics.

If Python is that useful and if there are so many libraries available and also quite nice IDEs (my favorite is Spyder) why do I shy back to use it as first language? Here are some reasons.

• I am teaching mathematicians. Mathematicians are meant to program, to test and to improve basic algorithms. They may also use high level libraries for research, but they should at least be able to start on the roots of the code, and often that is the only way to go. Python is not designed for this use. First, it is a lot slower than Java (which is on the level of C code, by the way). The difference can be a factor of 10 to 50. Then, it hides the basic data types (such byte, short, int, long, float, double) and their problems from the user and encourages to use its infinite arithmetic. I feel that mathematicians should understand the bits and bytes, and get full, speedy control of their code.
• Python extends its language to compactify the handling of data sets at the cost of extending the language with almost cryptic syntax. E.g., Python can automatically and in one expression create a list of integers satisfying a certain condition or select from another list under a condition. I am convinced that this kind of behind-the-scene operation is not suited to teach beginners how things work and how to write good and clean code. The situation is a bit like Matlab where everything is vectored and loops are costly. You learn how to use the high-level constructs of your language, but not the basics. If you only have a hammer everything looks like a nail.
• Python is as multi-platform or open-source as Java. Both need an interpreter and cannot be translated to native code to ease implementation of bigger programs on the user systems. Java is even a bit better in this respect. Both languages cannot easily use the native GUI library or other more hardware bound libraries. Python is a bit better here. But this is nothing to worry about as a first language anyway.
• I also need to answer the question: Is it easier to go from Java to Python or the other way? And, for me, the answer is clearly that Java provides the more basic knowledge on programming. Python is on a higher level. It is more a scripting language than a basic programming language. One may argue that this is good for beginners since they get a mighty system right at the start. After all, you learn to drive a car without knowing how the motor works. But I have seen enough programs in Python to know that it spoils the programmer and drives the thoughts away from a fresh, more efficient solution.

This said, I would very much welcome a second course building on the basics that uses Python and its advanced libraries. Many students are taught R and Matlab in the syllabus. I think Python would be a better choice than Matlab for numerical programming and data visualization. And, in contrast to Matlab, it is open and free.

# The speed of C, Java, Python versus EMT

I recently tested Python to see if it is good as a first programming language. The answer is a limited „yes“. Python has certainly a lot to offer. It is a mainstream language and thus there is a lot of support in the net, and there are tons of libraries for all sorts of applications. A modern version like Anaconda even installs all these tools effortlessly.

However, I keep asking myself if it isn’t more important to learn the basics before flying high. Or to put it another way: Should we really start to teach the usage of something like a machine learning toolbox before the student has understood data types and simple loops? This question is important because most high-level libraries take away the tedious work. If you know the right commands you can start problem-solving without knowing much of the underlying programming language.

I cannot decide for all teachers. It depends on the type of students and the type of class. But I can give you an idea of how much a high-level language like Python is holding you back if you want to implement some algorithm yourself. You have to rely on a well-implemented solution being available, or you will be lost.

So here is some test code for the following simple problem. We want to count how many pairs (i,j) of numbers exist such that i^2+j^2 is prime. For the count, we restrict i and j to 1000.

#include
#include

int isprime (int n)
{
if (n == 1 || n == 2) return 1;
if (n % 2 == 0) return 0;
int i = 3;
while (i * i <= n)
{
if (n % i == 0) return 0;
i = i + 2;
}
return 1;
}

int count(int n)
{
int c = 0;
for (int i = 1; i <= n; i++)
for (int j = 1; j <= n; j++)
if (isprime(i * i + j * j)) c = c + 1;
return c;
}

int main()
{
clock_t t = clock();
printf("%d\n",count(1000));
printf("%g", (clock() - t) / 1000.0);
}

/*
Result was:
98023
0.132
*/


So this takes 0.132 seconds. That is okay for one million prime-number checks.

Below is a direct translation in Java. Surprisingly, this is even faster than C, taking 0.127 seconds. The reason is not clear. But I might not have switched on every optimization of C.

public class Test {

static boolean isprime (int n)
{
if (n == 1 || n == 2) return true;
if (n % 2 == 0) return false;
int i = 3;
while (i * i <= n)
{
if (n % i == 0) return false;
i = i + 2;
}
return true;
}

static int count(int n)
{
int c = 0;
for (int i = 1; i <= n; i++)
for (int j = 1; j <= n; j++)
if (isprime(i * i + j * j)) c = c + 1;
return c;
}

static double clock ()
{
return System.currentTimeMillis();
}

public static void main (String args[])
{
double t = clock();
System.out.println(count(1000));
System.out.println((clock() - t) / 1000.0);
}

/*
Result was:
98023
0.127
*/

}



Below is the same code in Python. It is more than 30 times slower. This is unexpected even to me. I have to admit that I have no idea if some compilation trick can speed that up. But in any case, it is the behavior that an innocent student will see. Python should be seen as an interactive interpreter language, not a basic language to implement fast algorithms.

def isprime (n):
if n==2 or n==1:
return True
if n%2==0:
return False
i=3
while i*i<=n:
if n%i==0:
return False
i=i+2
return True

def count (n):
c=0
for k in range(1,n+1):
for l in range(1,n+1):
if isprime(k*k+l*l):
## print(k*k+l*l)
c=c+1
return c

import time
sec=time.time()
print(count(1000))
print(time.time()-sec)

## Result was
## 98023
## 4.791218519210815


I have a version in Euler Math Toolbox too. It uses matrix tricks and is thus very short, using built-in loops in C. But it still cannot compete with the versions above. It is about 3 times slower than Python and 100 times slower than C. The reason is that the isprime() function is not implemented in C but in the EMT language.

>n=1000; k=1:n; l=k'; tic; totalsum(isprime(k^2+l^2)), toc;
98023
Used 13.352 seconds


Below is a version where I use TinyC to check for primes. The function isprimc() can only handle real numbers, no vectors. Thus I vectorize it with another function isprimecv(). This takes a bit of performance.

>function tinyc isprimec (x) ...
$ARG_DOUBLE(x);$  int n = (int)x;
$int res = 1;$  if (n > 2)
${$     if (n%2 == 0) res=0;
$int i=3;$     while (i*i<=n)
${$         if (n%i==0) { res=0; break; }
$i = i+2;$     }
$}$  new_real(res);
\$  endfunction
>function map isprimecv (x) := isprimec(x);
>n=1000; k=1:n; l=k'; tic; totalsum(isprimecv(k^2+l^2)), toc;
98023
Used 0.849 seconds