Pages

Saturday, April 12, 2008

Java

Java is a programming language originally developed by Sun Microsystems and released in 1995 as a core component of Sun Microsystem's Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities. Java applications are typically compiled to bytecode that can run on any Java virtual machine (JVM) regardless of computer architecture.

The original and reference implementation Java compilers, virtual machines, and class libraries were developed by Sun from 1995. As of May 2007, in compliance with the specifications of the Java Community Process, Sun made available most of their Java technologies as free software under the GNU General Public License. Others have also developed alternative implementations of these Sun technologies, such as the GNU Compiler for Java and GNU Classpath.

Philosophy

Primary goals

There were five primary goals in the creation of the Java language:

  1. It should use the object-oriented programming methodology.
  2. It should allow the same program to be executed on multiple operating systems.
  3. It should contain built-in support for using computer networks.
  4. It should be designed to execute code from remote sources securely.
  5. It should be easy to use by selecting what were considered the good parts of other object-oriented languages.

Platform independence

One characteristic, platform independence, means that programs written in the Java language must run similarly on any supported hardware/operating-system platform. One should be able to write a program once, compile it once, and run it anywhere.

This is achieved by most Java compilers by compiling the Java language code halfway (to Java bytecode) – simplified machine instructions specific to the Java platform. The code is then run on a virtual machine (VM), a program written in native code on the host hardware that interprets and executes generic Java bytecode. (In some JVM versions, bytecode can also be compiled to native code, either before or during program execution, resulting in faster execution.) Further, standardized libraries are provided to allow access to features of the host machines (such as graphics, threading and networking) in unified ways. Note that, although there is an explicit compiling stage, at some point, the Java bytecode is interpreted or converted to native machine code by the JIT compiler.

The first implementations of the language used an interpreted virtual machine to achieve portability. These implementations produced programs that ran slower than programs compiled to native executables, for instance written in C or C++, so the language suffered a reputation for poor performance. More recent JVM implementations produce programs that run significantly faster than before, using multiple techniques.

One technique, known as just-in-time compilation (JIT), translates the Java bytecode into native code at the time that the program is run, which results in a program that executes faster than interpreted code but also incurs compilation overhead during execution. More sophisticated VMs use dynamic recompilation, in which the VM can analyze the behavior of the running program and selectively recompile and optimize critical parts of the program. Dynamic recompilation can achieve optimizations superior to static compilation because the dynamic compiler can base optimizations on knowledge about the runtime environment and the set of loaded classes, and can identify the hot spots (parts of the program, often inner loops, that take up the most execution time). JIT compilation and dynamic recompilation allow Java programs to take advantage of the speed of native code without losing portability.

Another technique, commonly known as static compilation, is to compile directly into native code like a more traditional compiler. Static Java compilers, such as GCJ, translate the Java language code to native object code, removing the intermediate bytecode stage. This achieves good performance compared to interpretation, but at the expense of portability; the output of these compilers can only be run on a single architecture. Some see avoiding the VM in this manner as defeating the point of developing in Java; however it can be useful to provide both a generic bytecode version, as well as an optimised native code version of an application.

Implementations

Sun Microsystems officially licenses the Java Standard Edition platform for Microsoft Windows, Linux, and Solaris. Through a network of third-party vendors and licensees, alternative Java environments are available for these and other platforms. To qualify as a certified Java licensee, an implementation on any particular platform must pass a rigorous suite of validation and compatibility tests. This method enables a guaranteed level of compliance and platform through a trusted set of commercial and non-commercial partners.

Sun's trademark license for usage of the Java brand insists that all implementations be "compatible". This resulted in a legal dispute with Microsoft after Sun claimed that the Microsoft implementation did not support the RMI and JNI interfaces and had added platform-specific features of their own. Sun sued and won both damages in 1997 (some $20 million) and a court order enforcing the terms of the license from Sun. As a result, Microsoft no longer ships Java with Windows, and in recent versions of Windows, Internet Explorer cannot support Java applets without a third-party plugin. However, Sun and others have made available Java run-time systems at no cost for those and other versions of Windows.

Platform-independent Java is essential to the Java Enterprise Edition strategy, and an even more rigorous validation is required to certify an implementation. This environment enables portable server-side applications, such as Web services, servlets, and Enterprise JavaBeans, as well as with Embedded systems based on OSGi, using Embedded Java environments. Through the new GlassFish project, Sun is working to create a fully functional, unified open-source implementation of the Java EE technologies.

Automatic memory management

See also: Garbage collection (computer science)

One of the ideas behind Java's automatic memory management model is that programmers be spared the burden of having to perform manual memory management. In some languages the programmer allocates memory for the creation of objects stored on the heap and the responsibility of later deallocating that memory also resides with the programmer. If the programmer forgets to deallocate memory or writes code that fails to do so, a memory leak occurs and the program can consume an arbitrarily large amount of memory. Additionally, if the program attempts to deallocate the region of memory more than once, the result is undefined and the program may become unstable and may crash. Finally, in non garbage collected environments, there is a certain degree of overhead and complexity of user-code to track and finalize allocations. Often developers may box themselves into certain designs to provide reasonable assurances that memory leaks will not occur.

In Java, this potential problem is avoided by automatic garbage collection. The programmer determines when objects are created, and the Java runtime is responsible for managing the object's lifecycle. The program or other objects can reference an object by holding a reference to it (which, from a low-level point of view, is its address on the heap). When no references to an object remain, the unreachable object is eligible for release by the Java garbage collector - it may be freed automatically by the garbage collector at any time. Memory leaks may still occur if a programmer's code holds a reference to an object that is no longer needed—in other words, they can still occur but at higher conceptual levels.

The use of garbage collection in a language can also affect programming paradigms. If, for example, the developer assumes that the cost of memory allocation/recollection is low, they may choose to more freely construct objects instead of pre-initializing, holding and reusing them. With the small cost of potential performance penalties (inner-loop construction of large/complex objects), this facilitates thread-isolation (no need to synchronize as different threads work on different object instances) and data-hiding. The use of transient immutable value-objects minimizes side-effect programming.

Comparing Java and C++, it is possible in C++ to implement similar functionality (for example, a memory management model for specific classes can be designed in C++ to improve speed and lower memory fragmentation considerably), with the possible cost of adding comparable runtime overhead to that of Java's garbage collector, and of added development time and application complexity if one favors manual implementation over using an existing third-party library. In Java, garbage collection is built-in and virtually invisible to the developer. That is, developers may have no notion of when garbage collection will take place as it may not necessarily correlate with any actions being explicitly performed by the code they write. Depending on intended application, this can be beneficial or disadvantageous: the programmer is freed from performing low-level tasks, but at the same time loses the option of writing lower level code. Additionally, the garbage collection capability demands some attention to tuning the JVM, as large heaps will cause apparently random stalls in performance.

Java does not support pointer arithmetic as is supported in, for example, C++. This is because the garbage collector may relocate referenced objects, invalidating such pointers. Another reason that Java forbids this is that type safety and security can no longer be guaranteed if arbitrary manipulation of pointers is allowed.

Syntax

The syntax of Java is largely derived from C++. Unlike C++, which combines the syntax for structured, generic, and object-oriented programming, Java was built exclusively as an object oriented language. As a result, almost everything is an object and all code is written inside a class. The exceptions are the intrinsic data types (ordinal and real numbers, boolean values, and characters), which are not classes for performance reasons.

Hello, world program

This is a minimal Hello world program in Java with syntax highlighting:

// Hello.java
public class Hello {
public static void main(String[] args) {
System.out.println("Hello, World");
}
}

To execute a Java program, the code is saved as a file named Hello.java. It must first be compiled into bytecode using a Java compiler, which produces a file named Hello.class. This class is then launched.

The above example merits a bit of explanation.

  • All executable statements in Java are written inside a class, including stand-alone programs.
  • Source files are by convention named the same as the class they contain, appending the mandatory suffix .java. A class that is declared public is required to follow this convention. (In this case, the class Hello is public, therefore the source must be stored in a file called Hello.java).
  • The compiler will generate a class file for each class defined in the source file. The name of the class file is the name of the class, with .class appended. For class file generation, anonymous classes are treated as if their name was the concatenation of the name of their enclosing class, a $, and an integer.
  • The keyword public denotes that a method can be called from code in other classes, or that a class may be used by classes outside the class hierarchy.
  • The keyword static indicates that the method is a static method, associated with the class rather than object instances.
  • The keyword void indicates that the main method does not return any value to the caller.
  • The method name "main" is not a keyword in the Java language. It is simply the name of the method the Java launcher calls to pass control to the program. Java classes that run in managed environments such as applets and Enterprise Java Beans do not use or need a main() method.
  • The main method must accept an array of String objects. By convention, it is referenced as args although any other legal identifier name can be used. Since Java 5, the main method can also use variable arguments, in the form of public static void main(String... args), allowing the main method to be invoked with an arbitrary number of String arguments. The effect of this alternate declaration is semantically identical (the args parameter is still an array of String objects), but allows an alternate syntax for creating and passing the array.
  • The Java launcher launches Java by loading a given class (specified on the command line) and starting its public static void main(String[]) method. Stand-alone programs must declare this method explicitly. The String[] args parameter is an array of String objects containing any arguments passed to the class. The parameters to main are often passed by means of a command line.
  • The printing facility is part of the Java standard library: The System class defines a public static field called out. The out object is an instance of the PrintStream class and provides the method println(String) for displaying data to the screen while creating a new line (standard out).

No comments: