It's no secret that at the moment Java is one of the most popular programming languages in the world. The official release date for Java is May 23, 1995.
This article is devoted to the basics: it outlines the basic features of the language, which will come in handy for beginner "javists", and experienced Java developers will be able to refresh their knowledge.
* The article was prepared on the basis of a report by Eugene Freiman - Java developer of IntexSoft.
The article contains links to external materials .
1. JDK, JRE, JVM
Java Development Kit is a
Java application development kit. It includes
Java Development Tools and the Java Runtime Environment (
JRE ).
Java development tools include about 40 different tools: javac (compiler), java (application launcher), javap (java class file disassembler), jdb (java debugger), etc.
The JRE runtime is a package of everything needed to run a compiled Java program. Includes the
JVM virtual machine and the
Java Class Library .
JVM is a program designed to execute bytecode. The first advantage of the JVM is the principle of
“Write once, run anywhere” . It means that an application written in Java will work the same on all platforms. This is a big advantage of the JVM and Java itself.
Before the advent of Java, many computer programs were written for specific computer systems, and the preference was given to manual memory management, as more efficient and predictable. Since the second half of the 1990s, after the advent of Java, automatic memory management has become a common practice.
There are many JVM implementations, both commercial and open source. One of the goals of creating new JVMs is to increase performance for a specific platform. Each JVM is written separately for the platform, while it is possible to write it so that it works faster on a specific platform. The most common JVM implementation is the
OpenJDK JVM Hotspot. There are also implementations of
IBM J9 ,
Excelsior JET .
2. JVM code execution
According to
the Java SE specification , in order to get code running in the JVM, you need to complete 3 steps:
- Loading bytecode and instantiating the Class class
Roughly speaking, in order to get on the JVM, the class must be loaded. There are separate loader classes for this, we will return to them a little later. - Linking or linking
After loading the class, the linking process begins, on which the bytecode is parsed and checked. The linking process, in turn, takes place in 3 steps:
- verification or verification of bytecode: the correctness of the instructions, the possibility of stack overflow on this section of the code, the compatibility of variable types; check occurs once for each class;
- preparation or preparation: at this stage, in accordance with the specification, memory is allocated for static fields and their initialization occurs;
- resolution or resolution: permission of symbolic links (when in bytecode we open files with the extension .class, we see numerical values instead of symbolic links). - Initializing the resulting Class object
At the last stage, the class that we created is initialized, and the JVM can begin to execute it.
3. Class loaders and their hierarchy
Back to class loaders, these are special classes that are part of the JVM. They load classes into memory and make them available for execution. Loaders work with all classes: both ours and those that are directly needed for Java.
Imagine the situation: we wrote our application, and in addition to the standard classes, there are our classes, and there are a lot of them. How will the JVM work with this? Java implements deferred class loading, in other words lazy loading. This means that class loading will not be performed until the class is called in the application.
Class Loader Hierarchy
The first class loader is the
Bootstrap classloader . It is written in C ++. This is the base loader that loads all system classes from the
rt.jar archive. At the same time, there is a slight difference between loading classes from
rt.jar and our classes: when the JVM loads classes from
rt.jar , it does not perform all the verification steps that are performed when loading any other class file since The JVM is initially aware that all of these classes are already validated. Therefore, you should not include any of your files in this archive.
The next bootloader is the
Extension classloader. It loads extension classes from the
jre / lib / ext folder. Suppose you want a class to load every time the Java machine starts. To do this, you can copy the source class file to this folder, and it will automatically load.
Another bootloader is the
System classloader . It loads classes from the classpath that we specified when the application started.
The process of loading classes occurs in a hierarchy:
- First of all, we request a search in the System Class Loader cache (the system loader cache contains classes that have already been loaded by it);
- If the class was not found in the cache of the system loader, we look at the cache Extension class loader;
- If the class is not found in the extension loader cache, the class is requested from the Bootstrap loader.
If the class is not found in the Bootstrap cache, it tries to load this class. If Bootstrap was unable to load the class, it delegates the loading of the class to the extension loader. If at this point the class is loaded, it remains in the cache of the Extension classloader, and class loading is complete.
4. Class file structure and boot process
We proceed directly to the structure of Class files.
One class written in Java is compiled into a single file with the extension .class. If there are several classes in our Java file, one Java file can be compiled into several files with the extension .class - bytecode files of these classes.
All numbers, strings, pointers to classes, fields and methods are stored in the
Constant pool - the
Meta space memory area. The class description is stored in the same place and contains the name, modifiers, super-class, super-interfaces, fields, methods and attributes. Attributes, in turn, may contain any additional information.
Thus, when loading classes:
- reading of the class file, i.e. format validation
- class representation is created in Constant pool (Meta space)
- super classes and super interfaces are loaded; if they are not loaded, then the class itself will not be loaded
5. Bytecode execution on the JVM
First of all, to execute bytecode, the JVM can
interpret it . Interpretation is a rather slow process. In the process of interpretation, the interpreter “runs” line by line through the class file and translates it into commands that are understood by the JVM.
Also, the JVM can
broadcast it , i.e. compile into machine code that will be executed directly on the CPU.
Commands that are executed frequently will not be interpreted, but will immediately be broadcast.
6. Compilation
A compiler is a program that converts the source parts of programs written in a high-level programming language into a machine language program that is “understandable” to a computer.
Compilers are divided into:
- Not optimizing
- Simple optimizing (Hotspot Client): work quickly, but generate non-optimal code
- Complex optimizing (Hotspot Server): perform complex optimizing transformations before generating bytecode
Compilers can also be classified at the time of compilation:
- Dynamic compilers
They work simultaneously with the program, which affects performance. It is important that these compilers run on code that is often executed. During program execution, the JVM knows which code is most often executed, and in order not to constantly interpret it, the virtual machine immediately translates it into commands that will already be executed directly on the processor. - Static Compilers
Compile longer, but generate the optimal code for execution. From the pros: they do not require resources during program execution, each method is compiled using optimizations.
7. Organization of memory in Java
The stack is a region of memory in Java that works according to the LIFO scheme - “
Last in - Fisrt Out ” or “
Last In, First Out ”.
It is needed in order to store methods. Variables on the stack exist as long as the method in which they were created is executed.
When any method is called in Java, a frame or memory area is created on the stack, and the method is put on its top. When a method completes execution, it is removed from memory, thereby freeing up memory for the following methods. If the stack memory is full, Java will throw a
java.lang.StackOverFlowError exception. For example, this can happen if we have a recursive function that will call itself and there will not be enough memory on the stack.
Key features of the stack:
- The stack is populated and freed as new methods are called and completed.
- Access to this memory area is faster than heap.
- The stack size is determined by the operating system.
- It is thread safe, because for each thread a separate stack is created
Another area of memory in Java is
Heap or
heap . It is used to store objects and classes. New objects are always created on the heap, and references to them are stored on the stack. All objects on the heap have global access, that is, they can be accessed from anywhere in the application.
The heap is broken into several smaller parts called generations:
- Young generation - the area where recently created objects are located
- Old (tenured) generation - the area where “long-lived” objects are stored
- Prior to Java 8, there was another area - Permanent generation - which contains meta-information about classes, methods, and static variables. After the advent of Java 8, it was decided to store this information separately, outside the heap, namely in Meta space
Why abandoned Permanent generation? First of all, this is due to an error that was associated with overflowing the area: since Perm had a constant size and could not expand dynamically, sooner or later the memory ran out, an error was thrown, and the application crashed.
Meta space has a dynamic size, and at runtime it can expand to JVM memory sizes.
Key heap features:
- When this memory area is full, Java throws java.lang.OutOfMemoryError
- Heap access is slower than stack access
- Garbage collector works to collect unused objects
- A heap, unlike a stack, is not thread safe, since any thread can access it
Based on the information above, we will consider how memory management is performed using a simple example:
public class App { public static void main(String[] args) { int id = 23; String pName = "Jon"; Person p = null; p = new Person(id, pName); } } class Person { int pid; String name; // constructors, getters/setters }
We have an App class in which the only
main method consists of:
- primitive
id variable of type
int with value
23-
pName reference variable of type
String with value
Jon- reference variable
p of type
personAs already mentioned, when a method is called, a memory area is created at the top of the stack in which the data necessary for this method to be stored is stored.
In our case, this is a reference to the
person class: the object itself is stored on the heap, and the link is stored on the stack. A link to the string is also pushed onto the stack, and the string itself is stored on the heap in the String pool. The primitive is stored directly on the stack.
To call the constructor with
Person (String) parameters from the
main () method on the stack, on top of the previous
main () call, a separate frame is created on the stack that stores:
-
this - link to the current object
- primitive
id value
- the reference variable
personName , which points to a string in the String Pool.
After we called the constructor,
setPersonName () is called, after which a new frame is created on the stack again, where the same data is stored: object reference, line reference, variable value.
Thus, when the
setter method is executed, the frame disappears, the stack is cleared. Next, the constructor is executed, the frame that was created for the constructor is cleared, after which the
main () method finishes its work and is also removed from the stack.
If other methods are called, new frames will also be created for them with the context of these specific methods.
8. Garbage collector
Garbage collector is working on the heap - a program running on the Java virtual machine that gets rid of objects that cannot be accessed.
Different JVMs may have different garbage collection algorithms; there are also different garbage collectors.
We will talk about the simplest collector
Serial GC . We request garbage collection using
System.gc () .
As already mentioned above, the heap is divided into 2 areas: New generation and Old generation.
New generation (younger generation) includes 3 regions:
Eden ,
Survivor 0 and
Survivor 1 .
Old generation includes the
Tenured region.
What happens when we create an object in Java?
First of all, the object falls into
Eden . If we have already created many objects and there is no more space in
Eden , the garbage collector fires and frees up memory. This is the so-called
small garbage collection - on the first pass, it cleans the
Eden area and puts the “surviving” objects in the
Survivor 0 region. Thus, the
Eden region is completely freed.
If it happens that the
Eden area is full again, the garbage collector starts working with the
Eden area and
Survivor 0 , which is currently occupied. After cleansing, the surviving objects will fall into another region -
Survivor 1 , and the other two will remain clean. Upon subsequent garbage collection,
Survivor 0 will again be selected as the destination region. That is why it is important that one of the
Survivor regions is always empty.
The JVM monitors objects that are constantly being copied and moved from one region to another. And in order to optimize this mechanism, after a certain threshold, the garbage collector moves such objects to the
Tenured region.
When there is not enough space for new objects in
Tenured , there is a complete garbage collection -
Mark-Sweep-Compact .
During this mechanism, it is determined which objects are no longer used, the region is cleared of these objects, and the
Tenured memory
area is defragmented, i.e. sequentially filled with the necessary objects.
Conclusion
In this article, we examined the basic tools of the Java language: JVM, JRE, JDK, the principle and stages of JVM code execution, compilation, memory organization, as well as the principle of the garbage collector.