JAVA Basic Concept

OOP

Object-oriented design (OOD) : the process of planning a program based on “object”, which contains encapsulated data and procedures grouped together to represent a type of entity.

Object-oriented programming (OOP) : a programming paradigm based on "objects". Developers need to group both data/attributes (data members) and code/methods (member functions) together into a class, in order to represent a type of objects.

Neither Java nor C++ is pure object-oriented language.

Java

  1. It has primitive data types (boolean, char, byte, short, int, float, double, long) which are different from objects. (A pure object-oriented language should contain only objects and treat all primitive values, such as integers and characters, as objects.)
  2. The "static" keyword in Java allows us to use a class method without creating an object of that class.
  3. Some objects in Java, like String, can be used without calling its methods. For example, we can concatenate two strings by only using the arithmetic operator +.

Five basic concepts in OOD, which are: Classes/Objects, Encapsulation/Data Protection, Inheritance, Interface/Abstraction, Polymorphism.

Encapsulation

Bind together both data and methods which manipulate those data, to make sure that certain data can only be used in a proper way and keep them safe from outside.

  • Advantage:① Data hiding: Have no idea about the inner implementation; ② Flexibility: Can make the variables of a class to be read-only or write-only, by only providing the get methods or the set methods; ③ Testing: Easy for a unit test; ④ Reusability: Can always reuse part of code or modify with new requirements.
  • Implementation: In Java, encapsulation is achieved by class and access modifiers/access specifiers.
  • Access Modifier/Access Specifier is access restrictions to classes, methods and variables to limit certain access from other parts of program. There are 4 types of access modifiers: default (no keyword specified), private, protected and public.
Default Private Protected Public
Same Class Yes Yes Yes Yes
Subclass in Same Package Yes No Yes Yes
Non-subclass in Same Package Yes No Yes Yes
Subclass in Different Package No No No Yes
Non-subclass in Different Package No No No Yes

Inheritance : allows us to arrange classes in a hierarchy, so that we can create a class and inherit its features from another existing class, reuse parts of code and run the program faster. (A class that is derived from another class is called a subclass/derived class/child class/extended class, and the class being inherited by another class is called a superclass/base class/parent class.) Usually the subclass will automatically inherit the data members and member functions of its superclass, and not inherit constructors, destructors and private functions of the superclass (Java doesn't have destructors because of the garbage collection). If an instance of the subclass is created, the constructor of the superclass will be called firstly before the constructor of the subclass (constructor chaining).

  • Implementation: In Java, the keyword "extends" is used for inheritance. When an object of the subclass is created, a copy of all methods and variables of its superclass will get memory in this object. Object Class (java.lang.Object) is the top most class in Java, that any other classes are directly or indirectly derived from the Object class. The Object class is the only one class in Java that has no superclass.
  • Methods Overriding in OOP is a feature that allows the subclass to provide a different implementation of method that already provided by its superclass. So that the implementation in the subclass will rewrite/replace/override the implementation in the superclass with the same method name, parameters and return type. __Notice it's the type of the object being instantiated (not the type of the reference variable) that determines which version of the method should be executed.

Polymorphism

is the ability of an object to take on multiple forms. So with polymorphism, developers can have functions with the same name, but different implementations.

  • Static Polymorphism/Static Binding/Static Dispatch/Early Binding/Compile Time Polymorphism/Static Linkage: have multiple methods that use the same name, but different signatures (method signature may indicate the number, types or sequences of parameters) with different implementations. => overloading
  • Dynamic Polymorphism/Dynamic Dispatch/Late Binding/Runtime Polymorphism/Dynamic Linkage: In Java, dynamic polymorphism can't be achieved by the compiler. Instead, the JVM will do that at runtime. => overriding

overloading example:

class Calculation {
    void sum(int a,int b){System.out.println(a+b);}
    void sum(int a,int b,int c){System.out.println(a+b+c);
}
public static void main(String args[]) {
    Calculation obj=new Calculation();
    obj.sum(10,10,10); // 30 obj.sum(20,20); //40 }
}

overriding example:

class Animal {    
   public void move(){
      System.out.println("Animals can move");
   }
}

class Dog extends Animal {
   public void move() {
      System.out.println("Dogs can walk and run");
   }
}

public class TestDog {
   public static void main(String args[]) {
      Animal a = new Animal(); // Animal reference and object
      Animal b = new Dog(); // Animal reference but Dog object
      a.move();//output: Animals can move
      b.move();//output:Dogs can walk and run
   }
}

Object Composition means combination of simple objects or data types into more complex ones. In this way, objects of one class may contain objects of other classes, which maintains a "has-a" relationship.

Composition over Inheritance/Composite Reuse Principle in OOP requires a class to achieve polymorphism and code reuse by object composition, that is, by containing instances of other classes to implement desired functionalities, rather than by inheritance from a superclass. So in order to achieve composite reuse principle, we should firstly create various interfaces representing different functionalities that an object should have.

Interface

Interface is like a blueprint of class, which enforces some behaviors a class has to do without telling how to do that. In Java, an interface can have variables and methods, but all methods should be by default public and abstract, and all variables should be public, static and final.

Starting from Java 8, we can add default implementations for methods in interfaces, using "default" keyword.

public interface oldInterface {
    public void existingMethod();
        default public void newDefaultMethod() {
        System.out.println("New default method"
              " is added in interface");
    }
}

The Java 8 also enables static methods so that static methods in the interface can be called independently in the implementation class without an object.

interface in1{
    final int a = 10;
    static void display(){
        System.out.println("hello");
    }
}
Class Main implements in1{
    public static void main(String[] args){
        in1.display();
    }
}

Abstract

Abstract method is a method that is declared, but contains no implementation. Abstract Class is a class that contains one or more abstract methods. Abstract class cannot be instantiate and requires subclasses to provide implementations for its abstract methods. So abstract class provides a way where classes can only be inherited but not instantiated.

Abstract Class vs Interface ① Interfaces contain only abstract methods (before Java 8) and therefore methods are by default abstract without being declared with keyword "abstract". Abstract class can have both abstract and non-abstract methods, and so abstract methods must be declared with keyword "abstract". ② Variables in interfaces are by default static and final while in abstract classes they can be static or non-static, final or non-final. ③ Abstract class can implement one or multiple interface using keyword "implements", abstract class can extend only one class (either abstract or non-abstract, considering the Object class). An interface can extend multiple interfaces. ④ Members in an interface are public by default, while members in an abstract class can also be protected or private.

Garbage Collection

Garbage Collection is a memory management technique in some programming languages so that developers don't need to worry about memory deallocation after creating new objects, because garbage collector will destroy those objects automatically.
GC Root (garbage collection root) is a kind of special objects in Java that's always reachable, and therefore each object that has a GC root is reachable by the program. There are 4 kinds of GC roots in Java: ① Local variables (that are kept alive by the stack of a thread) in the main method; ② Active Java threads; ③ Static variables; ④ JNI References (Java objects that the native code has created as part of a JNI call, and objects thus created are treated specially since the JVM doesn't know if it's being referenced by the native code or not.)
To determine which object is no longer in use, the JVM intermittently runs a "Mark-and-Sweep Algorithm", which contains two steps: ① Mark → Traverse all object references, starting with the GC roots, and marks every object found as alive; ② Sweep → Clear all heap memory that is marked as unreachable (not occupied by a live object) and set it to be free. Garbage collection is intended to remove the cause for classic memory leaks: unreachable-but-not-deleted objects in memory. However, this works only for memory leaks in the original sense. It’s possible to have unused objects that are still reachable in the program if the developer forget to dereference them. Such objects can't be garbage-collected. Such a logical memory leak can't be detected by any software. Even the best analysis software can only highlight suspicious objects.https://www.dynatrace.com/resources/ebooks/javabook/how-garbage-collection-works/__

Keyword

This keyword is used to refer to the current object. It can be used in the following ways: ① Use "this()" function to invoke the current class constructor (constructor call must be the first statement in the constructor); ② To invoke the current class method; ③ To indicate the current class variables (if there is a local variable in a method with the same name as an instance variable, then the local variable will hide the instance variable; or, you can use "this" to indicate the instance variable explicitly); ④ To return the current class instance; ⑤ To serve as a method parameter.

// 1. Use "this()" function to invoke the current class constructor
// Output: Parameterized Constructor Default Constructor
Class Main{
    Main(){
        this(10,20);
        System.out.prnitln(" Default Constructor");
    }
    Main(int a, int b){
        System.out.prnitln("Parameterized Constructor");
    }
    public static void main(String[] args){
        Main object = new Main();
    }
}

// 2.Invoke the current class method
// Output: Inside Display
Class Main{
    Main(){
        this.display();
    }
    void display(){
        System.out.prnitln("Inside Display");
    }
    public static void main(String[] args){
        Main object = new Main();
    }
}

// 3. Indicate the current class variables
// Output: a=10
Class Main{
    int a;
    Main(int a){
        this.a = a;
    }
    void display(){
        System.out.prnitln("a=" + this.a);
    }
    public static void main(String[] args){
        Main object = new Main(10);
        object.display();
    }
}

// 4. Return the current class instance
// Output: a=10 b=20
Class Main{
    int a, b;
    Main(){
        a = 10;
        b = 20;
    }
    Main get(){
        return this;
    }
   void display(){
        System.out.prnitln("a=" + a + " b=" + b);
    }
    public static void main(String[] args){
        Main object = new Main();
        object.get().display();
    }
}

// 5.Serve as a method parameter
// Output: a=10 b=20
Class Main{
    int a, b;
    Main(){
        a = 10;
        b = 20;
    }
    Main get(){
        display(this);
    }
   void display(Main obj){
        System.out.prnitln("a=" + obj.a + " b=" + obj.b);
    }
    public static void main(String[] args){
        Main object = new Main();
        object.get();
    }
}

Super keyword is used to indicate the object of superclass, for the purpose of resolving ambiguity. It can be used in the following ways: ① If both superclass and subclass have same members (either data members or member functions), we can use "super" to indicate the superclass member; ② The "super()" function, either parametric or non-parametric, can be used to call the superclass constructor, and it must be the first statement in the subclass' constructor. (In fact, the program will always call the superclass constructor firstly before calling the constructor of subclass.) Explicit constructor call should always be the first statement in a constructor.

Constructor Chaining is the process of calling one constructor from another constructor. It can be done in two ways: ① Within the same class, use "this()" keyword (either parametric or non-parametric); ② The subclass constructor will always invoke the constructor of its superclass, either explicitly (using keyword "super()") or implicitly, which forms a whole chain of constructor calls from the Object class down to the current subclass (superclass's constructor is always invoked before the subclass's constructor).

Static keyword is used for efficient memory management purpose. In Java, you can declare static blocks, variables, methods and nested classes with keyword "static". Once a member is declared static, it can be accessed before any instance of its class created, or without reference to any instance.

// Java program to demonstrate that a static member 
// can be accessed before instantiating a class 
class Test { 
    // static method 
    static void m1() { 
        System.out.println("from m1"); 
    } 

    public static void main(String[] args){ 
          // calling m1 without creating 
          // any object of class Test 
           m1(); 
    } 
}
  • Static Block/Static Clause will be executed exactly once at the first time the class is loaded (either when you create the first instance of the class or when you first access a static member of that class). The static block will also be executed before constructors.
  • Static Variable will get memory only once. They are shared by all instances of that class. Static variables are essentially global variables.
  • Static method within a class always belongs to the class rather than objects of that class. They can be called without creating an instance of that class. They can never invoke a non-static method or use a non-static variable.
  • Static Main: The main function is always called by the JVM before any objects are made. So only by making it static can the main function be directly invoked via the class.

Final keyword in Java can be used for variables, methods and classes, to prevent specific modifications.

  • final variable means a constant, and you must initialize that final variable.
  • final method can't be overridden, which means we are required to follow the same implementation throughout all subclasses to prevent any unexpected behavior.
  • final class should never be inherited or extended. Declare a class as final means all of its methods are final. A class can never be both abstract and final.

Memory Allocation

**stack memory is responsible for holding references to heap objects and for storing primitive data types, and all objects are dynamically allocated on heap. (when we declare a variable of a class type, only a reference is created and memory is not allocated. To allocate memory to an object, we must use "new". So the object memory is always allocated on heap.)
Usually OS allocates heap memory in advance to be managed by the JVM while the program is running. So object creation is faster since global synchronization with the OS is not needed for every single object. An allocation simply claims some parts of a memory array and moves the offset pointer forward. The next allocation starts from this offset and claims the next parts of the array.

Collection is a group of individual objects treated as a single unit in Java. There are mainly two root interfaces designed for collection, which are Map interface (java.util.Map) and Collection interface (java.util.Collection).

Implementation Classes (only partial):
List {ArrayList, LinkedList, Stack, Vector};
Set {HashSet, LinkedHashSet, TreeSet};
SortedSet {TreeSet};
NavigableSet {TreeSet};
Queue {LinkedList, PriorityQueue};
Deque {LinkedList};
Map {HashMap, Hashtable, LinkedHashMap, TreeMap};
SortedMap {TreeMap};
NavigableMap {TreeMap}.

Data Structure

Map

Map represents a mapping relationship within a key-value pair. To access the value you must know its key. And there can't be duplicate keys and each key can map to at most one value.

  • SortedMap is an interface that extends the Map interface, which maintains some orders among its keys based on either the natural ordering or specific comparator.
  • HashMap is the class that provides basic implementations for Map interface. It is implemented based on hashing and maintains an array of buckets (the capacity of HashMap refers to the number of buckets) while each bucket is a linked list key-value pairs. HashMap can't maintain a constant order among elements over time. The get, put and containsKey operations in HashMap are basically O(1) and the worst case is O(n) (but it's not guaranteed since it may depends on how much time it takes to compute the hash value), and the time complexity of iterations over HashMap is usually proportional to the sum of its capacity and its number of key-value pairs. HashMap is unsynchronized which means it's not thread-safe. *
  • LinkedHashMap is similar to HashMap but can maintain the initial insertion order among elements. TreeMap is a class implements the SortedMap interface. It has functions of a Map while can also maintain a sorted order among its keys based on some comparator. It is implemented with a Red-Black Tree in the background. The operations like insert, remove and search (containsKey) take O(logn) complexity, and sorted traversal may take O(n) time because the ordering of keys has already been implemented during insertion, and what need to do is just traverse each key one by one.

List

List is an ordered collection in Java that can store duplicate values. It provides add, remove, get and set operations based on the numerical position of each element, together with search and other operations.

  • ArrayList is like a dynamic array in Java whose size can be increased or shrunk based on the total number of elements in the list. It also allows random access to elements.
  • LinkedList is a linear data structure that is achieved by assigning some specific pointers for each node, and let pointers of each node point to both of its previous and next nodes in the list.
  • Vector(legacy code) is very similar to ArrayList which can grow or shrunk as required and have random access to elements via various methods. However, Vector is synchronized which can be thread-safe.
  • Stack is a Java collection with the restrictions that elements can be pushed/added only onto the top and popped/removed only from the top of the Stack, which known as LIFO property.

Array vs ArrayList: ① An array in Java has fixed size and we can create it in the way like simply declaring and initializing a variable; ArrayList is a class that implements the List interface, which will create an array with dynamic size. ② We can access elements in an array using a pair of square brackets, while we need various functions to access elements in an ArrayList. ③ Elements in Array can be either primitive data type or objects of a class, while ArrayList doesn't support primitive data type. ④ Array is faster in performance considering that ArrayList may need to resize during execution.

ArrayList vs LinkedList: ① ArrayList is a dynamic array while LinkedList is a doubly linked list. ② Insertions and removals are faster on LinkedList because for ArrayList, you may need to resize the array, copy its content to a new array and update indexes, which requires an O(n) time complexity. But for LinkedList, insertion and removal are O(1). ③ LinkedList has more memory overhead because in ArrayList, you only need to store the data for each index, while in LinkedList, for each node you need to store both data and addresses that indicating its previous and next node.

Set

Set is an unordered collection in Java that cannot store duplicate values. It provides basically add, remove, contains operations, etc. All the classes of Set are internally backed up by Map.

  • SortedSet is an ordered collection that extends the Set interface, which maintains some orders among its elements. All elements in a SortedSet must be mutually comparable.
  • HashSet is one class that implements the Set interface so duplicate values are not allowed either. The underlying data structure of HashSet is hashtable. HashSet stores elements based on their hashcode, rather than the order in which you insert them. The load factor of HashSet is a measure of how full it is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the internal data structure is rebuilt (hash table is rehashed). Implementation: HashSet is internally implemented by HashMap. So the value we insert into a HashSet acts as a key of HashMap, and Java uses a constant variable as the value. That is, all keys have the same value. Average time complexity for add, remove and contains methods in HashSet is O(1).
  • LinkedHashSet is an ordered version of HashSet that maintains a doubly-linked list among all elements, so the order of traversing a LinkedHashSet is predictable.
  • TreeSet is one class that implements the SortedSet interface and duplicate values are not allowed either. It maintains orders among elements based on keys or some specific comparators (rather than preserving the insertion order). It is implemented with a self-balanced binary search tree (like Red-Black Tree). So operations like add, remove and contains take O(log n) time. And traversal of TreeSet with n elements in sorted order takes O(n) time since ordering has already been implemented during insertion.

Queue

Queue is an ordered list of objects with the restrictions that objects must be inserted at the end of the queue and removed from the start of the queue, which is FIFO (First-In-First-Out) principle. A Queue interface can be implemented with both LinkedList and PriorityQueue class, where PriorityQueue will sort elements in the Queue based on priority.

  • PriorityQueue can process objects in the queue based on priority. Such priority may be based on their natural ordering, or some user-defined comparator.
  • Deque is like a double-ended queue so that insertion and removal of elements are available at either end. It can be used as both a Queue and a Stack.

HashMap vs HashTable: HashMap and HashTable are both used to store key/value pairs in a hash table. So we can further fetch a value by referring to its corresponding key. The hash code is computed for the key and stored in the table, and used as the index of the value that stored in the table. Their differences are: ① HashMap is non-synchronized. It's not thread-safe and can’t be shared among many threads without proper synchronization code. But Hashtable is synchronized, it is thread-safe and can be shared with many threads; ② HashMap allows one null key and multiple null values but Hashtable doesn’t allow any null key or value; ③ HashMap is more generally preferred if thread synchronization is not needed.

Red–Black Tree is a kind of self-balanced binary search tree that each tree node has an extra bit, which is often interpreted as color (red/black) of that node. The color bits are used to ensure the tree remains approximately balanced at insertions and deletions. So in addition to maintain properties of a binary search tree, a red-black tree should also have 5 properties: ① Each node is either red or black; ② All leaves (NIL) are black; ③ No two red nodes are adjacent (parent and children node of a red node must be black); ④ The root node must be black; ⑤ Every path from a given node down to a NIL/leaf node should contain the same number of black nodes. The time complexity of insert, delete and search operations are all O(logn) in average.

Error vs Exception: An error is a fatal problem or abnormal condition that the program should never try to catch it (since they should never be predicated to occur in a normal application). An exception is an unexpected event occurring at either compiling time or running time of the program. In Java, there is a class called Throwable (java.lang.Throwable), which is derived directly from the Object class. And the Throwable class has two subclasses, which are Error class (java.lang.Error) and Exception class (java.lang.Exception). All exceptions and errors in Java belong to these two classes. Errors in the Error class are always caused by the running environment, such as stack overflow or memory used up. And there is no way to recover it, so the execution will be terminated immediately. Exceptions in the Exception class are always caused by the program itself, and mostly they can be predicted, handled and recovered, and so the program can keep running as normal.

Checked vs Unchecked: In Java, all exceptions in both Exception class can be categorized into two types: checked and unchecked. Checked exceptions are those exceptions that are checked at the compiling time. In other words, they are noticed by the compiler, and therefore the compiler will force the programmer to either handle them with the try/catch block, or simply throw them and then deal with them in the caller. All classes derived from the Exception class, except the RuntimeException class, belong to the checked exceptions. Unchecked exceptions are those exceptions that are not checked at the compiling time, and will only occur during the running time. So if such exceptions are not handled properly, the program will always terminate immediately without raising a compiling error. All exceptions in the RuntimeException class are unchecked exceptions. All errors in the Error class are also considered as unchecked exceptions.

Control Flow in Try-Catch or Try-Catch-Finally: So if an exception is raised in the try block, then the rest code of that try block will not be executed and the control will be passed to a corresponding catch block. Here we assume this exception is handled in one catch block. So the corresponding catch block will be executed, and if there is also a finally block present, the code in the finally block will be executed. Then the program will go on its execution of the remaining code. Otherwise, if the exception occurred in the try block is not handled by any catch block, if there is finally block, the code in the finally block will be executed. Then this exception will be handled by default handling mechanism, usually it is stop execution and generate an error. If there is no exception occurred in the try block, the finally block, if present, will always be executed, and followed by the remaining part of the program.

Control Flow in Try-Finally: No matter whether there is an exception raised in the try block, the finally block will always be executed. The only difference is that, if an exception is raised in the try block, the exception will be handled by the default handling mechanism after the finally block; and if no exception occurred, the finally block will be followed by the remaining part of the program.

Throw vs Throws: The "throw" keyword is used to explicitly throw an exception from a method or a block. It can throw either checked or unchecked exceptions. In fact, it's usually used for custom exceptions. Once an exception is thrown, the nearest try-catch block will check to see whether there is a matching type of exception. And if no matching found, then the second nearest try-catch block will check and so on. The "throws" keyword is always write after the method signature, followed by a list of exceptions, to indicate that this method may throw one of listed exceptions and require the its caller to handle this exception in a try-catch block.


Language

Compiled Language can be translated into native machine instructions of the target machine, which in turn be executed directly by the hardware. In other words, compiled language is implemented by compilers, which serve as translators to generate machine code from source code. C++ is compiled language. (For example, an addition "+" operation in C++ source code could be translated directly to the "ADD" instruction in machine code.)

  • Advantages: ① It is faster, since the program is compiled only once and after that, you can run the native code directly without referring to the source code; ② There can be some error check (syntax and type) at the compilation stage.
  • Disadvantages: ① It is not platform independent and always requires a compiler; ② May need some extra effort in programming.

Interpreted Language is that the original program should be executed directly without a previous compilation stage. An interpreter is needed for a step-by-step/line-by-line conversion and execution of source code. In this way, the operating system will never directly execute the program, it just runs the interpreter, and interpreter should execute the interpreted program. PHP is interpreted language. (For example, the "+" operation would be recognized by the interpreter at run time, and the interpreter will call its own "add" function and run it directly.)

  • Advantages: ① It's platform free, the code can be easily executed elsewhere as long as there is an interpreter for that language; ② There can be more flexible in achieving dynamic typing and dynamic scoping.
  • Disadvantages: ① It's running slower than compiled language; ② The source code is more readily to be viewed by others and even suffer from some security risks, such as code injection.

Intermediate: There is an intermediate stage between compiled and interpreted language, where the original program is firstly compiled into byte code (or some other representations), and then be executed by "interpreter" (for Java it is Java virtual machine). The interpreter's execution on byte code is pretty similar to the hardware's execution on machine code, but the interpreter is still "software processed" language that on the top of hardware. Examples are Python and Java.

Strengths of Java: ① Java is object-oriented which is easy to model and understand. ②Java is simpler and easier to use than C++. One main reason may be that Java is able to achieve memory allocation and deallocation automatically. ③Java is platform-independent and it's easy to move the program from one computer system to another, which is especially useful for the Internet. ④ Java provides many security considerations in designing. For example, Java adds runtime limitations for JVM, has a security manager so that untrusted code can be put into a sandbox, provides multiple APIs that related to security (like the standard cryptographic algorithms, authentication, and secure communication protocols). ⑤ Java offers cross-functionality and cross-platform so that Java programs can run on desktops, mobiles, embedded systems, etc. ⑥ Java has multiple IDEs that provide effective debugging, testing and other properties. ⑦ It's easy for Java to write network programs or program for cloud service.

Drawbacks of Java: ① The running speed of Java is not optimal, since it must be translated into byte code firstly at each runtime, and then be executed by its interpreter JVM. ② The memory management method in Java is a little expensive, considering that garbage collection technique may require additional time and resource consumption (When garbage collection program is running, all other threads have to be paused, and it may need additional memory space for garbage collection algorithm to determine which object to deallocate). ③ Java doesn't support templates, and therefore we can't pass multiple types to one single function at the same time (not good for code reuse).

JDK (Java Development Kit) is a software development environment used for developing Java applications and applets. It includes the Java Runtime Environment (JRE), an interpreter/loader (Java), a compiler (javac), an archiver (jar), a documentation generator (Javadoc) and other tools needed in Java development.

  • JRE (Java Runtime Environment) may also be written as “Java RTE.” It provides the minimum requirements for executing a Java application, which consists of the Java Virtual Machine (JVM), core classes, and supporting files.
  • JVM (Java Virtual Machine) is a virtual machine that enables a computer to run a Java program. JVM has 3 notions: ① A specification, which is a document that formally describes what is required of a JVM implementation. ② An implementation, which is a computer program that meets the requirements of the JVM specification. (But implementation provider is independent to choose the algorithm. Its implementation has been provided by Sun and other companies.) ③ A runtime instance, which is an implementation running in a process that executes a program compiled into Java bytecode. Whenever you write java command on the command prompt to run the java class, an instance of JVM is created.
  • JDK=JRE+Development Tools, JRE=JVM+Library Classes

What is Blank Final Variable? A final variable in Java can be assigned a value only once, either in declaration or later, while a blank final variable in Java is a final variable that is not initialized during declaration.

This is useful to create immutable objects:

public class Bla {
    private final Color color;

    public Bla(Color c) {this.color = c};

}

Bla is immutable (once created, it can't change because color is final). But you can still create various Blas by constructing them with various colors.

Can we overload main() method? In fact, the main method is like any other method and can be overloaded in a similar way, and JVM always looks for the method signature to launch the program. The normal main method acts as an entry point for the JVM to start the program. So we can overload the main method in Java. But the program can execute the overloaded main method only after we call the overloaded main method from the actual main method only.

A Wrapper class is a class whose object wraps or contains a primitive data types. When we create an object to a wrapper class, it contains a field and in this field, we can store a primitive data types. In other words, we can wrap a primitive value into a wrapper class object.

Primitive Data types and their Corresponding Wrapper class

Primitive Data types Wrapper class
char Character
byte Byte
short Short
int Integer
long Long
float Float
double Double
boolean Boolean

HashMap Implementation: A HashMap in Java contains an array of buckets to store the key/value pairs, and uses the key's hashcode to determine where to place/find that key/value pair. That is, when you pass a key to a HashMap, it computes the key's hashcode. Then the HashMap calculates the bucket index based on that hashcode (using xor) to find the right bucket corresponding to the key's hashcode. In this way, the HashMap can quickly determine which bucket it should put or retrieve the pair.

  • Collision: However, sometimes multiple hashcodes of keys may map to the same bucket, which causes a collision. The HashMap responds to a collision by constructing a Linkedlist among all conflict key-value pairs under that bucket. Then, it will query each entry of the Linkedlist, by comparing the values of keys, to find the right key/value pair. Three steps after calling of the get() method: ① Compute the key's hashcode and query the bucket index corresponding to that hashcode; ② Retrieve the whole list of key/value pairs under the bucket with the certain index; ③ Perform a sequential/tree-based search through each entry until a key that equals to the key passed into the get() method is found. (In fact, start from Java 8, when the number of entries in one bucket exceeds a certain threshold, the Linkedlist structure under that bucket will be reconstructed to be a balanced tree, which will improve the worst case performance from O(n) to O(log n).)

String in Java is a class representing objects that contains an immutable sequence of unicode characters. (In C/C++, string is simply an array of chars.) Unlike an ordinary Java class, Java String is special in: ① String is associated with string literal in the form of double-quoted texts (such as "Hello World!"). And you can either call the constructor to create a String instance (explicit construction), or simply assign a string literal directly to a String variable-just like a primitive data type (implicit construction); ② The '+' operator is overloaded to concatenate two String operands (String is the only class that has the '+' operator overloaded and '+' is the only operator that is internally overloaded to support string concatenate); ③ String is immutable, that its content can't be modified. For functions called by a String object, such as toUpperCase(), a new String object should be constructed and returned instead of modifying the original one.

  • Reason: Strings receive special treatment in Java since they are used frequently in program, and therefore, efficiency (in terms of computation and storage) is crucial. (Instead of making everything an object, Java designers decided to preserve primitive types in the language to improve the programming performance. Primitives are stored in the call stack, which require less storage spaces and are cheaper to manipulate, while objects are stored in the heap, which require complex memory management and more storage spaces.)
  • String Literal vs String Object: String literals are stored in a common pool. Multiple String variables can share common storage of strings in the pool as long as they hold the same contents (and this is why String is immutable); String objects allocated via "new" keyword are stored in the heap, and there is no sharing of storage for the same contents. The equal() method in the String class is used to compare only the content of two Strings, while the "==" (relational equality operator) is used to compare the reference/pointer of two strings.
  • StringBuffer and StringBuilder are both classes that help to build mutable strings, and are more efficient than String objects if you need to modify them frequently (String is more efficient if there is no need to modify). Their only difference is: StringBuffer is synchronized while StringBuilder is not synchronized in multithread programming. In sing-thread program, StringBuilder, without synchronization overhead, is more efficient.

Comparable and Comparator are both interfaces in Java used to compare and sort objects. The Comparable interface is in java.lang package while the Comparator is in java.util package. They both need classes to implement them and need the implementation classes to override their compare methods. Differences: ① The compare method in Comparable is "int compareTo(Object o)", which takes only one parameter and compare that with the current object; the compare method in Comparator is "int compare(Object o1, Object o2)", which takes two parameters and compare them with each other. ② A class which implements the Comparable interface can only sort in a single way, while a class implements the Comparator interface can sort in multiple ways.

CPU Processor Core Thread

Process is an abstraction or an instance of a running program, which includes some current variable values of this program. By switching the CPU from process to process, it creates an illusion of (pseudo) parallelism.

Thread is like a mini-process, that multiple threads can exist in only one process and they will share some resource of this process while running independently. We need threads since: ① Many programs consist of multiple activities that can run simultaneously. However, as processes can’t share memory, threads can be used here; ② Threads are faster and easier to create and destroy than a process. Per-Process Items: address space, global variables, open files, child processes, pending alarms, signals and signal handlers, accounting information, etc.; Per-Thread Items: program counter, registers, stack, state.

Multithreading is a feature that allows concurrent execution of two or more parts of a program, in order to maximize the utilization of CPU. Each part of such program is called a thread. So, threads are light-weight processes within a process.

In Java, threads can be created by: ① extending the Thread class; ② implementing the Runnable Interface.

Create a new class to extend the Thread (java.lang.Thread) class and override the run() method available in the Thread class. A thread begins its lifecycle inside the run() method. Then an object of this subclass can be created and call start() method to start the execution of a thread. Start() invokes the run() method on the Thread object.

class MultithreadingDemo extends Thread{ 
    public void run(){ 
        try {
            System.out.println("Thread:"+ Thread.currentThread().getId() + " is Running");
        } catch(Excpetion e){
            System.out.println("Exception is caught");
        }
    } 
}
public static Main{
    public static void main(String[] args){ 
        for(int i = 0; i < 8; i++){
            MultithreadingDemo object = new MultithreadingDemo();
            object.start();
        } 
    } 
}

Implement the Runnable (java.lang.Runnable) interface and the run() method in it. Then instantiate a Thread object and call start() method on this object. (Thread class can provide some built-in methods like start(), interrupt(), yield(), etc. which help us achieve some basic functionality of a thread, but they are not available in Runnable interface.

class MultithreadingDemo implements Runnable{ 
    public void run(){ 
        try {
            System.out.println("Thread:"+ Thread.currentThread().getId() + " is Running");
        } catch(Excpetion e){
            System.out.println("Exception is caught");
        }
    } 
}
public static Main{
    public static void main(String[] args){ 
        for(int i = 0; i < 8; i++){
            Thread object = new Thread(new MultithreadingDemo());
            object.start();
        } 
    } 
}

Notice that a thread can only be started by calling the start() function (rather than calling the run() function), and the start() function will invoke the inner run() method. This is because the start() function is to create a separate call stack for the thread (Stack memory in Java is allocated per thread while heap memory is allocated for each running JVM process and shared by all threads in that process). After a separate call stack is created by it, JVM will call run() based on that stack. (What happens when a function is called: ① Arguments are evaluated; ② A new stack frame is pushed into the call stack; ③ Parameters are initialized; ④ Method body is executed; ⑤ Value is returned and current stack frame is popped out from the call stack.)

There are six thread states in Java: New, Runnable, Blocked, Waiting, Timed Waiting, Terminated.

Main Thread is the one that begins automatically and immediately when a Java program starts up. It is also the thread from which other children threads generated and must be the last thread to finish execution by performing various shutdown actions. For each program, a Main thread is created by JVM. The Main thread will firstly verify the existence of the main() method, and then initialize the class. To control the Main thread we must obtain a reference to it, which can be done by calling the method currentThread( ) provided in Thread class. This method returns a reference to the thread on which it is called. "Thread t=Thread.currentThread();"

Singleton class is a class that has only one instance of the class at any time. That is, no matter how many times you instantiate the singleton class, there's only one instance and each reference variable will point to that instance. Singletons can control access to resources, such as database connections or sockets. To design a singleton class, there are three requirements: ① Make constructor private, since we need the private constructor to prevent any additional instantiations; ② Need a static method that has a return type of the class type, so that the singleton class can use that method, which is public and static, to create instance of the class and return it. ③ Need a private static variable to hold that instance.

class Singleton{
    private static Singleton instance = null;
    private Singleton(){}
    public static Singleton getInstance(){
        if(instance == null) instance = new Singleton();
        return instance;
    }
}

Synchronization: Java uses synchronized methods or blocks to make sure only one thread can access the resource at a given point of time. Synchronized methods/blocks are marked with the "synchronized" keyword, to be synchronized on some object. All synchronized methods and blocks synchronized on the same object can only have one thread executing inside them at a time. All other threads attempting to enter the synchronized method/block are blocked until the thread inside the synchronized block exits the block. The synchronization is implemented in Java with a concept of "monitors". Only one thread can own a monitor at a given time. When a thread acquires a lock, it is said to have entered the monitor. All other threads attempting to enter the locked monitor will be suspended until the first thread exits the monitor.

Synchronization is only needed when object is mutable. If shared object is immutable or all the threads which share the same object are only reading the object’s state while not modifying it, then there is no need to synchronize it.

yield() method in Java is a way to prevent execution of a thread, indicating that the thread is not doing anything particularly important and if any other threads or processes need to run, they can run. Otherwise, the current thread will continue to run.

sleep() method is another way to prevent execution of a thread, causing the current thread sleep for a specified number of milliseconds.

join() method makes the current thread to wait until another thread finishes its execution. If join() is called on a thread instance, the currently running thread will block until that thread instance has finished executing.

Lifecycle of Thread

New: When a thread is created, it's in the new state. The thread in the new state is ready to start running. It can turn into the runnable state by calling the "start()" function on it;

Runnable: A thread in the runnable state can be either actually running or ready to run. It's the responsibility of the thread scheduler to give the thread a time slot to run. Usually, a multithread program allocates a fixed amount of time to each individual thread. Each thread will run for a short while, then pause and give up the CPU to another thread which is in the Runnable state, so other threads which are ready to run can get a chance to run;

Blocked/Waiting: If a thread is temporarily inactive, it may be either blocked or waiting. A thread is in the blocked state if it's trying to access protected/synchronized sections of code which is currently locked by another thread. When such part of code is unlocked, the scheduler will pick one of the threads which are blocked by that section and move it to the runnable state. In programming, a thread will falls into the blocked state after calling the "Object.wait()" function. A thread is in the waiting state if it's waiting for another thread on a condition. A waiting state occurs when calling "Object.wait()", "Thread.join()" or "LockSupport.park()" functions. Only when the condition is satisfied, the scheduler is notified and the waiting thread is moved to a runnable state. If a currently running thread is moved to a blocked/waiting state, another thread in the runnable state will be scheduled by the thread scheduler to run. It's the responsibility of the scheduler to determine which thread to run;

Timed Waiting: A thread in timed waiting state when it calls a method with a timeout parameter, and it will be in this state for a certain time period unless it is notified. The methods that will cause a timed waiting state include "Object.wait()", "Thread.sleep()", "Thread.join()", "LockSupport.parkNanos()" and "LockSupport.parkUntil()";

Terminated: A thread can terminate for two reasons. Firstly the execution is fully completed, that code of the thread has been entirely executed. Second it's because of some unusual erroneous, like segmentation fault or unhandled exceptions. A terminated thread will never consume any cycles of CPU.

Critical Region is a part of program where shared resource is accessed. Access to shared resource simultaneously can lead to unexpected or erroneous behavior, so parts of the program where the shared resource is accessed must be protected.

Race Condition in programming is a situation when multiple processes/threads enter the same shared region and the output is totally dependent on the execution sequence of these processes or threads, which is uncontrollable.

Mutual Exclusion is a property which can help to prevent race conditions. It requires that one running thread/process can never enter the critical region when another concurrent thread/process is running inside it, and therefore ensure that simultaneous updates to the critical region would never occur. For multi-core system, it must ensure that only one processor can access the critical region memory while the instruction is executed, which might be done with help of some hardware mechanism, to make sure that each CPU can use the bus exclusively to access the memory. In Java multithreading, mutual exclusion can be achieved by locks/synchronization/monitors, inter-thread communication, semaphores, etc.

Thread Pool:
A thread pool can reuse previously created threads to execute current tasks, and therefore, offers a solution to the problem of thread cycle overhead and resource thrashing. As the thread is already existed when the request arrives, the delay introduced by thread creation is eliminated, making the application more responsive. Implementation: A thread pool in Java can be implemented with help of the Executor interface, its sub interface, ExecutorService, and the implementing class, ThreadPoolExecutor. Using such an executor framework, developers only need to implement the Runnable objects and send them to the executor to execute. As for a fixed-size thread pool, if all threads are currently running in the pool, then the pending tasks are placed in a queue and can be executed when a thread becomes idle.
Risks: ① Deadlock: While deadlock can occur in any multi-thread program, thread pools introduce another case of deadlock, where all the running threads are waiting for the results from the blocked threads which are waiting in the queue due to a lack of threads for execution; ② Thread Leakage: Thread leakage occurs if a thread is removed from the pool to execute a task but not returned to it when the task completed. For example, if the thread throws an exception and the pool class doesn't catch this exception, then the thread will simply exit, reducing the size of the thread pool by one. If this repeats many times, then the pool would eventually become empty and no threads would be available to execute other requests; ③ Resource Thrashing: If the size of the thread pool is very large then time is wasted in context switching between threads. Having more threads than the optimal may cause a starvation problem that leading to resource thrashing. Attention: ① Don’t put tasks that concurrently wait for results from other tasks, as it can lead to a deadlock; ② Be careful while using threads for a long lived operation, as it may cause the thread waiting forever and would eventually lead to resource leakage; ③ The thread pool has to be ended explicitly at the end, otherwise the program will go on executing and never end. Call shutdown() on the pool to end the executor. If another task is sent to the executor after shutdown, it will throw a RejectedExecutionException.

Deadlock is a state where each process/thread in a group is blocked and waiting for resource which is held by some other waiting processes/threads. The deadlock condition may occur in a Java multithread program when there is synchronizations on resource. It's usually caused when one thread locks on some resource while also tries to acquire some other resource which is outside of the lock region.

Coffman Conditions describe all requirements that must be satisfied simultaneously to cause a deadlock: ① Mutual Exclusion: The resources involved must be unsharable, only one process/thread can use them at any time; ② Hold and Wait: A process/thread must be currently holding at least one resource and requesting additional resources which are being held by other processes/threads; ③ No Preemption: A resource can be released only voluntarily by the process/thread holding it; ④ Circular Wait: Each process/thread must be waiting for a resource which is being held by another process/thread, which in turn is waiting for another process/thread to release the resource.

Prevention of a deadlock can be achieved by preventing one of four Coffman conditions: ① Remove the mutual exclusion condition so that no process/thread can have exclusive access to resource, which is always impossible in reality (may cause race condition); ② Remove hold and wait condition by requiring processes/threads to acquire all resources they need before start a certain set of operations. However, it's difficult to satisfy, and may cause an inefficient usage of resource. Another way is to request resources only when a process/thread has none. But it's also impractical because resources may be allocated and remain unused for a long time. And a process/thread requiring a popular resource may have to wait indefinitely as such resource may always be allocated to other process/thread, causing a resource starvation; ③ Non preemption condition is difficult to avoid as a process/thread has to be able to have a resource for a certain time, otherwise the result may be inconsistent or thrashing may occur. However, inability to enforce preemption may interfere with a priority algorithm; ④ Approaches to avoid circular waits include avoid nested locks (which is the main reason for deadlock), avoid unnecessary locks, use the join() function (only in Java) or create a certain order on locks so that they are acquired only in a specific sequence.

Atomic operation is an operation that can’t be subdivided into smaller parts, in order to prevent any interruptions before the operation is done. Implementation methods include TSL (test and set lock, it should be implemented in hardware), XCHG (similar with TSL but available on Intel processors), semaphores, etc.

Semaphore is like a counter which indicates the number of available resource for a specific thread, in order to help control the thread access to the resource. Using semaphore also guarantees that actions of both checking resource availability and processing resource can be done in a single, atomic operation, and therefore, can safely handle resource to prevent race conditions. Generally, when using semaphore, a thread which wants to enter the shared resource should be granted permission from it. Steps: ① If the semaphore indicates a count which is not zero, then the thread can acquire a permission, cause the count be decremented by 1; ② Otherwise, if the count is zero, then the thread will be blocked until a permission can be acquired; ③ If the thread no longer needs to access to the resource, it will release the permission, causing the count of semaphore being increased by 1. Then other threads can be granted with the permission.

Daemon thread is an utmost low priority thread that runs in background to perform tasks such as garbage collection. Property: ① They can't prevent the JVM from exiting when all the user threads finish their execution; ② JVM terminates itself when all user threads finish their execution. If JVM finds running daemon threads, it will terminate the daemon thread and then shutdown itself; ③ JVM doesn't care whether Daemon thread is running or not.

results matching ""

    No results matching ""