MindMap Gallery Java basic summary, Web framework, distributed architecture, data structure and algorithm and core knowledge (distributed, big data, microservices)
Java basic knowledge, J2EE knowledge, Java Web framework, Java multi-threading knowledge, Java core ideas, RPC, distributed architecture, microservices, big data Hive, Mysql optimization, distributed locks, load balancing, data structures and algorithms, etc., P7 Level Java core knowledge summary, super summary.
Edited at 2022-05-27 16:06:38One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
Java basics and core knowledge framework
underlying principles
JVM (Java Virtual Machine)
principle
Way of working
Cross-platform principle
How to achieve platform independence: compile once, run any where
compile time
javac compilation directive
Javap decompilation instructions
Java source code is first compiled into bytecode, and then parsed by JVMs on different platforms. Java does not need to be recompiled when running on different platforms. When the Java virtual machine executes the bytecode, it converts the bytecode into a specific platform machine instructions on
Why doesn’t JVM directly convert the source code into machine code for execution?
Preparation: Various checks are required for each execution
Compatibility: Other languages can also be parsed into bytecode
memory model
Memory model diagram
thread exclusive
program counter
A small memory space, which is the line number indicator of the bytecode executed by the current thread (bytecode instructions, branches, loops, jumps, exception handling and other information)
Each thread must have an independent program counter. This type of memory is also called "thread-private" memory.
Java virtual machine stack
Thread private, the method will create a stack frame data structure when executed.
Mainly used to store local variable tables, operation stacks, dynamic links, method exports, etc.
When each thread is created, the JVM will create a corresponding virtual machine stack for it.
The size of the stack memory division directly determines how many threads a JVM process can create.
Thread sharing
MetaSpace
The difference between MetaSpace and permanent generation (before PermGen JDK7 version)
The metaspace uses local memory and the permanent generation uses JVM memory.
Advantages of MetaSpace compared to PermGen
The string constant pool has permanent generation, which is prone to performance problems and memory overflows.
The size of classes and methods is difficult to determine, which makes it difficult to specify the size of the permanent generation.
Permanent generation introduces unnecessary complexity to the GC
Facilitates the integration of HotSpot with other JVMs such as Jrockit
Heap
The allocation area of the object instance
The difference between heap and stack (memory allocation strategy)
Java memory allocation strategy
static storage
The runtime space requirements of each data target are determined at compile time.
stack storage
The data area requirements are unknown at compile time and determined before module entry at runtime.
Heap storage
It cannot be determined at compile time or before running the module entry. Dynamic allocation
The difference between heap and stack in Java memory model
Contact: When referencing an object or array, the variable defined in the stack stores the first address of the target in the heap.
the difference
management style
stack automatically released
Heap requires GC
size of space
The stack is smaller than the heap
Fragment related
The fragmentation generated by the stack is much smaller than that of the heap
Allocation
The stack supports static allocation and dynamic allocation
The heap only supports dynamic allocation
efficiency
Stack is more efficient than heap
method area
Before jdk1.8
PermGen
local memory
jdk1.8
Meta Space
JVM class loading mechanism
class loader
Start class loader
extension class loader
application loader
Parental delegation model
The process from class compilation to execution
The compiler compiles the .java source file into a .class bytecode file
ClassLoader converts .class bytecode files into Class<?> objects in the JVM
JVM uses Class<?> object to instantiate it into ? object
What is ClassLoader
ClassLoader plays a very important role in Java. It mainly works in the Class file loading stage, and its main working principle is to obtain the Class binary data stream from outside the system. It is the core component of Java. All Class files are loaded through ClassLoader. ClassLoader is responsible for loading the binary data stream in the Class file into the system, and then hands it to the Java virtual machine for connection, initialization and other operations.
ClassLoader types (four types)
BootstrapClassLoader
Written in C, loading the core library Java.*
ExtClassLoader
Written in Java, load the extension library javax.*
AppClassLoader
Written in Java, the directory where the loader is located
Custom ClassLoader
Written in Java, customized loading
OSGI (dynamic modeling system)
Garbage Collection (GC)
GC type
Minor GC/Young GC
Major GC/Full GC
Full GC is slower than Major GC, but executes less frequently
Trigger Full GC condition
Insufficient space in the old generation
Insufficient space in the permanent generation (before JDK7)
CMS GC appears promotion failed, concurrent mode failure
Minor GC promotes to an area where the average size of the old generation is greater than the remaining space of the old generation.
System.gc() is explicitly called in the program to remind the JVM to recycle the young and old generations.
Heap memory
Young Generation
When an object is created, memory allocation first occurs in the young generation
Objects are no longer used soon after they are created, and most of them are quickly cleaned up by the young generation GC.
Old Generation
Large objects can be created directly in the old generation
If an object survives long enough in the young generation without being cleaned up, it is copied to the old generation.
The space in the old generation is generally larger than that in the young generation and can store more objects.
When there is insufficient memory in the old generation, Full GC will be executed.
permanent generation
Refers to the permanent storage area of the memory, which mainly stores Class and Meta (metadata) information.
Class is put into the permanent area when it is loaded, and the GC will not clean up the permanent generation area.
The permanent generation area will fill up as the number of loaded Classes increases, and eventually an OOM (Out of Memory) exception will be thrown.
In Java 8, the permanent generation has been removed and replaced by an area called the "metadata area" (metaspace).
The metaspace is not located in the virtual machine, but uses local memory.
Schematic diagram
logic diagram
Recycling algorithm
Mark-and-sweep algorithm
Mark first, then clear, without moving the object The advantage is that it executes quickly The disadvantage is that it will cause memory fragmentation. When a large object needs to be allocated, a large enough contiguous memory cannot be found, triggering another GC.
Replication algorithm
Divided into object surface and free surface
Objects are created on the object surface
After cleaning, surviving objects will be copied from the object side to the free side.
Clear all objects on the object surface
Suitable for young generation
advantage: 1. Solve the problem of fragmentation 2. Allocate memory sequentially, simple and efficient 3. Suitable for scenarios with low object survival rate
shortcoming
You need to divide it into pieces and copy it.
Mark-collation algorithm
Avoid memory discontinuities
No need to set up two swaps
Suitable for scenarios with high object survival rate
Suitable for the elderly
Generational collection algorithm
A combination of garbage collection algorithms
According to the different life cycles of objects, different garbage collection algorithms are used to divide different areas.
Purpose: Improve JVM recycling efficiency
Object survival judgment
reference counting
accessibility analysis
GC Roots
Objects referenced by local variable tables in the virtual machine stack
in method area
Object referenced by class static variable
object referenced by constant
Objects referenced by JNI in the local method stack
What will happen if it is not reachable?
finalize()
Young generation - collect objects with short life cycles as quickly as possible
Memory space division Eden district Two Survivor zones
Use one Eden area and one Survivor area each time
Every time a minor GC is triggered, the age will be increased by 1.
When the age defaults to 15 years old, the object will enter the old generation.
The old generation age can be adjusted through -XX:MaxTenuringThreshold
If the created object is relatively large and cannot be accommodated by Enden or Survivor, it will directly enter the old age.
How to promote an object to the old generation
Objects that survive after a certain number of Minors
Survivor Object that cannot be let go
When the Enden area cannot be accommodated, a MinorGC will be started directly, and the objects in the Eden area will be cleared.
The newly generated large object controls the size of the large object through -XX: PretenuerSizeThreshold
Commonly used tuning parameters
-XX: SurvivorRatio
The ratio of Eden area to Surivor, default 8:1
-XX:NewRatio
Ratio of memory size between old generation and young generation
-XX:MaxTenuringThreshold
The maximum threshold for the number of GC times an object is promoted from the young generation to the old generation.
New generation garbage collector
Stop-the-World
The JVM stopped the execution of the program due to the execution of GC.
Happens in any GC algorithm
Most GC optimizations improve program performance by reducing the time that Stop-the-World occurs.
Safepoint Safepoint
Points where object reference relationships will not change during the analysis process
Where Savepoint is generated: method invocation, loop breakout, exception jump, etc.
Select the safe point appropriately. Selecting too few will make the GC wait time too long. Selecting too many will increase the load of the running program.
JVM operating mode
Server
The heavyweight virtual machine has made more optimizations to the program. Starts slower, runs faster
Client
Lightweight, fast to start, slow to run
Common garbage collectors in the young generation
The relationship between garbage collectors
There is a connection between garbage collectors to indicate that they can be used together.
Serial collector (-XX:UseSerialGC, copy algorithm)
Single-threaded collection, when garbage collection is performed, all working threads must be suspended.
Simple and efficient, the default young generation collector in Client mode
ParNew collector (-XX: UseParNewGC, copy algorithm)
Multi-threaded collection, other behaviors, and characteristics are the same as the Serial collector
Single-core execution efficiency is not as good as Serial, and it has advantages only when executed on multiple cores.
Parallel Scavenge collector (-XX:UseParallelGC, copy algorithm)
First understand that throughput = running user code time / (running user code time garbage collection time)
Rather than paying attention to user thread pause time, pay more attention to system throughput.
It is advantageous to execute under multi-core, the default young generation collector in Server mode
Adaptive adjustment policy (-XX:UseAdaptiveSizePolicy)
Leave the memory tuning task to the virtual machine to complete
Old generation garbage collector
Serial Old collector (-XX:UseSerialOldGC, mark-collation algorithm)
Multi-threading, throughput first
SMS collector (-XX: UseConcMarkSweepGC, mark-sweep algorithm)
The garbage collection thread can work almost at the same time as the worker thread, almost, not completely, shortening the pause time as much as possible
Initialization tag: Stop-the-World
Requires a brief pause
Concurrency mark: Concurrent traceback mark, the program will not pause
Concurrency pre-cleanup: Find objects promoted from the young generation to the old generation during the concurrent marking phase
Remark: Pause the virtual machine and scan the CMS heap for remaining objects
Requires a brief pause
Concurrent cleanup: The garbage collection thread executes concurrently with the user thread to clean up marked garbage.
Clean up garbage objects and the program will not pause.
Concurrency reset: Reset the data structure of the CMS collector
CMS collector execution process
three color marking method
If the current node and all its children have completed marking, the current node is black; if the current node has completed marking and all children have not completed marking, the current node is gray; unmarked nodes are white .
Mislabeling or mislabeling may occur.
cms uses the Incremental Update algorithm
G1 uses SATB (snapshot-at-the-beginning), referred to as snapshot
G1 (Garbage first) collector (-XX: UseG1GC, copy mark-collation algorithm)
Features of Garbage First collector
Parallelism and Concurrency
Collection by generation
spatial integration
predictable pauses
Divide the entire Java memory into multiple regions of equal size
The young generation and the old generation are no longer physically separated
Control the pause time very precisely to achieve low-pause garbage collection without sacrificing throughput.
To avoid full-area garbage collection, it divides the heap memory into several independent areas of fixed size and tracks the progress of garbage collection in these areas.
Based on mark-compact algorithm, no memory fragmentation occurs.
#Modify the default garbage collector export JAVA_OPTS='-XX: UseG1GC'
JDK11’s new garbage collector Epsilon GC
JDK11’s new garbage collector ZGC
Java Garbage Collector-FAQ
Is the finalize() method of Object the same as the destructor of C?
Unlike C's destructor, whose call is deterministic, its call is undefined.
Place unreferenced objects in the F-Queue queue
Method execution may be terminated at any time
Give an exclusive last chance to rebirth
What are the functions of strong references, soft references, weak references and virtual references in Java?
Strong Reference
The most common reference: Object = new Object()
When there is insufficient memory space, an OutOfMemoryError will be thrown to terminate the program, and objects with strong references will not be recycled.
Weaken the reference by setting the object to null so that it can be recycled
Soft Reference
The object is in a useful but unnecessary state
Only when there is insufficient memory space, the GC will reclaim the memory of the referenced object.
Can be used to implement memory-sensitive tell caching
String str = new String("abc");//strong reference SfotReferenct<String> softRef = new SoftReference<String>(Str);//soft reference
Weak Reference
Non-essential objects, weaker than soft references
Will be recycled during GC
The probability of being recycled is also low because the GC thread priority is relatively low.
Suitable for referencing objects that are used occasionally and do not affect garbage collection
String str = new String("abc");//strong reference WeakReferenct<String> weakRef = new WeakReference<String>(Str);//Weak reference
PhantomReference
Does not determine the object's life cycle
May be collected by the garbage collector at any time
Track the activity of objects being recycled by the garbage collector, acting as a sentinel
Must be used in conjunction with reference queue ReferenceQueue
String str = new String("abc"); ReferenceQueue queue = new ReferenceQueue(); PhantomReference ref = new PhantomReference(str,queue);
ReferenceQueue
There is no actual storage structure, and the storage logic relies on the relationship between internal nodes to express
Stores associated soft references, weak references, and virtual references that are GCed
JVM lock
Classification of locks
synchronized
Implementation class of Lock interface
module
JMM (Java Memory Mode)
Eight major operations of memory interaction
lock: Acts on variables in main memory, marking a variable as thread-exclusive.
unlock: A variable that acts on main memory. It releases a variable that is in a locked state. Only the released variable can be locked by other threads.
read (read): Acts on main memory variables. It transfers the value of a variable from main memory to the working memory of the thread for use in subsequent load actions.
load: A variable that acts on working memory. It puts the read operation from the main memory variable into the working memory.
use (use): Acts on variables in the working memory. It transfers the variables in the working memory to the execution engine. Whenever the virtual machine encounters a value that needs to be used, this instruction will be used.
assign (assignment): Acts on a variable in the working memory. It puts a value received from the execution engine into a copy of the variable in the working memory.
store (storage): Acts on variables in main memory. It transfers the value of a variable from working memory to main memory for subsequent write use.
write: Acts on variables in main memory. It puts the value of the variable obtained from the working memory by the store operation into the variable in main memory.
Eight rules
1. One of the read and load, store and write operations is not allowed to appear alone. Even if you use read, you must load, and if you use store, you must write.
2. The thread is not allowed to discard its latest assign operation, that is, after the data of the working variable is changed, the main memory must be informed (visible)
3. A thread is not allowed to synchronize unassigned data from working memory back to main memory.
4. A new variable must be created in the main memory, and the working memory is not allowed to directly use an uninitialized variable. That is, before a variable can be used for use or store operations, it must undergo assign and load operations.
5. Only one thread can lock a variable at the same time. After locking multiple times, you must perform the same number of unlocks to unlock.
6. If a lock operation is performed on a variable, the value of this variable in all working memory will be cleared. Before the execution engine uses this variable, it must be reloaded or assigned to initialize the value of the variable.
7. If a variable is not locked, it cannot be unlocked. Nor can you unlock a variable that is locked by another thread.
8. Before unlocking a variable, the variable must be synchronized back to the main memory.
Linux kernel principles
linuxarchitecture
Kernel, shell, file system and applications. The kernel, shell, and file system together form the basic operating system structure
linux file system
linux shell
The shell is the user interface of the system, providing an interface for users to interact with the kernel. It receives commands entered by the user and sends them to the kernel for execution. It is a command interpreter. In addition, the shell programming language has many characteristics of ordinary programming languages. Shell programs written in this programming language have the same effect as other applications.
1. Bourne Shell: Developed by Bell Labs.
2. BASH: It is GNU's Bourne Again Shell. It is the default shell on the GNU operating system. Most Linux distribution packages use this shell.
3. Korn Shell: It is a development of Bourne SHell and is compatible with Bourne Shell in most aspects.
4. C Shell: It is the BSD version of SUN Company Shell.
linux kernel
The Linux kernel is one of the largest open source projects in the world. The kernel is the lowest level of easily replaceable software that interfaces with computer hardware. It is responsible for connecting all applications running in "user mode" to the physical hardware and allows processes called servers to obtain information from each other using inter-process communication (IPC). The kernel is the core of the operating system and has many of the most basic functions. It is responsible for managing the system's processes, memory, device drivers, files and network systems, and determines the performance and stability of the system. The Linux kernel consists of the following parts: memory management, process management, device drivers, file system and network management
Memory management
Process management
File system
device driver
Network interface (NET)
User mode and kernel mode
Applications cannot directly access hardware resources and need to access hardware resources through the interface provided by the kernel SCI layer.
Network principles
Network protocol
HTTP
HTTP request and response format
Request type
GET
How to make a get request
Request directly from the browser address bar
Requests made by hyperlinks
form form method="get"
Characteristics of get requests
The requested parameters are directly spliced after the url. The splicing format is url?key=value&key=value
Since the request parameters are placed after the URL, the amount of data carried by all requests is limited, generally not exceeding 4kb.
Since the request parameters are placed directly after the url, it is relatively unsafe.
POSH
How to make a posh request
form form method="post"
Characteristics of posh requests
The request parameters are placed in the request body
Since the request parameters are placed in the request body: the amount of data requested is generally not limited.
Since the request parameters are placed in the request body: relatively safe
The difference between GET and POST requests
HTTP message level
The GET request puts the request information in the URL, and the POST puts the request content in the message body.
database level
GET requests are idempotent and safe, POST are not.
other aspects
GET can be cached and stored, but POST cannot
http request
request line
Request header
Allow
What request methods does the server support?
Content-Length
The length of the request body in bytes
Content-Type
MIME type
Content-Encoding
Set the encoding type used by the data
Expires
The expiration time of the response body, a GMT time, indicating the validity time of the cache
Request body
The composition of http response
response line request line Protocol status code status information response header request header key:value Pass server-side information to the client, such as response format and encoding response body request body Carrying response html data
HTTP redirection and forwarding
Redirect (page jump)
What is redirection
When the browser requests the web server for the first time, the web server returns a 302 status code and a URL address to the browser. When the browser receives the 302 status code and the following address, it will immediately send another request to the server for the address after 302. The process of responding again. 302 location
response.sendRedirect("address")
Features
It is essentially a behavior on the browser, with two requests and two responses. The address in the address bar will change
The request and response objects of both requests are new.
Redirection technology can not only locate requests within the project, but also locate requests outside the project.
Forward (page jump)
What is forwarding
When the browser makes a request to a servlet of the web server, the servlet hands over the unfinished work to the next servlet for processing.
Get transponder
RequestDispatcher DS=request.getRequestDispatcher("forwarded address")
Forward
ds.forward(request,response)
Features
It is essentially a behavior on the server, with one request and one response. The address in the address bar will not change
Sharing request and response objects
Forwarding can only be done within the project
HTTP caching and proxy servers
Cookie mechanism
The difference between Cookie and Session
Cookie is special information sent by the server to the client, which is stored on the client in the form of text. When the client requests again, the cookie will be sent back. After the server receives it, it will parse the cookie and generate content corresponding to the client.
Session is a server-side mechanism that saves information on the server Parse client requests and operate Session ids, saving status information as needed
Session exists on the server, and cookies are stored on the client. Session is more secure than Cookie If you consider reducing server pressure, you should use cookies
Digital signature and certification
Signatures and certificates
In HTTPS, the server uses the private key to sign, and then the browser uses the certificate's public key to verify
The certificate needs to be verified by the CA first, and then the public key needs to be used to protect the symmetric key generated by the client.
SSL digital certificate
3 more important attributes: organizational information, public key, and validity time
Record public key and organization information in X.509 format
OpenSSL
Robust, commercial-grade and full-featured toolkit for Transport Layer Security (TLS) and Secure Sockets Layer (SSL) protocols
It is also a general password library for certificate format conversion.
Authentication steps
1) Server certificate authentication (the browser reports the symmetric key)
2) Encrypted communication (no certificate required)
HTTPS and SSL/TLS
SSL/TLS
HTTPS uses SSL/TLS technology
Its core is symmetric encryption and asymmetric encryption technology
TLS is built on the SSL 3.0 protocol specification and is a subsequent version of SSL 3.0
SSL technology
A technology to ensure communication security in C/S mode, relying on digital certificate technology
Can ensure tamper resistance, encrypted communication, compressed communication
communication process
1) The client sends a communication request
SSL version number, encryption parameters, session ID
2) Server replies with response
SSL version number, encryption parameters, session ID and other information Public key certificate
3) The client uses the CA's public key to verify the server's public key certificate. If the server's certificate is invalid, an exception is thrown and the session is refused to continue.
4) The client sends the symmetric key session key of the session to the server
5) Both parties use sessionkey for encrypted communication
Encryption
Symmetric encryption
Both encryption and decryption use the same key
asymmetric encryption
The key used for encryption and the key used for decryption are different
Hash algorithm
Convert data of any length into a fixed-length value, the algorithm is irreversible
digital signature
Prove that a message or document was issued/acknowledged by a certain person
HTTPS data transfer process
The browser sends supported encryption algorithm information to the server
The server selects a set of encryption algorithms supported by the browser and sends it back to the browser in the form of a certificate.
The browser verifies the validity of the certificate and sends the encrypted information to the server based on the certificate's public key.
The server uses the private key to decrypt the message, verify the hash, and encrypt the response message back to the browser.
The browser decrypts the response message, verifies the message's authenticity, and then encrypts the interaction data.
The difference between HTTP and HTTPS
HTTPS needs to apply for a certificate from the CA, HTTP does not
HTTPS cipher text transmission, HTTP plain text transmission
The connection methods are different. HTTPS uses port 443 by default, and HTTP uses port 80 by default.
HTTPS= HTTP encryption authentication integrity protection, more secure than HTTP
Is HTTPS really secure?
The browser fills in http:// by default. The request needs to be redirected and there is a risk of being hijacked.
HSTS (HTTP Strict Transport Security) optimization
TCP
TCP protocol and flow control
The TCP/IP model includes hundreds of interrelated protocols such as TCP, IP, UDP, Telnet, FTP, and SMTP.
Among them, TCP and IP are the two most commonly used underlying protocols.
How to ensure the reliability of TCP protocol
The difference between TCP and UDP
Introduction to UDP
UDP message structure
Source Port
Destination Port
Length
Check Sum
UDP features
non-connection oriented
Does not maintain connection state and supports sending the same information to multiple clients
The data header is very short, only 8 bytes, and the additional overhead is small
Throughput is tightly limited by data generation rate, transfer rate, and machine performance
Best effort delivery, but no guarantee of reliable delivery, no need to maintain complex link state tables
Packet-oriented, there is no need to split or merge the packets submitted by the application
Differences between TCP and UDP
connection oriented
TCP connection-oriented
UDP packet-oriented
reliability
TCP ensures reliability by waving three times
UDP does not guarantee data reliability
Orderliness
TCP has a sequence number to ensure the orderliness of data transmission.
UDP does not guarantee
speed
TCP has done a lot of work to ensure the reliability and orderliness of data, etc., so it is slow.
UDP block
magnitude
TCP is heavyweight
UDP is lightweight
TCP sliding window
RTT and RTO
RTT
The time it takes to send a data packet to receive an ACK from the other party
RTO
retransmission interval
Socket
Relationship with TCP and HTTP
The TCP/IP protocol is used to transport stream format sockets. TCP is used to ensure the correctness of the data. IP is used to control how the data reaches the destination from the source.
The HTTP protocol is based on connection-oriented sockets, because the data must be accurate
The underlying implementation principle
The return value of socket in UNIX/Linux is the file descriptor, and ordinary file operation functions can be used to transmit data.
Windows treats the socket as a network connection and needs to call the data transfer function specially designed for the socket.
type
Internet socket
Stream Sockets (stream format sockets)
meaning
Also called "connection-oriented socket", based on TCP protocol
Is a reliable, two-way communication data flow in which data reaches another computer without error and can be resent if damaged or lost.
feature
Data will not disappear during transfer
Data is transmitted in sequence
The sending and receiving of data is not synchronous
There is a buffer (that is, a character array) inside the streaming format socket, and the data transmitted by the socket will be saved in the buffer.
The receiving end does not necessarily read the data immediately after receiving it. The receiving end may read it all at once after the buffer is filled.
transmission
TCP socket
The buffer exists independently and is automatically generated when the socket is created.
Closing the socket will also continue to transfer data left in the output buffer
Closing the socket will lose the data in the input buffer
The default is blocking mode
Datagram Sockets (datagram format sockets)
meaning
Also called "connectionless socket", based on UDP protocol
Just transmit data without data verification. If the data is damaged during transmission or does not reach another computer, there is no way to remedy it.
feature
Emphasis on fast transmission rather than transmission order
Transmitted data may be lost or corrupted
Limit data size per transfer
Data is sent and received synchronously
transmission
Use IP protocol for routing and UDP protocol (User Datagram Protocol) to transmit data
Application: QQ voice video chat, live broadcast
Compare
Connectionless socket transmission is efficient, but unreliable, with the risk of losing packets and corrupting data.
Connected sockets are very reliable and foolproof, but the transmission efficiency is low and consumes a lot of resources.
Unix socket (pathname of local node)
X.25 socket (CCITT X.25 address)
Data transfer process
three handshakes
When using connect() to establish a connection, the client and server will send three packets to each other.
four handshakes
A three-way handshake is required to establish a connection, and a four-way handshake is required to disconnect.
Broken link
For graceful shutdown, please use shutdown()
Possible problems
Data sticky package
High and low bits (big endian and little endian)
Big Endian: The high-order byte is stored in the low-order address (high-order byte first)
Little Endian: The high-order byte is stored in the high-order address (low-order byte first)
operating system
Classification
According to real-time
time-sharing operating system
not real-time
real-time operating system
Soft real-time
Accept occasional violations of time rules
hard real time
Processing must be completed within strictly specified events
other
network operating system
Distributed operating system
personal computer operating system
windows, MaxOS, Linux, UNIX
category
Microcontroller
Single-chip microcomputer, also known as microcontroller unit MCU
A microcomputer with a central processing unit, memory, timers/counters, various input and output interfaces, etc. all integrated on an integrated circuit chip
Embedded System
A computer system embedded within a mechanical or electrical system that has specialized functions and real-time computing capabilities
Often used to efficiently control many common devices, the embedded system is usually a complete device containing digital hardware and mechanical components.
Embedded Linux
It is the general name for a type of embedded operating system. This type of operating system is based on the Linux kernel and is designed to be used in embedded devices.
It is essentially the same as the Linux system running on the computer. It mainly uses the task scheduling, memory management, hardware abstraction and other functions in the Linux kernel.
RTOS (real-time operating system)
Also known as a real-time operating system, it runs in sequence, manages system resources, and provides a consistent foundation for developing applications
programming framework
Spring family bucket
Spring Framework
Features
Inversion of Control (IoC)
Dependency Injection (DI)
Loose coupling through dependency injection and interface orientation
Aspect Oriented (AOP)
Declarative programming based on aspects and inertia
Reduce boilerplate code with aspects and templates
Application scenarios
Authority authentication
Automatic caching
Error handling
debug
log
affairs
Bean Oriented (BOP)
Lightweight and minimally intrusive programming based on POJOs
Sping is lightweight in terms of both size and cost. The complete Sping framework can be used in a file with a size of only more than 1M.
The processing overhead required by Spring is also negligible
Non-intrusive: Typically, objects in Spring applications do not depend on specific Spring classes.
container
high level view
Container interface
BeanFactory
It is understood as a HashMap, Key is BeanName, and Value is Bean instance.
Usually only the two functions of registration (put) and acquisition (get) are provided
Supports both singleton model and prototype model
ApplicationContext
"Application context" represents all functions of the entire large container
A refresh method is defined to refresh the entire container, that is, reload/refresh all beans
frame
Simple components can be configured and combined into complex applications
In Spring, application objects are composed declaratively, typically in an XML file
Provides many basic functions (transaction management, persistence framework integration, etc.), leaving the development of application logic to developers
Common modules
Spring Core
Provides creation of IOC container objects and processing of dependent object relationships
core
IOC
Common annotations
Class level annotations
@Component
@Controller
@Service
@Repository
@Configuration
@ComponentsScan
@Bean
@Scope
Method variable level annotations
@Autowire
@Qualifier
@Resource
@Value
@Cacheable
@CacheEvict
Three injection methods
constructor injection
benefit
Ensure dependencies are immutable (final keyword)
Ensure that dependencies are not empty (saving us from checking them)
Ensure that the code returned to the client (calling) is in a fully initialized state
Avoid circular dependencies
Improved code reusability
Interface injection
setter injection
AOP
Implementation principle
dynamic proxy
JDK implementation
Need to implement at least one interface
CGlib
ASM operates bytecode implementation to generate subclasses of the target class
static proxy
compile time weaving
Class loading weaving
Keywords
Join Point
Pointcut
Advice
Before advice
After returning advice
After throwing advice
After (finally) advice
Around advice
Introduction
Target Object
AOP proxy
Aspect
Weaving
Method to realize
annotation
@AspectJ
@Pointcut
@Before
@After
@Around
Configuration file
Common applications
affairs
log
Permissions
Resource abstraction
Data validation and transformation
Spring expression language
core container
BeanFactory
BeanFactory is the infrastructure of the Spring framework, facing Spring itself
ApplicationContext is for developers using the Spring framework
The BeanDefinitionRegistry interface provides a method for manually registering BeanDefinition objects with the container.
The interface of the parent-child cascade IoC container. The child container can access the parent container through the interface method.
AutowireCapableBeanFactory automatic wiring
SingletonBeanRegistry registers a singleton bean during runtime
How IOC Inversion of Control is Implemented
XML configuration method
Annotation method
Automatic assembly method
Spring Web module
Provides support for Struts, Springmvc, and supports WEB development
SpringMVC
Annotation implementation for Servlet 3.0
ServletContainerInitializer container initialization
ServletRegistration registration
FilterRegisteration filter
ServletContext
Performance actual combat
Based on Servlet3.0 asynchronous
Callableasync
DeferredResultasync
Spring Web MVC
Common annotations
Class level annotations
@EnableWebMvc
@SessionAttributes
Method variable level annotations
@RequestBody
@ResponseBody
@RequestMapping
@GetMapping
@PostMapping
@PutMapping
@DeleteMapping
@PatchMapping
@ModelAttribute
@RequestParam
@RequestHeader
@RestController
@PathVariable
@ControllerAdvice
@CookieValue
@CrossOrigin
@Valid
@Validated
@ExceptionHandler
core components
DispatcherServlet
HandlerMapping
HandlerAdapter
ViewResolver
···
Spring Web Flux
Reactor basics
Lambda
Mono
Flux
core
Web MVC annotations
Functional declaration
RouteFunction
Asynchronous non-blocking
scenes to be used
data access
transaction processing
JDBC template
test
unit test
Integration Testing
Spring Data
JPA
Redis
MongoDB
Couchbase
Cassandra
ElasticSearch
Neo4j
…
Spring Security
OAuth2.0
CAS
WEB security
Authorize
Authentication
encryption
…
Spring AOP
Spring provides aspect-oriented programming, which can provide transaction management for a certain layer, such as adding transaction control to the Service layer.
Spring DAO
Provides a meaningful exception hierarchy for the JDBC DAO abstraction layer
Spring ORM
Plugged into multiple ORM frameworks, including JDO, Hibernate and iBatis SQL Map
Spring JEE
Support for J2EE development specifications, including enterprise services such as JNDl, EJB, electronic components, internationalization, verification and scheduling functions
Spring context
Is a configuration file that provides contextual information to the Spring framework
Spring Session
Spring Integration
Spring REST Docs
Spring AMQP
FanoutExchange (publish/subscribe)
Type of Exchange
Fanout
Broadcast, delivering messages to all queues bound to the exchange
Direct
Directed, the message is delivered to the queue that matches the specified routing key.
Topic
Wildcard, hand the message to the queue that matches the routing pattern (routing pattern)
Headers
Use headers to decide which queues to send messages to (this is rarely used)
Data Access
transactions
DAO support
JDBC
ORM
Marshalling XML
Main jar package
beans
Basic implementation of Spring IOC, including accessing configuration files, creating and managing beans, etc.
context
Provides extended services on basic IOC functions, in addition to providing support for many enterprise-level services
core
Spring's core tool package, other packages depend on this package
expression
Spring expression language
Instrument
Spring's proxy interface to the server
orm
Integrate third-party ORM implementations, such as hibernate, ibatis, jdo and spring's jpa implementation
Spring websocket
Provide Socket communication and push function on the web side
Spring test
Simple encapsulation of test frameworks such as JUNIT
Common annotations
bean annotations
@Component component, no clear role
@Service is used in the business logic layer (service layer)
@Repository is used in the data access layer (dao layer)
@Controller is used in the presentation layer, controller declaration (C)
Java configuration class
@Configuration declares the current class as a configuration class, which is equivalent to Spring configuration in xml form (on the class)
The @Bean annotation is on the method, declaring that the return value of the current method is a bean, replacing the method in xml (used on the method)
@ComponentScan is used to scan Component, which is equivalent to (on class) in xml
@WishlyConfiguration is the combined annotation of @Configuration and @ComponentScan, which can replace these two annotations.
Aspect (AOP) related
@Aspect declares an aspect (on the class)
@After executes after the method is executed (on the method)
@Before is executed before the method is executed (on the method)
@Around is executed before and after the method is executed (on the method)
@PointCut declares pointcuts
@Enable annotation
@EnableAspectJAutoProxy
Enable Spring's support for AspectJ proxies (on classes)
@EnableAsync turns on support for asynchronous methods
@EnableScheduling turns on support for scheduled tasks
@EnableWebMvc turns on Web MVC configuration support
@EnableConfigurationProperties enables support for @ConfigurationProperties annotation configuration beans
@EnableJpaRepositories turns on support for SpringData JPA Repository
@EnableTransactionManagement turns on support for annotated transactions
@EnableTransactionManagement turns on support for annotated transactions
@EnableCaching enables annotation caching support
Spring Panorama
SpringBoot
Contains modules
Single application
embedded container
Dependency management
Agreement is greater than configuration
environmental management
Log management
Configuration management
automatic configuration
Management functions
breakpoint
RBI
monitor
Developer tools & CLI
Common annotations
Class level annotations
@SpringBootApplication
@RestController
@EnableAutoConfiguration
@EntityScan
Method variable level annotations
Three major characteristics
Automatic assembly of components
webMVC
Supported template engines
FreeMarker
Groovy
Thymeleaf
Mustache
JSP
webFlux
Supported template engines
FreeMarker
Thymeleaf
Mustache
JDBC
···
Embedded web container (no need to deploy War files)
Tomcat
Jetty
Undertow
Production Readiness Features
Provide solid starter dependencies to simplify build configuration
Provide operational features
health examination
Indicator information
Externalized configuration
No code generation, no XML configuration required
automatic assembly
Implementation
Activate autowiring-@EnableAutoConfiguration/@SpringBootApplication
Implement automatic assembly-XXXAutoConfiguration
Configure automatic assembly implementation-META-INFO/spring.factories
extension point
SpringApplication
Auto-Configuration
Diagnostics Analyzer
Embedded Container
Factories Loading Mechanism
Configuration Sources (Property Sources)
Endpoints
Monitoring and Management (JMX)
Event/Listener
SpringCloud
Common components
Spring Cloud GateWay
service gateway
spring-cloud-starter-gateway
Route
Predicate
Filter
Spring Cloud Config
Service configuration
Config Server
spring-cloud-config-server
@EnableConfigServer
Client
spring-cloud-starter-config
Spring Cloud Consul
Service registration/service configuration
Spring Cloud Stream
event driven
Source
Sink
Processor
Binders
spring-cloud-binder-rabbit
spring-cloud-binder-kafka
spring-cloud-binder-kafka-streams
(Functions and services) Spring Cloud Function
Function
Consumer
Supplier
Applications
spring-cloud-function-web
spring-cloud-function-stream
Spring Cloud Security
Service security
Spring Cloud Sleuth
Service call chain tracking and visualization with Zipkin
client
spring-cloud-starter-sleuth
spring-cloud-starter-zipkin
Zipkin Server
io.zipkin.java.zipkin-server
@EnableZipkinServer
Spring Cloud OpenFeign
For Restful calls between services (REST client)
spring-cloud-starter-openfegin
@EnableFeginClient
Spring Cloud Netflix
Service governance
Eureka service discovery
Client(Service)
spring-cloud-starter-netflix-eureka-client
@EnableEurekaClient
EurekaServer
spring-cloud-starter-netflix-eureka-server
@EnableEurekasServer
Hystrix fuse
Client
spring-cloud-starter-netflix-hystrix-client
@EnableCircuitBreaker
Turbine aggregation service
spring-cloud-starter-netflix-turbine
@EnableTurbine
DashBoard backend
spring-cloud-starter-netflix-hystrix-dashboard
@EnableHystrixDashboard
Zuul gateway service
Spring-cloud-starter-netflix-zuul
@EnableZuulProxy > @EnableZuulServer
SideCar service
spring-cloud-starter-netflix-sidecar
@EnableSidecar
Ribbion client load balancing
spring-cloud-starter-netflix-ribbion
load rules
Random rules
Most available rules
Training rotation rules
Retry implementation
Client configuration
Availability filter rules
RT weight rules
Circumvent regional rules
(Task framework) Spring Cloud Task
spring-cloud-starter-task
@EnableTask
(Message bus) Spring Cloud Bus
spring-cloud-starter-bus-amqp
spring-cloud-starter-bus-kafka
Spring Cloud Circuit Breaker
Service fault tolerance
(Admin console) Spring Cloud Admin
de.codecentric.spring-boot-admin-starter-server
@EnableAdminServer
Spring Cloud Data Flow
DashBoard
Application
Spring Cloud Stream App Starters
Spring Cloud Task App Starters
Server
Deployer
(Microservice Contract) Spring Cloud Contract
spring-cloud-starter-contract-verifier
spring-cloud-starter-contract-stub-runner
ORM framework
JDBC
Database driver type
JDBC-ODBC bridge
Native API Java driver
JDBC network pure Java driver
Native protocol pure Java driver
MyBatis
definition
Excellent persistence layer framework that supports customized SQL, stored procedures and advanced mapping
Avoids almost all JDBC code and manual setting of parameters and retrieval of result sets
Architecture diagram
use
Two sql configuration methods
XML configuration method
Annotation annotation method
configuration configures each element
properties
setting
typeAliases
typeHandlers
objectFactory
plugins
environments
databaseProvider
mapper
Pagination
pageHelper
Batch operations
Union query
Possible pitfalls
Matching of jdbcType and database field type
Hibernate(Nhibernate)
Can be used in Java client programs
Can be used in Servlet/JSP web applications
Hibernate framework can replace CMP in Java EE architecture applying EJB
SpringData
Spring Data JDBC
Spring Data JPA
sql generation
Splice sql by method name
Inquire
find
read
get
query
stream
First
Top
count
exists
Distinct
OrderBy
delete
remove
delete
other
IsBetween/Between
IsNotNull/NotNull
IsNull/Null
IsLessThan/LessThan
IsLessThanEqual/LessThanEqual
IsGreaterThan/GreaterThan
IsGreaterThanEqual/GreaterThanEqual
IsBefore/Before
IsAfter/After
IsNotLike/NotLike
IsLike/Like
IsStartingWith/StartingWith/StartsWith
IsEndingWith/EndingWith/EndsWith
IsNotEmpty/NotEmpty
IsEmpty/Empty
IsNotContaining/NotContaining/NotContains
IsContaining/Containing/Contains
IsNotIn/NotIn
IsIn/In
IsNear/Near
IsWithin/Within
MatchesRegex/Matches/Regex
IsTrue/True
IsFalse/False
IsNot/Not
Is/Equals
@Query
JPQL
Native SQL
Programmatic
JPA comes with commonly used APIs
JpaRepository<T, ID>
findAll
findAllById
saveAll
saveAndFlush
deleteInBatch
deleteAllInBatch
getOne
PagingAndSortingRepository<T, ID>
findAll
CrudRepository<T, ID>
save
saveAll
findAll
findById
existsById
count
deleteById
delete
deleteAll
other
flush
Spring Data Mongodb
Spring Data Redis
Spring Data Elasticsearch
Spring Data Apache Solr
Spring Data Apache Hadoop
other
EclipseLink
iBATIS
The predecessor of MyBatis
JFinal
Morphia
JavaWeb development framework
web framework
Netty
It is a NIO network framework for efficient development of network applications.
Threading model
Architecture diagram
Differences from Tomcat
Netty not only supports HTTP protocol, but also supports multiple application layer protocols such as SSH, TLS/SSL, etc.
Tomcat needs to follow the Servlet specification. Before Servlet 3.0, it used the synchronous blocking model.
Netty has a different focus than Tomcat. It does not need to be constrained by the Servlet specification and maximizes the NIO features.
Mina
It is the underlying NIO framework of the Apache Directory server (Netty is an upgraded version of Mina)
Grizzly
MVC framework
Struts
Struts2
JSF (Java Server Faces)
WebWork
Xwork1
WebWork2
Frame combination
SSM framework
SpringMVC Spring Mybatis
web layer (springmvc), service layer (spring) and DAO layer (mybatis)
SSMM framework
Spring SpringMVC Mybatis MySQL
SSH framework
Structs Spring Hibernate
Database connection pool
C3P0
DBCP
druid
HikariCP
proxool
Tomcat jdbc pool
BoneCP
Tapestry
Other frameworks
caching framework
Ehcache
Provides memory, disk file storage, and distributed storage methods, etc.
Fast, simple, low consumption, low dependency, strong scalability, supports object or serialization cache, supports cache or element invalidation
Structure diagram
Each CacheManager can manage multiple Cache, and each Cache can manage multiple Elements in a hash manner.
Element: used to store real cache content
caching strategy
TTL
LRU
redis
caffeine
Infinispan
Log processing
Log4j
sl4j
Persistence layer framework
Hibernate
Hibernate makes a lightweight encapsulation of the code for JDBC access to the database, which greatly simplifies the tedious and repetitive code of the data access layer.
Mybatis
security framework
Spring Security
Shiro
computing framework
Storm
Nimbus
Supervisor
Worker
Executor
Task
Topology
Spout
Bolt
Tuple
Stream grouping
Shuffle
Fields
All
Global
None
Direct
Local or shuffle
JStorm
Spark Streaming
Flink
Blink
job framework (scheduled tasks)
Quartz
Common annotations
@DisallowConcurrentExecution
components
JobDetail
Trigger
SimpleTrigger
CronTrigger
Calendar
Schedule
ElasticJob
Spring-Task
Validation frame
Hibernate validator
Oval
Distributed architecture
cache
cache level
caching technology
[Server] Distributed Cache
【CDN】Dynamic caching technology
CSI (Client Side Includes)
Dynamically include the content of another page through iframe, javascript, ajax, etc.
The page can still be turned into a static html page, and where dynamic is required, it can be loaded dynamically through iframe, javascript or ajax.
Relatively simple and does not require changes and configurations on the server side
Not conducive to search engine optimization (iframe method), javascript compatibility issues, and client-side caching issues may cause untimely updates
SSI (Server Side Includes)
Load different modules through comment line SSI commands and build them into html to update the content of the entire website.
The corresponding files of each module are called through SSI and finally assembled into an html page, which requires server module support.
It is not limited by specific languages and is relatively universal. It only needs the support of a web server or application server, such as Ngnix, Apache, IIS, etc.
SSI can only be loaded on the current server and cannot directly include files on other servers (that is, it cannot be included across domains)
ESI (Edge Side Includes)
Use simple markup language to describe content fragments in web pages that can and cannot be accelerated
Can be used to cache entire pages or page fragments, mostly executed on cache servers or proxy servers.
Currently, there are relatively few softwares that support ESI, and official updates are a bit slow, so it is not widely used.
Caching algorithm (page replacement algorithm)
FIFO (First in First out)
first in first out
LFU (Least Frequently Used)
Use an array to store data items, use a hashmap to store the corresponding position of each data item in the array, and then design an access frequency for each data item. When the data item is hit, the access frequency increases automatically, and the access is eliminated when it is eliminated. Data with the least frequency
LRU
Cache elements have a timestamp. When the cache capacity is full and space needs to be made to cache new elements, the elements with timestamps farthest from the current time among the existing cache elements will be cleared from the cache.
scenes to be used
1) Keep it consistent with the data structure in the database and cache it as it is.
2) Caching of list sorting and paging scenarios
3) Count cache
4) Reconstruct the dimension cache
5) Larger detailed content data cache
caching problem
Cache penetration refers to querying data that must not exist in a database
solution
Detect key specifications and intercept malicious attacks
If the object queried from the database is empty, it is also put into the cache. Set a shorter cache expiration time, such as 60 seconds.
Cache avalanche refers to the centralized expiration of caches in a certain period of time.
Different categories of goods have different cache periods. Products in the same category, plus a random factor
Cache breakdown means that a key is very hot and large concurrency is concentrated on accessing this point.
When the key expires, the continuous large concurrency breaks through the cache and directly requests the database.
Allows cache to never expire
Caching solutions in action
Caching concept
SpringCache usage
Cache consistency strategy
Cache Avalanche Scenario
Cache penetration solution
Three major contradictions
Cache real-time and consistency issues
real time strategy
The application first fetches the data from the cache. If it does not get it, it fetches the data from the database. After success, it puts it in the cache.
During the writing process, the data is stored in the database. After success, the cache is invalidated.
Cache penetration problem
Asynchronous strategy
read
1) When it cannot be read during reading, it does not directly access the database and returns a fallback data.
2) Put a data loading event into the message queue, read the database asynchronously and update it to the cache
renew
Update the database first, then update the cache asynchronously
Update the cache first, then update the database asynchronously
Cache high concurrent access to the database
Timing strategy
The application only accesses the cache, not the database
Split a whole piece of data into several parts for caching, and distinguish between frequently updated and infrequently updated parts.
Distributed storage
Traditional network storage
NAS
A network server that provides storage capabilities and a file system
protocol
SMB
NFS
AFS
SAN
Only block storage is provided, and the file system is managed by the client.
protocol
FibreChannel
iSCSI
ATA over Etherent (AoE)
HyperSCSI
object storage
access form
Access via REST web service object
Handled via HTTP predefined methods
GET
Get a network resource
PUT
Create or replace a network resource
POST
Used to create a resource. If it already exists, an error will be reported.
DELETE
Delete a network resource
metadata service
object hash value
Object storage uses the hash value of the object as a globally unique identifier
Use a high-digit hash function to calculate hash values to ensure that data with different content has different hash values.
service architecture
ElasticSearch
http
Breakpoint upload
Highly available data storage
MySQL high-performance storage practice
Mycat advanced practice
FastDFS distributed file storage practice
File storage practice
File synchronization practice
File query practice
Distributed deployment practice
Distributed transactions
isolation level
The default isolation level of the database is used
mysql defaults to repeatable reading
Oracle defaults to read committed (read committed)
ACID
Atomicity
Atomicity means that a thing is an indivisible unit of work, and operations in the thing either all occur or none occur.
Consistency
The integrity of the data before and after the transaction must be consistent
Isolation isolation
When multiple users access the database, the transactions opened by the database for each user cannot be interfered by data operated by other transactions, and multiple concurrent transactions must be isolated from each other.
Durability
Once a thing is submitted, its changes to the data in the database are permanent, and even if the database fails, it will not have any impact on the data.
Transaction propagation characteristics
Guaranteed to be in the same transaction PROPAGATION_REQUIRED supports the current transaction, if it does not exist, create a new one (default) PROPAGATION_SUPPORTS supports the current transaction. If it does not exist, the transaction will not be used. PROPAGATION_MANDATORY supports the current transaction. If it does not exist, an exception is thrown. Guaranteed not to be in the same transaction PROPAGATION_REQUIRES_NEW If a transaction exists, suspend the current transaction and create a new transaction PROPAGATION_NOT_SUPPORTED Run in non-transactional mode. If a transaction exists, suspend the current transaction PROPAGATION_NEVER runs in non-transactional mode and throws an exception if a transaction exists PROPAGATION_NESTED If the current transaction exists, the nested transaction is executed
Distributed transaction framework
2pc
3PC
JOTM
Atomikos
Distributed cluster
master-slave replication
Read and write separation
load balancing
Distributed lock
Distributed lock characteristics
mutual exclusivity
Only one thread holds the lock at the same time
Reentrancy
The same thread on the same node can acquire the lock again after acquiring the lock.
lock timeout
Like the lock in the JUC package, it supports lock timeout to prevent deadlock.
High performance and high availability
Locking and unlocking must be efficient, while also ensuring high availability to prevent distributed lock failure.
Has blocking and non-blocking features
Able to wake up from blocked state in time
Distributed lock implementation
database based
Check the implementation yourself
Based on redis
Based on zooKeeper
Check the implementation yourself
redis implementation
How to implement locking
Using the setnx expire command (wrong approach)
Because the setnx expire operation is not atomic
public boolean tryLock(String key,String requset,int timeout) { Long result = jedis.setnx(key, requset); // When result = 1, the setting is successful, otherwise the setting fails if (result == 1L) { return jedis.expire(key, timeout) == 1L; } else { return false; } }
Use Lua script (including setnx and expire instructions)
public boolean tryLock_with_lua(String key, String UniqueId, int seconds) { String lua_scripts = "if redis.call('setnx',KEYS[1],ARGV[1]) == 1 then" "redis.call('expire',KEYS[1],ARGV[2]) return 1 else return 0 end"; List<String> keys = new ArrayList<>(); List<String> values = new ArrayList<>(); keys.add(key); values.add(UniqueId); values.add(String.valueOf(seconds)); Object result = jedis.eval(lua_scripts, keys, values); //Determine whether it is successful return result.equals(1L); }
Use the set key value [EX seconds][PX milliseconds][NX|XX] command (correct approach)
Starting from version 2.6.12, Redis adds a series of options to the SET command: SET key value[EX seconds][PX milliseconds][NX|XX] EX seconds: Set expiration time in seconds PX milliseconds: Set expiration time in milliseconds NX: Set value only if key does not exist XX: Set value only if key exists
public boolean tryLock_with_set(String key, String UniqueId, int seconds) { return "OK".equals(jedis.set(key, UniqueId, "NX", "EX", seconds)); }
The value must be unique. We can use UUID to do this and set a random string to ensure uniqueness. As for why we need to ensure uniqueness? If value is not a random string, but a fixed value, then the following problems may exist: 1. Client 1 acquires the lock successfully 2. Client 1 blocked on an operation for too long 3. The set key expires and the lock is automatically released. 4. Client 2 obtains the lock corresponding to the same resource 5. Client 1 recovers from blocking. Because the value is the same, the lock held by client 2 will be released when the lock release operation is performed, which will cause problems.
So generally speaking, when releasing the lock, we need to verify the value
Using the set key value [EX seconds][PX milliseconds][NX|XX] command seems fine, but in redis cluster mode, problems may still occur.
Client A gets the lock on the Master. At this time, the Master has not synchronized the key to the slave node. The Master node is down and a certain Slave is elected as the Master. At this time, the client will also succeed in acquiring the lock again, and there will be multiple The client has obtained the lock situation.
Unlock implementation method
To unlock, we need to verify the value of value. We cannot use del key directly, because any client can unlock it. So when unlocking, we need to verify whether the value is our own and judge based on the value.
public boolean releaseLock_with_lua(String key,String value) { String luaScript = "if redis.call('get',KEYS[1]) == ARGV[1] then " "return redis.call('del',KEYS[1]) else return 0 end"; return jedis.eval(luaScript, Collections.singletonList(key), Collections.singletonList(value)).equals(1L); }
redssion
redlock
Implementation principle
Get the current Unix time in milliseconds.
Try 5 instances in sequence, using the same key and
A unique value (such as UUID) acquires the lock. When requesting a lock from Redis, the client should set a network connection and response timeout, which should be less than the lock expiration time. For example, if your lock's automatic expiration time is 10 seconds, the timeout should be between 5-50 milliseconds. This can avoid the situation where the server-side Redis has hung up and the client is still waiting for the response result. If the server does not respond within the specified time, the client should try to obtain a lock from another Redis instance as soon as possible.
The client uses the current time minus the time when it started to acquire the lock (the time recorded in step 1) to get the time used to acquire the lock.
The lock is considered successful if and only if the lock is obtained from the majority (N/2 1, here is 3 nodes) of Redis nodes and the usage time is less than the lock expiration time.
If the lock is acquired, the real valid time of the key is equal to the valid time minus the time used to acquire the lock (the result calculated in step 3).
If for some reason, the lock acquisition fails (the lock is not acquired on at least N/2 1 Redis instances or the lock acquisition time has exceeded the effective time), the client should
Unlock all Redis instances (even if some Redis instances are not locked successfully at all, this prevents some nodes from acquiring the lock but the client does not receive a response and the lock cannot be reacquired for a period of time).
Usage
Introduce pom
<!-- https://mvnrepository.com/artifact/org.redisson/redisson --> <dependency> <groupId>org.redisson</groupId> <artifactId>redisson</artifactId> <version>3.3.2</version> </dependency>
Get lock
The code to acquire the lock is redLock.tryLock() or redLock.tryLock(500, 10000, TimeUnit.MILLISECONDS). The final core source code of both is the following code, but the default lease time (leaseTime) of the former to acquire the lock is LOCK_EXPIRATION_INTERVAL_SECONDS, which is 30s:
Config config = new Config(); config.useSentinelServers().addSentinelAddress("127.0.0.1:6369","127.0.0.1:6379", "127.0.0.1:6389") .setMasterName("masterName") .setPassword("password").setDatabase(0); RedissonClient redissonClient = Redisson.create(config); // You can also getFairLock(), getReadWriteLock() RLock redLock = redissonClient.getLock("REDLOCK_KEY"); boolean isLock; try { isLock = redLock.tryLock(); // If the lock cannot be obtained within 500ms, it is considered that the lock acquisition failed. 10000ms or 10s is the lock expiration time. isLock = redLock.tryLock(500, 10000, TimeUnit.MILLISECONDS); if (isLock) { //TODO if get lock success, do something; } } catch (Exception e) { } finally { // No matter what, it must be unlocked in the end redLock.unlock(); }
KEYS[1] is Collections.singletonList(getName()), which represents the key of the distributed lock, that is, REDLOCK_KEY; ARGV[1] is internalLockLeaseTime, which is the lease time of the lock. The default is 30s; ARGV[2] is getLockName(threadId), which is the only value set when acquiring the lock, that is, UUID threadId:
Unique ID
A very important point in implementing distributed locks is that the value of the set must be unique. How does the value of redisson ensure the uniqueness of the value? The answer is UUID threadId
protected final UUID id = UUID.randomUUID(); String getLockName(long threadId) { return id ":" threadId; }
Unlock
The code to release the lock is redLock.unlock()
load balancing
Four-layer load balancing vs seven-layer load balancing
Layer 4 load balancing (destination address and port switching)
F5: Hardware load balancer, very functional, but very expensive.
Ivs: heavyweight four-layer load software.
nginx: lightweight four-layer load software with caching function and flexible regular expressions.
haproxy: simulates layer 4 forwarding and is more flexible.
Seven-layer load balancing (content switching)
haproxy: built-in load balancing technology, fully supports seven-layer proxy, session protection, marking, and route transfer:
nginx: It only has better functions on http protocol and mail protocol, and its performance is similar to haproxy:
apache: poor functionality
Mysql proxy: The function is acceptable.
Load balancing algorithm/strategy
Round Robin
Weighted Round Robin
Random equilibrium (Random)
Weighted Random Balance
Response speed balance (Response Time detection time)
Least Connection Balance
Processing power balance (CPU, memory)
DNS response balancing (Flash DNS)
Hash algorithm
IP address hashing (to ensure stable client-server correspondence)
URL hash
LVS
LVS principle
LVS NAT mode
①. The client sends the request to the front-end load balancer. The source address of the request message is CIP (client IP), hereinafter collectively referred to as CIP), and the destination address is VIP (load balancer front-end address, hereinafter collectively referred to as VIP).
②. After the load balancer receives the message and finds that the request is for an address that exists in the rule, it changes the target address of the client request message to the RIP address of the back-end server and sends the message according to the algorithm.
⑧. After the message is sent to the Real Server, since the destination address of the message is itself, it will respond to the request and return the response message to LVS.
④. Then lvs changes the source address of this message to the local machine and sends it to the client.
Features
1. NAT technology requires the request message and the response message to be rewritten through LB. Therefore, when the website visits are relatively large, there is a relatively large bottleneck in the LB load balancing scheduler. Generally, a maximum of 10-20 units are required. node
2. You only need to configure a public network 1P address on the LB.
3. The gateway address of each internal realserver server must be the intranet address of the scheduler LB.
4. NAT mode supports IP address and port translation. That is, the port requested by the user and the port of the real server may be inconsistent.
advantage
The physical servers in the cluster can use any operating system that supports TCP/IP, only the load balancer requires a valid IP address
shortcoming
Scalability is limited. When server nodes (ordinary PC servers) grow too much, the load balancer will become the bottleneck of the entire system, because all request packets and response packets flow through the load balancer.
When there are too many server nodes, a large number of data packets will converge on the load balancer, and the speed will slow down!
LVS DR mode (LAN rewrites mac address)
①. The client sends the request to the front-end load balancer. The source address of the request message is CIP and the destination address is VIP.
②. After the load balancer receives the message and finds that the request is for an address that exists in the rule, it changes the source MAC address of the client request message to its own DIP MAC address, and changes the target MAC to the RIP MAC address. , and send this package to RS.
③. When RS finds that the destination MAC in the request message is itself, it will receive the message. After processing the request message, it will send the response message to the ethO network card through the lo interface and send it directly to the client.
Features
1. Forwarding is implemented by modifying the destination MAC address of the data packet on the scheduler LB. Note that the source address is still the CIP and the destination address is still the VIP address.
2. The requested message passes through the scheduler, while the RS response processed message does not need to pass through the scheduler LB, so the usage efficiency is very high when the amount of concurrent access is large (compared with NAT mode)
3. Because DR mode implements forwarding through the MAC address rewriting mechanism, all RS nodes and scheduler LB can only be in one LAN.
4. The RS host needs to bind the VIP address to the LO interface (mask 32 bits), and configure ARP suppression.
5. The default gateway of the RS node does not need to be configured as an LB, but directly configured as the gateway of the upper-level route, which allows the RS to directly go out of the network.
advantage
Like TUN (tunnel mode), the load balancer only distributes requests, and response packets are returned to the client through a separate routing method. Compared with VS-TUN, VS-DR does not require a tunnel structure, so most operating systems can be used as physical servers.
The DR mode is very efficient, but the configuration is a little more complicated, so companies that do not have particularly large visits can use haproxy/nginx instead. If you have 1000-2000W PV per day or less than 10,000 concurrent requests, you can consider using haproxy/nginx.
shortcoming
All RS nodes and scheduler LB can only be in one LAN
LVS TUN mode (P encapsulation, cross-network segment)
① The client sends the request to the front-end load balancer. The source address of the request message is CIP and the destination address is VIP.
②. After the load balancer receives the message and finds that the request is for an address that exists in the rule, it will encapsulate an IP message in the header of the client request message and change the source address to DIP and the destination address. Change to RIP and send this packet to RS.
③. After receiving the request message, RS will first unpack the first layer of encapsulation, and then find that the target address of the IP header is the VIP on its own lo interface, so it will process the request message and report the response. The file is sent to the ethO network card through the lo interface and sent directly to the client.
Features
1.TUNNEL mode must bind VIP IP addresses on all realserver machines
2. TUNNEL mode vip------>realserver's packet communication is through TUNNEL mode, and it can communicate whether it is the internal network or the external network, so there is no need for /vs vip and realserver to be in the same network segment.
3. In TUNNEL mode, realserver will send the packet directly to the client and not to Ivs.
4. TUNNEL mode uses tunnel mode, so it is difficult to operate and maintain, so it is generally not used.
advantage
The load balancer is only responsible for distributing request packets to backend node servers, while RS sends response packets directly to users.
Reducing the large amount of data flow in the load balancer, the load balancer is no longer the bottleneck of the system and can handle a huge amount of requests. In this way, one load balancer can distribute to many RSs. And it can be distributed in different regions by running on the public Internet.
shortcoming
The RS node in tunnel mode requires a legal IP. This method requires all servers to support the "IP Tunneling" (IPEncapsulation) protocol. The server may be limited to some Linux systems.
LVS FULLNAT mode
1. When the packet is transferred from LVS to RS, the source address is replaced from client 1P to the internal IP of LVS. Intranet IPs can communicate across VLANs through multiple switches. The destination address is changed from VIP to RS IP.
2. When RS processes the received packet and returns after processing, it changes the target address to LVS ip, the original address to RSIP, and finally returns the packet to LVS's intranet IP. This step is not limited. VLAN.
3. After LVS receives the packet, on the basis of modifying the source address in NAT mode, it changes the destination address in the packet sent by RS from the LVS intranet IP to the client's 1P, and changes the original address to VIP.
Summarize
1.FULL NAT mode does not require LBIP and realserver IP to be in the same network segment;
2. Because the source ip needs to be updated, the performance of full nat is normally 10% lower than that of nat mode.
Keepalive
Keepalive was originally designed for LVS, specifically used to monitor the status of each service node of LVS. Later, the function of vrrp was added, so in addition to LVS, it can also be used as a high-availability software for other services (nginx, haproxy)
VRRP is the abbreviation of virtual router redundancy protocol. The emergence of VRRP is to solve the single point of failure in static routing. It can ensure the uninterrupted and stable operation of the network.
Nginx reverse proxy load balancing
upstream_module and health detection
pproxy_pass request forwarding
HAProxy
Distributed coordination and offloading
Zookeeper distributed environment commander
Getting started with zk
zk development basics
zookeeper application practice
Protocol and algorithm analysis
Nginx high concurrency offloading advanced practice
nginx installation
Forward and reverse proxy
nginx process model
core configuration structure
Log configuration and signature
location rules
Use of rewrite
Separation of movement and stillness
Cross-domain configuration
Cache configuration, Gzip configuration
https configuration
Problems caused by horizontal expansion
LVS
keepalived
distributed consistency
Raft algorithm
Features
Write to database based on Quorum
Half of the writes from the database are successful, which means the operation is successful.
The master library writes the log and pushes it to the slave library
Three types of logs: BinLog, RedoLog, UndoLog
Election based on log comparison
Determine whose log is the latest (determined based on the log index in the voting request of other nodes and your own log index)
Role classification
Leader
Candidate (Leader candidate)
Follower
Message type
RequestVote
Request other nodes to vote for themselves (generally issued by Candidate)
AppendEntries
Used for log replication, indicating the entries added to the log, used for heartbeat when the number of entries is 0
Sent by Leader
Practical implementation of common distributed solutions
transaction concept
Transactions and locks
Background of distributed transactions
X/OpenDTP transaction model
Standard distributed transactions
Distributed transaction solutions
two-phase commit
BASE theory and flexible affairs
TCC scheme
compensatory plan
Asynchrony Guaranteed and Best Effort
Single sign-on solution
Problem background of single sign-on
Page cross-domain issues
Session cross-domain sharing solution
Session extension
Distributed task scheduling solution
How to use Quartz scheduling
Elastic-Job example
Difficulties in distributed scheduling
Quartz cluster customized distributed scheduling
Distributed framework and middleware
Distributed calls
RPC (Remote Call)
Restful
middleware
Caching/persistence
Distributed cache
Redis (Remote Dictionary Server)
object relations
Structure diagram
Memory classification
object memory
buffer memory
Client buffering
AOF buffer
Copy backlog buffer
Mainly used for master-slave synchronization.
own memory
Memory consumption of Redis creation child process when using AOF/RDB
memory fragmentation
Optional allocators are jemalloc, glibc, tcmalloc, the default is jemalloc
High memory fragmentation solution: data alignment, safe restart (high availability/master-slave switching)
Memory reclamation strategy
Lazy deletion
It will not actively delete expired key-value pairs, but wait for the client to read the key. If it has timed out, delete the key-value pair object and return empty.
Delete scheduled tasks
Pipeline
Redis uses a client-server (CS) model and a TCP server with request/response protocol. This means that typically a request will follow the following steps:
The client sends a query request to the server and listens for the Socket return, usually in blocking mode, waiting for the server to respond.
The server processes the command and returns the results to the client.
The TCP protocol is used to connect the Redis client and the Redis server. A client can initiate multiple request commands through a socket connection. After each request command is issued, the client usually blocks and waits for the redis server to process. After redis processes the request command, the result will be returned to the client through a response message. Therefore, when executing multiple commands, you need to wait for the previous command to be executed. implement
Pipeline can send multiple commands at once and return the results at once after execution. Pipeline reduces the round-trip delay time by reducing the number of communications between the client and redis, and the principle of Pipeline implementation is queue, and The principle of the queue is first in, first out, thus ensuring the order of data. The default number of synchronizations in Pipeline is 53, which means that the data will be submitted when 53 pieces of data are accumulated in arges. The process is shown in the figure below: the client can put three commands into one tcp message and send them together, and the server can put the processing results of the three commands into one tcp message and return them.
It should be noted that commands are packaged and sent in pipeline mode, and redis must cache the processing results of all commands before processing them. The more commands you package, the more memory the cache consumes. So it’s not that the more commands you package, the better. The specific amount of appropriateness needs to be tested according to the specific situation.
Since there will be network delay in communication, if the packet transmission time between client and server takes 0.125 seconds. Then the above three commands and six messages will take at least 0.75 seconds to complete. In this way, even if redis can process 100 commands per second, our client can only issue four commands per second. This obviously does not fully utilize the processing power of redis.
Applicable scene
Some systems may have very high reliability requirements. Every operation needs to know immediately whether the operation is successful and whether the data has been written to redis. In this case, this scenario is not suitable.
In other systems, data may be written to redis in batches, allowing a certain proportion of write failures. Then this scenario can be used. For example, if 10,000 entries are entered into redis at once, it may not matter if 2 entries fail. There will be a compensation mechanism in the future. That's it. For example, in the scenario of mass text messaging, if 10,000 messages are sent in batches at once and implemented according to the first mode, then it will take a long time to respond to the client when the request comes. The delay is too long. If the client requests settings If the timeout is 5 seconds, an exception will definitely be thrown, and the real-time requirements for group text messaging are not that high. At this time, it is best to use pipeline.
Pipelining vs Scripting
Large pipeline application scenarios can be handled more efficiently through Redis scripts (Redis version >= 2.6), which perform a lot of the work on the server side. One of the great advantages of scripts is that they can read and write data with minimal latency, making operations such as reading, calculation, and writing very fast (pipeline cannot be used in this case because the client needs to read the results returned by the command before writing it ).
Applications may sometimes send EVAL or EVALSHA commands in the pipeline. Redis explicitly supports this case through the SCRIPT LOAD command (which guarantees that EVALSHA will be called successfully).
Redis cluster solution
twemproxy
redis cluster (redis-cluster)
Redis-Cluster adopts a centerless structure. Each node saves data and the entire cluster status, and each node is connected to all other nodes.
Structural features
All redis nodes are interconnected with each other (PING-PONG mechanism), and a binary protocol is used internally to optimize transmission speed and bandwidth.
The fail of a node takes effect only when more than half of the nodes in the cluster detect failures.
The client is directly connected to the redis node, without the need for an intermediate proxy layer. The client does not need to connect to all nodes in the cluster, just connect to any available node in the cluster.
redis-cluster maps all physical nodes to [0-16383] slot (not necessarily evenly distributed), and cluster is responsible for maintaining node<->slot<->value. (hash ring) 2 to the 14th power
The Redis cluster is pre-divided into 16384 buckets. When a key-value needs to be placed in the Redis cluster, the bucket in which a key is placed is determined based on the value of CRC16(key) mod 16384.
redis cluster node allocation
Now we have three main nodes: A, B, and C. They can be three ports on one machine or three different servers. Then, if 16384 slots are allocated using hash slots, the slot intervals assumed by the three nodes are
Node A covers 0-5460;
Node B covers 5461-10922;
Node C covers 10923-16383.
retrieve data
If a value is stored, follow the algorithm of the redis cluster hash slot: CRC16('key')384 = 6782. Then the storage of this key will be allocated to B. Similarly, when I connect to any node (A, B, C) and want to get the key 'key', I will also use this algorithm, and then jump internally to node B to get the data.
Add new master node
Add a new node D. The method of redis cluster is to take a part of the slot from the front of each node and put it on D.
Node A covers 1365-5460
Node B covers 6827-10922
Node C covers 12288-16383
Node D covers 0-1364, 5461-6826, 10923-12287
Delete master node
Deleting a node is similar. You can delete the node after the move is completed.
Redis Cluster master-slave mode
In order to ensure the high availability of data, redis cluster has added the master-slave mode. One master node corresponds to one or more slave nodes. The master node provides data access, and the slave node pulls data backup from the master node. When the master node hangs After it fails, one of the slave nodes will be selected to act as the master node to ensure that the cluster will not hang up.
The master-slave structure of Redis can use one master, multiple slaves or a cascade structure. Redis master-slave replication can be divided into full synchronization and incremental synchronization according to whether it is full volume.
Construction of redis cluster
There should be at least an odd number of nodes in the cluster, so at least three nodes, each with at least one backup node
The master-slave structure of Redis can use one master, multiple slaves or a cascade structure. Redis master-slave replication can be divided into full synchronization and incremental synchronization according to whether it is full volume.
Master-slave synchronization
Data can be synchronized from the master server to any slave server, and the slave server can be a master server associated with other servers. Since the publish/subscribe mechanism is fully implemented, when the slave database synchronizes the tree anywhere, it can subscribe to a channel and receive the complete message publishing record of the master server. Synchronization is helpful for scalability and data redundancy of read operations.
working principle
Full synchronization
Redis full synchronization generally occurs during the slave initialization phase. At this time, all data on the Master needs to be copied.
1. The slave server connects to the master server and sends the SYNC command;
2. After the main server receives the SYNC naming, it starts executing the BGSAVE command to generate an RDB file and uses the buffer to record all write commands executed thereafter;
3. After the master server BGSAVE is executed, it sends snapshot files to all slave servers and continues to record the executed write commands during the sending period;
4. After receiving the snapshot file from the server, discard all old data and load the received snapshot;
5. After the master server snapshot is sent, it starts sending the write command in the buffer to the slave server;
6. The slave server completes loading the snapshot, starts receiving command requests, and executes write commands from the master server buffer;
After completing the above steps, all operations of data initialization from the slave server are completed. The slave server can now receive read requests from users.
Incremental synchronization
Redis incremental synchronization refers to the process of synchronizing the write operations that occurred on the master server to the slave server when the redis slave is initialized and starts working normally.
Incremental replication mainly means that every time the master server receives a write command, it will send the same write command to the slave server, and the slave server receives and executes the same write command.
redis master-slave synchronization strategy
When the master-slave connection starts, full synchronization is performed, and after synchronization is completed, incremental synchronization is performed.
If necessary, the slave can initiate full synchronization at any time
The redis strategy is to perform incremental synchronization first, and then perform full synchronization if synchronization fails.
Note: When multiple slaves are disconnected, they need to be restarted. When restarted, sync will be automatically sent to request the master server for full synchronization. When multiple slaves appear at the same time, it will cause a sharp increase in Master IO and downtime.
High availability under Redis Sentinel architecture
When the Master hangs up, it is necessary to manually promote the slave node to the master node, and at the same time notify the business party to change the master node address. This kind of fault handling method is unacceptable for certain application scenarios. Redis provides sentinel architecture in version 2.8 to solve this problem
Implementation principle
Three scheduled monitoring tasks
Every 10 seconds, each sentinel node will send the info command to the master node and slave node to obtain the latest topology structure
Every 2s, each sentinel node will send the current sentinel node's judgment of the master node and the information of the current sentinel node to the _sentinel_:hello channel of the Redis data node. At the same time, each sentinel node will also subscribe to this channel to learn about other sentinel nodes. and their judgment on the master node
Every 1 second, sentinel will send a ping command to the master node and slave node for a heartbeat check to confirm whether these nodes are currently reachable.
Subjective offline
Because every second, each Sentinel node will send a ping command to the master node, slave node, and other Sentinel nodes for a heartbeat detection. When these nodes do not respond effectively for more than down-after-milliseconds, the Sentinel node will The node makes a failure determination. This behavior is called subjective offline.
Objective offline
When the node that Sentinel subjectively goes offline is the master node, the Sentinel node will ask other Sentinel nodes for their judgment on the master node. When the number exceeds <quorum>, it means that most of the Sentinel nodes are not sure about the master node. The network makes a unanimous decision, so the Sentinel node believes that there is indeed a problem with the master node. At this time, the Sentinel node will make an objective decision to go offline.
Leader sentinel node election
The Raft algorithm assumes that s1 (sentinel-1) is the first to complete the objective offline. It will send commands to other Sentinel nodes and request to become the leader; if the Sentinel node that receives the command has not agreed to the request of other Sentinel nodes, it will agree. s1's request, otherwise rejected; if s1 finds that its vote count is greater than or equal to a certain value, then it will become the leader.
failover
1. The leader Sentinel node selects a node from the slave nodes as the new master node
2. The above selection method is the slave node with the highest replication similarity to the master node.
3. The leader Sentinel node allows other slave nodes to become slave nodes of the new master node.
4. The Sentinel collection will turn the original master node into a slave node, keep an eye on it, and order it to copy the new master node when it recovers.
High availability under Redis Cluster (cluster)
Implementation principle
Subjective offline
Each node in the cluster periodically sends ping messages to other nodes, and the receiving nodes reply with ping messages in response. If communication continues to fail within the cluster-node-timeout time, the sending node will consider the receiving node to be faulty and mark the receiving node as subjective offline (pfail).
Objective offline
When a node determines that another node is subjectively offline, the corresponding node status will follow the message and spread within the cluster.
Assume that node a marks node b as subjectively offline. After a period of time, node a sends the status of node b to other nodes through messages. When other nodes receive the message and parse out the pfail status of b in the message body, node b is added. Offline report chain list;
When a certain node c receives the pfail status of node b, and more than half of the slot master nodes have marked node b as pfail status, the faulty node b is marked as objectively offline;
Broadcast a pfail message to the cluster, notifying all nodes in the cluster to mark the faulty node b as objectively offline and taking effect immediately. It also notifies the slave node of the faulty node b to trigger the failover process.
Recovery
Eligibility check
If the slave node is disconnected from the master node for more than a certain period of time, it will not be eligible.
Preparing for election time
When the slave node is eligible for failover, it will wait for a period of time before starting the election.
Among all the slave nodes of the faulty node, the slave node with the largest replication offset starts the election first (most consistent with the data of the master node), and then the second largest node starts the election... and the rest Slave nodes wait until their election time arrives before conducting elections.
Initiate an election
Only the master node holding the slot has a unique vote. When N/2 1 master node holding the slot votes are collected from the slave node, the slave node can perform the operation of replacing the master node.
election voting
Replace the master node
When enough votes are collected from the slave node, the replacement of the master node operation is triggered.
The current slave node cancels replication and becomes the master node
Revoke the slots responsible for the failed master and delegate these slots to itself
Broadcast its own pong message to the cluster to notify all nodes in the cluster that the current slave node has become the master node and has taken over the slot information of the failed master node.
asynchronous queue
Use list as queue
RPUSH acts as a producer to produce messages, and LPOP acts as a consumer to consume messages.
Disadvantages: There is no waiting queue, and you can consume it directly if there is value. Make up: You can call LOOP through the sleep mechanism at the application layer to retry. If you do not use the sleep mechanism, you can use BLPOP key[key..] timeout to block until there is a message in the queue or times out.
One producer corresponds to one consumer
How to produce once and make it available to multiple consumers
pub/sub: topic subscription mode
The sender (pub) sends the message and the subscriber (sub) receives the message.
Subscribers can subscribe to any number of channels
Disadvantages: Messages are stateless and cannot be guaranteed to be reachable.
How to implement a delay queue
Use sortset
Take timestamp as score
The message content is used as the key to call zadd to produce the message.
The consumer uses the zrangeBysocre instruction to obtain data polling N seconds ago for processing.
Redis persistence
Persistence mode RDB (redis database)
principle
RDB persistence refers to writing a snapshot of the data set in memory to disk within a specified time interval. The actual operation process is to fork a child process, first write the data set to a temporary file, and then replace the previous file after the writing is successful. , stored using binary compression.
Advantage
1. Once you adopt this approach, your entire redis library only contains one file, which is perfect for file backup. For example, you plan to synchronize 24 hours of data every hour and 30 days of data every day. With this configuration, we can easily recover data in the event of a catastrophic failure.
2. For disaster recovery, RDB is a very good choice because we can easily compress a separate file and then copy it to other storage media.
3. Maximize performance. For the Redis service process, when starting persistence, the only thing to do is to fork out a child process, and leave the rest to the child process to complete the persistence operation, which greatly avoids the service process from performing IO operations.
4. Compared with the AOF mechanism, if the data set is large, the RDB startup efficiency will be higher
shortcoming
1. If you want to ensure high availability of data, that is, avoid data loss to the greatest extent, RDB is not the best choice, because if the system goes down before a specific persistence time, the data that has not been saved to the disk will be lost. .
2. Since the fork child process is used to assist in disk persistence operations, when the data set is relatively large, it may cause the server to stop serving for hundreds of milliseconds, or even 1 second.
Configuration
Redis will dump the snapshot of the data set into the dump.rdb file. In addition, we can also modify the frequency of Redis server dump snapshots through the configuration file. After opening the 6379.conf file, we search for save and can see the following configuration information:
save 900 1 #After 900 seconds (15 minutes), if at least 1 key changes, dump the memory snapshot.
save 300 10 #After 300 seconds (5 minutes), if at least 10 keys have changed, dump the memory snapshot.
save 60 10000 #After 60 seconds (1 minute), if at least 10000 keys have changed, dump the memory snapshot.
During the execution of the bgsave command, only the fork child process will block the server; for the save command, the entire process will block the server; therefore, save has been basically abandoned, and the use of save must be eliminated in the online environment;
Persistence method AOF (append only file)
principle
AOF persistence records every write and delete operation processed by the server in the form of a log. Query operations are not recorded, but are recorded in text. You can open the file to see detailed operation records.
Advantage
This mechanism can bring higher security, that is, data persistence
Three synchronization strategies are provided
Sync every second
Synchronization per second is also an asynchronous operation, which is very efficient. If there is a service outage, only the data of the previous second will be lost.
Sync every change
It can be understood as synchronous persistence and low efficiency.
Out of sync
Since this mechanism uses the append mode for writing log files, even if there is a downtime during the writing process, the existing content in the log file will not be destroyed. However, if we only write half of the data in this operation and a system crash occurs, don't worry. Before Redis starts next time, we can use the redis-check-aof tool to help us solve the data consistency problem.
If the log is too large, Redis can automatically enable the rewrite mechanism. That is, Redis continuously writes modified data to the old disk file in append mode. At the same time, Redis also creates a new file to record which modification commands were executed during this period. Therefore, data security can be better ensured when performing rewrite switching.
AOF contains a clearly formatted, easy-to-understand log file for recording all modification operations. In fact, we can also complete data reconstruction through this file.
shortcoming
For the same number of data sets, AOF files are usually larger than RDB files. RDB is faster than AOF when restoring large data sets.
Depending on the synchronization strategy, AOF is often slower than RDB in terms of operating efficiency. In short, the efficiency of the synchronization strategy per second is relatively high, and the efficiency of the synchronization disabling strategy is as efficient as RDB.
Configuration
There are three synchronization methods in the Redis configuration file, which are:
appendfsync always #The AOF file will be written every time a data modification occurs.
appendfsync everysec #Synchronize once every second. This policy is the default policy of AOF.
appendfsync no #Never synchronize. Efficient but data will not be persisted.
Persistence mixed mode
redis4.0 begins to support this mode
To solve the problem
Redis usually loads AOF files when restarting, but the loading speed is slow Because RDB data is incomplete, AOF is loaded Opening method: aof-use-rdb-preamble true When turned on, AOF will directly read the contents of the RDB when rewriting.
working process
It is completed through bgrwriteaof. The difference is that when hybrid persistence is turned on, 1 The child process will write the data in the memory into the aof in the form of RDB. 2 Write the incremental command in the rewrite buffer to the file in AOF mode 3 Overwrite the old AOF file with the AOF data containing the number of RDBs and the number of AOF cells. In the new AOF file, part of the data comes from the RDB file, and part of it comes from the incremental data when Redis is running.
advantage
Hybrid persistence combines the advantages of RDB persistence and AOF persistence. Since most of them are in RDB format, the loading speed is fast. At the same time, combined with AOF, incremental data is saved in AOF mode, and less data is lost.
shortcoming
Poor compatibility. Once hybrid persistence is turned on, the AOF file will not be recognized in versions before 4.0. At the same time, because the first part is in RDB format, the readability is poor.
Persistence (double open)
Within the time interval specified by the default RDB, if a specified number of write operations are performed, the data will be written to the disk.
Suitable for large-scale data recovery, data consistency and integrity are poor
AOF appends write operation logs to the AOF file every second
AOF files are large and data integrity is higher than RDB
Compare with memcached
All values in memcached are simple strings, and redis supports richer data types.
redis is faster than memcached and can persist data
Applicable scene
Session Cache
Full Page Cache (FPC)
queue
Leaderboard/Counter
publish/subscribe
The difference between Redis and Memcache
Memcached is a high-performance distributed memory object caching system
A single key-value in-memory Cache
Used in dynamic web applications to reduce database load and cache images, videos, etc.
Redis is a data structure in-memory database
Support data persistence and data recovery, allowing single points of failure
Rich operations can be performed directly on the server side to reduce network IO times and data volume.
Redis Java client
Lettuce
The client used by springboot by default
An event-driven communication layer based on the Netty framework, whose method calls are asynchronous
Thread-safe synchronous, async and reactive usage, support for clusters, Sentinel, pipes and encoders
Redisson
Based on Netty implementation, using non-blocking IO, high performance
Support asynchronous requests
Support connection pooling
Transactions are not supported
Supports read-write separation and read load balancing
Can be integrated with Spring Session to achieve Redis-based session sharing
Jedis
Lightweight, simple, easy to integrate and modify
Support connection pooling
Supports pipelining, transactions, LUA Scripting, Redis Sentinel, Redis Cluster
Reading and writing separation is not supported and needs to be implemented by yourself.
common problem
Why is redis single-threaded?
Because Redis is a memory-based operation, CPU is not the bottleneck of Redis.
Unnecessary context switching and race conditions are avoided, and there is no multi-process or multi-thread switching that consumes CPU.
Use multi-channel I/O multiplexing model, non-blocking IO
Redis directly built its own VM mechanism
How to ensure the consistency between Redis and database
When updating, delete the cache first and then update the database.
When reading, read the cache first; if not, read the database, put the data into the cache, and return the response.
Set cache expiration time in special circumstances
How to implement distributed lock in redis
SET key value [EX seconds] [PX milliseconds] [NX|XX]
EX second: Set the expiration time of the key to second seconds
NX: Only set the key when the key does not exist
When the SET operation is successfully completed, OK is returned, otherwise null is returned.
PX millisecond: Set the expiration time of the key to millisecond milliseconds
XX: Only set the key when the key already exists.
It is only safe in a single instance scenario
Multiple clients may acquire the lock at the same time
Asynchronous acquisition, node exception
Redis memory optimization
Optimization of storage encoding
The data stored in Redis is encapsulated using the redisObject structure.
shared object pool
Refers to the integer object pool [0-9999] maintained internally by Redis to save memory.
In addition to integer value objects, other types such as list, hash, set and zset internal elements can also use the integer object pool.
String optimization
Control the number of keys and use hash instead of multiple key values
Reduce key-value objects, the shorter the key value, the better
Common performance issues with Redis
Prevent the master from writing memory snapshots
Master AOF persistence. Excessive AOF files will affect the recovery speed when the master restarts.
Prevent the master from calling BGREWRITEAOF to rewrite the AOF file and cause a short service suspension.
Performance issues of redis master-slave replication
For the speed of master-slave replication and the stability of the connection, it is best for the slave and the master to be in the same LAN.
What are the elimination strategies?
volatile-lru: From the data set with expiration time set
Select the most recent and unused data to release
allkeys-lru: From the data set (including the data set with expiration time set and the expiration time not set)
Select the most recent and unused data to release
volatile-random: from the data set with expiration time set
Randomly select a piece of data to release
allkeys-random: from the data set (including expiration time set and expiration time not set)
Randomly select a piece of data for release
volatile-ttl: From the data set with expiration time set
Select data that is about to expire and release it
noeviction: Do not delete any data (but redis will also release it based on the reference counter)
At this time, if there is not enough memory, an error will be returned directly.
How to find a key with a fixed prefix from a large number of keys
keys pattern command
Will return all matching keys
Blocking. When querying a large number of keys, it will affect the running service, because when the amount of data returned at one time is relatively large, the service will become laggy.
SCAN cursor [MATCH pttern] [Count count] command
SCAN is a list of fetch instructions in non-blocking mode, returning only a small number of elements each time
cursor refers to the cursor, the MATCH pattern value is an instruction, and the count parameter specifies the number of data returned, but count cannot strictly control the number.
scan 0 match k1* count 10 //This command refers to the high probability of returning count data starting with k1
This command only performs new iterations with 0 as the starting point of the cursor until the cursor returned by the command is 0, that is, when the returned cursor is 0, it means that the entire iteration process is completed.
The SCAN incremental iteration command does not guarantee that each execution will return a given number of elements, which may be 0. If the cursor returned by the command is not zero, the application will continue to use the previous cursor for iteration until The cursor value is 0. For larger data sets, dozens of data may be returned at a time. For smaller data sets, all data sets may be returned directly.
All command pairs detected by the SCAN command may have duplicate values, so we can use hashset to achieve data deduplication.
Memcached
Structure diagram
Distributed storage
MongoDB
FastDFS
Elasticsearch
Full cache
Binlog
It is a master-slave data synchronization solution for MySQL and most mainstream databases.
Other caching frameworks/databases
SSDB
RocksDB
Message Queuing (MQ)
Kafka/Jafka
advantage
Time complexity O(1)
High TPS
shortcoming
Does not support scheduled messages
Kafka Streams
ActiveMQ
RabbitMQ
advantage
High concurrency (caused by Erlang language implementation characteristics)
High reliability and high availability
shortcoming
Heavyweight
publish-subscribe model
RabbitMQ’s publish-subscribe model
Exchange
Messages sent by producers can only be sent to the switch. The switch decides which queue to send to, and the producer cannot decide.
temporary queue
queueDeclare() to create a non-persistent, proprietary, automatically deleted queue with a randomly generated name
Binding
RocketMQ
java language implementation
components
nameserver
broker
producer
consumer
Two consumption models
PULL
DefaultMQPullConsumer
PUSH
DefaultMQPushConsumer
advantage
High data reliability
Supports synchronous disk brushing, asynchronous real-time disk brushing, synchronous replication, and asynchronous replication
Real-time message delivery
Support message failure retry
High TPS
A single machine writes about 70,000 messages/second to a single instance of TPS. With 3 Brokers deployed on a single machine, it can reach a maximum of 120,000 messages/second. The message size is 10 bytes.
Strict message order
Support scheduled messages
Supports backtracking messages by time
Billions of messages piled up
shortcoming
The consumption process must be idempotent (remove duplication)
ZeroMQ
advantage
High TPS
shortcoming
Persistent messages are not supported
Poor reliability and availability
JMS
API
ConnectionFactory
Connection
Session
Destination
MessageProducer/consumer
Message composition
message header
message body
TextMessage
MapMessage
BytesMessage
StreamMessage
ObjectMessage
Message properties
JMS reliable mechanism
The message is considered successfully consumed only after it is confirmed. Message consumption consists of three stages: the client receives the message, the client processes the message, and the message is confirmed
transactional session
Messages are automatically submitted after session.commit
non-transactional session
answer mode
AUTO_ACKNOWLEDGE
automatic confirmation
CLIENT_ACKNOWLEDGE
textMessage.acknowledge()confirm message
DUPS_OK_ACKNOWLEDGE
Delayed confirmation
Peer-to-peer (P2P mode)
Publish and subscribe (Pub/Sub mode)
durable subscription
non-durable subscription
Database middleware
Sub-database and sub-table
ShardingSphere
Architecture diagram
SJDBC (sharding-jdbc)
Mycat
other
Disque
Cassandra
Neo4j
InfoGrid
Distributed framework
Dubbo
Spring Cloud
Nacos
Apollo
Disconf
Distributed architecture
service component
Registration center
Zookeeper
data model
Node type
persistence node
Persistent ordered nodes
Temporary node
temporary ordered node
Order
Create node
create [-s] [-e] path data acl
Get node
get path [watch]
list nodes
ls [path]
Modify node
set path data [version]
Delete node
delete path [version]
Applicable scene
Subscription Publishing/Configuration Center
Watcher mechanism implementation
Implement centralized management of configuration information and dynamic update of data
service discovery
Distributed lock
Implementation of temporary ordered nodes and watcher mechanism
exclusive lock
Temporary node implementation
shared lock
Temporary ordered node implementation
load balancing
Requests/data are spread across multiple computer units
ID generator
distributed queue
Unified naming service
master election
Split-brain problems can be avoided
Limiting
Euraka
consul
File system
NFS
FTP
Ceph
AWS S3
Grid services
Service Mesh
Linkerd
Istio
Envoy
Dynamic service discovery
load balancing
polling
random
Minimum weighted requests
TLS termination
HTTP/2 & gRPC proxy
fuse
Health check, grayscale release based on percentage traffic split
fault injection
Rich metrics
Mixer
Access control
Use strategy
Data collection
Pilot
service discovery
Resilient (timeout, retry, circuit breaker, etc.) traffic management
Intelligent routing
Citadel
Galley
Envoy
nginmesh
Tool library
Apache Commons
GoogleGuava
lombok
Bytecode manipulation library
ASM
Cglib
Javassist
Byteman
Byte Buddy
bytecode-viewer
json
FastJson
Gson
Jackson
Json-lib
Other frameworks
reactive framework
Vert.x
asynchronous framework
Netty
Tiles
Core algorithm
consensus algorithm
Load balancing algorithm
Current limiting algorithm
Distributed task scheduling
Distributed ID generation
Distributed coordination and synchronization
filter algorithm
Hash algorithm
Application: file verification, digital signature, authentication protocol
type
MD5 (Message-Digest Algorithm 5)
Algorithm used to ensure complete and consistent information transmission and output a fixed length of 128 bits
SHA-1
Commonly used for HTTPS transmission and software signing
SHA-2
SHA-224/SHA-256/SHA-384/SHA-512 and become SHA-2
SHA-3
Previously known as the Keccak algorithm, it is a cryptographic hash algorithm
Problems and methods
Build high-performance read services
The data in the cache is filtered and stored only if it has business meaning and will be queried.
Data in cache can be compressed
Compression algorithms such as Gzip and Snappy
Fields are replaced with alternative identifiers during JSON serialization
Redis uses a Hash structure to store data, and you can also use identifiers instead.
Asynchronous parallelized reading
Build a highly available data writing service
Database sharding/data sharding
Globally unique identifier
Randomly generated using an algorithm
Build an ID generation service based on the database primary key
Sub-library middleware
Mycat
Stateless storage, cut the database at any time
Write randomly according to the weight of available libraries
After the data is successfully written to the random storage, the data is actively written to the cache.
Full synchronization, scanning database creation time is greater than 5 seconds (configurable) and unsynchronized data
Cache downgrade
Actively downgrade to the database for a complete query and store the queried value in the cache
message queue
Can be read from the cache and written to the database asynchronously
Highly available architecture
Cache multi-machine hot backup to avoid cache loss and other problems
Leveraging in-app precaching
load balancing
HAProxy
Nginx
network model
epoll (multiplexed IO)
downgrade protection
Read and write separation
Separation of dynamic and static traffic
How to ensure heterogeneous data consistency
Multithreading and concurrency
Basic principles
synchronized
Thread safety issues
Shared data exists (also known as offline resources)
There are multiple threads working together to operate these shared data
Solution: Only one thread is allowed to operate shared resources at the same time. Other threads must wait for the thread to finish processing before they can operate shared resources.
Mutex lock characteristics
mutual exclusivity
Only one thread is allowed to hold an object lock at the same time. Through this feature, a multi-thread coordination mechanism is implemented, so that only one thread can access the code block (composite operation) that needs to be synchronized at the same time. Mutual exclusivity is also known as atomicity of operations
visibility
It must be ensured that the modifications made to the shared variable before the lock is released are visible to the thread that subsequently operates the variable. That is, the latest value of the shared variable should be obtained when the lock is acquired, otherwise another thread will be able to access the shared variable locally. One of the cached copies continues to operate, causing inconsistencies
What synchronized locks is not the code, but the object.
Two ways to obtain object locks
synchronized code block
Synchronized code block, synchronized(this), synchronized(instance object), the instance object in parentheses when locking
Synchronized non-static methods
synchronized method, the current instance object is locked
According to the classification of acquiring locks
Get object lock
synchronized code block
synchronized(this), synchronized(class instance object), the lock is the instance object in ()
Synchronized non-static methods
synchronized method locks the instance of the current object
Get class lock
Synchronized code block synchronized(class.class) locks the Class object in parentheses
Synchronized static method synchronized static method locks the current class object (Class object)
synchronized The underlying implementation principle
The layout of objects in memory
Object header
Mark World
By default, the hashCode and generational age of the stored object are stored. Lock type, lock flag and other information
Lock status
Unlocked state
lightweight lock
Heavyweight lock
GC mark
bias lock
Class Metadata Address
The type pointer points to the class metadata of the object, and the JVM uses this pointer to determine which class the object is.
Instance data
Align padding
What is reentrancy
From the design of the mutex lock, when a thread attempts to enter the critical resource of the object lock held by another thread, it will be blocked. When a thread again requests the critical resource of the object lock it holds, this situation is reentrant.
Why do you sneer at synchronized?
In early versions, synchronized was a heavyweight lock and relied on Mutex Lock for implementation.
Switching between threads requires switching from user mode to kernel mode, which is expensive.
After Java6 synchronized Performance has been greatly improved
spin lock
spin lock
In many cases, the locked state of shared data lasts so short that switching threads is not worth it.
By letting the thread execute a busy loop and wait for the lock to be released, the CPU is not given up.
Disadvantages: If the lock is occupied by other threads for a long time, it will bring a lot of performance overhead.
Adaptive spin lock
Spin time is no longer fixed
Determined by the previous spin time on the same lock and the status of the lock owner
lock elimination
More thorough optimization
During JIT compilation, context scanning will be performed to remove locks where there is no possibility of competition.
lock roughening
Avoid repeated locking and unlocking by expanding the scope of the lock
Synchronized lock four states
no lock
bias lock
Reduce the cost of acquiring locks for the same thread. In most cases, there is no multi-thread competition for lock resources, and the same thread always acquires the lock resource multiple times.
Core idea: If a thread obtains the lock, the lock enters the biased mode, and the structure of the Mark Word becomes a biased structure. When the thread requests the lock again, there is no need to do any synchronization action. That is, the process of acquiring the lock only needs to check the mark bit of the Mark Word lock to bias the lock, and the Id of the current thread is equal to the ThreadID of the mark word. This can save the need. A large number of application operations related to locks
lightweight lock
Lightweight locks are upgraded from biased locks. When the second thread joins the lock competition, the biased lock will be upgraded to a lightweight lock.
Adapt to the scenario: Threads alternately execute synchronized code blocks If the same lock is accessed at the same time, it will expand into a heavyweight lock.
Heavyweight lock
synchronized and ReenTrantLock the difference
Reentrant:Lock (reentrant lock) introduction
Located under j.U.C package
Implemented based on AQS like CountDownLatch, FutureTask, and Semaphore
Able to achieve finer-grained control than synchronized, such as controlling fairness
After calling lock(), you must call unlock() to unlock
The performance may not be higher than synchronized, and it is also reentrant.
ReentrantLock fairness settings
Create fair lock example ReentrantLock fairLock = new ReetrantLock(true);
When the parameter is true, the lock is tended to be given to the thread that has been waiting the longest.
fair lock
The order of acquiring locks is based on the order in which the lock method is called (use with caution)
unfair lock
The order of preemption is not necessarily determined, it depends on luck.
Fair lock and unfair lock
synchronized is an unfair lock
ReentrantLock objectifies locks
Determine whether there is a thread, or a specific thread, waiting in queue to acquire the lock
Attempt to acquire lock with timeout
Perceive whether the lock has been successfully acquired
Can wait/notify/notifyAll be objectified?
java.util.concurrent.locks.Condition
ArrayBlockingQueue is a thread-safe, bounded blocking queue implemented by the underlying array.
Summarize
synchronized is the keyword, ReentrantLock is the class
ReentrantLock can set the waiting time for acquiring the lock to avoid deadlock.
ReentrantLock can obtain information about various locks
ReentrantLock can flexibly implement multiple notifications
Mechanism: sync operates on the Mark Word of the object header, and lock calls the park() method of the Unsafe class.
jmm memory visibility
What is Java in the memory model happen-before
Java of Memory Model JMM
Java Memory Model (JMM for short) itself is an abstract concept and does not really exist. It describes a set of rules or specifications. Through this set of specifications, each variable of the program (including instance fields) is defined. , static fields and elements that make up an array object).
Main memory in JMM
Store Java instance objects
Including member variables, class information, constant light, static variables, etc.
Belongs to a data sharing area, which may cause thread safety issues when multi-threaded operations occur concurrently.
JMM working memory
Stores all local variable information of the current method. Local variables are not visible to other threads.
Bytecode line number indicator, Native method information
It belongs to the thread private data area and does not have thread safety issues.
JMM and Java memory area division are different conceptual levels
JMM describes a set of rules that revolve around atomicity, orderliness, and visibility.
Similarities: There are shared areas and private areas
Summary of data storage types and operation methods of main memory and working memory
The basic data type local variables in the method will be directly stored in the stack frame structure of the working memory.
Local variables of reference type: the reference is stored in working memory and the instance is stored in main memory
Member variables, static variables, and class information will all be stored in main memory.
The main memory sharing method is that each thread copies a copy of the data to the working memory, and refreshes it back to the main memory after the operation is completed.
JMM How to solve memory visibility issue
Conditions that need to be met for instruction reordering
The results of the operation cannot be changed in a single-threaded environment.
Reordering is not allowed if there are data dependencies.
That is, only instructions that cannot be deduced through the happens-before principle can be reordered.
If A's demerit operation requires B's operation courseware, then A and B have a happens-before relationship.
happen-before in principle
The main basis for judging whether there is competition in the data and whether the thread is safe
program order rules
In a thread, according to the code order, the operations written in the front occur before the operations written in the back.
Locking rules
An unLock operation occurs before a subsequent lock operation on the same lock.
volatile variable rules
A write operation to a variable occurs before a subsequent read operation to the variable.
Delivery rules
If operation A occurs before operation B, and operation B occurs before C, then it can be concluded that operation A occurs before C
Thread startup rules
The start() method of the Thread object occurs first for every action of this thread.
Thread termination rules
The call to the thread's interrupt() method occurs first after the code of the interrupted thread detects the occurrence of the interrupt event.
Thread termination rules
All operations in the thread first occur when the thread is terminated. We can detect that the thread has terminated through the Thread.join() method and the return value of Thread.isAlive().
Object finalization rules
The initialization of an object occurs first at the beginning of its finalize() method.
If two operations do not satisfy any of the above happens-before rules, then the order of these two operations is not guaranteed, and the JVM can reorder the two operations;
If operation A happens-before operation B, then operations performed on memory by operation A are visible to operation B.
volatile
Lightweight synchronization mechanism provided by JVM
Ensure that shared variables modified by volatile are always visible to all threads
Disable instruction reordering optimization
Volatile does not guarantee safety in multi-threaded situations
Not thread safe
Thread safety
Thread safety
Volatile variables and why they are immediately visible
When writing a volatile variable, JMM will refresh the shared variable value in the working memory corresponding to the thread to the main memory.
When reading a volatile variable, JMM will invalidate the working memory corresponding to the thread.
volatile such as disabling reordering optimization
Memory Barrier
Guarantee the execution order of specific operations
Guarantee memory visibility of certain variables
Disable reordering optimizations for instructions before and after a memory barrier by inserting a memory barrier
Forces flushing of cached data for various CPUs, so threads on any CPU can read the latest version of these data
Double detection implementation of singleton
Use volatile to disable reordering optimization
The difference between volatile and synchronized
The essence of volatile is to tell the JVM that the value of the current variable in the register (working memory) is uncertain and needs to be read from the main memory; synchronized locks the current variable. Only the current thread can access the variable, and other threads are blocked until they know the value. Until the thread completes the variable operation
Volatile can only be used at the variable level, and synchronized can be used at the variable, method, and class levels.
Volatile can only realize the visibility of variable modifications, but cannot guarantee atomicity, while synchronized can guarantee the visibility and atomicity of variable modifications.
volatile will not cause thread blocking; synchronized may cause thread blocking.
Variables marked volatile will not be optimized by the compiler; variables marked synchronized can be optimized by the compiler.
CAS lock-free technology (Compare and Swap)
synchronized is a pessimistic lock, CAS is an optimistic lock design
An efficient way to achieve thread safety
Supports atomic update operations, suitable for counters, sequencers and other scenarios
It is an optimistic locking mechanism, known as lock-free.
The perception is that there is still locking behavior at the bottom layer.
If the CAS operation fails, the developer decides whether to continue trying or perform other operations.
So it will not be blocked and hung
CAS thought
Contains three operands
Memory location (V)
Expected original value (A)
new value (B)
When performing a CAS operation, the value of the memory location will be compared with the expected original value. If they are equal, the processor will automatically set the value of the location to the new value, otherwise the processor will not do any processing. The value of the memory location is the value of the main memory
CAS is transparent to developers in most cases
The atomic package of J.U.C provides commonly used atomic data types as well as related atomic types such as references and arrays, and update operation tools. It is the first choice for many thread-safe programs.
Although the Unsafe class provides CAS services, it has hidden dangers because it can arbitrarily manipulate memory addresses to read and write, so do not implement it manually easily.
You must call Unsafe. After Java 9, Variable Handke API is provided to replace Unsafe.
Disadvantages of CAS
If the weak loop time is long, the overhead will be high
Only atomic operations on a shared variable are guaranteed
ABA problem
If the value of a variable is A, but was changed to B by another thread during the period, and then changed back to A, the CAS operation will consider that the value has not been changed at this time.
Solution: Provides a version of the AtomicStampedeReference control variable to solve the ABA problem of CAS
process
multi-Progress
Features
Memory isolation, an exception in a single process will not cause the entire application to crash, making debugging easier
Inter-process calls, communication and switching are expensive
Often used in scenarios with little interaction between target sub-functions, weak correlation, and scalable to multi-machine distribution (Nginx load balancing) scenarios
Classification of processes
daemon
Daemon Thread (daemon thread)
Provide services for the running of other threads
GC thread
non-daemon process
User Thread(user thread)
How to create a process
Use Runtime's exec(String cmdarray[]) method to create a virtual machine instance
Any process will only run in one virtual machine instance (the underlying source code adopts singleton mode)
Create an operating system process using the start() method of ProcessBuilder
The difference between process and thread
Process/process
is the entity of the program
A program is a description of instructions, data, and their organization
It is a running activity of a program in a computer on a certain data set.
It is the basic unit for resource allocation and scheduling in the system.
It is the basis of the operating system structure.
Is a container for threads
Process characteristics
independence
Dynamic
Concurrency
Launch a computer, text editor, etc. through a process
thread
It exists dependent on the process. Each thread must have a parent process.
The thread has its own stack, program counter and local variables, and the thread shares the system resources of the process with other threads
Processes cannot share memory, whereas threads can easily share memory with each other
Thread basics
Thread parent-child relationship
The creation of a thread must be completed by another thread
The parent thread of a created thread is the thread that created it
Create thread
Implement the Runnable interface run method
Override the run method of the Thread class
daemon thread
thread join
interrupt interrupt function
1. Call interrupt() to notify the thread that it should be interrupted.
If the thread is blocked, it will exit the blocked state immediately and throw an InterruptedException
Current usage
The called thread needs to cooperate with the interrupt
When executing tasks normally, you need to frequently check the interrupt flag bit of this thread. If the interrupt flag is set, stop the thread by itself.
If it is in a normal active state, then set the thread's interrupt flag bit to true, and the set thread will run normally without being affected.
The difference between Thread and Runable
Runable is the interface
Thread is a class that implements the Runable interface
Because of the single inheritance principle of classes, it is recommended to use the Runnable interface
The difference between sleep and wait
sleep()
Thread.sleep()
The sleep method can be used anywhere
Sleep will only give up the CPU and will not cause the lock behavior to change.
wait()
Object.wait()
wait can only be used in synchronized methods or synchronized code blocks
Not only gives up the CPU, but also releases the synchronization resource lock currently occupied.
The difference between notify and notifyAll
Lock pool and wait pool
lock pool
Assume that thread A already owns the lock of an object (not), and other threads B and C want to call the synchronized method or synchronized code block of this object. Since threads B and C must enter the synchronized method (or code block) of the object, Hold the ownership of the object lock. At this time, the lock is held by A, then the B and C threads will be blocked and enter a place to wait for the lock to be released. This place is the object's lock pool.
waiting to eat
Assume that thread A calls the wait() method of an object. Thread A will release the object's lock. At this time, A will enter the waiting pool of the object. The threads entering the waiting pool will not compete for the object's lock.
notify
A thread in the waiting pool will be randomly selected to enter the lock pool to compete for the opportunity to acquire the lock.
notifyAll
notifyAll will cause all threads in the waiting pool to enter the lock pool to compete for the opportunity to acquire the lock.
yield assignment function
When the Thread.yield() function is called, it will give the thread scheduler a hint, indicating that the current thread is willing to give up the CPU, but the thread scheduler may ignore this hint.
Thread status
new new
Thread status that has not yet been started after creation
Entry method: after new, before start
runnable can run
Running/Ready
Running: The state of the thread when the thread scheduler selects a thread from the runnable pool as the current thread.
The state of a thread when the thread scheduler selects a thread from the runnable pool as the current thread. This is also the only way for a thread to enter the running state.
The state of a thread when the thread scheduler selects a thread from the runnable pool as the current thread.
1. The ready state only means that you are qualified to run. If the scheduler does not select you, you will always be in the ready state. 2. Call the thread's start() method, and the thread enters the ready state. 3. The sleep() method of the current thread ends, and the join() method of other threads ends. After the user input is completed, a thread obtains the object lock, and these threads will also enter the ready state. 4. When the time slice of the current thread is used up, the yield() method of the current thread is called, and the current thread enters the ready state. 5. When the thread time slice is used up, call the yeid() method of the current thread, and the current thread will enter the ready state. 6. After the thread in the lock pool obtains the object lock, it enters the ready state.
blocked blocked
Waiting to acquire exclusive lock
waiting infinite waiting
Will not be allocated CPU execution time and needs to be woken up explicitly
Timed Waiting Timed Waiting
The system will automatically wake up after a period of time.
terminated
Terminated status, the thread has finished running
1. When the execution of a thread's run method ends, or the execution of the main function is completed, we think that the thread is terminated. This thread object may be alive, but it is no longer a separately executing thread. Once a thread is terminated, it cannot be revived. 2. Calling the start method on a terminated thread will throw a java.lang.IllegalThreadStateExecption exception.
Multithreading
The meaning of multithreading
Get the most out of your processor
How to create a thread
Inherit the Thread class
Implement the Runnable interface
Avoid the limitations of multiple inheritance
Can better reflect the concept of sharing
Implement the Callable interface
Start multithreading through thread pool
What is the difference between runnable and callable
Runnable interface run method has no return value
Can only throw runtime exceptions and cannot catch them
Callable interface call method has return value and supports generics
Allow exceptions to be thrown and exception information to be obtained
How to interact with information
void notify()
Randomly wakes up a single thread waiting on this object's wait pool (monitor), entering the lock pool
void notifyAll()
Wake up all threads waiting on this object's wait pool (monitor) and enter the lock pool
void wait()
Causes the current thread to wait until other threads call the notify() method or notifyAll() method of this object
void wait(long timeout)
Same as above or exceeds the specified amount of time
void wait(long timeout, int nanos)
Same as above or some other thread interrupts the current thread
JMM
atomicity
The characteristic that one or more operations are not interrupted while the CPU is executing
Atomic classes starting with Atomic
CAS principle
AtomicLong >> LongAdder
AtomicLong is based on CAS spin update
LongAdder divides the value into several cells
visibility
Modifications to shared variables by one thread can be immediately seen by another thread.
volatile
Orderliness
program execution in order Coded Execute sequentially
Happens-Before Rule
program order rules
Within a thread, according to the program control flow sequence, operations written in the front occur before operations written in the back.
Monitor locking rules
An unlock operation occurs before a subsequent lock operation on the same lock.
Volatile variable rules
A write operation to a volatile variable occurs before a subsequent read operation to this variable.
Thread startup rules
The start() method of the Thread object occurs first for every action of this thread.
Thread termination rules
All operations in a thread occur first upon termination detection of this thread
Thread interruption rules
The call to the thread's interrupt() method occurs first after the code of the interrupted thread detects the occurrence of the interrupt event.
Object finalization rules
The completion of an object's initialization (the end of constructor execution) occurs first at the beginning of its finalize() method.
ThreadLocal thread local storage
Each thread can access the value in its own internal ThreadLocalMap object
For example: Each thread is assigned a JDBC connection Connection
Thread Pool
Use Executors to create different thread pools to meet the needs of different scenarios
newFixedThreadPool(int nThreads)
Thread pool with specified number of worker threads
newCachedThreadPool(
Thread pool to handle large amounts of short-lived work tasks
Attempts to cache threads and reuse them, when no cached threads are available, new worker threads are created
If a thread remains idle for longer than the threshold, it will be terminated and moved out of the cache.
When the system is idle for a long time, no resources will be consumed.
newSingleThreadPool()
Create a unique worker thread to perform the task. If the thread ends abnormally, another thread will replace it.
newSingleThreadScheduledExecutor() and newScheduledThreadPool(int corePoolSize)
Timing or periodic work scheduling, the difference lies in a single worker thread or multiple threads
newWorkStealingPool()
ForkJoinPool will be built internally, using the work-stealing algorithm to process tasks in parallel, and processing is not guaranteed.
Fork/jion framework
Parallel task framework provided by Java7
A framework that divides a large task into several small tasks and executes them in parallel, and finally summarizes the results of each small task to obtain the results of the large task
work-stealing algorithm
A thread steals tasks from other queues for execution
Fork will divide tasks into different queues and create separate threads for each queue to execute.
When a thread completes executing its own task queue, it will steal the tasks in the queue of other threads for execution.
The queue adopts a double-ended queue. The stolen thread always takes the task from the head of the double-ended queue and the stolen thread always takes the task from the tail of the double-ended queue.
Executor framework
It is a framework that separates task submission and task execution.
J.U.C’s three Executor interfaces
Executor
A simple interface for running new tasks, with task submission and task execution details
ExecutorService
It has methods to manage executors and task declaration cycles, and the task submission mechanism is more complete.
ScheduledExecutorService
Support Future and regularly executed tasks
Constructor of ThreadPoolExecutor
corePoolSize
Number of core threads
maximumPoolSize
The maximum number of threads that can be created when there are not enough threads
workQueue
Task waiting queue
keepAliveTime
In addition to the number of core threads, the survival time of other threads when they are idle
The order of preemption is not necessarily determined, it depends on luck.
threadFactory
Create new thread
The default is Executors.defaultThreadFactory
handler
saturation strategy
AbortPolicy
Throw an exception directly, this is the default strategy
CallerRunsPolicy
Use the caller's thread to execute the task
DiscardOldestPolicy
Discard the top task in the queue
discardPolicy
Discard the task directly
Implement a custom handler of the RejectedExecutionHandler interface to meet your own business needs
Judgment after new task is submitted to execute
If fewer threads are running than corePoolSize, new threads will be created to handle tasks, even if other threads in the thread pool are idle.
If the number of threads obtained in the thread pool is greater than or equal to corePoolSize and less than maximumPoolSize, a new thread will be created only when the workQueue is full, otherwise, it will be inserted into the workQueue.
If the set corePoolSize and maximumPoolSize are the same, the size of the created thread pool is fixed. At this time, if a new task is submitted and the workQueue is not full, the task will be put into the workQueue and wait for an idle thread to extract the task processing from the workQueue.
If the number of running threads is greater than or equal to maximunPoolSize, and if the workQueue is full The task is processed through the strategy formulated by the handler
Thread pool status
RUNNING
Able to accept new tasks and process tasks in workQueue
SHUDOWN
No longer accepting new tasks, but existing tasks can be processed
STOP
No more new tasks will be accepted, and no existing tasks will be issued.
TIDYING
All tasks have been terminated
TERMINATED
terminated(), enter this state after the method is executed
Worker thread life cycle
Why use thread pool
Reduce resource consumption
Improve thread manageability
How to choose the size of the thread pool
CPU intensive
Number of threads = Set according to the number of cores or the number of cores 1
I/O intensive
Number of threads = Number of CPU cores * (1 average waiting time / average working time)
thread executor executor
state
Executor framework
Executors
newFixedThreadPool
A fixed-length thread pool creates a thread every time a task is submitted until the maximum number of thread pools is reached.
newScheduledThreadPool
Fixed-length thread pool that can perform periodic tasks
Ability to schedule commands to run after a given delay or periodically as needed
newCachedThreadPool
Cacheable thread pool, if the capacity of the thread pool exceeds the number of tasks, idle threads will be automatically recycled
New threads can be added automatically when tasks increase, and the capacity of the thread pool is not limited.
newSingleThreadExecutor
In a single-threaded thread pool, if the thread ends abnormally, a new thread will be created to ensure that tasks are executed in the order they are submitted.
newSingleThreadScheduledExecutor
A single-threaded thread pool that can perform periodic tasks
newWorkStealingPool
Tasks steal the thread pool and do not guarantee the order of execution, which is suitable for tasks with large differences in time consumption.
The default created parallel level is the number of CPU cores. When the main thread ends, even if there are tasks in the thread pool, it will stop immediately.
ForkJoinTask
Solve the problem of CPU load imbalance
Get the most out of multi-core CPUs using divide-and-conquer
Task segmentation
Results merged
ThreadPoolExecutor Thread pool base class
corePoolSize
Number of core threads
maximumPoolSize
Maximum number of threads
keepAliveTime
Idle thread survival time
unit
survival time unit
workQueue
Task blocking queue
threadFactory
Factory required when creating a new thread
handler
Deny policy
AbortPolicy
By default, when the queue is full, tasks are discarded and an exception is thrown.
DiscardPolicy
If the queue is full, the task will be discarded without throwing an exception.
DiscardOldestPolicy
Delete the earliest task that enters the queue, and then try to join the queue again
CallerRunPolicy
If adding to the thread pool fails, the main thread will perform the task itself.
Four rejection strategies
SecheduleThreadPoolExecutor
Mainly used to run tasks after a given delay, or tasks that are executed regularly
ExecutorCompletionService
Internally manages a blocking queue of completed tasks
submit()
Submit the task and it will eventually be delegated to the internal executor to execute the task.
Parameters (Runnable) or (Runnable and result T) or (Callable)
Return value Future When calling the get method, you can catch and handle exceptions
take()
If there is already a completed task in the blocking queue, return the task result, otherwise block and wait for the task to complete.
poll()
Returns if there is a task in the queue that has been completed, otherwise returns null
pull(long,TimeUnit)
If there is a task in the queue that is completed, return the result of the task, otherwise wait for the specified time, If there is still no task completed, return null
Fork/Join framework
ForkJoinPool
It is a supplement to ExecutorService, not a substitute. It is especially suitable for divide and conquer and recursive calculation algorithms.
RecursiveTask
Tasks that return results
RecursiveAction
Calculations that return no results
thread framework
JUC
【java.util.concurrent development package】
core class
TimeUnit tool class
ThreadFactory thread factory class
CAS
Is the basis of the java.util.concurrent.autimic package
atomic (atomic variable class)
The atomic class is built on CAS and volatile. CAS is a commonly used implementation of non-blocking algorithms. Compared with blocking algorithms such as synchronized, it has better performance.
Ordinary atomic class
AtomicBoolean
AtomicLong
AtomicInteger
AtomicIntegerArray
AtomicLongArray
Reference atom class
AtomicReference
AtomicReferenceArray
Solving ABA
AtomicMarkableReference
AtomicStampedReference
Enhanced atom class
LongAccumulator
The constructor of a custom-implemented long accumulator accepts a binocular operator interface and returns a calculated value based on the two input parameters. The other parameter is the initial value of the accumulator.
DoubleAccumulator
Custom implemented double type accumulator, same as LongAccumulator
LongAdder
Atomic operations of long type are better at concurrency than LongAtomic and should be used first. LongAdder is a special case of LongAccumulator
DoubleAdder
Atomic operations of double type, same as LongAdder
core components
lock mechanism
Lock
ReadWriteLock
AQS (Queue Synchronizer)
It is the java.util.concurrent.locks package, as well as the basis of some common classes such as Semophore and ReentratLock.
AbstractOwnableSynchronizer (exclusive lock)
To implement blocking locks and correlations that rely on first-in-first-out FIFO waiting queues Synchronizers <semaphores, events, etc.> provide a framework
AbstractQueuedLongSynchronizer (64-bit synchronizer)
ReentrantLock mutex lock
ReadWriteLock read-write lock
Condition control queue
LockSupport blocking primitive
Semaphore semaphore
CountDownLatch latch
CyclicBarrier fence
Exchanger switch
CompletableFuture thread callback
Concurrent collections
concurrent queue
ArrayBlockingQueue
Bounded blocking queue of array structure
LinkedBlockingDueue
A blocking queue with a linked list structure. If the size is not specified, it is unbounded.
LinkedBlockingQueue
A two-way blocking queue with a linked list structure. If the size is not specified, it is unbounded.
PriorityBlockingQueue
Array structure priority bounded blocking queue, heap sort
DelayQueue
delayed blocking queue
PriorityQueue is used internally to implement delay
SynchronousQueue
Synchronous blocking queue has no capacity and the put operation will always be blocked. The execution cannot continue until there is a take operation.
LinkedTransferQueue
Combining SychronousQueue and LinkedBlockingQueue Unbounded blocking queue
The blocking queue BlockQueue provides blocking enqueue and dequeue operations. Mainly used in producer and consumer modes In the case of multi-threading The producer adds elements to the end of the queue The consumer consumes elements at the head of the queue In order to achieve the purpose of isolating the generation and consumption of tasks
ConcurrentLinkedDeque
Thread-safe queue with linked list structure
ConcurrentLinkedQueue
Thread-safe bidirectional queue with linked list structure
non-blocking queue
concurrent collection
ConcurrentHashMap
Thread-safe Map, array, linked list/red-black tree structure
ConcurrentHashMap.newKeySet()
Thread-safe operations
ConcurrentSkipListMap
Thread safety, Map sorted by key, skip list structure
ConcurrentSkipListSet
Thread safety, ordered Set, skip list structure
CopyOnWriteArrayList
Write to copy the List, the scenario of reading more and writing less
CopyOnWriteArraySet
Set copied when writing, scenario of reading more and writing less
Concurrency tools tools
CountDownLatch
Locking allows one thread to wait for other threads to complete their work before continuing execution.
CyclicBarrier
Fences enable multiple threads to wait for a condition to be met before continuing execution.
Semaphore
Semaphore, a counting semaphore that must be released by the thread that obtained it, is often used to limit the number of threads that can access certain resources.
Exchanger
Exchanger, an encapsulated tool class used to exchange data between two worker threads
other
TimeUnit
ThreadLocalRandom
Improved random number generation performance under multi-threading to avoid competition for the same seed
Thread safety and data synchronization
CountDownLatch
Make a thread or threads wait
Semaphore
Auxiliary class for thread synchronization, which can maintain the number of threads currently accessing itself and provide synchronization
CyclicBarrier
Implement a group of threads to wait for each other, and then perform subsequent operations when all threads reach a certain barrier point
volatile
effect
Can only act on variables
Indicates that the variable is undefined in the CPU's register and must be read from main memory.
Features
Ensure visibility
No guarantee of atomicity
Disable command rearrangement
thread blocking
Lock
Level classification
no lock
bias lock
lightweight lock
Heavyweight lock
cause
cross lock
Not enough storage
database lock
Question-and-answer data exchange
infinite loop
deadlock
Four necessary conditions
Mutually exclusive conditions
A certain resource is only occupied by one thread for a period of time, and other threads requesting the resource can only wait.
no deprivation conditions
The resources obtained by a thread cannot be forcibly taken away by other threads before they are completely used.
It can only be released actively by the thread that obtained the resource.
Request and hold conditions
The thread already holds at least one resource but makes a new resource request
The resource is already occupied by other threads, and the requesting thread is blocked at this time.
Hold on to the resources you have acquired
Loop wait condition
There is a circular waiting chain for thread resources
The resources obtained by each thread are simultaneously requested by the next thread in the chain.
Ways to avoid deadlock
Locking order
Threads lock in a certain order
Lock time limit
Add a certain time limit when the thread tries to acquire the lock
If the time limit is exceeded, the request for the lock will be given up and the lock it holds will be released.
Deadlock detection
diagnosis
jstack
jvisualvm
spin lock
The thread repeatedly checks whether the lock variable is available
TicketLock lock mainly solves the issue of fairness
CLH lock is a scalable, high-performance, fair spin lock based on linked list
MCSLock loops through the nodes of local variables
CAS implementation
CAS has 3 operands, the memory value V, the old expected value A, and the new value to be modified B
If and only if the expected value A and the memory value V are the same, modify the memory value V to B, otherwise do nothing
In-depth principles
CAS ticket
Lock optimization
JVM optimization
The direction of optimization is to reduce thread blocking
In Java SE 1.6, locks have a total of 4 states
No lock state, partial lock state, Lightweight lock status, heavyweight lock status
The status will gradually escalate with the competition situation.
The lock cannot be downgraded after it is upgraded.
lock elimination
The JIT compiler eliminates locks that are synchronized by some code that does not need to be synchronized.
escape analysis
It is to observe whether a certain variable Will escape from a certain scope
lock roughening
If a series of consecutive operations repeatedly lock and unlock the same object
The JVM will extend (coarsen) the scope of lock synchronization to the outside of the entire operation sequence.
Program optimization
Reduce lock holding time
Reduce lock granularity
Use read-write locks to replace exclusive locks
Lock separation
no lock
CAS
synchronized synchronization lock
effect
Modify a code block
Synchronized statement block
Its scope is the code enclosed in curly brackets {}
The object of action is the object that calls this code block.
Modify a method
Modify a static method
Modify a class
Classification
method lock
Object lock synchronized(this)
Class lock synchronized(Demo.Class)
Features
Ensure orderliness, atomicity and visibility among threads
thread blocking
principle
Bytecode plus identification
The synchronization code block obtains the execution right of the thread through the monitorenter and monitorexit instructions.
The synchronization method controls the execution right of the thread by adding the ACC_SYNCHRONIZED flag.
Lock
Code
lock() obtains the lock in a blocking manner, and the blocking state ignores the interrupt method
lockInterruptibly() unlike lock() does not ignore interrupt methods
tryLock() obtains the lock in a non-blocking manner and can add time parameters
Lock and synchronized
Similarity: Lock can complete all functions implemented by synchronized
Difference: Lock has more precise thread semantics and better performance than synchronized
Lock locking is implemented through code
The programmer releases it manually and must release it in the finally clause
synchronized is implemented at the JVM level
automatic release lock
The scope of Lock lock is limited, block scope
synchronized can lock blocks, objects, and classes
ReetrantLock
Reentrant lock, mutex lock
ReeTrantReadWriteLock
Reentrant read-write lock, read-read shared, read-write mutex, write-write mutex
StampedLock
Read-write lock with timestamp, non-reentrant
ReadWriteLock read-write lock
An implementation of ReentrantReadWriteLock
The difference between synchronized and ReentrantLock
wait and notify
wait
1) Release the current object lock
2) Make the current thread enter the blocking queue
notify
wake up a thread
High concurrency scenario practice
Redis and JVM multi-level cache architecture
Message middleware traffic peak shaving and asynchronous processing
Current limiting strategy implementation
Nginx current limit
counter
sliding time window
Token bucket, leaky bucket algorithm
Sentinel/Hystrix current limiting
Implementation of service peak downgrade
System security anti-brush strategy
Performance tuning
JVM GC tuning
GC
Avoid using the finalize() object finalization method
Tomcat tuning
Nginx tuning
Java algorithm
binary search
public static int biSearch(int []array,int a){ int lo=0; int hi=array.length-1; int mid; while(lo<=hi){ mid=(lo hi)/2;//middle position if(array[mid]==a){ return mid 1; }else if(array[mid]<a){ //Search to the right lo=mid 1; }else{ //Search left hi=mid-1; } } return -1; }
bubble sort algorithm
public static void bubbleSort1(int [] a, int n){ int i, j; for(i=0; i<n; i ){//Indicates n sorting processes. for(j=1; j<n-i; j ){ if(a[j-1] > a[j]){//If the previous number is greater than the following number, swap it //Exchange a[j-1] and a[j] int temp; temp = a[j-1]; a[j-1] = a[j]; a[j]=temp; } } } }
insertion sort algorithm
public void sort(int arr[]) { for(int i =1; i<arr.length;i ) { //number inserted int insertVal = arr[i]; //The inserted position (prepare to compare with the previous number) int index = i-1; //If the inserted number is smaller than the inserted number while(index>=0&&insertVal<arr[index]) { //Will move arr[index] backward arr[index 1]=arr[index]; //Let index move forward index--; } //Put the inserted number into the appropriate position arr[index 1]=insertVal; } }
Scan from back to front in the sorted sequence to find the corresponding position and insert it.
Insertion sort is very similar to playing cards.
quicksort algorithm
The principle of quick sort: select a key value as the baseline value. Values smaller than the baseline are all in the left sequence (generally unordered),
Those larger than the base value are on the right (generally unordered). Generally the first element of the sequence is selected.
One loop: compare from back to front, compare the base value with the last value, swap positions if it is smaller than the base value, if not continue to compare to the next one, do not swap until the first value that is smaller than the base value is found. After finding this value, the comparison starts from front to back. If there is a value larger than the baseline value, the position is swapped. If the value is not compared to the next one, the position is not swapped until the first value larger than the baseline value is found.
Until the comparison index from front to back > the index from back to front, the first loop ends. At this time, for the base value, the left and right sides are in order.
public void sort(int[] a,int low,int high){ int start = low; int end = high; int key = a[low]; while(end>start){ //Compare from back to front while(end>start&&a[end]>=key) //If there is no smaller value than the key value, compare the next one until there is a swap position smaller than the key value, and then compare from front to back. end--; if(a[end]<=key){ int temp = a[end]; a[end] = a[start]; a[start] = temp; } //Compare from front to back while(end>start&&a[start]<=key) //If there is no one larger than the key value, compare the next one until there is a swap position larger than the key value start ; if(a[start]>=key){ int temp = a[start]; a[start] = a[end]; a[end] = temp; } //At this point, the first loop comparison ends, and the position of the key value has been determined. The values on the left are all smaller than the key value, and the values on the right The values are all larger than the key value, but the order of both sides may be different. Make the following recursive call. } //recursion if(start>low) sort(a,low,start-1);//Left sequence. First index position to key value index -1 if(end<high) sort(a,end 1,high);//right sequence. From key value index 1 to the last } }
Hill sort algorithm
Idea: First divide the entire sequence of records to be sorted into several subsequences for direct insertion sorting. When the records in the entire sequence are "basically in order", then perform direct insertion sorting on all records.
1. Operation method: Select an incremental sequence t1, t2,...,tk, where ti>t, tk=1;
2. Sort the sequence k times according to the number of incremental sequences k;
In each sorting pass, the column to be sorted is divided into several subsequences of length m according to the corresponding increment ti, and direct insertion sorting is performed on each subtable. Only when the increment factor is 1, the entire sequence is processed as a table, and the length of the table is the length of the entire sequence.
private void shellSort(int[] a) { int dk = a.length/2; while( dk >= 1 ){ ShellInsertSort(a, dk); dk = dk/2; } } private void ShellInsertSort(int[] a, int dk) { //Similar to insertion sort, except that the increment of insertion sort is 1, here the increment is dk, just replace 1 with dk. for(int i=dk;i<a.length;i){ if(a[i]<a[i-dk]){ int j; int x=a[i];//x is the element to be inserted a[i]=a[i-dk]; for(j=i-dk; j>=0 && x<a[j];j=j-dk){ //Through the loop, move back one bit one by one to find the position to be inserted. a[j dk]=a[j]; } a[j dk]=x;//Insert } } }
merge sort algorithm
public class MergeSortTest { public static void main(String[] args) { int[] data = new int[] { 5, 3, 6, 2, 1, 9, 4, 8, 7 }; print(data); mergeSort(data); System.out.println("Sorted array:"); print(data); } public static void mergeSort(int[] data) { sort(data, 0, data.length - 1); } public static void sort(int[] data, int left, int right) { if (left >= right) return; // Find the intermediate index int center = (left right) / 2; //Recurse on the left array sort(data, left, center); //Recurse on the right array sort(data, center 1, right); // merge merge(data, left, center, right); print(data); }
/** * Merge the two arrays. The two arrays before merging are already in order, and they are still in order after merging* * @param data * array object * @param left * Index of the first element of the left array * @param center * The index of the last element of the left array, center 1 is the index of the first element of the right array * @param right * Index of the last element of the right array */ public static void merge(int[] data, int left, int center, int right) { //temporary array int[] tmpArr = new int[data.length]; // Index of the first element of the right array int mid = center 1; // third records the index of the temporary array int third = left; //Cache the index of the first element of the left array int tmp = left; while (left <= center && mid <= right) { // Take the smallest one from the two arrays and put it into the temporary array if (data[left] <= data[mid]) { tmpArr[third] = data[left]; } else { tmpArr[third] = data[mid]; } } //The remaining parts are put into the temporary array in turn (actually only one of the two whiles will be executed) while (mid <= right) { tmpArr[third] = data[mid]; 13/04/2018 Page 240 of 283 } while (left <= center) { tmpArr[third] = data[left]; } //Copy the contents of the temporary array back to the original array // (The contents of the original left-right range are copied back to the original array) while (tmp <= right) { data[tmp] = tmpArr[tmp]; } } public static void print(int[] data) { for (int i = 0; i < data.length; i ) { System.out.print(data[i] "\t"); } System.out.println(); } }
bucket sort algorithm
The basic idea of bucket sorting is to divide the array arr into n subranges (buckets) of the same size, sort each subrange separately, and finally merge. Counting sort is a special case of bucket sort. You can think of counting sort as a case where there is only one element in each bucket.
1. Find the maximum value max and minimum value min in the array to be sorted
2. We use the dynamic array ArrayList as the bucket, and the elements placed in the bucket are also stored in ArrayList. The number of buckets is (max-min)/arr.length 1
3. Traverse the array arr and calculate the bucket where each element arrd] is placed.
4. Sort each bucket separately
public static void bucketSort(int[] arr){ int max = Integer.MIN_VALUE; int min = Integer.MAX_VALUE; for(int i = 0; i < arr.length; i ){ max = Math.max(max, arr[i]); min = Math.min(min, arr[i]); } //Create bucket int bucketNum = (max - min) / arr.length 1; ArrayList<ArrayList<Integer>> bucketArr = new ArrayList<>(bucketNum); for(int i = 0; i < bucketNum; i ){ bucketArr.add(new ArrayList<Integer>()); } //Put each element into the bucket for(int i = 0; i < arr.length; i ){ int num = (arr[i] - min) / (arr.length); bucketArr.get(num).add(arr[i]); } // Sort each bucket for(int i = 0; i < bucketArr.size(); i ){ Collections.sort(bucketArr.get(i)); }}
Radix sort algorithm
public class radixSort { inta[]={49,38,65,97,76,13,27,49,78,34,12,64,5,4,62,99,98,54,101,56,17,18,23,34 ,15,35,2 5,53,51}; public radixSort(){ sort(a); for(inti=0;i<a.length;i ){ System.out.println(a[i]); } } public void sort(int[] array){ //First determine the number of sorting passes; int max=array[0]; for(inti=1;i<array.length;i ){ if(array[i]>max){ max=array[i]; } } int time=0; //Judge the number of digits; while(max>0){ max/=10; time ; } //Create 10 queues; List<ArrayList> queue=newArrayList<ArrayList>(); for(int i=0;i<10;i ){ ArrayList<Integer>queue1=new ArrayList<Integer>(); queue.add(queue1); } //Perform time allocation and collection; for(int i=0;i<time;i){ //Allocate array elements; for(intj=0;j<array.length;j ){ //Get the first digit of the number; int x=array[j]%(int)Math.pow(10,i 1)/(int)Math.pow(10, i); ArrayList<Integer>queue2=queue.get(x); queue2.add(array[j]); queue.set(x, queue2); } int count=0;//Element counter; //Collect queue elements; for(int k=0;k<10;k ){ while(queue.get(k).size()>0){ ArrayList<Integer>queue3=queue.get(k); array[count]=queue3.get(0); queue3.remove(0); count ; } } } } }
Pruning algorithm
In the optimization of search algorithms, pruning is to avoid unnecessary traversal processes through certain judgments. To put it figuratively, it means to cut off some "branches" in the search tree, so it is called pruning. The core issue in applying pruning optimization is to design a pruning judgment method, that is, a method to determine which branches should be discarded and which branches should be retained.
Backtracking algorithm
The backtracking algorithm is actually a search attempt process similar to enumeration. It mainly searches for the solution to the problem during the search attempt. When it is found that the solution conditions are no longer met, it "backtracks" and returns to try other paths.
shortest path algorithm
Among the paths that start from a certain vertex and reach another vertex along the edges of the graph, the path with the smallest sum of weights on each edge is called the shortest path. There are the following algorithms to solve the shortest path problem, Dijkstra algorithm, Bellman-Ford algorithm, Floyd algorithm and SPFA algorithm, etc.
Maximum Subarray Algorithm
longest common subsequence algorithm
Minimum Spanning Tree Algorithm
There are many algorithms for constructing minimum spanning trees, but they all take advantage of the same property of minimum spanning trees: the MST property (assuming N=(V,{E}) is a connected network, and U is a non-empty subset of the vertex set V , if (u, v) is an edge with the minimum weight, where u belongs to U and v belongs to V-U, then there must be a minimum spanning tree containing the edge (u, v)). Here are two ways to use MST Properties Algorithms for generating minimum spanning trees: Prim's algorithm and Kruskal's algorithm.
We can build different spanning trees for a connected network with n vertices. Each spanning tree can be used as a communication network. When the cost of constructing this connected network is the minimum, building the spanning tree of this connected network is called is the minimum spanning tree.
Java basics
object-oriented
feature
Encapsulation
Hide object properties to protect internal object state
Improved code usability and maintainability
Prohibit bad interactions between objects and improve modularity
inherit
Provides objects with the ability to obtain fields and methods from base classes
Polymorphism
The ability to display the same interface for different underlying data types
abstract
Separate class behavior from concrete implementation
Object-oriented basics
Define classes and create instances
method
Construction method
Purpose: Initialize all internal fields to appropriate values when creating an object instance
Default constructor
Initialize fields
Multiple construction methods
this variable
parameter
Formal parameters and actual parameters
Formal parameters: are parameters used when defining the function name and function body. The purpose is to receive the parameters passed in when calling the function, referred to as "formal parameters".
Actual parameters: When calling a function in the calling function, the parameters in parentheses after the function name are called "actual parameters", or "actual parameters" for short.
Parameter passing: The method parameter passing method in Java is by value.
If the argument is of a primitive type, a copy of the literal value of the primitive type is passed.
If the parameter is a reference type, what is passed is a copy of the address value in the heap of the object referenced by the parameter.
The fundamental difference between passing by value and passing by reference is whether a copy of the passed object is created.
Parameter binding
Method overloading: The method names are the same, but their parameters are different, which is called method overloading.
access control modifier
Public: Has the greatest access rights and can access any class, interface, exception, etc. under the classpath. It is often used in external situations, that is, in the form of an interface for an object or class to the outside world.
protected: The main function is to protect subclasses. Its meaning is that subclasses can use the members it modifies, but not others. It is equivalent to an inherited thing passed to subclasses.
Default: Sometimes also called friendly, it is designed for access to this package. Any classes, interfaces, exceptions, etc. under this package can access each other, even members of the parent class that are not modified with protected.
private: Access rights are limited to the inside of the class, which is a manifestation of encapsulation. For example, most member variables are modified to private, and they do not want to be accessed by any other external class.
Inheritance: extends
Definition: Java inheritance is the most significant feature of object-oriented. Inheritance is the derivation of a new class from an existing class. The new class can absorb the data attributes and behaviors of the existing class, and can expand new capabilities.
Function: Inheritance technology makes it very easy to reuse previous code, which can greatly shorten the development cycle and reduce development costs.
Keyword super: indicates parent class
Upcasting: Assignment that safely changes a subclass type to a parent class type is called upcasting.
Downcasting: If a parent class type is forced to a subclass type, it is a downward cast.
Operator instanceof
The difference between inheritance and combination: inheritance is an is relationship, and combination is a has relationship.
Polymorphism
Definition: The same operation acts on different objects and can have different interpretations and produce different execution results. At runtime, you can call methods in a derived class through a pointer to the base class.
Function: Allows adding more types of subclasses to implement functional expansion without modifying the code based on the parent class. Treating different subclass objects as parent classes can shield the differences between different subclass objects, write common code, and make common programming to adapt to changing needs.
Methods to achieve polymorphism: virtual functions, abstract classes, overrides, templates
overwrite
Definition: In an inheritance relationship, if a subclass defines a method with exactly the same signature as the parent class method, it is called overriding.
Override Object methods
toString(): Output instance as String
equals(): Determine whether two instances are logically equal
hashCode(): Calculate the hash value of an instance
tax example
Keyword final: Modifying method with final can prevent it from being overwritten by subclasses
abstract class
Definition: A class modified with abstract is an abstract class.
Function: Because the abstract class itself is designed to be inherited only, the abstract class can force the subclass to implement the abstract method it defines, otherwise the compilation will report an error. Therefore, abstract methods are actually equivalent to defining "specifications".
abstract programming
keywordabstract
interface
Definition: Interface is a pure abstract interface that is more abstract than an abstract class, because it cannot even have fields.
The difference between abstract class and interface
Keywordinterface
default method
Static fields and methods
Static field: static field
static fields of interface
Static method: static method
Initialization sequence of static variables and static methods
Bag
Java defines a namespace called package: package. A class always belongs to a certain package. The class name (such as Person) is just an abbreviation. The real complete class name is package name.class name. Use packages to resolve name conflicts
Package scope: Classes located in the same package can access the fields and methods of the package scope. Fields and methods that are not modified by public, protected, or private are package scopes.
import keyword: The purpose of import is to allow you to write Xyz without writing abc.def.Xyz, and access permissions are two different things.
Scope
The modifiers public, protected, and private can be used to limit the access scope.
Local variables: Variables defined inside a method are called local variables. The scope of local variables starts from the declaration of the variable and ends with the corresponding block. Method parameters are also local variables.
final modifier: Java also provides a final modifier. final does not conflict with access permissions and has many uses.
Decorating a class with final can prevent it from being inherited.
Decorating a method with final prevents it from being overridden by subclasses
Modifying a field with final prevents it from being reassigned
Decorating local variables with final prevents them from being reassigned
classpath and jar
module
Compared
Abstract classes and interfaces
Essence: Abstraction is the abstraction of classes and a template design; interface is the abstraction of behavior and a specification of behavior.
abstract class abstract modification
Provide a common type for subclasses
Encapsulate repeated properties and methods in subclasses
Abstract classes do not necessarily contain abstract methods, but classes with abstract methods must be abstract classes and cannot be instantiated.
Constructors and class methods (methods modified with static) cannot be declared as abstract methods
Define abstract methods
abstract modified method
A subclass of an abstract class, unless it is also an abstract class, must implement the methods declared by the abstract class
There must be an "is-a" relationship between the parent class and the derived class, that is, the parent class and the derived class should be conceptually the same.
interface interface modification
The methods of an interface are public by default. All methods cannot be implemented in the interface. Abstract classes can have non-abstract methods.
In jdk9, the private method and private static method introduced
The implementation of the interface must be through a subclass. The subclass uses the keyword implements, and the interface can be implemented in multiple ways.
If a class implements an interface, it must implement all methods of the interface, but abstract classes do not necessarily
An interface cannot be instantiated with new, but it can be declared, but it must refer to an object that implements the interface.
Member variables in the interface can only be of public static final type
The implementer only implements the behavioral contract defined by the interface, a "like-a" relationship
The difference between overloading and rewriting
Overloading: occurs in the same class, the method name must be the same, the parameter type and number are different
Rewriting: occurs in parent and child classes, the method name and parameter list must be the same, and the return value is less than or equal to the parent class
If the parent class method access modifier is private, it will not be overridden in the subclass.
Compare with process-oriented
Object procedures have higher performance, but the disadvantage is that they are not as easy to maintain and expand as object-oriented.
Easy to maintain, easy to reuse, easy to expand, performance is lower than process-oriented
Data types and operations
Basic data types
integer value
byte, short, int, long
Character type
char
boolean
boolean
floating point
double, float
type of packaging
Byte, Short, Integer, Long
Boolean
Character
Float, Double
The difference between basic types and packaging types
Wrapper types can be null, basic types cannot
Wrapper types can be used with generics, basic types cannot
Primitive types are more efficient than wrapped types
Basic types store specific values directly on the stack, while wrapper types store references in the heap.
The values of two wrapper types can be the same but not equal
Integer chenmo = new Integer(10); Integer wanger = new Integer(10); System.out.println(chenmo == wanger); // false System.out.println(chenmo.equals(wanger )); // true
Packing and unboxing
The process of converting basic types into wrapped types is called boxing
The process of converting a wrapped type into a basic type is called unboxing
reference type
kind
interface
array
Array type: int[] ns = new int[5]
Command line parameters String[ ] args
String
String objects are immutable
String is modified by final and therefore cannot be inherited.
StringBuffer and StringBuilder
StringBuilder has higher performance than StringBuffer, but is thread-unsafe
String concatenation:
Empty value null and empty string
enumerate
mark
Keywords
var
Operation
Integer operations
Arithmetic
Self-increment/self-decrement: ,--
Displacement
Bit operations
Related concepts
Operation priority
overflow
type conversion
Floating point arithmetic
Arithmetic
Related concepts
overflow
forced transformation
type promotion
Boolean operations
Comparison operators: >, >=, <, <=, ==, !=
AND operation: &&
OR operation: ||
Not operation: !
Ternary operator: b ? x : y
Related concepts
Short-circuit operation: If the result of a Boolean operation expression can be determined in advance, subsequent calculations will not be executed and the result will be returned directly.
Relational operation precedence
reference type
strong reference
Default type
GC Roots reachability analysis: reachable, still referenced and not recycled; unreachable, scenario of recycling all programs, basic objects, custom objects, etc.
String str = "xxx";
soft reference
Slightly weaker than strong references
If the memory is sufficient, it will not be recycled; if the memory is insufficient, it will be recycled.
Generally used for resources that are very sensitive to memory, and are used in many cache scenarios: web page caching, image caching, etc.
SoftReference<String> softReference = new SoftReference<String>(new String("xxx")); System.out.println(softReference.get());
weak reference
The life cycle is shorter than that of soft references and can only survive until the next garbage collection.
Objects with a short life cycle, such as Key in ThreadLocal
WeakReference<String> weakReference = new WeakReference<String>(new String("Misout's blog")); System.gc(); if(weakReference.get() == null) { System.out.println("weakReference has been Recycled by GC"); } // As a result, weakReference has been recycled by GC
virtual reference
Must be used in conjunction with a reference queue
There are currently no usage scenarios in the industry. It may be used internally by the JVM team to track JVM garbage collection activities.
It will be recycled at any time. Once created, it may be recycled soon.
Function: In order to better manage the memory of objects and perform better garbage collection
PhantomReference<String> phantomReference = new PhantomReference<String>(new String("Misout's blog"), new ReferenceQueue<String>()); System.out.println(phantomReference.get()); //The result is always Null
process control
input and output
Output: System.out.println (output content)
Input: Scanner scanner = new Scanner(System.in); String name = scanner.nextLine();
if judgment
if ... else if ..
Determine reference type equality: equals() method
switch multiple selections
The switch statement can also match strings. When matching strings, it compares "content is equal"
penetrating
cycle
while loop
do while loop
for loop
Loop control statement
break: jump out of the current loop
continue: End this loop early and continue executing the next loop directly.
Java core classes
String
Create: String s1 = "Hello"
Compare: s1.equals(s2)
Search: s1.contain(s2) s1.indexOf(s2) / s1.lastIndexOf(s2) / s1.statsWith(s2) / s1.endWith(s2)
Extraction: s1.substring(index1, index2)
Remove leading and trailing blanks: s1.trim() / s1.strip()
replace string
replaceFirst(String regex,String replacement)
Replace substring: s1.replace(s2, s3) / s1.replaceAll(regex)
Split: s1.split(regex)
Splicing: String.join("splicing symbol", String[])
Get string length
str.length()
Get the character at a certain position in the string
str.charAt(4)
Get a substring of a string
str.substring(2,5)
String comparison
Find the position of a substring in a string
str.indexOf('a')
str.indexOf('a',2)
str.lastIndexOf('a')
str.lastIndexOf('a',2)
Case conversion of characters in a string
toLowerCase()
toUpperCase()
Remove spaces from both ends of string
trim()
Split string into string array
split(String str)
Basic type conversion to string
String.valueOf(12.99)
Basic type conversion to String: String.valueOf(basic type)
String is converted to basic types: Integer.parseInt(s1) / Double.parseDouble(s1) / Boolean.parseBoolean(s1)
String -> char[]: char[ ] cs = s1.toCharArray()
char[] -> String:String s = new String(cs)
type conversion
Encoding: byte[] b1 = s1.getBytes("Some encoding method")
Decoding: String s1 = new String(b1, "Some encoding method")
encode decode
StringBuilder
Create: StringBuilder sb = new StringBuilder(buffer size);
Write string: sb.append(s1).sappend(s2)...
Delete string: sb.delete(start_index, end_index)
Conversion: String s1 = sb.toString()
StringJoiner
Create: var sj = new StringJoiner(separator, start symbol, end symbol)
Write string: sj.add(s1)
Conversion: String s1 = sb.toString()
type of packaging
Autoboxing: Integer n = 100; // The compiler automatically uses Integer.valueOf(int)
Automatic unboxing: int x = n; // The compiler automatically uses Integer.intValue()
Base conversion: int x2 = Integer.parseInt("100", 16); // 256, because it is parsed in hexadecimal
static variable
Number
JavaBeans
enum class
Why use enum classes?
Create: enum ClassName{A, B, C...}
Compare: ==
Return constant name: String s = ClassName.A.name()
Returns the ordinal number of the defined constant: int n = ClassName.A.ordinal()
Improve readability: toString()
Enumeration: switch
Record class
record keyword
BigInteger
Create: BigInteger a = new BigInteger("1234567890")
Operations: a.add(b) / a.subtract(b) / a.multiply(b) / a.mod(b) etc.
Conversion: such as longValueExact()
BigDecimal
Create: BigDecimal bd = new BigDecimal("123.4567")
Number of decimal places: bd.scale() // 4, four decimal places
Remove trailing zeros: bd.stripTrailingZeros()
Operation: bd.function(bd2)
Truncate
Compare: bd.equals(bd2)
Commonly used tools
Math calculations: Math
Generate fake random numbers: Random
Generate secure random numbers: SecureRandom
reflection
Class class
There are three ways to obtain a Class instance of a class
Get it directly through the static variable class of a class: Class cls = String.class
If there is an instance variable, it can be obtained through the getClass() method provided by the instance variable: String s = "Hello";Class cls = s.getClass();
If you know the complete class name of a class, you can get it through the static method Class.forName(): Class cls = Class.forName("java.lang.String");
Class instance comparison and instanceof comparison
Obtain the class information of the Object through reflection
Create instances of corresponding types through Class instances
Features of VM dynamically loading classes: In order to load different implementation classes according to conditions during runtime.
Dynamic assembly can be achieved
Reduce code coupling
dynamic proxy
The java.lang.Class class in JDK is one of the core classes provided to implement reflection.
dynamic proxy
At runtime, a target class is created and methods of the target class can be called and extended.
JDK dynamic proxy
Implement the InvocationHandler interface
Rewrite the invoke method and add business logic
Holds target class object
Provide static method to obtain proxy
CGLib dynamic proxy
Used ASM (bytecode manipulation framework) to manipulate bytecode to generate new classes
java serialization
Serialization: The process of converting Java objects into byte streams
Deserialization: The process of converting a byte stream into a Java object
JavaException
throwable class
Error class
Generally refers to issues related to virtual machines
Interruptions caused by this type of error cannot be recovered by the program itself. When encountering such an error, it is recommended to terminate the program.
NoClassDefFoundError
Occurs when the JVM is running dynamically and cannot find the corresponding class in the classpath to load according to the class name you provided.
Exception class
Represents an exception that the program can handle, can be caught, and possibly recovered
When encountering such an exception, you should handle the exception as much as possible and resume the program instead of arbitrarily terminating the exception.
Unchecked Exception
Refers to a flaw or logic error in a program that cannot be recovered at runtime
RuntimeException
NullPointerException/null pointer exception
NullPointerException occurs when calling a method on a null object
IndexOutOfBoundsException array subscript out of bounds exception
NegativeArraySizeException Negative array length exception
ArithmeticException Mathematical calculation exception
ClassCastException type cast exception
SecurityException violation of security principles exception
Checked Exception
Represents invalid external conditions that the program cannot directly control
ClassNotFoundException
An error occurs when the corresponding class cannot be found in the classpath during compilation.
NoSuchMethodException method not found exception
IOException input and output exception
NumberFormatException String to number exception
EOFException File ended exception
FileNotFoundException File not found exception
SQLException operation database exception
Throws
Acts on the declaration of a method, indicating that if an exception is thrown, the caller of the method will handle the exception.
The method will throw an exception of some type, so that its users know the type of exception to catch.
Abnormality is a possibility, but it does not necessarily occur
Throw
actually throws an exception
try-catch-finally
Both catch and finally can be omitted, but not both at the same time
finally will definitely be executed, and the return in catch will not be executed until the code in finally is executed.
JavaEE basics
hierarchical model
Classic three layers
web layer
Servlet/JSP
Business Logic Layer (BLL)
EJB
persistence layer
JDBC, MyBatis, Hibernate
four-layer model
Presentation layer (PL)
Service layer (Service)
Business Logic Layer (BLL)
Data Access Layer (DAL)
five-layer model
Domain Object layer
DAO (data access layer)
business logic layer
MVC control layer
Presentation layer
Layered application
JPA (Java Persistence API)
JPA implementation
JMS (Java Message Service)
JCA (JavaEE Connector Architecture)
Managed beans, EJB
CDI (Safe Dependency Injection)
Web layer
Servlets, JSPs
Web Services
JavaIO/NIO
IO
streaming
byte stream
InputStream
PipedInputStream
ByteArrayInputStream
FileInputStream
FileterInputStream
BufferedInputStream
DataInputStream
PushBackInputStream
StringBufferInputStream
SequenceInputStream
ObjectInputStream
OutputStream
ObjectOutputStream
PipedOutputStream
FileterOutputStream
BufferedOutputStream
DataOutputStream
PrintStream
FileOutputStream
ByteArrayOutputStream
character stream
Writer
FilterWriter
StringWriter
PipedWriter
OutputStreamWriter
FileWriter
CharArrayWriter
BufferedWriter
PrintWriter
Reader
BufferedReader
LineNumberReader
CharArrayReader
InputStreamReader
FileReader
PipedReader
StringReader
FilterReader
non-streaming
file
other
SerializablePermission
FileSystem
Character Encoding
ASCII, American Standard Code for Information Interchange
Latin code, ISO8859-1
Chinese code, GB2312/GBK/GBK18030
International Standard Code,Unicode
UTF-8, UTF-16, variable length encoding according to Unicode
IO model
BIO (Blocking I/O): Synchronous blocking I/O mode
NIO (New I/O): Synchronous non-blocking I/O model
Channel bidirectional channel
Buffer
Selector
Multiplexing
select
Schematic diagram
shortcoming: 1) There is a maximum limit on the number of file descriptors that can be monitored (1024) 2)Linear scanning efficiency is low
poll
epoll
Whenever the FD is ready, use the system's callback function to directly put the fd in, which is more efficient. No limit on the maximum number of connections
AIO (Asynchronous I/O): Asynchronous non-blocking IO model
Signal driven IO model
Asynchronous IO model
Pseudo-asynchronous IO model
formal parameters and actual parameters
Parameters that appear in a function definition are called formal parameters. Parameters that appear in a function call are called actual parameters.
Formal parameter variables only allocate memory units when they are called. At the end of the call, the allocated memory units are released immediately.
Actual parameters can be constants, variables, expressions, functions, etc. No matter what type of quantities the actual parameters are, they must have definite values when making function calls so that these values can be transferred to the formal parameters.
In the general call-by-value mechanism, only the actual parameters can be transferred to the formal parameters, but the value of the formal parameters cannot be transferred in the reverse direction to the actual parameters. Therefore, during the function call, the formal parameter value changes, but the value in the actual parameter does not change. In the call-by-reference mechanism, the address of the actual parameter reference is passed to the formal parameter, so any changes that occur on the formal parameter actually occur on the actual parameter variable.
Java Web Basics
Servet JDBC Application (3.1)
db.properties file (properties file)
.properties file content: #Driver address driverClassName=oracle.jdbc.OracleDriver #Connection address url=jdbc:oracle:thin:@localhost:1521/XE #account number username=system/litao #password password=123456 #optional #Initialized connection pool number of connections initialaSize=5 #Maximum number of connections maxActive=100 #Maximum number of connections reserved when idle maxIdle=10 #Minimum number of connections reserved when idle minIdle=5 #overtime time maxWait=10000
Import 3 packages
Write connection pool tool class
DbcpUtil
com.xdl.util package
Complete the function of logging in to a bank account
Create a bank account table in the database
Create an init.sql
/** Create a bank account table, delete it before creating */ drop table xdl_bank_account cascade constraint; create table xdl_bank_account( id number constraint xdl_bank_account_id_pk primary key, acc_no varchar(30) constraint xdl_bank_account_acc_no_uk unique, acc_password varchar(30), acc_money number ); /** Create a sequence for this table and delete it before creating it */ drop sequence xdl_bank_account_id_seq; create sequence xdl_bank_account_id_seq; /** Insert test data */ insert into xdl_bank_account values(xdl_bank_account_id_seq.nextval, 'malaoshi','17',10000000); commit;
Create a project and create entity classes based on tables (to put it bluntly, encapsulation)
Naming format: remove the underline in the table name and capitalize the first letter The variable name is consistent with the table construct,get set,tostring,serialization implements Serializable
com.xdl.bean package
Define the design of DAO interface methods
Naming format: add DAO after the entity class Only do one of the things of adding, deleting, modifying and checking
com.xdl.Dao
According to the DAO interface, combine DbcpUtil and JDBC programming in five steps to complete the DAO implementation class
Naming format: entity class dao database name imp Database connection code 1. Load the driver 2. Get the connection 3. Define sql and obtain the precompiled environment of sql Set parameters setXX 4. Execute SQL to process the return value of sql select/ traverse the result set 5. Release resources
com.xdl.dao.imp
Write business logic class Service to encapsulate business methods
Naming format: Add function Service after entity class 1. Hold a DAO 2. Assign value to Dao 3. Encapsulate a business method (return interface.method)
com.xdl.service
Test class
com.xdl.test
Write an html page to issue a login request
Write a Servlet to receive user request parameters and call the method in the Service based on the parameters to see if the login is successful. If successful, the output will be login success. Otherwise, login failed.
Naming format: Add Servlet after the entity class 0.encoding 1. Obtain request parameters 2. Use service to make judgments 3. Encapsulate parameters into objects 4. Write to the browser and obtain an output object to the browser
com.xdl.servlet
Status management
Why should we have state management?
Because the http protocol is stateless, when the request responds, the request is disconnected from the server. The last request It has nothing to do with future data variables. But sometimes you need to get the data status of the last request, such as shopping to purchase goods. The product data of the last and previous purchases are required.
How to implement state management technology
Client-based state management technology
Cookies
principle
When the browser requests a server on the server, the server will create a Cookie object and then use setCookie Passed to the browser in the form of message headers. When the browser requests services from the server again, it will carry this Cookie object. to the server. The server can learn the last data status.
How to achieve?
Create a Cookie object Cookie cookie=new Cookie("key","value");
How to write to client response.addCookie(cookie);
summary
How to get the corresponding cookie in the request Cookie [] cookies =request.getCookies();
Get the name and value of the cookie getName() getValue() Settings setValue("string value")
summary
Cookie life cycle issues
The default is the same as the browser lifetime. It will disappear when the browser is closed. The value is -1.
setMaxAge(int seconds); (seconds)
After the cookie expires, the browser will no longer carry it (How to set a three-month lifetime? setMaxAge(60*60*24*93);) Setting the lifetime to 0 means deleting the cookie immediately
Cookie path problem
The default path where the cookie is located is the path where the servlet is located.
The rule for carrying cookies is that the cookies under this path and the cookies under the parent path corresponding to this path will be carried.
You can modify the default path of Cookie through setPath("/path")
Such as: setPath("/servlet-day04") It means that the cookie is placed under the project All requests for this project will carry this cookie.
What does it mean if path is written as /? Representative for all requests
Server-based state management technology
Session
basic knowledge
Request parameters
Chinese garbled problem of request parameters
tomcat8: there is no garbled code problem in the get method
tomcat8: garbled code problem in post mode
When the post is submitted, the data is submitted in utf-8. However, when tomcat decodes, it decodes according to ISO-8859-1 by default.
HttpServletRequest This type of object provides the corresponding API for tomcat to set the decoding encoding.
request.setCharacterEncoding("utf-8")
Tell tomcat to decode in utf-8
Must appear before getting parameters
This is only for post method
acc_no=new String (acc_no.getbytes("iso-8859-1"),"utf-8");
First decode according to ISO-8859-1, then encode according to utf-8
This is a common solution
Get post for tomcat7 and tomcat8 post
Obtaining request parameters
Obtaining request parameters in String String data= request.getParameter("name") Get the corresponding request parameters based on the name corresponding to a single value
String [] datas=request.getParameterValues("name"); Get the corresponding request parameter and the array of corresponding values according to the name
HttpServletRequest API to obtain request header information
getMethod()
Get request method
getServletPath();
Get the request path of the servlet, which is equivalent to url-pattiern
getContextPath()
/Item name
getServerPort()
The port number
getServerName()
CPU name
getRemoteAddr()
Get remote client address
getLocalAddr()
Get server address
getRequestURL()
Unified resource positioning
http://localhost:8888/Web02/Zhuce
Protocol Host Port Project Request
getRequestURI()
Uniform Resource Identifier
/Web02/Zhuce
Project Request
Forwarding and Redirecting
the difference
The sendRedirect() method in the HttpServletResponse interface is used to implement redirection
Big data, cloud computing and other expansions
Operations and integration
Continuous integration (CI/CD)
Version Management Tool (SCM)
Git
SVN
warehouse management
GitLab
Maven repository manager
Apache Archiva
JFrog Artifactory
Sonatype Nexus
Build tools
Maven
Ant
Gradle
Code detection
SonarQube
Automated release
Jenkins
test
Distributed testing
Full link stress test
Integration Testing
Encryption Algorithm
AES
Advanced Encryption Standard (AES, Advanced Encryption Standard) is the most common symmetric encryption algorithm
Symmetric encryption algorithm uses the same key for encryption and decryption
RSA
RSA encryption algorithm is a typical asymmetric encryption algorithm, which is based on the mathematical problem of factoring large numbers. It is also the most widely used asymmetric encryption algorithm.
Asymmetric encryption uses two keys (public key-private key) to encrypt and decrypt data.
The public key is used for encryption and the private key is used for decryption.
CRC
Cyclic Redundancy Check (CRC) is a hash function that generates a short fixed-digit check code based on network data packets or computer files. It is mainly used to detect or verify data transmission or storage. errors that may occur later.
It uses the principles of division and remainder for error detection.
MD5
MD5 often appears as the signature of a file. When we download a file, we often see a text or a line of characters with the extension .MD5 attached to the file page. This line of characters means that the entire file is treated as the original data and calculated through MD5. After we download the file, we can use software that checks the MD5 information of the file to perform a calculation on the downloaded file.
Comparing the results twice can ensure the accuracy of the downloaded file.
Another common use is the encryption of sensitive website information, such as usernames and passwords, payment signatures, etc.
With the popularization of https technology, current websites widely use front-end plaintext transmission to the back-end, and MD5 encryption (using offset) to protect sensitive data and protect site and data security.
new technology
Blockchain Technology (Java Edition)
Blockchain application
Bitcoin
Ethereum
Hyperledger
big data technology
Big Data
Hadoop
MapReduce
Hadoop MapReduce job life cycle
Client
JobTracker
TaskTracker
Task
Implementation process
1. Read the MapTask intermediate results from the remote node (called the "Shuffle stage");
2. Sort key/alue pairs according to key (called "Sort stage");
3. Read <key, value list> in sequence, call the user-defined reduce0 function to process, and save the final result to HDFS
HDFS
Client
NameNode
Secondary NameNode
DataNode
yarn will be introduced after hadoop2.0
HBase
Spark
core architecture
Spark Core
Spark SQL
Spark Streaming
MIlib
GraphX
core components
Cluster Manager-Control the entire cluster and monitor workers
Worker node-responsible for controlling computing nodes
Driver: Run the main() function of Application
Executor
Actuator,
Is a process running on a worker node for an Application
SPARK programming model
1. Users use the API provided by SparkContext (commonly used ones include textFile, sequenceFile, runJob, stop, etc.) to write the Driver application program. In addition, SQLContext, HiveContext and Streaming Context encapsulate SparkContext and provide APIs related to SQL, Hive and streaming computing.
2. User applications submitted using SparkContext will first use BlockManager and BroadcastManager to broadcast the Hadoop configuration of the task. The tasks are then converted into RDDs by DAGScheduler and organized into DAGs. The DAGs will also be divided into different Stages. Finally, TaskScheduler uses ActorSystem to submit the task to the cluster manager (Cluster Manager).
3. The cluster manager (ClusterManager) allocates resources to tasks, that is, allocates specific tasks to workers, and workers create Executors to handle the running of tasks. Standalone, YARN, Mesos, EC2, etc. can all be used as Spark cluster managers.
SPARK calculation model
SPARK operation process
1. Build the running environment system of Spark Application and start SparkContext
2.SparkContext applies for running Executor resources from the resource manager (can be Standalone, Mesos, Yarn) and starts StandaloneExecutorbackend.
3.Executor requests Task from SparkContext
4.SparkContext distributes applications to Executor
5.SparkContext is constructed into a DAG graph, the DAG graph is decomposed into stages, the Taskset is sent to the Task Scheduler, and finally the Task Scheduler sends the Task to the Executor for running
6.Task runs on the Executor and all resources are released after running.
SPARK RDD process
1. Create RDD object
2. The DAGScheduler module intervenes in the operation to calculate the dependencies between RDDs. The dependencies between RDDs form a DAG.
3. Each Job is divided into multiple stages. A main basis for dividing stages is whether the input of the current calculation factor is certain. If so, divide it into the same stage to avoid the message passing overhead between multiple stages.
Hive
Big data search
Lucene
ElasticSearch
Features
Based on Lucene basic architecture
The originator of java search industry
High real-time search performance
Regular, substring, memory database
member
Document row (Row) text
Index index (data key value)
Analyzer word segmenter (labeling)
Solr
Features
Not available for real-time search
Nutch
yarn
ResourceManager
1. ResourceManager is responsible for resource management and allocation of the entire cluster and is a global resource management system.
2. NodeManager reports resource usage to ResourceManager in a heartbeat manner (currently mainly CPU and memory usage). RM only accepts resource return information from NM, and leaves specific resource processing to NM itself.
3. YARN Scheduler allocates resources to applications based on their requests and is not responsible for the monitoring, tracking, running status feedback, startup, etc. of application jobs.
NodeManager
1. NodeManager is the resource and task manager on each node. It is the agent that manages this machine and is responsible for the running of the node program, as well as the management and monitoring of the node resources. Each node in the YARN cluster runs a NodeManager.
2. NodeManager regularly reports the usage of resources (CPU, memory) of this node and the running status of Container to ResourceManager. When ResourceManager goes down, NodeManager automatically connects to the RM backup node.
3. NodeManager receives and processes various requests from ApplicationMaster such as Container start and stop.
ApplicationMaster
1. Responsible for negotiating with the RM scheduler to obtain resources (represented by Container).
2. Further allocate the obtained tasks to internal tasks (secondary allocation of resources).
3. Communicate with NM to start/stop tasks.
4. Monitor the running status of all tasks, and re-apply resources for the task to restart the task when the task fails.
5. Currently, YARN comes with two ApplicationMaster implementations. One is the instance program DistributedShell used to demonstrate AM writing methods. It can apply for a certain number of Containers to run a Shell command or Shell script in parallel; the other is to run MapReduce applications. AM—MRAppMaster.
artificial intelligence technology
Neural Networks
machine learning
deep learning
Commonly used frameworks
DL4J
machine learning algorithm
decision tree
random forest algorithm
logistic regression
SVM
Naive Bayes
K nearest neighbor algorithm
K-means algorithm
Adaboost algorithm
Markov
Mathematical basis
Application scenarios
cloud computing
virtual machine
JRockitVM
HotSpotVM
cloud native
kubernetes
Docker
Docker uses a client-server (C/S) architecture model, using remote APIs to manage and create Docker containers.
Docker containers are created from Docker images.
Docker’s four network modes
Host
Container
None
Bridge mode
Storage drivers supported by Docker
AUFS
devicemapper
overlay2
zfs
vfs
Cloud architecture classification
public cloud
Purchase/Solution - Upper Level
Huge computing resources
Company A builds a cloud computing environment---->Company B/C/D
Abroad: aws/google/azure
Domestic: aliyun/Tencent cloud/Huaweicloud/jinshan cloud
Private Cloud
Deployment and construction-bottom layer
Private data/important data
Company A builds - Company A uses
Abroad: vmware vicoud
Domestic: Huawei DevCloud/H3C Cloud
Hybrid Cloud: Public Cloud Private Cloud
Cloud computing business model
SaaS
Charge on a per-use basis by providing services that meet end-user needs
Internet application
Example
Salesforce
Huawei Cloud: WeLink
PaaS
Provide application running and development environment
Provide components for application development (e.g. database)
Example
Microsoft: Visio Studio Tools for Azure
Huawei Cloud: Devcloud software development cloud
laaS
Rent computing, storage, network and other IT resources
Pay per use
Example
Amazon EC2 cloud host
Huawei Cloud: ECS
Openstack
LoT
Java Collections
collection framework
Frame structure diagram
Collection
List
ArrayList
Orderly and repeatable
Underlying the use of arrays
The getter() and setter() methods are fast, but addition and deletion are slow.
Not thread safe
When the capacity is insufficient, the ArrayList is the current capacity * 1.5 1
CopyOnWriteArrayList
Writes a copy of the array, supports efficient concurrency and is thread-safe
Read lock-free ArrayList
Suitable for use in scenarios where read operations are much larger than write operations, such as caching
There is no concept of expansion. Each write operation requires a copy, so the write operation performance is poor.
Implemented interfaces such as List, RandomAccess, Cloneable, java.io.Serializable, etc.
Writing methods such as add/remove need to be locked. The purpose is to avoid copying N copies and causing concurrent writing.
LinkedList
Orderly and repeatable
The bottom layer uses a two-way circular linked list data structure
Query speed is fast, addition and deletion are fast, add() and remove() are fast
Not thread safe
Vector
Orderly and repeatable
Underlying the use of arrays
Fast, slow to add and delete
Thread safety, low efficiency
When the capacity is not enough, Vector expands to double its capacity by default.
Stack
The bottom layer is also an array, inherited from Vector
First in, last out, default capacity is 10
Application: Evaluating postfix expressions
Set
HashSet
Arranged in no order and cannot be repeated
The bottom layer is implemented using a Hash table, and the internal one is a HashMap.
Fast access speed
TreeSet
Arranged in no order and cannot be repeated
The bottom layer is implemented using a binary tree
Sorting storage, internally TreeMap and SortedSet
LinkedHashSet
Use Hash table storage and use doubly linked list records to insert data
Inherited from HashSet, internally LinkedHashMap
Queue
Lists inserted at both ends can also be implemented using arrays or linked lists.
DelayQueue
It is a delayed blocking queue under the Java concurrent package, often used to implement scheduled tasks.
DelayQueue implements BlockingQueue
Mainly implemented using priority queues, supplemented by reentrant locks and conditions to control concurrency safety
PriorityQueue
Is a small top heap, not thread safe
It is not ordered, only the smallest element is stored at the top of the heap
PriorityBlockingQueue
It is a priority blocking queue under the Java concurrent package, which is thread-safe.
ArrayDeque (double-ended queue)
It is a special queue that can enter and exit elements at both ends, hence the name double-ended queue.
It is a double-ended queue implemented as an array, which is not thread-safe.
ArrayDeque implements the Deque interface, which inherits from the Queue interface.
It can be used directly as a stack. Dequeue and requeue are implemented by recycling the array through the head and tail pointers.
When the capacity is insufficient, it will be expanded. Each expansion will double the capacity.
Map
HashMap
Keys cannot be repeated, values can be repeated
Underlying hash table, internal array, singly linked list
Red-black trees are introduced in jdk8 to convert linked lists with length > 8
The Key value is allowed to be null, and the value can also be null.
Use the hash algorithm to configure the storage address according to hashCode()
Not thread safe
Write operation lost
Modify override
HashTable (hash table/hash table)
Keys cannot be repeated, values can be repeated
underlying hash table
Thread safety
Key and value are not allowed to be null
LinkedHashMap
Implemented based on HashMap and doubly linked list/red-black tree
orderly
insertion order
The order in which they are inserted is the order in which they are read out.
access sequence
After accessing a key, this key comes to the end.
WeakHashMap
ConcurrentHashMap
Efficient and thread-safe
Key and value are not allowed to be null
TreeMap
Keys cannot be repeated, values can be repeated
The Key value is required to implement java.lang.Comparable
When iterating, TreeMap defaults to sorting in ascending order by Key value.
NavigableMap implementation based on red-black tree
Red-black trees are balanced binary trees
The average height is log(n), and the worst-case height will not exceed 2log(n).
Red-black trees can perform search, insertion, and deletion operations with a time complexity of O(log2(N))
Any imbalance will be resolved within 3 spins
SortedMap interface
Map of overall ordering of keys
Properties
HashTable subclass, can only operate on String type
Stream
Stream basic operations
MapReduce
Distributed computing model, mainly used in the search field
Generic erasure
Generics and Object The difference
Generic declaration
public <T> T doSomeThing(T t){return t; }
Generic reference
No more forced conversion, type safety is automatically checked at compile time, and implicit type conversion exceptions are avoided.
Object declaration
public Object doSomeThing(Object obj){ return obj; }
Object reference
A type conversion exception (ClassCastException) may occur
Differences and Analogies
HashMap, HashSet, The difference between HashTable
HashSet
HashMap
Hash collision
chain address method
linked list
JDK1.7
Bit bucket linked list
JDK1.8
Bit bucket linked list red black tree
HashTable
ConcurrentHashMap
HashMap, LinkedHashMap, The difference between TreeMap
HashMap
Array linked list/red-black tree
Values in no order
The key of at most one record is null
In most cases where sorting is not necessary
LinkedHashMap
Array linked list/red-black tree
Values are taken in the order of insertion/order of modification, controlled by accessOrder
The key of at most one record is null
The order of insertion and the order of removal need to be the same.
TreeMap
red black tree
When inserting, press the natural order of keys or a custom order.
key cannot be null when stored in the natural order of keys
When you need to follow the natural order of keys or even a custom order
ArrayList, The difference between LinkedList
Arraylist is based on Array (dynamic array) structure
The maximum array capacity is Integer.MAX_VALUE-8
For the vacated 8 bits
①Storage Headerwords
② Avoid some machine memory overflows and reduce the chance of errors, so allocate less
③The maximum can still be supported is Integer.MAX_VALUE.
Linkedlist dynamic array based on linked list
Adding and deleting data is efficient and only requires changing the pointer.
However, the average efficiency of accessing data is low, and the linked list needs to be traversed.
The difference between ArrayList and Vector
synchronicity
Vector is thread safe
Thread synchronization between methods
ArrayList is programmatically unsafe
The threads are not synchronized between methods
data growth
Vector grows twice its original size by default
ArrayList grows 1.5 times its original size
What is the difference between Collection and Collections
java.util.Collection is a collection interface
Provides general interface methods for basic operations on collection objects
java.util.Collections is a wrapper class
Various static methods related to collection operations
Cannot be instantiated, just like a tool class, serving Java's Collection framework
What is the difference between Array and ArrayList
ArrayList is imagined as an "Array that automatically expands its capacity"
The size of Array is fixed, and the size of ArrayList changes dynamically.
Array can contain basic types and object types, ArrayList can only contain object types.
Queue is a typical first-in-first-out (FIFO) container
The difference between offer() and add() to add an element to the queue
If you want to add a new element to a full queue
The offer() method will return false
The add() method will throw an unchecked exception
The difference between peek() and element() when returning the head of the queue without removing it
The peek() method returns null when the queue is empty
The element() method throws NoSuchElementException
Remove and return the head of the queue. The difference between poll() and remove()
poll() returns null when the queue is empty
remove() will throw NoSuchElementException exception
IteratorIterator
iterator pattern
Separate the traversal behavior of sequence type data structures from the objects being traversed
Iterable
Collection objects that implement this interface support iteration and can be iterated.
Can be used with foreach
Iterator: iterator
An object that provides an iteration mechanism. The specific method of iteration is specified by this Iterator interface.
The relationship between foreach and Iterator
Calling collection remove in foreach will cause the original collection to change and cause an error.
You should use the remove method of the iterator
Comparison between for loop and iterator
The get() method in the for loop uses random access method
ArrayList is faster for random access
The next() method in iterator adopts sequential access method
LinkedList is faster for sequential access.
The difference between Iterator and ListIterator
ListIterator iterator
Different scope of use
Iterator can be applied to all collections, Set, List and Map and subtypes of these collections
ListIterator can only be used with List and its subtypes
ListIterator is more powerful and can add, delete, modify and query
Implementation methods and principles
Implementation principle of HashMap
Entry array
storage location
Source code
static int indexFor(int h, int length) { return h & (length-1);}
static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }
Due to the sum (length-1) operation, length is less than 2 to the 16th power in most cases. Therefore, the lower 16 bits of the hashcode are always involved in the operation. If the upper 16 bits are also involved in the operation, the resulting subscript will be more hashed.
So (h >>> 16) gets its high 16 bits and performs ^ operation with hashCode()
Because & and | will both bias the result to 0 or 1, which is not a uniform concept, so use XOR
Implementation principle of HashSet
No duplicate elements allowed
The value of HashSet is stored in the key of HashMap
The bottom layer of HashSet is implemented by HashMap, which is unordered.
The elements in HashSet are stored on the key of HashMap
value is unified into a meaningless static constant private static final Object PRESENT = new Object();
How to ensure that a collection cannot be modified
Use the Collections.unmodifiableCollection(Collection c) method under the Collections package
Any operation that changes the collection will throw Java.lang.UnsupportedOperationException exception
Various methods for converting between arrays and list collections
Array to List
Using the Collector in Stream
List<String> list5=Stream.of(str).collect(Collectors.toList());
List to array
Use toArray() method
String[] str2=list.toArray(new String[list.size()]);
List deduplication method
Using java8's stream to remove duplicates
List uniqueList = list.stream().distinct().collect(Collectors.toList());
Set does not allow duplicate elements
Traverse and remove duplicates
microservices
theoretical knowledge
ESB (Service Bus)
Contains content
Service metadata management
Service registration, life cycle
Protocol adaptation
intermediary services
Various integration scenarios support various message processing and conversion modes
Governance and monitoring
Logs and statistical analysis of service calls and message processing, service quality, service degradation, flow control, etc.
safety
Transmission communication security, data security, service call security, identity verification, etc.
other
Transaction management, high performance, high availability, high reliability, high stability, etc.
fuse
Service registration discovery
tool
ZooKeeper
Consul
etcd,
eureka
Client registration (zookeeper)
Third-party registration (independent service Registrar)
When the service is started, Registrar is notified in some way, and then Registrar is responsible for initiating registration work with the registration center.
At the same time, the registration center must maintain the heartbeat between the service and the service. When the service is unavailable, the registration center must log out the service.
client discovery
Server discovery
API gateway
API Gateway is a server, which can also be said to be the only node entering the system.
Request forwarding
Service forwarding mainly installs the load of microservices on client requests and forwards them to different services.
Response merge
Combine the work that requires calling multiple service interfaces into one call to provide unified services to the outside world.
protocol conversion
The focus is on supporting protocol conversion between SOAP, JMS, and Rest.
data conversion
The focus is on supporting message format conversion capabilities between XML and Json (optional)
safety certificate
1. Token-based client access control and security policy
2. To encrypt transmission data and messages and decrypt them on the server, an independent SDK agent package is required on the client.
3. HTTPS-based transmission encryption, client and server digital certificate support
4. Service security authentication based on OAuth2.0 (authorization code, client, password mode, etc.)
Configuration center
Require
Efficient acquisition
real-time perception
distributed access
zookeeper configuration center
Configuration center data classification
Event scheduling (kafka)
Service tracking (starter-sleuth)
As the number of microservices continues to grow, it is necessary to track the propagation process of a request from one microservice to the next. SpringCloud Sleuth solves this problem. It introduces unique IDs in the log to ensure the consistency between microservice calls. Consistency so you can track how a request passes from one microservice to the next.
Service circuit breaker (Hystrix)
The principle of a fuse is very simple, just like an electrical overload protector. It can fail fast. If it detects many similar errors over a period of time, it will force multiple subsequent calls to fail fast and no longer access the remote server, thus preventing the application from constantly trying to perform operations that may fail. , allowing the application to continue executing without waiting for errors to be corrected, or wasting CPU time waiting for a long timeout to occur. Circuit breakers also enable the application to diagnose whether the error has been corrected, and if so, the application attempts to invoke the operation again.
Hystrix circuit breaker mechanism
When the number of failed Hystrix Command requests to backend services exceeds a certain proportion (default 50%), the circuit breaker will switch to the open circuit state (Open).
After the circuit breaker remains in the open circuit state for a period of time (default 5 seconds), it automatically switches to the half-open circuit state (HALF-OPEN). At this time, the return status of the next request will be judged. If the request is successful, the circuit breaker will switch back to the closed circuit state (CLOSED). ), otherwise it switches back to the open circuit state (OPEN).
Microservice middle platform
SpringCloud Alibaba
SpringCloud Netflix
Contains items
Eureke
Distributed middleware based on REST services for service management
Hystrix
A fault-tolerant framework that helps control component interactions between distributed systems
Perform service rollback when call fails
Support real-time monitoring, alarm and other operations
In distributed systems, stop cascading failures (service interruptions)
Feign
REST client, designed to simplify Web Service client development
Ribbon
The load balancing framework provides support for communication between various clients in the microservice cluster and realizes load balancing in the middle layer.
Zuul
Provide proxy, filtering, routing and other functions for microservice clusters
Netty and RPC framework
Netty
Netty principle
It is a high-performance, asynchronous event-driven NIO framework, implemented based on the API provided by JAVA NIO
It provides support for TCP, UDP and file transfer
All IO operations of Netty are asynchronous and non-blocking
Through the Future-Listener mechanism, users can easily obtain IO operation results actively or through the notification mechanism.
Netty high performance
IO multiplexing technology multiplexes the blocking of multiple IOs to the blocking of the same select, so that the system can handle multiple client requests at the same time in a single thread.
Compared with the traditional multi-thread/multi-process model, the biggest advantage of I/O multiplexing is that the system overhead is small, and the system does not need to create new additional processes or threads.
The system does not need to guard the operation of these processes and threads, which reduces the maintenance workload of the system and saves system resources.
Multiplex communication method
Netty's IO thread NioEventLoop can concurrently process hundreds or thousands of client Channels at the same time due to the aggregation of the multiplexer Selector. Since the read and write operations are non-blocking, this can fully improve the operating efficiency of the IO thread and avoid due to Thread suspension caused by frequent IO blocking.
Asynchronous communication NIO
Since Netty adopts an asynchronous communication mode, one IO thread can handle N client connections and read and write operations concurrently, which fundamentally solves the traditional synchronous blocking IO one-connection-one-thread model.
Zero copy (DIRECT BUFFERS uses off-heap direct memory)
1. Netty uses DIRECT BUFFERS to receive and send ByteBuffer, using off-heap direct memory for Socket reading and writing, without the need for a secondary copy of the byte buffer. If traditional heap memory (HEAP BUFFERS) is used for Socket reading and writing, the JVM will copy the heap memory Buffer to direct memory and then write it to the Socket. Compared with direct memory outside the heap, the message has an additional memory copy of the buffer during the sending process.
2.Netty provides a combined Buffer object, which can aggregate multiple Byte Buffer objects. Users can operate the combined Buffer as conveniently as operating a Buffer, avoiding the traditional method of merging several small Buffers into one large one through memory copying. Buffer.
3.Netty’s file transfer uses the transferTo method, which can directly send the data in the file buffer to the target Channel, avoiding the memory copy problem caused by the traditional cyclic write method.
Memory pool (buffer reuse mechanism based on memory pool)
reuse buffer,
Netty provides a memory pool-based buffer reuse mechanism
Reactor thread model
Reactor single-threaded model
It means that all I0 operations are completed on the same NIO thread
1) As a NIO server, receive the client’s TCP connection;
2) As a NIO client, initiate a TCP connection to the server;
3) Read the request or response message from the communication peer;
4) Send a message request or response message to the communication peer.
Since the Reactor mode uses asynchronous non-blocking IO, all IO operations will not cause blocking. In theory, one thread can handle all IO-related operations independently.
Reactor multi-threading model
The biggest difference between the Rector multi-threaded model and the single-threaded model is that there is a set of NIO threads to handle IO operations.
NIO thread-Acceptor thread is used to monitor the server and receive the client's TCP connection request;
Network I0 operations - reading, writing, etc. are responsible for a NIO thread pool
The thread pool can be implemented using the standard JDK thread pool, which contains a task queue and N available threads. These NIO threads are responsible for reading, decoding, encoding and sending messages;
Master-slave Reactor multi-thread model number
The server is no longer a separate NIO thread for receiving client connections, but an independent NIO thread pool.
After the Acceptor receives the client's TCP connection request and completes the processing (which may include access authentication, etc.), it registers the newly created SocketChannel to an IO thread in the IO thread pool (sub reactor thread pool), which is responsible for reading and writing the SocketChannel. and codec work.
The Acceptor thread pool is only used for client login, handshake and security authentication. Once the link is successfully established, the link is registered to the IO thread of the back-end subReactor thread pool, and the IO thread is responsible for subsequent IO operations.
Lock-free design, thread binding
Netty adopts a serial lock-free design and performs serial operations within 10 threads to avoid performance degradation caused by multi-thread competition.
Multiple serialized threads can be started to run in parallel at the same time. This locally lock-free serial thread design has better performance than a queue-multiple worker thread model.
High performance serialization framework
1.SO RCVBUF and SO SNDBUF
Generally recommended values are 128K or 256K.
2.SO TCPNODELAY:
Small packets encapsulate large packets to prevent network congestion
3. Soft interrupt: After turning on RPS, soft interrupt can be implemented to improve network throughput.
Soft interrupt Hash value CPU binding
Netty RPC implementation
RPC concept
Call a service on a remote computer just like calling a local service.
RPC can decouple systems very well. For example, WebService is an RPC based on the Http protocol.
Key technology
1. Service publishing and subscription: The server uses Zookeeper to register the service address, and the client obtains the available service address from Zookeeper.
2. Communication: Use Netty as the communication framework.
3.Spring: Use Spring to configure services, load beans, and scan annotations.
4. Dynamic proxy: The client uses the proxy mode to transparently call services.
5. Message encoding and decoding: Use Protostuff to serialize and deserialize messages.
core process
1. The service consumer (client) calls the service through local calling;
2. After receiving the call, the client stub is responsible for assembling methods, parameters, etc. into a message body that can be transmitted over the network;
3.client stub finds the service address and sends the message to the server;
4. The server stub decodes the message after receiving it;
5. The server stub calls local services based on the decoding results;
6. The local service is executed and the results are returned to the server stub;
7. The server stub packages the returned results into messages and sends them to the consumer;
8. The client stub receives the message and decodes it;
9. The service consumer gets the final result.
Message codec
Message data structure (interface name, method name, parameter type and parameter value, timeout time requestID
1. Interface name: In our example, the interface name is "HelloWorldService". If it is not passed, the server will not know which interface to call;
2. Method name: There may be many methods in an interface. If the method name is not passed, the server will not know which method to call;
3. Parameter types and parameter values: There are many parameter types, such as bool int, long, double, string, map, list, and even struct (class); and corresponding parameter values;
4. Timeout:
5.requestID, identifies the unique request ID. The use of requestID will be described in detail in the following section.
6. The message returned by the server: generally includes the following content. Return value status code requestID
Serialization
plan
Protobuf
advantage
Performance
small volume
After serialization, the data size can be reduced by about 3 times
Serialization is fast
20-100 times faster than XML and JSON
Fast transfer speed
Because of its small size, the bandwidth and speed of transmission will be optimized.
Usage
Simple to use
proto compiler automatically serializes and deserializes
Low maintenance cost
Multiple platforms only need to maintain one set of object protocol files (.proto)
Good backward compatibility
That is, good scalability: the data structure can be updated directly without destroying the old data format.
Good encryption
Http transmission content capture can only see bytes
Scope of use
Cross-platform
cross language
Good scalability
shortcoming
Functional aspects
Not suitable for building with text-based markup documents (such as HTML) because text is not suitable for describing data structures.
other aspects
Less versatile
Json and XML have become standard writing tools in many industries, while Protobuf is only a tool used internally by Google.
Poor self-explanation
Stored in binary data stream mode (unreadable), you need to go through the .proto file to understand the data structure
Summarize
Protocol Buffer is smaller, faster, and easier to use & maintain than XML and Json!
Thrift
Avro
Communication process
requestID generation-AtomicLong
Store the callback object callback in the global ConcurrentHashMap
synchronized acquires the lock of the callback object callback and spins wait
The thread listening for the message receives the message, finds the lock on the callback and wakes up
RMI implementation
1. Write a remote service interface, which must inherit the java.rmi.Remote interface, and the method must throw a java.rmi.RemoteException exception;
2. Write a remote interface implementation class, which must inherit the java.rmi.server.UnicastRemoteObject class;
3. Run the RMI compiler (rmic) and create the client stub class and server skeleton class;
4. Start an RMI registry to host these services;
5. Register the service in the RMI registry;
6. The client searches for remote objects and calls remote methods;
Cross-language RPC framework
Hessian
Apache Thrift
gRPC
Hprose
Service Governance RPC Framework
Dubbo
composition
Provider
Consumer
Registry
Monitor
Supported containers
Spring
Jetty
Log4j
Logback
Supported protocols
dubbo (default)
RMI
hessian
webservice
http
thrift
Supported registries
zookeeper
redis
multicast
simple
Service governance
load balancing
Random load (default)
Configurable weight
consistent hashing
minimum activity
....
Configuration
The registry supports local caching (cached in the file system)
DubboX
JSF
Motan
database
storage engine
Hash storage engine
Supports addition, deletion, modification and random read operations, but does not support sequential scanning. The corresponding storage system is a key-value storage system.
Representative databases: redis, Memcache, storage system Bitcask
B-tree storage engine
Not only supports random reads, but also supports range scans
LSM tree (Log-Structured Merge Tree) storage engine
Distributed database
HBase
Overview
Starting from BigTable
BigTable is a distributed storage system that uses the MapReduce distributed parallel computing model proposed by Google to process massive data, uses Google's distributed file system GFS as its underlying data storage method, and uses Chubby to provide system management services, which can be expanded to the PB level. Data and thousands of machines, it has the characteristics of wide applicability, scalability, high performance and high availability.
BigTable has the following features:
Support large-scale massive data
Distributed concurrent data processing is highly efficient
Easy to expand and support dynamic scaling
Good for cheap devices
Suitable for reading operations but not suitable for writing operations
Introduction to HBase
HBase uses Hadoop Mapreduce to process massive data in HBase and achieve high-performance computing;
Use Zookeeper as a collaborative service to achieve stable service and failure recovery;
Use HDFS as a highly reliable underlying storage, and use cheap clusters to provide massive data storage capabilities.
In order to facilitate data processing on HBase, Sqoop provides HBase with an efficient and convenient relational database management system data import function.
Pig and Hive provide high-level language support for HBase
HBase is an open source implementation of BigTable
The underlying technical correspondence between HBase and BigTable:
Relationship analysis between HBase and traditional relational databases
HBase is a very mature and stable database management system. It usually has functions including disk-oriented storage and index structure, multi-threaded access, lock-based synchronized access mechanism, log-based recovery mechanism and transaction mechanism, etc.
The difference between HBase and traditional relational databases is mainly reflected in the following aspects:
type of data
Data operations
storage mode
Data index
data maintenance
Scalability
Hbase access interface
Type: Native Java API
Features: Conventional and efficient access methods
Usage occasions: Suitable for parallel batch processing of HBase table data by Hadoop MapReduce jobs
Type: HBase Shell
Features: HBase command line tool, simple interface
Usage occasions: suitable for HBase management
Type: Thrift Gateway
Features: Utilizes Thrift serialization technology to support C, PHP, Python and other languages
Usage occasions: suitable for other heterogeneous systems to access HBase table data online
Type: REST Gateway
Features: Remove language restrictions
Usage occasions: Support REST style HTTP API to access Hbase
Type: Pig
Features: Use Pig Latin streaming programming language to process data in HBase
Usage occasions: suitable for data statistics
Type: Hive
Features: Simple
Usage occasions: When you need to access HBase in a SQL-like way
HBase data model
Data model overview
HBase is a sparse, multi-dimensional, arranged mapping table. The index of this table includes row keys, column families, column qualifiers and timestamps.
Data model related concepts
surface
HBase uses tables to organize data. Tables are composed of rows and columns, and columns are divided into several column families.
row key
Each HBase table consists of several rows, and each column is identified by a column key (Row Key).
clans
An HBase table is grouped into a collection of "column families", which are the basic unit of access control.
column qualifier
Data within a column family is located by column qualifiers (or columns).
Cell
In an HBase table, a cell (Cell) is determined by row key, column family and column qualifier.
Timestamp
Each cell holds multiple versions of the same data, indexed by timestamps.
Data coordinates
HBase uses coordinates to locate data in tables, and each value is accessed through coordinates.
conceptual view
In the conceptual view of HBase, a table can be regarded as a sparse, multi-dimensional mapping relationship.
physical view
From a conceptual view level, each table in HBase is composed of many rows, but at the internal storage level, it uses column-based storage instead of row-based storage like traditional relational databases. This is also Important differences between HBase and traditional relational databases.
column-oriented storage
Through the previous discussion, we already know that HBase is column-oriented storage, and HBase is a "column database".
HBase implementation principle
Functional components of HBase
Library functions, linked to each client.
A Master server (also called Master).
Many Region servers.
Tables and Regions
In an HBase, many tables are stored. Each table contains only one Region, and as data is continuously inserted, the Region will continue to grow.
Region positioning
An HBase table may be very large and split into many Regions, which can be distributed to different Region servers.
HBase operating mechanism
HBase system architecture
client
Zookeeper server
Master main server
Region server
How the Region server works
The process of users reading and writing data
cache refresh
StoreFile
How Store works
Region server is the core module of HBase, and Store is the core of Region server. Each Store corresponds to the storage of a column family in the table. Each Store contains a MEMStore cache and several StoreFile files.
How HLog works
In a distributed environment, system errors must be considered. For example, when the Region server fails, all data in the MEMStore cache (which has not been written to the file) will be lost. Therefore, HBase uses HLog to ensure that the system can be restored to a normal state when a failure occurs.
HBASE programming practice
HBase commonly used Shell commands
HBase commonly used Java APIs and application examples
relational data
MySQL
Index optimization
Optimize Index-Binary Search Tree
Features
All non-leaf nodes have at most two child nodes
Each node stores a key
The left pointer of a non-leaf node points to a subtree smaller than its key, and the right pointer points to a subtree larger than its key.
TODO
Optimize index B-tree
Features
Multi-way search tree, not necessarily binary
Define any leaf node to have at most M sons, and M>2;
The child tree of the root node is [2,M]
The child tree of nodes other than the root node is [M/2,M]
Each node stores at least M/2-1 (round up) and at most M-1 keywords; (at least 2 keywords)
The number of keywords of non-leaf nodes = the number of pointers to children - 1;
Keywords for non-leaf nodes: K[1], K[2], …, K[M-1]; and K[i] < K[i 1];
Pointers to non-leaf nodes:
P[1], P[2], …, P[M]; where P[1] points to a subtree with a key less than K[1],
P[M] points to the subtree whose key is greater than K[M-1], and other P[i] points to the subtree whose key belongs to (K[i-1], K[i]);
All leaf nodes are located on the same layer;
B-tree search starts from the root node and performs a binary search on the keyword (ordered) sequence within the node. If If it is hit, it ends, otherwise it enters the child node of the range to which the query keyword belongs; repeat until the corresponding child pointer is empty or is already a leaf node;
1. The keyword set is distributed throughout the tree; 2. Any keyword appears in only one node; 3. The search may end at a non-leaf node; 4. Its search performance is equivalent to a binary search in the complete set of keywords; 5. Automatic layer control; Since non-leaf nodes other than the root node are restricted to contain at least M/2 sons, the minimum utilization of the node is ensured. The lowest search performance is: Among them, M is the maximum number of subtrees set for non-leaf nodes, and N is the total number of keywords; Therefore, the performance of B-tree is always equivalent to binary search (independent of the M value), and there is no problem of B-tree balance; Due to the limitation of M/2, when inserting a node, if the node is full, the node needs to be split into two nodes each occupying M/2; when deleting a node, two nodes that are less than M/2 need to be split. Sibling nodes merge;
Optimized Index-B-Tree
Features
Is a variant of B-tree
The tree pointer of a non-leaf node is the same as the number of keywords
The subtree pointer p[i] of the non-leaf node points to the subtree whose key value belongs to [K[i], K[i 1]). The B-tree is an open interval.
Add a link pointer to all leaf nodes
All keywords appear in leaf nodes
B-tree searches will only hit leaf nodes (B-tree can hit non-leaf nodes). The performance is equivalent to doing a binary search on the complete set of keywords.
All keywords appear in the linked list of leaf nodes (dense index), and the linked list is ordered
Impossible to hit on non-leaf nodes
Non-leaf nodes are equivalent to the index of leaf nodes (sparse index), and leaf nodes are equivalent to the data layer that stores keywords.
More suitable for file indexing systems
Query efficiency is stable
B* tree is a variant of B tree, with fewer new nodes and high space utilization.
Optimize index-Hash and BitMap
Only "=" and "IN" can be satisfied, and range queries cannot be used.
Cannot be used to sort data
Partial index keys cannot be used for querying
Dense and sparse indexes
dense index
Each index code in the dense index file corresponds to an index value
INNODB
If a primary key is defined, the primary key is used as a dense index
If there is no primary key defined, the first unique non-empty index of the table is used as a dense index.
If there is no primary key and no non-empty index, innodb will generate a hidden primary key (dense index) internally.
Non-primary key index stores relevant key bits and their corresponding primary key values, including two searches
sparse index
Sparse index files only create index entries for certain values of the index code
MyISAM
Index optimization issues Optimize sql
Check slow log Positioning slow sql
Use show variable like '%quer%' View configuration information
slow_query_log slow query log opening status
set global slow_query_log = ON
long_query_time How long will the slow query time be written?
set global long_query_time =1
slow_query_log_file slow log file location
Use explain etc. tool analyze sql
explain Main keywords
ID
The subquery sequence number of the select query represents the select clause or the order of the operation table. If the id is the same, it goes from top to bottom. If the IDs are different, the one with a larger ID will have a higher execution priority.
select_type
Indicates the type of query, mainly used to distinguish between ordinary queries, joint queries, subqueries and other complex queries.
SIMPLE
PRIMARY
SUBQUERY
DERIVED
UNION
UNION RESULT
table
current execution table
possible_keys
Indexes that exist in the current table may not necessarily be used.
key
The actual index used. If it is NULL, it means that the index is not used (including no index is created or the index is invalid)
key_len
The number of bytes used in the index. You can use this column to calculate the index length used in the query. The shorter the length, the better without losing accuracy. key_len represents the maximum possible length of the index field, not the actual length. That is, key_len is calculated through table definition and not obtained through table query.
ref
Indicates which column of the index is used, preferably a constant if possible. Those columns or constants are used to find the value on the index column.
type
Query sorted from best to worst
system>const>eq_ref>ref>fulltext>ref_or_null>index_merage>unique_subquery>index_subquery>range>index>all
When index and all appear, it means that a full table scan is performed.
extral
extralIf the following situation occurs, it means that MYSQL cannot use the index at all, and the efficiency will be greatly affected. Optimize this as much as possible
Using filesort
Indicates that MYSQL will use an external sorting of the results instead of reading the content from the table in index order. May be sorted on disk or in memory. The sorting operation that cannot be completed using indexes in MYSQL is called "file sorting"
Using temporary
Indicates that MYSQL uses temporary tables for query results. Commonly used for sorting order by and group query group by
rows
Based on the table statistics and index selection, roughly estimate the number of rows that need to be read for the required record query. The fewer rows, the better.
Modify the sql or try to make the sql index
Index leftmost matching reason
Leftmost matching principle: MYSQL will always match to the right and stop matching when it reaches the range query. For example, if you create a sequential index of a, b, c, d, the conditions a=1 and b=2 and c>1 and d =2, d will not be used. If you create an abdc sequential index, abcd can be used.
= and in can be out of order, and the msql query optimizer will help you optimize the index into a recognizable form.
reason
The bottom layer of the index is a B-tree. Of course, the joint index is still a B-tree, but the key value of the joint index is not just one, but multiple. Building a B-tree can only be built based on one value, so the database builds the B-tree based on the leftmost field
Several situations of matching If you create a joint index (a,b,c)
Full value matching query
select * from table_name where a = '1' and b = '2' and c = '3' select * from table_name where b = '2' and a = '1' and c = '3' select * from table_name where c = '3' and b = '2' and a = '1' ...
The index used and the order of sub-conditions will not affect the results. Mysql query optimizer will automatically optimize the query order
When matching to the leftmost column
select * from table_name where a = '1' select * from table_name where a = '1' and b = '2' select * from table_name where a = '1' and b = '2' and c = '3'
All starting from the leftmost column, continuous matching, using the index
select * from table_name where b = '2' select * from table_name where c = '3' select * from table_name where b = '1' and c = '3'
None of these start from the far left. The final query does not use the index and uses a full table scan.
select * from table_name where a = '1' and c = '3'
When discontinuous, only the index of column a is used, and both b and c are used.
match match forward embellished
select * from table_name where a like 'As%'; //The prefixes are all sorted, use index query select * from table_name where a like '%As'//Full table query select * from table_name where a like '%As%'//Full table query
If a is a character type, the comparison rule is to compare the first character of the string. If the character of the first string is smaller, the string will be smaller. If the first character is the same, compare the second character. And so on. Therefore, prefix matching uses the index, and suffix matching and infix matching use the full table scan.
match match Fan surround value
select * from table_name where a > 1 and a < 3
You can perform range queries on the leftmost column and use the index
select * from table_name where a > 1 and a < 3 and b > 1;
When querying in a multi-column range, only the leftmost column a can use the index for range query. In the range of 1<a<3, b is unordered and cannot use the index. After finding the records 1<a<3, you can only filter them one by one according to the condition b>1.
Exactly match a column and range find a column
select * from table_name where a = 1 and b > 3;
When a=1, b is ordered, and the range query uses the joint index.
sort
Under normal circumstances, mysql uses file sorting, which is relatively slow. If there is an index in order by, the file sorting step can be omitted.
select * from table_name order by a,b,c limit 10;
Use index
select * from table_name order by b,c,a limit 10;
Reverse the order without using the index
select * from table_name order by a limit 10; select * from table_name order by a,b limit 10;
Use partial index
select * from table_name where a =1 order by b,c limit 10;
The leftmost column of the joint index is a constant, and the index can be used for subsequent sorting.
Data reading and transaction isolation
Problems caused by lock module concurrency-transaction isolation mechanism
Update lost
All transaction isolation levels of mysql can be avoided at the database level
dirty read
READ_COMMITTED transaction isolation level and above can be avoided
The generation of dirty reads
Under the READ_UNCOMMITTED transaction isolation level, two transactions modify the same row at the same time. Once a ROLL BACK occurs in one of the transactions, it will cause the other transaction to generate a dirty read. (Transaction A read the data that transaction B failed to submit, and the actual data has been rolled back)
non-repeatable read
non-repeatable read generation
Transaction A reads a row of records multiple times, and transaction B and transaction C modify this row of records respectively. After transaction B and transaction C submit the transaction, transaction A has not yet submitted. At this time, transaction A reads transaction B one after another, and after transaction C submits The record of this row has been modified, and the record read is different.
phantom reading
Phantom reading occurs
Transaction A reads several rows of data. Transaction B modifies the result set of transaction A through insertion/deletion. Transaction A operates again and finds data rows that have not been operated on. It is like an illusion (records that should not appear appear) )
Current read and snapshot read
currently reading
select *** lock in share mode, select*** for update
update,delete,insert
Find and read quickly
Non-blocking read without locking, select
How to implement non-blocking reading in InnoDB at RC and RR levels
DB_TRX_ID, DB_ROLL_PTR, DB_ROW_ID fields in the data row
undo log
read view
How to avoid phantom reading in RR
InnoDB will add a GAP lock next-key lock to avoid phantom reads
Will GAP lock be used for primary key index or unique index?
If all the where conditions are hit, the GAP lock will not be used and only the record lock will be added.
Under the RR isolation level, Gap locks will be used in non-unique indexes, or in current reads that do not go through the index.
explainanalysis and optimization
select_type (type of each select clause in the query)
(1) SIMPLE (simple SELECT, no UNION or subquery, etc.)
(2) PRIMARY (if the query contains any complex subparts, the outermost select is marked as PRIMARY)
(3) UNION (the second or subsequent SELECT statement in UNION)
(4) DEPENDENT UNION (the second or subsequent SELECT statement in UNION, depending on the external query)
(5) UNION RESULT (the result of UNION)
(6) SUBQUERY (the first SELECT in the subquery)
(7) DEPENDENT SUBQUERY (the first SELECT in the subquery, depends on the outer query)
(8) DERIVED (SELECT of derived table, subquery of FROM clause)
(9) UNCACHEABLE SUBQUERY (the results of a subquery cannot be cached, and the first row of the external link must be re-evaluated)
type (The way MySQL finds the required row in a table, Also known as "access type")
ALL: Full Table Scan, MySQL will traverse the entire table to find matching rows
index: Full Index Scan, the difference between index and ALL is that the index type only traverses the index tree
range: Retrieve only rows in a given range, using an index to select rows (Alibaba requires the least)
ref: Indicates the connection matching conditions of the above table, that is, which columns or constants are used to find the value on the index column (Alibaba recommends that it is best to achieve this)
eq_ref: Similar to ref, the difference is that the index used is a unique index. For each index key value, only one record in the table matches. To put it simply, the primary key or unique key is used as the association condition in multi-table connections.
const, system: These types are used when MySQL optimizes a certain part of the query and converts it to a constant. If the primary key is placed in the where list, MySQL can convert the query into a constant. System is a special case of the const type. When the queried table has only one row, use system
NULL: MySQL decomposes the statement during the optimization process, and does not even need to access the table or index during execution. For example, selecting the minimum value from an index column can be completed through a separate index lookup.
possible_keys
Which index can MySQL use to find records in the table? If there is an index on the field involved in the query, the index will be listed, but it may not be used by the query.
Key
The key column shows the key (index) that MySQL actually decided to use.
key_len
Indicates the number of bytes used in the index. The length of the index used in the query can be calculated through this column (the value displayed by key_len is the maximum possible length of the index field, not the actual length used, that is, key_len is calculated based on the table definition, not retrieved from the table) without losing accuracy, the shorter the length, the better
ref
The join matching condition of the above table, that is, which columns or constants are used to find the value on the index column
rows
MySQL estimates the number of rows that need to be read to find the required records based on table statistics and index selection.
Extra
Using where: column data is returned from a table that only uses the information in the index without reading the actual action. This occurs when all requested columns for the table are part of the same index, indicating that the MySQL server will Filtering after the rows are retrieved by the storage engine
Using temporary: Indicates that MySQL needs to use temporary tables to store result sets, which is common in sorting and grouping queries.
(If the above two red "Using temporary" and "Using filesort" appear, it means low efficiency)
Using join buffer: The changed value emphasizes that no index is used when obtaining the join condition, and a join buffer is needed to store intermediate results. If this value appears, it should be noted that depending on the specific conditions of the query, you may need to add an index to improve performance.
Impossible where: This value emphasizes that the where statement will result in no qualifying rows.
Using filesort: The sorting operation in MySQL that cannot be completed using indexes is called "file sorting"
Select tables optimized away: This value means that by using the index only, the optimizer may return only one row from the aggregate function result
Mysql tuning
Field design
Try to use integers to represent strings
Choose small data types and specify short lengths whenever possible
Use not null whenever possible
A single table should not have too many fields
Fields can be reserved
Normal Format
First normal form 1NF: field atomicity
Second Normal Form: Eliminate partial dependence on primary keys
Third normal form: Eliminate transitive dependence on primary keys
Query cache
It is my.ini on windows and my.cnf on linux.
0: Not enabled
1: Enable, cache everything by default, add select sql-no-cache to the SQL statement to abandon the cache
2: Enable, no caching by default, add select sql-cache to the SQL statement to actively cache (commonly used)
Cache invalidation problem
When the data table is modified, any cache based on the data table will be deleted.
Partition
Only when the search field is a partition field, the efficiency improvement brought by partitioning will be more obvious.
Split horizontally and vertically
Horizontal split: store data separately by creating several tables with the same structure
Vertical split: Put fields that are often used together in a separate table. There is a one-to-one correspondence between the split table records.
Typical server configuration
max_connections, maximum number of client connections
show variables like 'max_connections'
table_open_cache, table file handle cache
key_buffer_size, index cache size
innodb_buffer_pool_size, Innodb storage engine cache pool size
innodb_file_per_table
Stress testing tools
mysqlslap
Troubleshooting
Use the show processlist command to view all current connection information
Use the explain command to query the SQL statement execution plan
Turn on the slow query log and view the slow query SQL
NoSQL
MongoDB
Getting started with MongoDB
Nosql and sql usage scenario analysis
basic concepts
mongodb advanced
Common commands
Quick start
mongodo client driver
Add, delete, modify, search and aggregate
safely control
Advanced knowledge of mongodb
storage engine
index
High availability
Best practices and considerations
data structure
stack
A stack is a table that restricts insertion and deletion to only one position. This position is the end of the table and is called the top of the stack.
It is last in, first out (LIFO). There are only two basic operations on the stack: push (into the stack) and pop (out of the stack). The former is equivalent to inserting, and the latter is equivalent to deleting the last element.
queue
A queue is a special linear table. The special thing is that it only allows deletion operations at the front end of the table (front) and insertion operations at the back end (rear) of the table.
A queue is a linear list with restricted operations. The end that performs the insertion operation is called the tail of the queue, and the end that performs the deletion operation is called the head of the queue.
Link
A linked list is a data structure, the same level as an array. For example, the ArrayList we use in Java is implemented based on an array.
The implementation principle of Linkedlist is linked list.
Linked lists are not efficient when performing loop traversal, but have obvious advantages when inserting and deleting.
Hash Table
Hash table (also called hash table) is a search algorithm. Different from algorithms such as linked lists and trees, the hash table algorithm does not require a series of keywords (keywords are certain elements in the data elements) when searching. The value of a data item, used to identify a data element) comparison operation.
The hash table algorithm hopes to be able to obtain the data element it is looking for through one access without any comparison. Therefore, a determination must be established between the storage location of the data element and its keyword (which can be represented by key). The corresponding relationship makes each keyword correspond to a unique storage location in the hash table.
Methods of constructing hash functions are:
(1) Direct addressing method: take the keyword or a linear function value of the keyword as the hash address.
That is: h(key) = key or h(key) = a* key b where a and b are constants.
(2) Digital analysis method
(3) Square value method: Take the middle digits after squaring the keyword as the hash address.
(4) Folding method: Divide the keyword into several parts with the same number of digits, and then take the superposition sum of these parts as the hash address.
(5) Division with remainder method: The remainder obtained after the keyword is divided by a number p not larger than the hash table length m is the hash address.
That is: h(key) = key MOD p ps m
(6) Random number method: Choose a random function and take the random function value of the keyword as its hash address.
That is: h(key) = random(key)
sorted binary tree
Each node of the sorted binary tree satisfies:
All node values in the left subtree are less than its root node value,
And the value of all nodes in the right subtree is greater than its root node value
insert operation
First, start from the root node and find the position you want to insert (that is, the parent node of the new node); the specific process is: compare the new node with the current node. If they are the same, it means they already exist and cannot be inserted again; if they are smaller than the current node , then search in the left subtree. If the left subtree is empty, the current node is the parent node you are looking for, and the new node can be inserted into the left subtree of the current node; if it is greater than the current node, search in the right subtree. , if the right subtree is empty, the current node is the parent node you are looking for, and the new node can be inserted into the right subtree of the current node.
Delete operation
Deletion operations are mainly divided into three situations:
The node to be deleted has no child nodes
The node to be deleted has only one child node
The node to be deleted has two child nodes
1. If the node to be deleted has no child nodes, you can delete it directly, that is, let its parent node leave the child node blank.
2. If the node to be deleted has only one child node, replace the node to be deleted with its child node.
3. If the node to be deleted has two child nodes, first find the replacement node of the node (that is, the smallest node in the right subtree), then replace the node to be deleted with the replacement node, and then delete the replacement node.
Query operation
The main process of search operation is:
First compare with the root node, and return if they are the same.
If it is less than the root node, search recursively in the left subtree.
If it is greater than the root node, search recursively in the right subtree.
So it is easy to get the maximum (rightmost deepest child node) and minimum (leftmost deepest child node) value in a sorted binary tree.
red black tree
Characteristics of red-black trees
(1) Each node is either black or red.
(2) The root node is black.
(3) Each leaf node (NIL) is black. [Note: The leaf node here refers to the leaf node that is empty (N, L or NULL)!
(4) If a node is red, its child node must be black.
(5) All paths from a node to the node’s descendant nodes contain the same number of black nodes.
Left-handed
Left-rotating × means setting the "right child of x" as the "father node of x"; that is, turning
Therefore, the "left" in left-rotation means "the rotated node will become a left node".
LEFT-ROTATE(T, x) y ← right[x] // Premise: It is assumed that the right child of x is y. The official operation begins below right[x] ← left[y] // Set "y's left child" to "x's right child", that is, set β to x's right child p[left[y]] ← x // Set "x" to "the father of y's left child", that is, set the father of β to x p[y] ← p[x] // Set "x's father" to "y's father" if p[x] = nil[T] then root[T] ← y // Case 1: If "x's father" is an empty node, set y as the root node else if x = left[p[x]] then left[p[x]] ← y // Case 2: If x is the left child of its parent, then set y to "the left child of x's parent" else right[p[x]] ← y // Case 3: (x is the right child of its parent) Set y to "the right child of x's parent" left[y] ← x // Set "x" to "y's left child" p[x] ← y // Set "x's parent node" to "y"
Right-hand rotation
Right-rotating x means setting the "left child of x" as the "father node of x"; that is, turning x into a right node (x becomes the right child of y)!
Therefore, the "right" in right rotation means "the rotated node will become a right node".
RIGHT-ROTATE(T, y) x ← left[y] // Premise: It is assumed that the left child of y is x. The official operation begins below left[y] ← right[x] // Set "x's right child" to "y's left child", that is, set β to y's left child p[right[x]] ← y // Set "y" to "the father of the right child of x", that is, set the father of β to y p[x] ← p[y] // Set "y's father" to "x's father" if p[y] = nil[T] then root[T] ← x // Case 1: If "y's father" is an empty node, set x as the root node else if y = right[p[y]] then right[p[y]] ← x // Case 2: If y is the right child of its parent, then set x to "the left child of y's parent" else left[p[y]] ← x // Case 3: (y is the left child of its parent) Set x to "the left child of y's parent" right[x] ← y // Set "y" to the "right child of x" p[y] ← x // Set "y's parent node" to "x"
Add to
Step 1: Treat the red-black tree as a binary search tree and insert the node.
Step 2: Color the inserted node "red."
Step 3: Make it a red-black tree again through a series of rotation or coloring operations.
delete
Step 1: Treat the red-black tree as a binary search tree and delete the nodes.
This is the same as "deleting nodes in a regular binary search tree". There are 3 situations:
① The deleted node has no children and is a leaf node. Then, it is OK to delete the node directly.
② The deleted node has only one son. Then, delete the node directly and replace its position with the only child node of the node.
③ The deleted node has two sons. Then, first find its successor node; then copy "the content of its successor node" to "the content of this node"; then delete "its successor node".
Step 2: Modify the tree through a series of "rotation and recoloring" to make it a red-black tree again.
Because after deleting the node in the "first step", it may violate the characteristics of the red-black tree. So the tree needs to be corrected by "rotating and recoloring" to make it a red-black tree again.
Choose from 3 cases of recoloring.
① Situation description: x is a “red and black” node.
Processing method: directly set x to black and end. At this time, all the properties of the red-black tree are restored.
2⃣️Situation description: x is a "black black" node, and x is the root.
Solution: Do nothing and end it. At this time, all the properties of the red-black tree are restored.
③ Description of the situation: x is a "black" node, and x is not a root.
B-TREE
B-tree is also called balanced multi-path search tree. The characteristics of an m-order B-tree (m-ary tree) are as follows (where ceil(x) is an upper bound function):
1. Each node in the tree has at most m children;
2. Except for the root node and leaf nodes, each other node has at least ceil(m / 2) children;
3. If the root node is not a leaf node, it must have at least 2 children (special case: a root node without children, that is, the root node is a leaf node, and the entire tree has only one root node);
4. All leaf nodes appear in the same layer, and leaf nodes do not contain any keyword information (can be regarded as external nodes or queries
Failed nodes, in fact, these nodes do not exist, and the pointers pointing to these nodes are null);
5. Each non-terminal node contains n keyword information: (n, PO, K1, P1, K2, P2,Kn, Pn). in:
a) Ki(=1..n) is the keyword, and the keywords are sorted in order K(-1)<Ki.
b) Pi is a node pointing to the root of the subtree, and the pointer P(-1) points to the keys of all nodes in the subtree that are smaller than Ki, but larger than K(i-1).
c) The number of keywords n must satisfy: ceil(m / 2)-1 <=n <=m-1.
The difference between a B-tree of order m and a B-tree of order m is:
1. Nodes with n subtrees contain n keywords; (B-tree is n subtrees with n-1 keywords)
2. All leaf nodes contain information about all keywords and pointers to records containing these keywords, and the leaf nodes themselves are linked in order from small to large keywords. (The leaf nodes of the B-tree do not include all the information that needs to be found)
3. All non-terminal nodes can be regarded as the index part, and the node only contains the largest (or smallest) keyword in the root node of its subtree. (The non-terminal nodes of B-tree also contain valid information that needs to be found)
bitmap
The principle of bitmap is to use one bit to identify whether a number exists, and use one bit to store a piece of data, so this can greatly save space.
Bitmap is a very commonly used data structure, such as used in Bloom Filter; used for sorting of non-repeating integers, etc.
Bitmap is usually implemented based on an array. Each element in the array can be regarded as a series of binary numbers, and all elements form a larger binary set.
basic concept
Java basic concepts
JDK and JRE
JDK is the Java Development Kit, which includes the tools, IDE and JRE required to compile and run Java.
JRE is Java Runtime Environment, including JVM (Java Virtual Machine) and system class library
relation chart
JavaEE (Java Platform Enterprise Edition)
Contains technical standards
Applet
Is a Java program. It typically runs within a Java-enabled web browser
Applets are designed to be embedded in an HTML page
When a user browses an HTML page containing an Applet, the Applet's code is downloaded to the user's machine
EJB
JNDI
Java Naming and Directory Interface
JDBC
Servlet
A program that runs on a web server or application server
It acts as a middle layer between a database or application from a web browser and an HTTP server
Compare to CGI
Better performance and platform independent
Execute within the address space of the web server, eliminating the need to create a separate process to handle each client request
Servlet is trusted because the Java security manager enforces a series of restrictions
All functionality of the Java class library is available to Servlets
Architecture diagram
effect
Implement various interfaces and classes defined by the Servlet specification to provide underlying support for the operation of Servlet
Manage user-written Servlet classes and objects after instantiation
Provide HTTP service, equivalent to a simplified server
JSP
Java IDL (Interface Definition Language)/CORBA
XML
JMS (Java Message Service)
JTA (Java Transaction Architecture)
JTS (Java Transaction Service)
JavaMail
JAF (JavaBeans Activation Framework)
Classification
JavaWeb
Develop web programs
JavaSE
C/S architecture for developing and deploying Java applications for use in desktop, server, embedded and real-time environments
JavaME
Running environment on mobile devices and embedded devices (mobile phones, PAD, etc.)
relation chart
(Common Gateway Interface, public gateway interface) CGI
JavaBeans
Is an object-oriented programming interface
It is a Java class that mainly focuses on how to integrate applications in development tools.
POJO(Plain OrdinaryJava Object)
Refers to ordinary java objects that do not use Entity Beans
In essence, it can be understood as a simple entity class
You can easily use the POJO class as an object, and of course you can also easily call its get and set methods.
The POJO class also brings great convenience to our configuration in the struts framework.
Servlet
jsp
built-in objects
1 request javax.servlet.http.HttpServletRequest
Client request information: Http protocol header information, cookies, request parameters, etc.
2 response javax.servlet.http.HttpServletResponse
Used by the server to respond to client requests and return information
3 pageContext javax.servlet.jsp.PageContext
page context
4 session javax.servlet.http.HttpSession
Session between client and server
5 application javax.servlet.ServletContext
Used to obtain server-side application life cycle information
6 out javax.servlet.jsp.JspWriter
Output stream used to transmit content from the server to the client
7 config javax.servlet.ServletConfig
During initialization, the information passed by the Jsp engine to the Jsp page
8 pages java.lang.Object
Points to the Jsp page itself
9 exception java.lang.Throwable
An exception occurs on the page and the exception object generated
Scope
page current page scope
The attribute values stored in this scope can only be retrieved from the current page.
request request scope
The scope is the time from the creation of the request to the death of the request. A request can involve multiple pages.
session session scope
The range is a period of time during which the client and server are continuously connected.
The user requested the page involved multiple times during the session validity period
application global scope
The scope is from the start to the stop of the server-side web application, and the pages involved in all requests in the entire web application.
Servlet component
What is a Servlet?
Can provide dynamic html response
It is a java class running on the server
Used to complete dynamic response to customer requests under b/s architecture
This java class needs to comply with the jsp/Servlet specification
How to write a Servlet
three ways
Create a dynamic web project and write a java class that inherits the javax.servlet.http.HttpServlet abstract class. Override the doGet() or doPost() method
Implement the javax.servlet.Servlet interface and override all its methods
Inherit the javax.servlet.GenericServlet abstract class and override the service() method
You need to download web-inf and configure web.xml
Deploy the project to the web server and use the url format to request the test
Create dynamic web projects
Dynamic web
Note: Select the tomcat running environment and web version
//The server tells the browser to translate in HTML and encode in UTF-8 resp.setContentType("text/html;charset=utf-8;");
//Write to the browser and obtain an output object to the browser PrintWriter p = resp.getWriter(); p.write("<h1>" s "<h2>");
web.xml
<!-- Configure servlet --> <servlet> <servlet-name>Consistent</servlet-name> <servlet-class>Package name.Class name</servlet-class> </servlet> <servlet-mapping> <servlet-name>Consistent</servlet-name> <url-pattern>/Write it yourself</url-pattern> </servlet-mapping>
The relationship between Servlet, GenericServlet and HttpServlet
servlet life cycle
Creation of servlet object
The default is to create the Servlet when the first request comes
(via constructor)
You can pass the xml tag <load-on-startup> (loaded when it starts) with a value greater than or equal to 0. It is recommended to write 1
Initialization of servlet objects
Immediately after the object is created, void init() is called to complete the initialization.
servlet Continuous service of objects
void doGet(HttpServletRequest request,HttpServletResponse response) throws ServletException, IOException //If the browser directly requests to send doget void doPost(HttpServletRequest request,HttpServletResponse response) throws ServletException, IOException //When the form sends a psot request, call doPost void service(HttpServletRequest request,HttpServletResponse response) throws ServletException, IOException //Can handle both get and post requests
When sending a doget request, only the doPost method program will report 405
@Override protected void doGet(HttpServletRequest request,HttpServletResponse response) throws ServletException, IOException { doPost(request, response); }
solve
The imminent demise of servlet objects
void destroy()
Once called, the object enters the garbage collection system. It has not become garbage yet. Only when the method is called can it become garbage, because Java programmers only have suggestions for garbage collection rights.
die
Thread safety issues of servlets
servlet thread unsafe reasons
There is only one servlet object corresponding to the same request. When multiple servlets execute the same member variable When writing operations, thread insecurity will occur.
How to solve
via synchronized(this){}
Lock so threads are queued
Just assign an independent variable to each method call, do not make it a member variable
ServletContext object
What is a ServletContext object?
When the project is deployed to the web server, an object of type ServletContext will be generated for the project. This object can solve the problem of information sharing and transfer between multiple servlets
How to get it?
request.getServletContext()
Or this.getServletContext() in the servet service method
ServletContext API
Set data
setAttribute("name",value)
retrieve data
setAttribute("name")
delete data
removeAttribute("name")
aaaaaaa
getRealPath("folder name")
Get global parameter data configured in web.xml
getInitParameter("name");
Configure global parameters:
<context-param>Context parameter
<context-name>xx
<context-value>utf-8
ServletConfig object
What is a ServletConfig object?
Each independent servlet object has a dedicated ServletConfig Object of type serves it. This object of type ServletConfig can Get the relevant information and configuration information of the Servlet object.
How to get it?
In the servet service method this.getServletContext()
ServletConfig API
getServletName() Gets the name of the servlet
Refers to <servlet-name>
getServletContext() Gets the object of servlet shared information
getInitParameter("name") Gets the configuration information for this servlet object based on name
Written in <servlet> <init-param> <param-name> <param-value>
Servlet request forwarding
working principle
Features
Request forwarding does not support cross-domain access and can only jump to resources in the current application.
After the request is forwarded, the URL in the browser's address bar will not change, so the browser does not know that forwarding occurred within the server, let alone the number of forwardings.
Web resources participating in request forwarding share the same request object and response object.
Since the forward() method will first clear the response buffer, the generated response will only be sent to the client when it is forwarded to the last web resource.
request domain object
Compare with Context domain object
1) Different life cycles
2) Different scopes
3) The number of web applications is different
4) Different ways to achieve data sharing
Distributed systems and calls
(Enterprise Java Beans)EJB
It is a component running on an independent server. The client calls the EJB object through the network.
EJB cluster service
It is to connect servers with different functional modules through RMI communication to achieve a complete function.
Try not to use EJB
Simple pure Web application development, no need to use EJB
Used in conjunction with other service programs, the network protocol called or returned can solve
Applications with C/S structure that are accessed concurrently by many people
(Remote Method Invocation)RMI
The technical foundation of EJB is RMI, which enables remote calling through RMI.
It is a method that uses the object serialization mechanism to implement distributed computing and remote calling.
CORBA
Common Object Request Broker Architecture, Common Object Request Broker Architecture
Object-oriented distributed application system specification, a solution for interconnection of hardware and software systems in heterogeneous distributed environments
Common terms
OBR
Object Request Broker, object request proxy
In an object-oriented distributed environment, ORB can provide key communication facilities for distributing messages between applications, servers, and network facilities.
It is the core component of CORBA and provides a framework structure for identifying and locating objects, handling connection management, transmitting data and requesting communication.
CORBA objects
It is a "virtual" entity that can be located by an Object Request Broker (ORB) and can be called by client program requests.
IOR
Interoperable Object Reference: Interoperable Object Reference
Stores almost all inter-ORB protocol information, used to establish communication between clients and target objects, and provides a standardized object reference format for ORB interoperability.
Each IOR includes a host name, TCP/IP port number and an object key, which identifies the target object based on the host name and port
It mainly consists of three parts: warehouse ID, endpoint information and object key
CORBA system diagram
CORBA request calling steps
(1).Locate the target object
(2). Call the server application
(3). Pass the parameters required for the call
(4). If necessary, activate the servo program that calls the target object
(5). Wait for the request to end
(6). If the call is successful, return the out/inout parameters and pass the return value to the client.
(7). If the call fails, return an exception to the client.
feature
Realizes cross-platform and cross-language data transmission under distributed systems
Implemented local calling of remote methods (calling methods on the remote server locally and obtaining the return value)
Heterogeneous distributed system principles
Seeking platform-independent models and abstractions
Hide underlying complex details as much as possible without sacrificing too much performance
other
Java features
Compare with C
C supports pointers, but Java has no concept of pointers
C supports multiple inheritance, but Java does not support multiple inheritance
Java is a completely object-oriented language
Java automatically recycles memory, while in C, memory resources must be released by the program.
Java does not support operator overloading, which is considered a prominent feature of C
Java allows preprocessing, but does not support the preprocessor function. In order to implement preprocessing, introduce statements (import)
Java does not support default parameter functions, but C does
C and C do not support string variables
Java does not provide goto statement
Java does not support automatic casts in C
fail-fast
Default refers to an error detection mechanism of Java collections
When multiple threads perform structural changes to part of the collection, a fail-fast mechanism may occur.
tool
Source code editing
Notepad
EditPlus
UltraEdit
Sublime Text
vim
IDE
Eclipse IDE
MyEclipse
Intellij IDEA
NetBeans
Servlet container/server
Tomcat
Structure diagram
JBoss
Jetty
WebSphere
WebLogic
GlassFish
code generation
Xdoclet
Compression algorithm
Gzip
High compression ratio, slow speed
deflate
deflate(lvl=1)
Low compression, fast
. . .
deflate(lvl=9)
High compression ratio, slow speed
Bzip2
LZMA
XZ
LZ4
LZ4(high)
LZ4(fast)
Very fast, up to 320M/S
LZO
Snappy
Snappy (framed)
Snappy(normal)
Programming ideas and design patterns
Design Principles
single responsibility principle
Open-Closed Principle
Liskov Substitution Principle
dependency inversion principle
Interface isolation principle
Synthetic Reuse Principle
Demeter's Law
Design Patterns
creational pattern
(1) Singleton mode (Singleton)
(2) Simple factory pattern (SimpleFactory)
(3) Factory Method Pattern (FactoryMothod)
(4)Abstract Factory Pattern (AbstratorFactory)
(5)Builder Pattern
(6)Prototype Pattern
structural pattern
(7)Adapter Pattern
(8)Bridge Pattern
(9)Decorator Pattern
(10)Composite Pattern
(11)Facade Pattern
(12)Flyweight Pattern
(13)Proxy Pattern
behavioral patterns
(14)Template Method
(15)Command Pattern
(16)Iterator Pattern
(17)Observer Pattern
(18)Mediator Pattern
(19)State Pattern
(20)Strategy Pattern
(21)Responsibility chain model
(22)Vistor Pattern
(23) Memento Pattern
(24)Interpreter mode
Programming ideas
AOP Aspect-oriented
introduce
Separation of concerns: Different problems are solved by different parts
Aspect-oriented programming AOP is the embodiment of this technology
Universal function implementation corresponds to the so-called aspect (Aspect)
After the business function code and aspect code are separated, the architecture will become highly cohesive and low coupling
To ensure the integrity of functions, aspects ultimately need to be integrated into the business (Weave, compiled)
AOP three weaving methods
Compile-time weaving (static proxies)
A special Java compiler is required, enter AspectJ
Weaving during class loading
Requires special Java compiler, such as AspectJ, AspectWerkz
Runtime weaving (dynamic proxies)
Spring adopts this method to achieve simple implementation through dynamic proxy
Spring AOP
JDK dynamic proxy
The proxied class is received through reflection, and the proxied class must implement an interface.
InvocationHandler interface Proxy.newProxyInstance()
CGLIB dynamic proxy
It is a code generation class library that can dynamically generate subclasses of a certain class at runtime.
Dynamic proxying is done through inheritance, so if a class is marked as final, Then it cannot use CGLIB as a dynamic proxy
MethodInterceptor interface Enhancer class
AOP main noun concepts
Aspect
Aspect: The code for common functions is the implementation
Target
Target: to be woven into the Aspect object
Join point
Opportunities that can be used as entry points, all methods can be used as entry points
Pointcut
Define the Join Point where Aspect is actually applied, supporting regular expressions
Advice
Methods in the class and how this method is woven into the target method
Classification
Before notification
AfterRunning notification
Exception notificationAfterThrowing
Final notice After
Around notifications Around
Weaving
AOP implementation process
JDKProxy, Cglib to generate proxy objects
The details are determined by AopProxyFactory based on the configuration of the AdvisedSupport object.
The strategy is that if the target is an interface, use JDKProxy to implement it by default, otherwise use the latter
JDKProxy
JDK dynamic proxy is an implementation method of proxy mode. It receives the class to be proxied through reflection and requires that the class to be proxied must implement the interface.
JDKProxy core
InvocationHandler interface and Proxy class
The reflection mechanism is more efficient in the process of generating classes
Cglib
Implement the proxy of the target class through inheritance, and the bottom layer is implemented with the help of ASM
If the target class is set to final and cannot be inherited You cannot use Cglib dynamic proxy
ASM is more efficient in the execution process after generating classes.
proxy mode
Interface Real implementation class Proxy class
Proxy implementation in Spring
The logic of the real implementation class is contained in the getBean method
What the getBean method returns is actually an instance of Proxy
The Proxy instance is dynamically generated by Spring using JDKProxy or Cglib
IOC control reverse
The core part of Spring Core
Dependency Inversion dependency injection
Example: The superstructure depends on the lower structure, such code is almost unmaintainable If you want to modify the wheel, you have to modify all classes
The meaning of dependency injection: Pass the bottom class as a parameter to the upper class to realize the "control" of the upper layer over the lower layer
Injection method
Set injection
Interface injection
Annotation injection
constructor injection
Advantages of IOC containers
Avoid using new everywhere to create classes and maintain them uniformly
In the process of creating an instance, you do not need to know the details
What happens when a project starts
1. When Spring starts, read the Bean information provided by the application .2. Generate a bean configuration registry from the read bean configuration information. 3. Instantiate the bean according to the bean's registry, assemble the bean's dependencies, and provide a ready running environment for the upper layer. 4. Use Java’s reflection function to instantiate beans and establish dependencies between beans
What functions does IOC support?
dependency injection
Dependency check
automatic assembly
Support collections
Develop initial method and destruction method
Support callback for certain methods
Spring IOC container core interface
BeanFactory
Provides IOC configuration mechanism
Contains various definitions of beans to facilitate instantiation of beans
Establish dependencies between beans
Bean life cycle control
ApplicationContext
Inherits multiple interfaces
BeanFactory
Able to manage and assemble Beans
ResourcePatternResolver
Ability to load resource files
MessageSource
Ability to implement internationalization related functions
ApplicationEventPublisher
Ability to register listeners and implement monitoring mechanisms
BeanDefinition
Mainly used to describe bean definitions
BeanDefinitionRegistry
Provides methods to register BeanDefinition objects with the IOC container
BeanFactory and ApplicationContext comparison
BeanFactory is Spring's infrastructure ApplicationContext is for Spring framework developers
refresh method
Provide conditions for IOC container and Bean life cycle management
Refresh Spring context information and define Spring context loading process
getBean method
Convert beanName
Load instance from cache
Instantiate beans
Detect parentBeanFactory
Initialize dependent beans
Create beans
Common interview questions
Spring Bean Scope
singleton
The default scope of the Spring container, there will be a unique Bean instance in the container
Suitable for stateless beans
prototype
For each getBean request, the container creates a new Bean instance
Suitable for stateful beans
request
A Bean instance will be created for each HTTP request
session
A Bean instance will be created for each Session
globalSession
A Bean instance will be created for each global Http Session. This scope only takes effect for Portlets
web container additional support
Bean life cycle
InstantiateBean
Aware (inject Bean ID, BeanFactory, AppCtx)
Aware interface declares dependencies
BeanPostProcessor(s) postProcessBeforeInitalization
Pre-initialization method After Spring completes instantiation, Add some custom processing logic to beans instantiated by the Spring container
InitializingBeans(s).afterPropertiesSet
Customized bean.init method
BeanPosProcessor(s). postProcessAfterInitalization
post initialization method After the bean initialization is completed, Custom operations
Bean initialization completed
Bean creation
Bean destruction process
If the DisposableBean interface is implemented, the destroy method will be called
If the destroy-method attribute is configured, the previously configured destruction method will be called.
Bean destruction
Advanced Java Basics
Theoretical basis
stack, heap and method area
Variables of basic data types, object references, and function calls are saved using the stack space in the JVM.
Objects created through the new keyword and constructor are placed in the heap space
The method area and heap are memory areas shared by each thread and are used to store class information, constants, static variables, code compiled by the JIT compiler, etc. that have been loaded by the JVM.
The stack space is the fastest to operate, but the stack is very small. Usually a large number of objects are placed in the heap space.
Both stack and heap sizes can be adjusted through JVM startup parameters.
Running out of stack space will cause a StackOverflowError, while insufficient heap and constant pool space will cause an OutOfMemoryError.
Basic types and packaging types
Wrapper types can be null, basic types cannot
Wrapper types can be used with generics, basic types cannot
Primitive types are more efficient than wrapped types
Basic types store specific values directly on the stack, while wrapper types store references in the heap.
The values of two wrapper types can be the same but not equal
Integer chenmo = new Integer(10); Integer wanger = new Integer(10); System.out.println(chenmo == wanger); // false System.out.println(chenmo.equals(wanger )); // true
Packing and unboxing
The process of converting basic types into wrapped types is called boxing
The process of converting a wrapped type into a basic type is called unboxing
String and StringBuilder, StringBuffer
String is a read-only string, which means that the content of the string referenced by String cannot be changed.
The string object represented by the StringBuffer/StringBuilder class can be modified directly
StringBuilder is used in a single thread and is thread-unsafe, but its performance is much greater than StringBuffer.
final, finally, finalize
final
If a class is declared final, it means that it cannot derive new subclasses, that is, it cannot be inherited, which is the antonym of abstract.
Declare variables as final to ensure that they are not changed during use
finally
Usually placed after try...catch..., the structure always executes the code block, which means that whether the program is executed normally or an exception occurs, the code here can be executed as long as the JVM is not closed.
finalize
Java allows the use of the finalize() method to do necessary cleanup work before the garbage collector clears the object from memory.
static final/final static
A property modified with static final means that once a value is given, it cannot be modified and can be accessed through the class name.
static final modified method, indicating that the method cannot be overridden and can be called without a new object
== and hashCode and equals methods
equals() can confirm whether they are really equal.
The default implementation in the Object class is: return this == obj. True will be returned only if this and obj refer to the same object.
Override equals
Reflexivity: x.equals(x) must be true
For null: x.equals(null) must be false
Symmetry: x.equals(y) and y.equals(x) have the same result
Transitivity: a and b equals, b and c equals, then a and c must also be equals.
Consistency: During a certain runtime, changes in the state of the two objects will not affect the decision result of equals
hashCode() is used to narrow the search range
Rewrite equals, you must also rewrite hashCode
The fields that participate in the equals function must also participate in the calculation of hashCode.
Equivalent (calling equals returns true) objects must produce the same hash code. Non-equivalent objects do not require the resulting hash codes to be different.
The fields that measure equality in equals participate in the hash operation. Each important field will generate a hash component and contribute to the final hash value.
== Determine whether the object addresses are equal
RPC aspect
The reason why serialization & deserialization of Protocol Buffer is simple & fast is
1. The encoding/decoding method is simple (only simple mathematical operations = displacement, etc.)
2. Use Protocol Buffer’s own framework code and compiler to complete the process
The reasons why Protocol Buffer has good data compression effect (that is, the volume of data after serialization is small) is:
1. a. Adopt a unique encoding method, such as Varint, Zigzag encoding method, etc.
2. b. Adopt T - L - V data storage method: reduce the use of delimiters & data storage is compact
database
INNODB's repeatable reading, what clever ways are used to avoid phantom reading
Representation snapshot read (non-blocking read), pseudo-MVCC
Intrinsic next-key lock gap lock
Need to know about Redo Log