MindMap Gallery CFA mind map
CFA refers to Chartered Financial Analyst, which is the most stringent and valuable qualification in the global investment industry. The figure below describes the knowledge content of statistics and probability, including basic concepts of statistics, probability, probability distribution, hypothesis testing, sampling and estimation.
Edited at 2021-08-01 21:33:31El cáncer de pulmón es un tumor maligno que se origina en la mucosa bronquial o las glándulas de los pulmones. Es uno de los tumores malignos con mayor morbilidad y mortalidad y mayor amenaza para la salud y la vida humana.
La diabetes es una enfermedad crónica con hiperglucemia como signo principal. Es causada principalmente por una disminución en la secreción de insulina causada por una disfunción de las células de los islotes pancreáticos, o porque el cuerpo es insensible a la acción de la insulina (es decir, resistencia a la insulina), o ambas cosas. la glucosa en la sangre es ineficaz para ser utilizada y almacenada.
El sistema digestivo es uno de los nueve sistemas principales del cuerpo humano y es el principal responsable de la ingesta, digestión, absorción y excreción de los alimentos. Consta de dos partes principales: el tracto digestivo y las glándulas digestivas.
El cáncer de pulmón es un tumor maligno que se origina en la mucosa bronquial o las glándulas de los pulmones. Es uno de los tumores malignos con mayor morbilidad y mortalidad y mayor amenaza para la salud y la vida humana.
La diabetes es una enfermedad crónica con hiperglucemia como signo principal. Es causada principalmente por una disminución en la secreción de insulina causada por una disfunción de las células de los islotes pancreáticos, o porque el cuerpo es insensible a la acción de la insulina (es decir, resistencia a la insulina), o ambas cosas. la glucosa en la sangre es ineficaz para ser utilizada y almacenada.
El sistema digestivo es uno de los nueve sistemas principales del cuerpo humano y es el principal responsable de la ingesta, digestión, absorción y excreción de los alimentos. Consta de dos partes principales: el tracto digestivo y las glándulas digestivas.
Statistics and Probability
Sampling and Estimation
sampling
type
1. stratified random sampling
2. simple random sampling
Group first and then simply sample
Data
1. time-series data
If there are structural changes during the period, this may lead to bias
2. cross-sectional data
error
1. data mining bias
Statistically significant does not mean it is supported by economic theory
2. sample selection bias
survival bias
survivorship bias
3. look-ahead bias
Uses data that is not yet available to predict
4. time-period bias
It can only be established within a specific period of time and is not generalized to all periods.
estimate
point estimate
Use estimators to estimate population parameters
Desirable properties of estimator Criteria for evaluating estimators
unbiasedness
The expected value of the statistic is equal to the population parameter
efficiency
Among unbiased sample statistics, the variance is the smallest
consistency
When the sample size increases, the probability that the sample statistics approximates the population parameters increases.
Use the sample mean to estimate the optimal population parameters
Central limit theorem
Describe the probability distribution of the mean of a statistic
condition
n >= 30
The mean of the population, the variance is known and finite
for simple random sampling
in conclusion
The sample statistic mean obeys [normal distribution]
The sample statistic [mean] is the population [mean]
Variance of sample statistic mean
Standard error of the sample statistic mean standard error =
Pay attention to the distinction between [standard deviation] and [standard error]
confidence interval estimate
A measure of how confidently a point estimate gives an estimated value
5% confidence interval = 95% confidence interval
Selection of confidence factors Z distribution (mean 0, variance 1) & T distribution (replaced by the variance of the sample)
hypothetical test
step
null hypothesis & alternative hypothesis
Determine statistics
The sample mean follows normal distribution after standardization
significance & critical value
The key value is the critical value that determines whether to reject the null hypothesis.
two-tailed test & one-tailed test
Their test statistics are all the same, the main difference is the rejection region
p-value
Minimum significance level to reject the null hypothesis
Then, reject the null hypothesis and vice versa
Type 1 error & Type 2 error
decision making
statistical significance & economic significance
Hypothesis test for normal population
mean
single
Compare with a constant
Choose z test or t test (degrees of freedom n-1) according to the situation
two
Independent
The variance is unknown, assuming that [variance 1] and [variance 2] are equal
The degrees of freedom are n1 n2-2
The variance is unknown, assuming that [variance 1] and [variance 2] are not equal
Degrees of freedom are very complex
paired comparison test
Check whether there is a relationship
Sampling is the subtraction of the means of two samples, n1-m1=U1, U1=U0, U1 is not equal to U0, and U0 is usually 0
Use t test with n-1 degrees of freedom
Correlation coefficient
p=0, there is no linear relationship, p is not equal to 0, there is a linear relationship
(-1~1)
Subject to t-test, the degrees of freedom are n-2 because there are two variables
variance
single
Is it equal to a certain constant?
Chi-Square Statistics Chi-Square
Degrees of freedom are n-1
two
Are the two variances equal?
F distribution, degrees of freedom are n1-1, n2-1
The one with the largest variance is the numerator, and the first degree of freedom is the numerator. Therefore, the F value is >=1
Parametric and non-parametric tests
parameter
are related to the overall parameters
It is assumed that the population obeys a certain distribution
non-parametric
The overall distribution is unknown, and the sample data does not obey a specific distribution.
Data is classified according to rank, and cannot be added, subtracted, multiplied or divided.
Does not involve overall parameters
Probability distributions
continuous random variable
Taking any single point, the probability is 0
probability density function PDF; In the probability density function, the focus is on the values in a certain interval
cumulative distribution function CDF
bounded
Applications of discrete random variables
bernouli distribution
I did an experiment and there were only two results
binomial random variable
After many experiments, there are only two results
The results are all independent, with the same probability.
Continuously uniform distribution
shortfall risk
is a probability
Roy's safety-first ratio
The higher the ratio, the better
When Shorfall risk R is equal to the risk-free rate R, SF ratio is Sharp ratio
Lognormal distribution
Describe asset prices
non-negative number
Positive skew (normal distribution shifted to the right)
The random variable obeys the normal distribution, then the logarithm x also obeys
t-distribution
Small sample inference about the population
df=n-1
The confidence interval is wider than the normal distribution
Low peaks and fat tails mean=0, variance>1, kurtosis>3
multivariate distribution multivariate distribution
multiple assets
3 parameters: mean, variance of each asset, correlation coefficient between different assets
Assuming n assets, the correlation coefficient is
monte carlo simulation
Assume that r obeys the normal distribution, sample the possible values of r, and then simulate it to get the final probability, and then conduct scenario analysis
Disadvantages: Complex, if the assumptions are incorrect, the conclusions drawn will also be incorrect.
historical/back simulation
Make predictions based on past historical data
Disadvantages: If there are structural changes, it is difficult to be accurate, after all, it is historical data
Probability
odds
joint probability
P(AB) = P(A|B)*P(B)
Addition rule
P(A or B)=P(A) P(B)-P(AB)
total probability rule
Mutually exclusive and traversable
P (success) = 90% * 80% success if you review hard, 10% * 10% success if you don't review hard
Bayes' formula
cause and effect
Modify probabilities when new information becomes available
Expect E(x)
The variance of random variables is actually an expectation
covariance
Both assets change direction
When cov=0, there is no linear relationship
The value ranges from negative infinity to positive infinity, regardless of dimension.
correlation
(-1~1)
<0 negative correlation
>0Positive correlation
The larger the absolute value, the more obvious the correlation
=0 no linear relationship
Arrangement and combination
In order
No order
Basic concepts of statistics
Four measurement scales
nominal sale
men and women
ordinal sale
first place, second place
interval scale
first grade, second grade
ratio scale
Scored 99 points on the exam
Two commonly used graphs representing frequency
histogram
frequency polygon
central tendency
mean
arithmetic mean
geometric mean
Measure the average return on assets over multiple periods
harmonic mean
The average cost of fixed investment
weighted mean
median
n is an odd number
(n 1)/2
n is an even number
The average of n/2 and (n 2)/2
mode
highest frequency
one
unimodal
two
bimodal
three
trimodal
does not exist
When the data are not equal