MindMap Gallery Social statistics mind map
This is a mind map about social statistics. Social statistics is an applied branch. It systematically collects, organizes, analyzes and presents data on human behavior in the social environment to reveal the nature of the data.
Edited at 2023-11-08 18:00:28El cáncer de pulmón es un tumor maligno que se origina en la mucosa bronquial o las glándulas de los pulmones. Es uno de los tumores malignos con mayor morbilidad y mortalidad y mayor amenaza para la salud y la vida humana.
La diabetes es una enfermedad crónica con hiperglucemia como signo principal. Es causada principalmente por una disminución en la secreción de insulina causada por una disfunción de las células de los islotes pancreáticos, o porque el cuerpo es insensible a la acción de la insulina (es decir, resistencia a la insulina), o ambas cosas. la glucosa en la sangre es ineficaz para ser utilizada y almacenada.
El sistema digestivo es uno de los nueve sistemas principales del cuerpo humano y es el principal responsable de la ingesta, digestión, absorción y excreción de los alimentos. Consta de dos partes principales: el tracto digestivo y las glándulas digestivas.
El cáncer de pulmón es un tumor maligno que se origina en la mucosa bronquial o las glándulas de los pulmones. Es uno de los tumores malignos con mayor morbilidad y mortalidad y mayor amenaza para la salud y la vida humana.
La diabetes es una enfermedad crónica con hiperglucemia como signo principal. Es causada principalmente por una disminución en la secreción de insulina causada por una disfunción de las células de los islotes pancreáticos, o porque el cuerpo es insensible a la acción de la insulina (es decir, resistencia a la insulina), o ambas cosas. la glucosa en la sangre es ineficaz para ser utilizada y almacenada.
El sistema digestivo es uno de los nueve sistemas principales del cuerpo humano y es el principal responsable de la ingesta, digestión, absorción y excreción de los alimentos. Consta de dos partes principales: el tracto digestivo y las glándulas digestivas.
Test of consistency of two population distributions
Test of two related samples
for a group of individuals Two phases before and after the trip Obtained from the same measurement Two sets of data obtained
Parametric test: t test Calculate the test statistic t/Z, based on the principle of small probability
For small samples, the prerequisite is that the entire population must obey a normal distribution.
Non-parametric test
signed test, signed rank test (Statistical thinking, if there is no change in the population between pre- and post-test, Then the difference between the two samples mainly comes from random error, is randomly generated, so n =n-, positive rank sum = negative rank sum) The purpose of testing is achieved by comparing the consistency of the distribution of two measurements. Inspection total construction quantity: T=min(T , 丨T-丨)
Test of two independent samples
Parametric test
t-test Use the difference between the eigenvalues of the two samples to Test the difference between the eigenvalues of two populations
1. With the help of mean (The research variable is a scale variable))
Test for the difference between two population means Large sample - test statistic: Z Small sample - test statistic: t (The two populations must obey normal distribution) and have equal variances
2.With the help of frequency (Research variables are categorical variables))
Method to test the frequency difference between two populations using the frequency difference between two samples
Non-parametric test
Statistical thinking: two populations have the same distribution, The sample distribution will not be much different, After mixed ranking, the results of the two samples will appear alternately, so there will be a journey, The rank sum will not be too large or too small
rank sum test
Test statistic: T (rank sum)
run test
Test statistic: r (number of runs)
cumulative frequency test
social statistics
Base
Probability and Probability Distributions of Random Variables
Classical concepts, probability addition formula, multiplication formula, conditional probability
Probability distribution of discrete random variables
Central tendency and dispersion trend E(X)=Σxipi D(X)=Σ[xi-E(X)]²pi
two-point distribution, binomial distribution, Hypergeometric distribution, Poisson distribution
Poisson distribution, used to describe unit time (can also be unit area) Probability distribution of the number of occurrences of random events within
Probability distribution of continuous random variables
normal distribution, standard normal distribution z-score, chi-square distribution, t-distribution
The law of large numbers, central limit theorem and sampling distribution
Law of large numbers: ①When n is large enough, frequency ≈ probability ② When n is large enough, the sample mean can be used to estimate the population mean μ
Central limit theorem: n is large enough, the mean of all sample combinations forms a normal distribution The normal distribution has mean μ and variance σ²/n.
Sampling distribution of sample mean
Sampling distribution of sample frequencies
P~N(π,π(1-π)/n) P-π/√P(1-P)/n~N(0,1)
Sampling distribution of sample variance
Parameter Estimation
point estimate
Calculated from sample observations sample eigenvalues to estimate the unknown total body eigenvalues
interval estimate
Basic principles of hypothesis testing
According to the principle of small probability
Use the X pull statistic
Use the z statistic
Use p-value
Hypothesis testing two types of errors
discard true error
The null hypothesis is correct, The test result was negative Null hypothesis
false error
The null hypothesis is false, but it is confirmed. The size of the false probability is related to the closeness of the null hypothesis to the true population. ·β=P(x1≤X≤x2)=P(z2≤Z≤z2) ·When z-score is used, the mean of the real population should be used. z1=(x1-μ1)/s/√n ☆: The pseudo probability cannot be calculated when the overall mean is unknown.
Basics of Probability Theory➕Basics of Inferential Statistics
Testing the distribution characteristics of a single population
Verify using conditions for parametric testing Whether established
Testing the distribution characteristics of categorical variables
Chi-square test
H0:πi=πi0 H1:πi≠πi0 πi0, is based on the distribution to be tested The overall features calculated for each category frequency, so as to find the expected frequency, Find the test statistic
Testing of distribution characteristics of scale variables
When conducting a two-sample t-test, Analysis of variance and regression analysis The premise is that the entire population obeys the normal distribution.
H0:X~N (x pull out, s²) H1: The population does not obey the normal distribution Calculate sample cumulative frequency and expectation The maximum absolute value of the distribution function difference Large value,D. Compare with Dα
PS: 1. The relationship between two categorical variables and hypothesis testing 2. The relationship between two scale variables and hypothesis testing
Relationship and test of two categorical variables
Contingency correlation coefficient
Construction based on proportional method of reducing error can be used as a measure of the relationship between two variables indicators of closeness
λ coefficient λ=(E1-E2)/E1
τ coefficient (defining E1 and E2 more accurately)
rank correlation coefficient
Spearman's rank correlation coefficient
Gamma rank correlation coefficient
Kendall τ coefficient
Somer's d coefficient
hypothetical test
Chi-square test
Test of contingency correlation coefficient (Φ coefficient, V coefficient)
Spearman's rank correlation coefficient test
Test of Gamma Rank Correlation Coefficient
Test of τc coefficient and d coefficient
Relationship and test of two scale variables
r (correlation coefficient)
R goodness of fit R²=r²
F-test (overall test of linear regression equation)
t-test (test of regression coefficients)
e (normality, homogeneity of variances, mean=0)
The relationship between categorical variables and scale variables and hypothesis testing
Parametric test
1. Average comparison
2. Statistical tables, bar charts and line charts
3.eta (correlation ratio)
Hypothesis testing - one-way analysis of variance
Test statistic, F=BSS/(m-1)/WSS/(n–m) H0: μ1=μ2=μ3=···μm H1: There is at least one category corresponding to the mean of the scale variable. Value is not equal to other categories
Equiskedasticity normality
Test of consistency of multiple population distributions
Testing of multiple independent samples (Analyzing a scale variable versus a relationship between categorical variables)
Non-parametric test
One-way rank analysis of variance
Test statistic: H H obeys the chi-square distribution with k-1 degrees of freedom.
Median test
Testing of Multiple Related Samples
two-way rank analysis of variance
kendall'sW test