MindMap Gallery Statistics super focus (1)
This is a mind map about the super key points of statistics (1), including methods of collecting statistical data, The organizational form of statistical surveys, Statistical research objects and other contents.
Edited at 2023-12-03 01:20:47El cáncer de pulmón es un tumor maligno que se origina en la mucosa bronquial o las glándulas de los pulmones. Es uno de los tumores malignos con mayor morbilidad y mortalidad y mayor amenaza para la salud y la vida humana.
La diabetes es una enfermedad crónica con hiperglucemia como signo principal. Es causada principalmente por una disminución en la secreción de insulina causada por una disfunción de las células de los islotes pancreáticos, o porque el cuerpo es insensible a la acción de la insulina (es decir, resistencia a la insulina), o ambas cosas. la glucosa en la sangre es ineficaz para ser utilizada y almacenada.
El sistema digestivo es uno de los nueve sistemas principales del cuerpo humano y es el principal responsable de la ingesta, digestión, absorción y excreción de los alimentos. Consta de dos partes principales: el tracto digestivo y las glándulas digestivas.
El cáncer de pulmón es un tumor maligno que se origina en la mucosa bronquial o las glándulas de los pulmones. Es uno de los tumores malignos con mayor morbilidad y mortalidad y mayor amenaza para la salud y la vida humana.
La diabetes es una enfermedad crónica con hiperglucemia como signo principal. Es causada principalmente por una disminución en la secreción de insulina causada por una disfunción de las células de los islotes pancreáticos, o porque el cuerpo es insensible a la acción de la insulina (es decir, resistencia a la insulina), o ambas cosas. la glucosa en la sangre es ineficaz para ser utilizada y almacenada.
El sistema digestivo es uno de los nueve sistemas principales del cuerpo humano y es el principal responsable de la ingesta, digestión, absorción y excreción de los alimentos. Consta de dos partes principales: el tracto digestivo y las glándulas digestivas.
Statistics super focus (1)
The most important:
How to collect statistical data:
1. Direct observation method, investigators go to the scene to investigate in person
2. Interview method, the investigators ask the respondents item by item according to the investigation outline.
3. Reporting method, providing statistical data according to certain reporting procedures
4. Questionnaire method, asking questions in the form of answer sheet
5. Experimental method, conducting experiments on investigation objects in special experimental locations and under special conditions.
Organizational form of statistical survey:
1. Comprehensive investigation; investigate all units subject to investigation one by one
Features: wide range, high consumption
2. Census; a one-time comprehensive survey specially organized
The research object of statistics: The research object of socioeconomic statistics is the quantitative aspect of a large number of socioeconomic phenomena as a whole (that is, the study of the quantitative characteristics and quantitative relationships of the socioeconomic phenomenon as a whole)
(Short answer) Characteristics of statistical research objects:
(1) Quantitativeness: Understanding the nature and laws of things quantitatively is the basic feature of statistical research; statistical research is not about abstract quantities, but concrete quantities with specific content. Statistics is the study of specific quantities that are closely related to the content and nature of the phenomenon being studied under qualitative stipulations.
(2) Totality: Statistics takes the quantitative characteristics of the overall phenomenon as its research object. Statistics requires a large number of observations and comprehensive analysis of the facts prevalent in each unit in the population to obtain quantitative characteristics that reflect the overall phenomenon.
(3) Variability: The characteristic characteristics of each unit in the population have different performances due to complex random factors, which is the premise of statistical research.
Main contents of the investigation plan
1. Determine the purpose of the investigation (why)
2. Determine the survey object (population) and survey unit (individual) Who)
3. Develop a survey outline (What)
Necessary investigation contents determined based on the purpose of the investigation
Determine the time (When) and location (Where) of the investigation
Descriptive statistics and inferential statistics:
1. Descriptive statistics: collection, organization, display and analysis of statistical data (data that reflects objective phenomena)
2. Inferential statistics: Use sample information and probability theory to estimate and test the quantitative characteristics of the population, etc.
The meaning of statistics and the two relationships
Three meanings: (1) statistical work, (2) statistical data (3) statistics (science)
Statistics is a methodological science about the collection, processing, identification, analysis and inference of various data information.
1) Statistical work: investigation and research
2) Statistics: work results
3) Statistics: A methodological science that studies how to collect, organize, and analyze data.
Statistical work and statistical data are the relationship between work and work results, and statistical work and statistics are the relationship between practice and theory.
1. The result of statistical work is statistical data
2. The basis of statistical data and statistical science is statistical work
3. Statistical science is not only the theoretical summary of statistical work experience, but also the principles, principles and methods that guide statistical work.
Theoretical statistics and applied statistics (according to the degree of research and application of statistical methods)
Theoretical statistics is a statistical method system centered on methodology.
Applied statistics is the problem-centered application of statistical methods to solve practical problems.
Group median calculation
Group median - The midpoint value between the upper and lower limits to represent the general level of each group of marker values.
Median value of closed group = (upper limit lower limit) ÷ 2
The median value of the missing lower limit open group = the upper limit - 1/2 the distance between adjacent groups,
The median value of the missing upper limit opening group = the lower limit 1/2 the distance between neighboring groups.
The relationship between the group spacing and the number of groups in an equidistant sequence
When compiling an equidistant sequence, the formula for calculating the group distance is: group distance = full distance/number of groups, so the size of the group distance is related to the number of groups
Frequency or frequency - the number of units distributed in each group
Frequency or ratio - the relative number formed by the ratio of each set of reps to the total reps
upward cumulative frequency (or rate) distribution
It is to first list the upper limit of each group, and then accumulate the frequency (or frequency) from the group with the low flag value to the group with the high flag value.
The upward cumulative frequency of a certain group indicates the sum of the number of units in each group below the upper limit of the group. The upward cumulative frequency of a certain group indicates the proportion of the sum of the number of units in each group below the upper limit of the group to the total number of units.
Accumulate frequency (or frequency) distribution downward: first list the lower limit of each group, and then accumulate frequencies (or frequency) sequentially from the group with high flag value to the group with low flag value.
Total indicator: an indicator used to indicate the total scale, total level or workload of social and economic phenomena at a certain time, place and condition.
Relative indicator classification and meaning
Relative indicators, also known as statistical relative numbers, are the ratio of two related statistical indicators, reflecting the quantitative relationship between things.
Relative indicators are divided into: structural relative indicators, proportional relative indicators, intensity relative indicators, dynamic relative indicators, comparative relative indicators and plan relative indicators.
Determining the mode of a range sequence
The method is to observe the number of times and proceed in two steps:
1 First determine the group where the most times data is located
2 Use the proportional interpolation method to calculate the mode approximation
Chapter One
(Re)Statistical indicator concepts: signs and indicators
A statistical mark, referred to as a mark, refers to the name of an attribute or characteristic possessed by an overall unit.
Statistical indicators: Concepts and values that comprehensively reflect the overall quantitative characteristics.
The relationship between signs, indicators and variables:
(Re)The difference between signs and indicators:
(1) The sign explains the overall single characteristic indicator and explains the overall characteristics;
(2) Marks include quality marks that cannot be expressed by numerical values and quantity marks that can be expressed by numerical values. All indicators are expressed by numerical values.
The connection between signs and indicators:
(1) The values of some statistical indicators are summarized from the quantitative indicator values of the overall units. For example, the total grain output of the county is summarized from the grain output of each township to which it belongs.
(2) There is a transformation relationship between the two. If the research purpose changes, such as the original overall becomes the overall unit, the corresponding statistical index will become a quantitative indicator, and vice versa. (Variables are variable quantity signs and all statistical indicators, and all flag values and indicator values are variable values. Some variable quantity signs are not statistical indicators, but they belong to variables. In the quantity signs, they are unchanged Quantity flags are called constants or parameters, variable quantity flags are called variables)
(Heavy) Overall concepts and understanding of statistics
The statistical population, referred to as the population, refers to an organic whole composed of many individual things (units) that exist objectively and have a certain common property.
feature:
Homogeneity: All units in the population have some common properties.
Massiveness: The population always contains all or a sufficient number of units.
Variability: There are differences between units in the population.
Quality indicator concepts and understanding
Including relative indicators and average indicators, statistical indicators that reflect the overall relative level of the phenomenon or the quality of work, such as population density, average salary, etc.
sequence variable
An ordinal variable is a type of variable that distinguishes variables in hierarchical order among cases of the same category.
(1) Classify things and give the order of each category
(2) More accurate than the classification scale
(3) Accurate differences between categories are not measured
(4) Data is presented as "category", but in order
(5) has > or
Chapter two
Concepts such as survey objects, survey units, survey items, reporting units, etc.
Survey object: refers to the overall phenomenon to be investigated, that is, the statistical population.
Survey unit: the overall unit (individual) of the survey and the person responsible for the survey content.
Reporting unit: The unit responsible for providing statistical data according to the specified date and format, that is, the unit that fills in and reports statistical data.
Investigation items: Necessary investigation contents determined according to the purpose of the investigation
Investigation time
Investigation time: Specify the time to which the investigation data belongs
third chapter
Statistical grouping
Statistical grouping is based on the purpose of statistical research and the characteristics of the research object, dividing each unit of the statistical population into several parts or groups with different properties according to a certain sign.
Statistical grouping function
(1) Classification of phenomenon types
(2) Overall structure of the study
(3) Study the dependence between phenomena
The relationship between the arithmetic mean, median, and mode in frequency distribution (relationship characteristics under each distribution)
Arithmetic mean: The middle value of each group of values in the frequency distribution histogram is multiplied by the frequency and then added.
Median: the value corresponding to half the area of the frequency distribution histogram
Mode: The mode is the group median of the group with the highest frequency of sample observations in the frequency distribution table.
Overall average
Population mean definition: The average of all individuals in the population is called the population mean.
Chapter Four
The nature of each indicator in the signature variation indicator
(1) The concept of full range: The full range is also called the range, which is the difference between the maximum value and the minimum value of a set of data.
(2) The concept of mean difference: the arithmetic mean of the absolute value of the deviation of each variable value from its mean value.
(3) Concepts of variance and standard deviation: The variance is the average sum of the squares of the differences between each data and the mean.
The standard deviation is the arithmetic square root of the mean square root of the sum of the deviations of each unit's sign value in the population and its arithmetic mean.
(4) Coefficient of variation: The commonly used indicator coefficient of variation is the standard deviation coefficient. That is, the greater the coefficient of variation index, the greater the relative degree of dispersion of each unit in the population, and the less representative the overall average is; conversely, the overall average is The more representative the number, the better.
Understanding plan completion indicators
The relative index of plan completion degree is referred to as "plan completion degree indicator" or "plan completion percentage". It is the result of comparing the actual completion value of social and economic phenomena in a certain period with the planned task value, and is generally expressed as a percentage. ·
chapter Five
(Re)Sampling Survey Concepts and Characteristics
A sample survey is a non-comprehensive survey. It is to select some units from all research objects according to the random principle as representatives of the whole for investigation.
Features
(1) It is a non-comprehensive investigation;
(2) Select investigation units for investigation based on the random principle;
(3) The purpose is to understand the overall comprehensive situation.
The random principle of sampling survey
The random principle means that when selecting survey units, the sampling of sample units is not affected by the subjective factors of the investigators and other systematic factors, and completely eliminates the influence of people's subjective consciousness, so that every unit in the population has an equal chance of being sampled. The chance of winning is purely a matter of chance. The random principle is the basic principle that random sampling must follow.
Understanding Sampling Error
The concept of sampling error: Under the condition of observing the principle of randomness, unavoidable errors arising from the sample indicators representing the overall indicators, excluding registration errors and systematic errors (generated by violating the principle of randomness, such as consciously selecting more good units) .
The relationship between the error limit and the number of sampling units (mainly under simple repeated sampling conditions)
The greater the number of sampling units, the smaller the average sampling error; conversely, the larger the average sampling error.
Sampling average error, sampling limit error concept, probability concept
Sampling mean error: the standard deviation of the sampling mean (or sampling number). It reflects the average dispersion between the sample mean (or sampling percentage) and the population mean (or population percentage).
Sampling limit error: also known as allowable error, it is the maximum allowable error between the sample index and the overall index determined by the investigator based on the requirements for the reliability of the sampling inference results. Generally expressed by Δ.
The probability degree t reflects the relative degree of the limit error, which is numerically equal to the multiple of the limit error as the average sampling error.
Chapter VII
Positive correlation
Positive correlation: means that two variables change in the same direction. When one variable changes from large to small or from small to large, the other variable also changes from large to small or from small to large.
Determine the degree of correlation based on the correlation coefficient
r>0 is positive correlation, r
|r|=1 means completely linear correlation; 0
|r|≤ 0.3 indicates weak correlation; 0.3< |r|≤ 0.5 indicates low correlation;
0.5<|r|≤0.8 is significant correlation; 0.8<|r|<1 is high correlation.
(Re)relationship between regression coefficient and correlation coefficient
If the regression coefficient is greater than zero, the correlation coefficient is greater than zero; if the regression coefficient is less than zero, the correlation coefficient is less than zero. (Their value signs are the same)
1. The difference between correlation coefficient and regression coefficient
1. Different meanings: Correlation coefficient: It is a quantity that studies the degree of linear correlation between variables.
Regression coefficient: A parameter that represents the influence of the independent variable x on the dependent variable y in the regression equation.
2. Different applications: Correlation coefficient: illustrates the correlation between two variables.
Regression coefficient: illustrates the quantitative relationship between dependent changes between two variables.
3. Different units: Correlation coefficient: generally represented by the letter r, r has no unit
Regression coefficient: Generally expressed by slope b, b has units.
2. The relationship between regression coefficient and correlation coefficient:
1. If the regression coefficient is greater than zero, the correlation coefficient is greater than zero.
2. If the regression coefficient is less than zero, the correlation coefficient is less than zero.
The connections and differences between correlation analysis and regression analysis
connect :
1. The theory and methods are consistent;
2. If there is no correlation, there will be no regression. The higher the degree of correlation, the better the regression;
3. The correlation coefficient and the regression coefficient have the same direction and can be calculated from each other.
the difference :
1. In correlation analysis, x and y are equivalent; in regression analysis, x and y must determine the independent variable and dependent variable;
2. In correlation analysis, x and y are both random variables. In regression analysis, only y is a random variable.
3. Correlation analysis measures the degree and direction of correlation. Regression analysis can not only reveal the impact of variable x on variable y, but can also use regression models for prediction and control.
One variable linear regression equation can be written and used
Univariate linear regression can be used for predictive analysis.
chapter eight
average growth rate concept
Average growth rate: describes the average growth rate of a phenomenon over a period of time. It is equal to the average development speed minus 1.
Level analysis and speed analysis in time series
Level analysis includes development level, average development level, growth level and average growth level.
Speed analysis includes development speed and growth speed, average development speed and average growth speed.
Factors affecting time series: irregular changes
cycle changes
long term trend
seasonal changes
Chapter nine
What index forms do the comprehensive index of quality indicators and the comprehensive index of quantity indicators adopt?
(1) Quantitative indicator index: When compiling the quantitative indicator index, the quality indicator in the base period should be used as the same measurement factor.
(2) Quality indicator index: When compiling the quality indicator index, the quantitative indicators during the reporting period should be used as the same measurement factor.
Distinguish between average indicator, average indicator index, and average index
1: The average index is a comprehensive index that reflects the general level of each unit of a homogeneous population under certain conditions at a certain time and place. It is a representative value of the uneven sign values of each unit in the population, and is also a measurement of the central tendency of variable distribution.
2: The average indicator index is a relative number calculated by comparing the average indicator values of quantities under the conditions of two different periods of the same economic phenomenon. It illustrates the direction and degree of changes in the overall average level in two periods.
3: The average index is an index calculated by using the weighted average of individual indices.
The calculated relationship between each index in the index system
In the index system, the quantitative relationship between the total index and each factor index is expressed as the total index is equal to the product of each factor index, and the change difference of the total index is equal to the sum of the change differences of each factor index.
Index classification
Classification by object scope: overall index and individual index
According to the nature of the index index: quantitative index index and quality index index
Divided according to different base periods: chain index and fixed base index
The concept of statistical grouping is a statistical organization method that divides the overall socio-economic phenomenon into several parts or groups according to a certain mark based on the purpose and requirements of statistical research and the inherent differences of the population.
(Heavy) The meaning and types of relative indicators: The meaning of relative indicators: It explains the quantitative relationship formed by the relative comparison between one value and another value.
There are four commonly used types: structural relative indicators, comparative relative indicators, intensity relative indicators, and dynamic relative indicators.
Flags the nature of each indicator in the compilation indicators: compilation time, memory usage, CPU usage, number of compilation errors, etc. These indicators can help developers understand the performance and efficiency of the compilation process and optimize the code and compilation process.
(Heavy) Time point and period:
1. The value of the period indicator is registered continuously, while the value of the time point indicator is obtained by discontinuous counting at a certain point in time;
2. The value of the period indicator is cumulative, but the value of the time point indicator is not cumulative;
3. The value of the period indicator is directly related to the length of the registered period, while the value of the time point indicator has no relationship to the length of the registered period.