What type of statistics to use




















Variability refers to a set of statistics that show how much difference there is among the elements of a sample or population along the characteristics measured, and includes metrics such as range , variance , and standard deviation.

The distribution refers to the overall "shape" of the data, which can be depicted on a chart such as a histogram or dot plot, and includes properties such as the probability distribution function, skewness, and kurtosis. Descriptive statistics can also describe differences between observed characteristics of the elements of a data set. Descriptive statistics help us understand the collective properties of the elements of a data sample and form the basis for testing hypotheses and making predictions using inferential statistics.

Inferential statistics are tools that statisticians use to draw conclusions about the characteristics of a population, drawn from the characteristics of a sample, and to decide how certain they can be of the reliability of those conclusions. Based on the sample size and distribution statisticians can calculate the probability that statistics, which measure the central tendency, variability, distribution, and relationships between characteristics within a data sample, provide an accurate picture of the corresponding parameters of the whole population from which the sample is drawn.

Inferential statistics are used to make generalizations about large groups, such as estimating average demand for a product by surveying a sample of consumers' buying habits or to attempt to predict future events, such as projecting the future return of a security or asset class based on returns in a sample period.

Regression analysis is a widely used technique of statistical inference used to determine the strength and nature of the relationship i. The output of a regression model is often analyzed for statistical significance , which refers to the claim that a result from findings generated by testing or experimentation is not likely to have occurred randomly or by chance but is likely to be attributable to a specific cause elucidated by the data.

Having statistical significance is important for academic disciplines or practitioners that rely heavily on analyzing data and research. Descriptive statistics are used to describe or summarize the characteristics of a sample or data set, such as a variable's mean, standard deviation, or frequency. Inferential statistics, in contrast, employs any number of techniques to relate variables in a data set to one another, for example using correlation or regression analysis. These can then be used to estimate forecasts or infer causality.

Statistics are used widely across an array of applications and professions. Any time data are collected and analyzed, statistics are being done. This can range from government agencies to academic research to analyzing investments. Economists collect and look at all sorts of data, ranging from consumer spending to housing starts to inflation to GDP growth. In finance, analysts and investors collect data about companies, industries, sentiment, and market data on price and volume.

Together, the use of inferential statistics in these fields is known as econometrics. Trading Basic Education. Financial Analysis. Advanced Technical Analysis Concepts. Actively scan device characteristics for identification. Use precise geolocation data. Select personalised content. Create a personalised content profile. Measure ad performance.

Try out PMC Labs and tell us what you think. Learn More. Bose Road, Kolkata - , India E-mail: moc. Today statistics provides the basis for inference in most medical research. Yet, for want of exposure to statistical theory and practice, it continues to be regarded as the Achilles heel by all concerned in the loop of research and publication — the researchers authors , reviewers, editors and readers.

Most of us are familiar to some degree with descriptive statistical measures such as those of central tendency and those of dispersion. However, we falter at inferential statistics.

This need not be the case, particularly with the widespread availability of powerful and at the same time user-friendly statistical software. As we have outlined below, a few fundamental considerations will lead one to select the appropriate statistical test for hypothesis testing.

However, it is important that the appropriate statistical analysis is decided before starting the study, at the stage of planning itself, and the sample size chosen is optimum.

These cannot be decided arbitrarily after the study is over and data have already been collected. The great majority of studies can be tackled through a basket of some 30 tests from over a that are in use.

The test to be used depends upon the type of the research question being asked. The other determining factors are the type of data being analyzed and the number of groups or data sets involved in the study.

The following schemes, based on five generic research questions, should help. Question 1: Is there a difference between groups that are unpaired? Groups or data sets are regarded as unpaired if there is no possibility of the values in one data set being related to or being influenced by the values in the other data sets.

Different tests are required for quantitative or numerical data and qualitative or categorical data as shown in Fig. For numerical data, it is important to decide if they follow the parameters of the normal distribution curve Gaussian curve , in which case parametric tests are applied.

If distribution of the data is not normal or if one is not sure about the distribution, it is safer to use non-parametric tests.

When comparing more than two sets of numerical data, a multiple group comparison test such as one-way analysis of variance ANOVA or Kruskal-Wallis test should be used first. Repeatedly applying the t test or its non-parametric counterpart, the Mann-Whitney U test, to a multiple group situation increases the possibility of incorrectly rejecting the null hypothesis.

Tests to address the question: Is there a difference between groups — unpaired parallel and independent groups situation? Question 2: Is there a difference between groups which are paired? Pairing signifies that data sets are derived by repeated measurements e. Pairing will also occur if subject groups are different but values in one group are in some way linked or related to values in the other group e. A crossover study design also calls for the application of paired group tests for comparing the effects of different interventions on the same subjects.

Sometimes subjects are deliberately paired to match baseline characteristics such as age, sex, severity or duration of disease. A scheme similar to Fig. Descriptive statistics is the term given to the analysis of data that helps describe, show or summarize data in a meaningful way such that, for example, patterns might emerge from the data.

Descriptive statistics do not, however, allow us to make conclusions beyond the data we have analysed or reach conclusions regarding any hypotheses we might have made. They are simply a way to describe our data. Descriptive statistics are very important because if we simply presented our raw data it would be hard to visualize what the data was showing, especially if there was a lot of it. Descriptive statistics therefore enables us to present the data in a more meaningful way, which allows simpler interpretation of the data.

For example, if we had the results of pieces of students' coursework, we may be interested in the overall performance of those students. We would also be interested in the distribution or spread of the marks. Descriptive statistics allow us to do this. How to properly describe data through statistics and graphs is an important topic and discussed in other Laerd Statistics guides. Typically, there are two general types of statistic that are used to describe data:.

When we use descriptive statistics it is useful to summarize our group of data using a combination of tabulated description i. We have seen that descriptive statistics provide information about our immediate group of data. For example, we could calculate the mean and standard deviation of the exam marks for the students and this could provide valuable information about this group of students.

Any group of data like this, which includes all the data you are interested in, is called a population. A population can be small or large, as long as it includes all the data you are interested in.



0コメント

  • 1000 / 1000