Normal Distribution: Definition, Parameters, Characteristics, and Empirical Rules

Normal Distribution – In a probability distribution theory, the normal distribution occupies an important position in various statistical analyses. This type of distribution is also often used as material for calculating various phenomena in everyday life.

For example in calculating measurement errors , height, blood pressure, error calculations, to the translation of IQ scores with normal distribution cases as the main reference. Check out a more complete explanation regarding the Definition, Parameters, Characteristics of the Normal Distribution below:

Definition of Normal Distribution

Normal distribution as a type of distribution of continuous random variables. In the normal distribution itself there is a bell-shaped curve or graph. The normal distribution can also work as a Gaussian distribution. The normal distribution equation includes a density function. The normal distribution with this probability function will then show the variable or spread of the distribution. This function will also be proven by a symmetrical graph or bell curve.

While the distribution markers are even, this curve will also peak in the middle to be sloping on both sides with an equal value equation. This distribution theory is also known as the Gaussian distribution or the Gaussian distribution. The term refers to Carl Friedrich Gauss or a German mathematician who developed a distribution theory with a two-parameter exponential function in the period 1794-1809.

Even so, the initial theory was the forerunner that began to be developed by Abraham de Moivre in 1733. After knowing the parameters of the normal distribution theory as its main characteristics, it can be concluded that this theory also has an important position in the statistical concept of probability. Its application is also considered important because it can increase the objectivity of the assessment. This will also be very helpful in placing the most appropriate members in a particular group, for example when grouping employees or evaluating student scores.

Within the same criterion it also avoids the occurrence of biased or ordinary judgments in only one category. With a centered and symmetrical distribution on the average value of all data in a population, one-sided or unbalanced judgments can then be avoided. Also helps in determining the degree of normality and central tendency. In statistics, especially probability statistics, the normality of a data is an important thing that cannot be ignored. Through the theory applied by the Gauss distribution, the level of data normality or central tendency can also be determined more easily.

Normal Distribution Parameters

After knowing the main characteristics and parameters in the normal distribution theory, it can be concluded that this theory has an important position in the statistical concept of probability. Its application is also considered important for several reasons, ranging from increasing the objectivity of assessments, helping to place the most appropriate members for a particular group, evaluating student grades or grouping employees according to the same criteria to avoid bias or skewed assessments in only one category. .

With a symmetrical distribution centered on the average value of all data in a population, one-sided or unbalanced judgments cannot be avoided. The normal distribution will also help determine the degree of normality with a central tendency. In statistics, especially probability statistics, the normality of a data is an important thing that cannot be ignored.

Through the theory applied by the Gaussian distribution, the central tendency or level of normality of the data can also be determined more easily. Thus information regarding the normal distribution as well as the parameters and characteristics that complement its application. For those of you who are studying probability statistics or are looking for supporting theories in calculating data, then the information above can be used as a reference.

As in other distribution theories in probability statistics, the shape of the normal distribution probability value curve is determined by a number of parameters. For this distribution, there are two types of parameters which are then used as a reference, namely the mean or an average value with a standard deviation or standard deviation, along with the explanation:

  • The average value is generally used as a distribution center or other spread of values. This value will then determine the location of the peak point in a bell curve, while the other values ​​are intentionally spread out to follow the average.
  • The standard deviation as a calculation of variability functions as a determinant of the width of a normal distribution curve. This standard can also calculate how far the tendency of the data will widen from the average value as the center point. The smaller the standard deviation value, the curve will also have a more pointed shape. In addition, the standard deviation also serves to describe the general difference or distance between the mean and other observed data.
  • Population parameters versus sample estimates. The mean and standard deviation as parameter values ​​apply to the entire population. In a normal distribution, the statistician also denotes the parameters by using the Greek symbols μ (mu) for the population mean and σ (sigma) for the population standard deviation. However, population parameters are generally unknown because it is generally not possible to measure the entire population. On a random sample to calculate the estimate of this parameter can also be used. The statistician who represents the sample estimate of this parameter also uses x̅ for the sample mean and s for the sample standard deviation.
See also  difference between has and have

Characteristics of the Normal Distribution

The Importance of the Normal Distribution in terms of statistics. Some statistical hypothesis tests also assume that the data is deliberately made to follow a normal distribution. But in a parametric and nonparametric test, there are more than normally distributed data.

Both linear and nonlinear regression also assume that the residuals follow a normal distribution. The central limit theorem also states that as the sample size increases, the sampling distribution of the mean follows a normal distribution in the distribution on which the original variable is not normal. When showing a data distribution value, the normal distribution also has a number of main characteristics, including the following:

  • Distribution theory with the same mean, median, and mode. Because of this distribution is often also referred to as unimodal. This distribution curve can also be symmetrical with a bell shape or bell curve.
  • The peak point of the curve is at the average value. This value itself is right in the middle of the curve, while in distribution data it is located around a straight line drawn down from the midpoint.
  • The mean or average value with this standard deviation value will then determine the location and shape of the distribution.
  • The total area under the normal curve is 1, which is ½ on the right and ½ on the left. This is also true for all continuous probability distributions.
  • In the distribution curve, it can also be concluded that half of the population data will then have a value that is less than the average number, while some will have a much larger value.
  • Each of the tails curves on either side of this then extends ad infinitum. In some cases of calculating the distribution, the tail of the curve may even intersect the horizontal axis.

Empirical Rules for the Normal Distribution

For example, in a normal distribution, 68% of observations are within +/- 1 standard deviations from the mean, 95% are within +/- 2 standard deviations and 99.7% are within +/- 3 standard deviations from the mean. This property is also part of the Empirical Rule, which describes the percentage of data that falls within a given number of standard deviations from the mean for a bell-shaped curve.

Standard Normal Distribution and Standard Score

This distribution is also known as the Z distribution. Values ​​in the standard normal distribution are also known as the standard score or Z score. The standard score represents the number of standard deviations at the bottom or top of or below the mean decrease of a particular observation. For example, a standard score of 1.5 indicates that the observation is 1.5 standard deviations above the mean. On the other hand, a negative score represents a value below the average with an average Z score of 0.

Standardization: How to Calculate Z Values

This standardized scale also allows you to compare previous observations that seem difficult to make. This process is also known as standardization, and allows you to compare observations and calculate probabilities across different populations.

In standardizing your data you also need to convert raw measurements into Z-scores. To calculate the standard observation score, start from the raw size, subtract the mean, and then divide by the standard deviation. Mathematically, the formula for this process is something like this: Z=μ−x¯σ X and represents the raw value of the desired measurement.

$\mu$ and sigma also represent parameters for the population from which the observations are taken. After the data is standardized, the data in the standard normal distribution is also used. Using this method, standardization will allow you to compare different types of observations based on each observation being in its own distribution.

See also  Electronic Passport: Definition, Fees & Differences with Ordinary Passport

Standard Score for Comparing Male and Female Height

If we want to compare the height of male and female students. Specifically, let’s compare the heights. Imagine a man with an average height of 180 cm and a woman 160 cm. When comparing the raw scores, it is easy to see that the men are taller than the women.

However, let’s compare their standard scores. To find out about the properties of height distribution in the height of women and men. Assume the height follows a normal distribution with the following parameter values:

Human height $ \ mu $ = 180 $ \ sigma $ = 30 Female height $ \ mu $ = 160 $ ​​\ sigma $ = 10

Now we will calculate the Z score:

Men’s Z score = (170-175)/30 = -0.16666666666666666
Women’s Z score = (165-160)/10 = 0.5
Z-score for men (-0.1667), which means that the male sample has a smaller height than the average male . On the other hand, women have a positive Z-score (0.5). This means that the sample height of the women is higher than the average because their Z values ​​are described in the standard normal distribution.

Finding the Area Under the Normal Distribution Curve

The normal distribution is a probability distribution. As with probability distributions, the proportion of area under the curve between two points on a probability distribution plot indicates the probability that a value will fall in that interval. To learn more about this property, first understand what is meant by a Probability Distribution.

Usually, we will use statistical software to find the area under the curve. However, when working with the normal distribution and converting values ​​to standard scores, you can calculate areas and look up Z-scores in a Standard Normal Distribution Table. Because on an infinite number of different normal distributions, a publisher cannot then print a table for each distribution. However, we can convert values ​​from any normal distribution into Z-scores, then use that table of standard scores to calculate probabilities.

p_bottom -0.4986501019683699
p_top 0.2475074624530771
Area Under Curve = p_above – p_bottom = 0.746157564421447
Normal Distribution (mean,std): 0 1
Integration of the curve between -3 and 0.66666666666666446 –> 0.746574571

Book Recommendations

1. Introduction to Research Statistics

The work of Nila Kesumawati, Dr. Nila Kesumawati, M.Si The presentation of concepts and examples of basic statistical calculations in this book is presented in a simple and organized manner that makes it easier for readers to understand how to analyze data from educational research results. The existence of a concept map at the beginning of each chapter helps the reader in mapping out the divisions, differences as well as the linkages between the material in an outline.

This book is very good as teaching material in basic statistics courses and even research statistics because the material in this book is arranged hierarchically and adapted to the syllabus in tertiary institutions. This book can also help students in preparing their final assignments even for researchers who conduct qualitative research.

2. Statistics for Modern Economics and Finance 3rd Edition

This is the work of Book 1 by Suharyadi, Purwanto SK Along with economic and financial developments, statistics are used to see the business cycle and the influence of economic policies on the country’s economy. Statistics for Modern Economics and Finance 3rd Edition is specially designed for modern economics and finance by applying statistical theory to actual cases.

In this edition, there are several updates to examples and cases that are more contextual. This book is divided into 2 books (volumes). Book 1 emphasizes descriptive statistics which consists of 10 chapters and is planned to be completed in 1 semester for Statistics 1 course. Book 2 emphasizes inferential statistics which also consists of 10 chapters and can be completed in 1 semester for Statistics 2 course.

3. Statistical Functions for Analyzing Data

Not only contains statistical functions, but also all supporting functions that allow us to work in the field of statistics. Includes data preparation and data processing before calculations are performed using statistical functions. You also need to know that if you work with Excel 2010, it turns out that many Excel function names have been changed; adapted to make the name consistent with its use. Statistical functions are one of the groups that have received the most improvements.

Obviously this is very convenient for Excel users because the names that used to feel awkward and difficult to recognize and memorize are now names that are more consistent with their meaning. BINOM.DIST Example for the Binomial Distribution; BINOM.INV for Inverse (reverse value) of Binomial. Likewise GAMMA.DIST for Gamma Distribution and GAMMA.INV for GAMMA inverse values. Then the VAR.P and VAR.S functions to calculate the variance of the entire population and the variance of a number of samples. And many more positive changes to Excel function names.