MYD-I-Haftalık Ödevler

Haftalık Ödevleri ödev kitapçığına cevaplayıp zamanında (ders saatinde) getiriniz.

1. hafta: 22.02.12

Qualitative data involves assigning non-numerical items into groups or categories. Qualitative data also are referred to as categorical data. The qualitative characteristic or classification group of an item is an attribute. Categorical variables are typically assigned attributes using a nominal, ordinal, or binary scale.

Nominal variables are categorical variables that have three or more possible levels with no natural ordering.

Ordinal variables are categorical variables that have three or more possible levels with a natural ordering, such as strongly disagree, disagree, neutral, agree, and strongly agree.

Binary variables are categorical variables that have two possible levels (e.g., yes/no).

 2. hafta - 29.02.12


A population data set includes all items of the set, such as the height of every person in the United States, or the volume of every can of soda pop that a manufacturer produces. If the desired information is available for all items in the population, we have what is referred to as a census. In practice, we rarely have a complete set of data. We usually collect data in samples, such as the volumes of the last thirty cans of pop.

3. hafta - 07.03.12


To compute the population standard deviation, we use the population mean and divide by n instead of n-1. In practice, the population standard deviation is rarely used because the true population mean is usually unknown. The use of the sample standard deviation is particularly important for smaller sample sizes. However, as the sample size gets large (say n > 100), the difference between dividing by n versus n-1 may become negligible. The typical notation used to represent the sample standard deviation is

S; the Greek letter s is used to represent the population standard deviation. The terms, S or sˆ , represent the estimate of the population standard deviation.


4. hafta - 14.03.12


First, arrange the data into frequency or bin ranges of equal width. The selection of the number and width of the bins (frequency ranges) is dependent on the analyst. For continuous data, a general rule of thumb is to set the number of bins equal to the square root of the number of samples (rounded to nearest whole number). To obtain the bin width, divide the range of the data set by the number of bins (rounded to desired precision of measurement data). This example has 100 samples and a range of 0.23. Thus, an analyst might create 10 bins (=sqrt(100)) of width 0.02 mm (0.23/10 = 0.02). In this example, the value 25.34 was chosen as the starting point because relatively few values are below it.

5. hafta:

Adreslerindeki videoları dinleyerek aşağıdaki soruları cevaplayınız:

1. Write down the terms (both in English and Turkish,with descriptions if possible) in all videos that :

a. You already know

b. Are new to you

2. Write down what exactly is spoken for at least 30 seconds of any video stream. (Write the name of the video you choose)

3. What did you learn from these videos?

Vizeden sonraki ödevler:

1. hafta:


p-values and statistical significance


When conducting a statistical analysis, the p-value is used to represent the probability that no difference exists (for example, two machines are not producing statistically different mean outputs). A common method for determining significance in a statistical comparison is to conclude a difference exists if the p-value is less than the alpha error level.


For example, suppose you determine in a comparison of two bottle-filling machines that the probability that the two means are not different is 0.24. Assuming alpha = 0.05, you would conclude that the two means are not different. However, if the p-value is 0.003, you would conclude that the means

are different. In this example, you are 99.7% confident that you are making the correct decision.

2. hafta:


Z scores transform data into the standard cumulative normal distribution whose mean = 0, and variance (


) = 1. Z-scores provide a mapping from a distribution of some variable to a standardized scale. These mappings reflect the difference in terms of number of standard deviations away from the mean. If the mean of a process = 4 mm and the standard deviation = 1, then an observed value of 1 could also be represented as –3*standard deviation from the mean. For this example, a Z = -3 is equivalent to an actual observation of 1 (where Z = – 3*standard deviation away from the mean). 

3. hafta

When drawing conclusions based on correlation coefficients, several important items must be considered: 

·  Correlation coefficients only measure linear relationships. A meaningful nonlinear relationship can exist even if the correlation coefficient is 0.

·  Correlation does NOT always indicate cause and effect. One should not conclude that changes to one variable cause changes in another. Properly controlled experiments are needed to verify that a correlation relationship indicates causation. 

A correlation coefficient is very sensitive to extreme values. A single value that is very different from the others in a data set can change the value of the coefficient a great deal. In the example below, the correlation is 0.9, but the scatter plot suggests that an outlier more likely explains the relationship that the predictor variable. If you removed the outlier value, the correlation between these two variable would drop to 0.1 over the smaller range of X.