Breaking

BCA 6th Sem -Data Science and Machine Learning UNIT-III MCQ

 

BCA 6th Sem -Data Science and Machine Learning UNIT-III MCQ


  • UNIT-I 
Introduction to Data Science :                                     -Evolution of Data Science    - Data Science  Roles             - Stages in a Data Science Project                                          -Applications of Data Science in various fields        -Data Security Issues
   .
 Unit-1 MCQ's
  • UNIT-II 
  • Data Collection and Data Pre-Processing :                      -DataCollection Strategies, -Data Pre-Processing Overview                                   -Data Cleaning                       -Data Integration and Transformation                           -Data Reduction  

    Unit-2 MCQ's
  • UNIT-III 
  • Exploratory Data Analytics :         - Descriptive Statistics - Mean, StandardDeviation,          -Skewness and Kurtosis              -Box Plots                                      – Pivot Table,                               -Correlation  Statistics,             - ANOVA,                                            
    Unit-3 MCQ's
  • UNIT-IV 
  • -Idea of Machines learning from data                                  -Classification of problem – Regression and Classification                            -Supervised and Unsupervised learning.                                  

  • UNIT-V                 
  • Neural Networks : 
    -History, 
    -Artificial and biological neural networks 
    -Artificial intelligence and neural networks -
    -Biological neurons              -Models of single neurons   -Different neural network models Neural Networks 

    Unit-5 MCQ's

    Data Science and Machine Learning 

                

    Basic EDA Concepts

    1. What is the primary goal of Exploratory Data Analysis (EDA)?
      a) Predicting future outcomes
      b) Summarizing main characteristics of data
      c) Building machine learning models
      d) Automating data collection

      Answer: b) Summarizing main characteristics of data

    2. Which Python library is widely used for EDA?
      a) TensorFlow
      b) OpenCV
      c) Pandas
      d) Scikit-learn

      Answer: c) Pandas

    3. Which of the following plots is most suitable for visualizing the distribution of a numerical variable?
      a) Pie chart
      b) Bar chart
      c) Histogram
      d) Line chart

      Answer: c) Histogram

    4. What is a common way to check for missing values in a Pandas DataFrame?
      a) df.describe()
      b) df.isnull().sum()
      c) df.sort_values()
      d) df.corr()

      Answer: b) df.isnull().sum()

    5. Which measure of central tendency is most resistant to outliers?
      a) Mean
      b) Median
      c) Mode
      d) Standard deviation

      Answer: b) Median

    6. Which visualization tool is best for detecting outliers?
      a) Histogram
      b) Bar chart
      c) Scatter plot
      d) Box plot

      Answer: d) Box plot

    7. Which function provides summary statistics for a Pandas DataFrame?
      a) df.head()
      b) df.describe()
      c) df.shape
      d) df.dtypes

      Answer: b) df.describe()

    8. Which of the following helps in detecting multicollinearity?
      a) Heatmap
      b) Box plot
      c) Scatter plot
      d) Histogram

      Answer: a) Heatmap

    9. Which of the following can be used to detect skewness in data?
      a) Scatter plot
      b) Histogram
      c) Box plot
      d) Line plot

      Answer: b) Histogram

    10. What does a correlation coefficient of 0 indicate?
      a) Perfect negative correlation
      b) Perfect positive correlation
      c) No correlation
      d) Strong correlation

    Answer: c) No correlation

    Advanced EDA Concepts

    1. Which statistical measure quantifies the spread of data?
      a) Mean
      b) Standard deviation
      c) Median
      d) Mode

    Answer: b) Standard deviation

    1. Which Pandas function is used to detect duplicate rows in a dataset?
      a) df.dropna()
      b) df.duplicated()
      c) df.isnull()
      d) df.fillna()

    Answer: b) df.duplicated()

    1. What does a right-skewed histogram indicate?
      a) The mean is less than the median
      b) The mean is greater than the median
      c) The data is evenly distributed
      d) The data has no skew

    Answer: b) The mean is greater than the median

    1. Which type of variable is best represented by a bar chart?
      a) Continuous
      b) Categorical
      c) Numerical
      d) Interval

    Answer: b) Categorical

    1. Which method is used to replace missing values with the median in Pandas?
      a) df.fillna(df.mean())
      b) df.fillna(df.median())
      c) df.dropna()
      d) df.replace()

    Answer: b) df.fillna(df.median())

    1. Which function in Pandas is used to check data types of columns?
      a) df.head()
      b) df.dtypes
      c) df.shape
      d) df.describe()

    Answer: b) df.dtypes

    1. Which visualization technique is best for showing the relationship between two numerical variables?
      a) Box plot
      b) Histogram
      c) Scatter plot
      d) Bar chart

    Answer: c) Scatter plot

    1. Which measure is useful for identifying the dispersion of data?
      a) Mean
      b) Standard deviation
      c) Mode
      d) Median

    Answer: b) Standard deviation

    1. Which of the following is a measure of shape in EDA?
      a) Variance
      b) Skewness
      c) Standard deviation
      d) Median

    Answer: b) Skewness

    1. Which method is used to normalize a dataset?
      a) Min-max scaling
      b) Standard deviation calculation
      c) Calculating the mean
      d) Removing duplicates

    Answer: a) Min-max scaling

         21. The mean of a dataset is also known as:

    a) Median
    b) Average
    c) Mode
    d) Range

    Answer: b) Average

    22. What is the mean of the numbers 10, 20, 30, 40, and 50?
    a) 30
    b) 25
    c) 35
    d) 40

    Answer: a) 30

    23. If all values in a dataset are increased by 5, how does the mean change?
    a) Increases by 5
    b) Decreases by 5
    c) Remains the same
    d) Increases by 10

    Answer: a) Increases by 5

    24. The median of a dataset is:
    a) The most frequently occurring value
    b) The middle value when arranged in order
    c) The sum of all values divided by the total count
    d) The difference between maximum and minimum values

    Answer: b) The middle value when arranged in order

    25. If a dataset has an even number of values, the median is:
    a) The smallest value
    b) The average of the two middle values
    c) The largest value
    d) The mode

    Answer: b) The average of the two middle values

    26. Which measure of central tendency is most affected by outliers?
    a) Mean
    b) Median
    c) Mode
    d) Range

    Answer: a) Mean

    27. Which measure of central tendency is best for skewed data?
    a) Mean
    b) Median
    c) Mode
    d) Variance

    Answer: b) Median

    28. The mode of a dataset is:
    a) The most frequently occurring value
    b) The middle value
    c) The sum of all values divided by the count
    d) The range

    Answer: a) The most frequently occurring value

    29. A dataset with two modes is called:
    a) Unimodal
    b) Bimodal
    c) Multimodal
    d) Non-modal

    Answer: b) Bimodal

    30. Which measure of central tendency can have more than two values?
    a) Mean
    b) Median
    c) Mode
    d) Range

    Answer: c) Mode


    Measures of Dispersion (Standard Deviation, Variance, Range, IQR)

    31. The range of a dataset is calculated as:
    a) The sum of all values divided by total count
    b) The difference between the highest and lowest values
    c) The middle value
    d) The most frequently occurring value

    Answer: b) The difference between the highest and lowest values

    32. Standard deviation measures:
    a) The central value of a dataset
    b) The dispersion of data points from the mean
    c) The most frequent value
    d) The correlation between variables

    Answer: b) The dispersion of data points from the mean

    33. A dataset with a standard deviation of 0 means:
    a) The data has high variation
    b) All values are the same
    c) The data has outliers
    d) The mean is zero

    Answer: b) All values are the same

    34. Variance is:
    a) The square root of the standard deviation
    b) The square of the standard deviation
    c) The sum of all values
    d) The difference between the highest and lowest values

    Answer: b) The square of the standard deviation

    35. If all values in a dataset are increased by 10, how does the standard deviation change?
    a) Increases
    b) Decreases
    c) Remains the same
    d) Becomes zero

    Answer: c) Remains the same

    36. If all values in a dataset are multiplied by 3, the standard deviation:
    a) Increases 3 times
    b) Remains the same
    c) Decreases
    d) Becomes zero

    Answer: a) Increases 3 times

    37. Which of the following is NOT a measure of dispersion?
    a) Standard deviation
    b) Variance
    c) Median
    d) Range

    Answer: c) Median

    38. A dataset has a variance of 16. What is its standard deviation?
    a) 2
    b) 4
    c) 8
    d) 16

    Answer: b) 4

    39. A smaller standard deviation indicates:
    a) Higher spread in data
    b) Lower spread in data
    c) More outliers
    d) A higher mean

    Answer: b) Lower spread in data

    40. The interquartile range (IQR) is:
    a) Q3 - Q1
    b) Q2 - Q1
    c) Q3 - Mean
    d) Q3 + Q1

    Answer: a) Q3 - Q1


    Application-Based Questions

    41. Which is more resistant to outliers:
    a) Mean
    b) Median
    c) Standard deviation
    d) Variance

    Answer: b) Median

    42. A higher variance means:
    a) Data points are closer to the mean
    b) Data points are spread out
    c) The dataset is symmetric
    d) The mean is zero

    Answer: b) Data points are spread out

    43. If the standard deviation of a dataset is 5, what is the variance?
    a) 5
    b) 10
    c) 25
    d) 50

    Answer: c) 25

    44. In a normal distribution, approximately 68% of data falls within:
    a) 1 standard deviation from the mean
    b) 2 standard deviations from the mean
    c) 3 standard deviations from the mean
    d) No standard deviations

    Answer: a) 1 standard deviation from the mean

    45. A standard deviation of 0 means:
    a) No variation in data
    b) Data is highly spread out
    c) Data is negatively skewed
    d) The mean is zero

    Answer: a) No variation in data

    46. In a perfectly symmetrical dataset, the mean and median are:
    a) Always different
    b) Always equal
    c) Sometimes equal
    d) Unrelated

    Answer: b) Always equal

    47. Which is not affected by extreme values?
    a) Mean
    b) Median
    c) Standard deviation
    d) Variance

    Answer: b) Median

    48. The standard deviation of {5, 5, 5, 5, 5} is:
    a) 0
    b) 5
    c) 25
    d) 10

    Answer: a) 0

    49. The sum of squared deviations from the mean is used to calculate:
    a) Range
    b) Variance
    c) Median
    d) Mode

    Answer: b) Variance

    50. A lower standard deviation means:
    a) Higher consistency in data
    b) More variability
    c) More outliers
    d) A higher range

    Answer: a) Higher consistency in data

    Skewness MCQs

    51. Skewness measures the ______________ of a dataset.
    a) Spread
    b) Symmetry
    c) Central tendency
    d) Variability

    Answer: b) Symmetry

    52. If a distribution has a long right tail, it is called:
    a) Positively skewed
    b) Negatively skewed
    c) Normally distributed
    d) Symmetric

    Answer: a) Positively skewed

    53. In a negatively skewed distribution:
    a) The mean is greater than the median
    b) The median is greater than the mean
    c) The mean and median are equal
    d) The data has no outliers

    Answer: b) The median is greater than the mean

    54. In a perfectly symmetrical distribution, the skewness value is:
    a) 1
    b) -1
    c) 0
    d) Undefined

    Answer: c) 0

    55. Which of the following distributions is most likely to have a skewness value close to zero?
    a) A uniform distribution
    b) A normal distribution
    c) A bimodal distribution
    d) An exponential distribution

    Answer: b) A normal distribution

    56. A left-skewed distribution has:
    a) A longer tail on the right
    b) A longer tail on the left
    c) No tails
    d) Equal tails on both sides

    Answer: b) A longer tail on the left

    57. If the skewness of a dataset is greater than 1, the distribution is:
    a) Heavily skewed
    b) Symmetric
    c) Normally distributed
    d) Bimodal

    Answer: a) Heavily skewed

    58. Which measure is least affected by skewness?
    a) Mean
    b) Median
    c) Mode
    d) Variance

    Answer: b) Median

    59. Which formula is used to calculate skewness?
    a) Karl Pearson’s coefficient of skewness
    b) Interquartile range formula
    c) Central limit theorem
    d) Least squares method

    Answer: a) Karl Pearson’s coefficient of skewness

    60. In a negatively skewed distribution, which measure of central tendency is the largest?
    a) Mean
    b) Median
    c) Mode
    d) None of the above

    Answer: c) Mode


    Kurtosis MCQs

    61. Kurtosis measures the ______________ of a dataset.
    a) Central tendency
    b) Spread
    c) Shape of the tails
    d) Mean deviation

    Answer: c) Shape of the tails

    62. A normal distribution has a kurtosis value of:
    a) 1
    b) 2
    c) 3
    d) 0

    Answer: c) 3

    63. If a distribution has kurtosis greater than 3, it is called:
    a) Platykurtic
    b) Mesokurtic
    c) Leptokurtic
    d) Symmetric

    Answer: c) Leptokurtic

    64. A leptokurtic distribution has:
    a) Thin tails
    b) Fat tails
    c) No tails
    d) Uniform shape

    Answer: b) Fat tails

    65. A platykurtic distribution has:
    a) Higher peaks and thicker tails
    b) Lower peaks and thinner tails
    c) Equal tails on both sides
    d) A perfect bell shape

    Answer: b) Lower peaks and thinner tails

    66. Which distribution is considered mesokurtic?
    a) Uniform distribution
    b) Normal distribution
    c) Exponential distribution
    d) Poisson distribution

    Answer: b) Normal distribution

    67. The kurtosis of a normal distribution is also known as:
    a) Excess kurtosis
    b) Standard kurtosis
    c) Absolute kurtosis
    d) Moderate kurtosis

    Answer: a) Excess kurtosis

    68. A dataset with kurtosis less than 3 is classified as:
    a) Leptokurtic
    b) Platykurtic
    c) Mesokurtic
    d) Skewed

    Answer: b) Platykurtic

    69. Which of the following is true about a distribution with high kurtosis?
    a) It has outliers far from the mean
    b) It is always symmetrical
    c) It has a flat peak
    d) It is negatively skewed

    Answer: a) It has outliers far from the mean

    70. Which statistic measures whether a dataset has light or heavy tails?
    a) Mean
    b) Standard deviation
    c) Kurtosis
    d) Variance

    Answer: c) Kurtosis


    Combination MCQs (Skewness & Kurtosis)

    71. Which of the following does NOT affect skewness and kurtosis?
    a) Outliers
    b) Data distribution
    c) Sample size
    d) Mean

    Answer: d) Mean

    72. A normal distribution has:
    a) Skewness of 0 and kurtosis of 3
    b) Skewness of 1 and kurtosis of 0
    c) Skewness of -1 and kurtosis of 1
    d) Skewness of 2 and kurtosis of 4

    Answer: a) Skewness of 0 and kurtosis of 3

    73. A distribution with high skewness and high kurtosis has:
    a) A long tail and many extreme values
    b) A short tail and no outliers
    c) No skewness
    d) Equal probabilities for all values

    Answer: a) A long tail and many extreme values

    74. What happens if a dataset has a high positive skewness and high kurtosis?
    a) The dataset has a long right tail and many extreme values
    b) The dataset is perfectly symmetric
    c) The dataset is normally distributed
    d) The dataset has no outliers

    Answer: a) The dataset has a long right tail and many extreme values

    75. If a dataset has negative skewness and low kurtosis, the data is:
    a) Left-skewed with thin tails
    b) Right-skewed with thick tails
    c) Normally distributed
    d) Symmetric

    Answer: a) Left-skewed with thin tails

    76. When the mean and median are equal, the skewness is likely to be:
    a) Positive
    b) Negative
    c) Zero
    d) Undefined

    Answer: c) Zero

    77. Which measure helps determine whether a distribution has extreme outliers?
    a) Mean
    b) Variance
    c) Kurtosis
    d) Standard deviation

    Answer: c) Kurtosis

    78. Which of the following distributions would most likely have high kurtosis?
    a) A dataset with many extreme outliers
    b) A uniform distribution
    c) A symmetric, bell-shaped distribution
    d) A dataset with no variation

    Answer: a) A dataset with many extreme outliers

    79. A leptokurtic distribution is more likely to have:
    a) Extreme values
    b) A flat shape
    c) No skewness
    d) A symmetrical spread

    Answer: a) Extreme values

    80. When calculating skewness and kurtosis, which assumption is typically made?
    a) The dataset is normally distributed
    b) The dataset has equal variance
    c) The dataset is unimodal
    d) The dataset contains outliers

    Answer: a) The dataset is normally distributed

    Skewness Numericals

    81. Given the following dataset: {10, 15, 20, 25, 80}, what is the mean?
    a) 30
    b) 20
    c) 50
    d) 25

    Solution:

    Mean=10+15+20+25+805=1505=30\text{Mean} = \frac{10 + 15 + 20 + 25 + 80}{5} = \frac{150}{5} = 30

    Answer: a) 30


    82. Using the same dataset {10, 15, 20, 25, 80}, what is the median?
    a) 20
    b) 30
    c) 25
    d) 35

    Solution:
    The middle value (when arranged in ascending order) is 20.
    Answer: a) 20


    83. Using the same dataset, what is the skewness?
    a) Positively skewed
    b) Negatively skewed
    c) Symmetric
    d) Cannot be determined

    Solution:
    The mean (30) is greater than the median (20), indicating a right-skewed (positively skewed) distribution.
    Answer: a) Positively skewed


    84. A dataset has Mean = 50, Median = 45, Mode = 40. What is the approximate skewness using Karl Pearson’s coefficient?
    a) 0.5
    b) 1.0
    c) -1.0
    d) -0.5

    Solution:

    Skewness=3(MeanMedian)Standard Deviation\text{Skewness} = \frac{3 (\text{Mean} - \text{Median})}{\text{Standard Deviation}}

    Since Mean > Median > Mode, it is positively skewed.

    Skewness=3(5045)σ=15σ\text{Skewness} = \frac{3 (50 - 45)}{\sigma} = \frac{15}{\sigma}

    Without standard deviation, we estimate positive skewness.
    Answer: b) 1.0


    Kurtosis Numericals

    85. The kurtosis of a dataset is 5. What type of kurtosis does it have?
    a) Mesokurtic
    b) Platykurtic
    c) Leptokurtic
    d) Cannot be determined

    Solution:
    Since kurtosis > 3, the distribution has high kurtosis (leptokurtic).
    Answer: c) Leptokurtic


    86. A dataset has the following values: {5, 10, 15, 20, 25, 30, 100}. What type of kurtosis is likely?
    a) Platykurtic
    b) Mesokurtic
    c) Leptokurtic
    d) None

    Solution:
    The dataset has an extreme outlier (100), leading to higher kurtosis (leptokurtic).
    Answer: c) Leptokurtic


    Combination of Skewness and Kurtosis Numericals

    87. If the skewness of a dataset is -1.5 and kurtosis is 1.8, the distribution is:
    a) Positively skewed and leptokurtic
    b) Negatively skewed and platykurtic
    c) Normally distributed
    d) Cannot be determined

    Solution:

    • Negative skewness means it is left-skewed.

    • Kurtosis < 3 means it is platykurtic.
      Answer: b) Negatively skewed and platykurtic


    88. If a dataset has a skewness of 0.2 and kurtosis of 2.9, the shape of the distribution is:
    a) Normal
    b) Positively skewed and leptokurtic
    c) Negatively skewed and platykurtic
    d) Positively skewed and platykurtic

    Solution:

    • Skewness ≈ 0 means it is almost symmetric.

    • Kurtosis ≈ 3 means it is mesokurtic (normal distribution).
      Answer: a) Normal

    89. A box plot is also known as a:
    a) Bar chart
    b) Whisker plot
    c) Histogram
    d) Scatter plot

    Answer: b) Whisker plot


    90. The box in a box plot represents which statistical measure?
    a) Mean
    b) Range
    c) Interquartile Range (IQR)
    d) Standard Deviation

    Answer: c) Interquartile Range (IQR)


    91. In a box plot, the median is represented by:
    a) The bottom of the box
    b) The top of the box
    c) The line inside the box
    d) The end of the whiskers

    Answer: c) The line inside the box


    92. The whiskers in a box plot extend to:
    a) The highest and lowest values in the dataset
    b) The mean of the dataset
    c) The interquartile range
    d) 1.5 times the interquartile range (IQR)

    Answer: d) 1.5 times the interquartile range (IQR)


    93. Which of the following is NOT shown in a box plot?
    a) Outliers
    b) Median
    c) Mean
    d) First quartile (Q1)

    Answer: c) Mean


    94. If a box plot is right-skewed, which of the following is true?
    a) The median is closer to Q3
    b) The median is closer to Q1
    c) The whiskers are equal in length
    d) There are no outliers

    Answer: b) The median is closer to Q1


    95. If the whiskers of a box plot are very long, this suggests that:
    a) The data has low variability
    b) The data is heavily skewed or has many outliers
    c) The dataset follows a normal distribution
    d) The dataset has only one unique value

    Answer: b) The data is heavily skewed or has many outliers


    Box Plot Numerical Questions

    96. Given the following five-number summary: {10, 20, 30, 40, 50}, what is the interquartile range (IQR)?
    a) 10
    b) 20
    c) 30
    d) 40

    Solution:

    IQR=Q3Q1=4020=20\text{IQR} = Q3 - Q1 = 40 - 20 = 20

    Answer: b) 20


    97. If the Q1 = 25 and Q3 = 75, what are the upper and lower limits for outliers?
    a) Lower = -50, Upper = 150
    b) Lower = 0, Upper = 100
    c) Lower = 50, Upper = 75
    d) Lower = 25, Upper = 75

    Solution:

    IQR=Q3Q1=7525=50\text{IQR} = Q3 - Q1 = 75 - 25 = 50 Lower Bound=Q11.5×IQR=25(1.5×50)=2575=50\text{Lower Bound} = Q1 - 1.5 \times IQR = 25 - (1.5 \times 50) = 25 - 75 = -50 Upper Bound=Q3+1.5×IQR=75+(1.5×50)=75+75=150\text{Upper Bound} = Q3 + 1.5 \times IQR = 75 + (1.5 \times 50) = 75 + 75 = 150

    Answer: a) Lower = -50, Upper = 150


    98. A dataset has a median of 35, Q1 = 20, and Q3 = 50. Which of the following statements is true?
    a) The dataset is left-skewed
    b) The dataset is right-skewed
    c) The dataset is symmetric
    d) Cannot be determined

    Solution:
    Since median (35) is exactly between Q1 (20) and Q3 (50), the data is symmetrical.
    Answer: c) The dataset is symmetric


    99. If a box plot shows that Q1 = 15, Q3 = 45, and the maximum whisker extends to 70, what is the likely presence of outliers?
    a) No outliers
    b) Outliers exist beyond 70
    c) The data is symmetric
    d) The data follows a normal distribution

    Solution:

    • IQR = 45 - 15 = 30

    • Upper limit = Q3 + 1.5 × IQR = 45 + (1.5 × 30) = 45 + 45 = 90

    • Lower limit = Q1 - 1.5 × IQR = 15 - 45 = -30

    • Since the max value 70 is within the range (-30, 90), there are no outliers.

    Answer: a) No outliers


    100. A dataset has the five-number summary: {12, 18, 25, 30, 60}. Which statement is true?
    a) The data is right-skewed
    b) The data is left-skewed
    c) The data is symmetric
    d) Cannot be determined

    Solution:

    • Q1 = 18, Q3 = 30, Median = 25

    • The upper whisker (60) is much farther from Q3 (30) than the lower whisker (12) is from Q1 (18).

    • This suggests positive (right) skewness.

    Answer: a) The data is right-skewed

    Pivot Table MCQs

    101. A Pivot Table is used for:
    a) Data visualization
    b) Data summarization
    c) Data cleaning
    d) Data encryption

    Answer: b) Data summarization


    102. In a Pivot Table, which of the following fields is used to categorize data?
    a) Values
    b) Filters
    c) Rows and Columns
    d) All of the above

    Answer: d) All of the above


    103. What function is commonly used in a Pivot Table to summarize numerical data?
    a) SUM
    b) COUNT
    c) AVERAGE
    d) All of the above

    Answer: d) All of the above


    104. A Pivot Table can be created in:
    a) Microsoft Excel
    b) Google Sheets
    c) Python (Pandas)
    d) All of the above

    Answer: d) All of the above


    Correlation Statistics MCQs

    105. The correlation coefficient (r) measures the relationship between:
    a) Two categorical variables
    b) Two numerical variables
    c) One numerical and one categorical variable
    d) None of the above

    Answer: b) Two numerical variables


    106. If the correlation coefficient (r) is -1, the relationship between variables is:
    a) Perfectly positive
    b) Perfectly negative
    c) No correlation
    d) Weak correlation

    Answer: b) Perfectly negative


    107. If two variables have no correlation, the correlation coefficient (r) is:
    a) -1
    b) 0
    c) 1
    d) Undefined

    Answer: b) 0


    108. Which correlation coefficient represents the strongest linear relationship?
    a) r = -0.8
    b) r = 0.5
    c) r = -0.3
    d) r = 0.1

    Answer: a) r = -0.8


    109. If correlation is positive, it means:
    a) As one variable increases, the other decreases
    b) As one variable increases, the other also increases
    c) The variables are not related
    d) The data has outliers

    Answer: b) As one variable increases, the other also increases


    110. Which method is commonly used to calculate correlation?
    a) Pearson’s correlation
    b) Spearman’s rank correlation
    c) Kendall’s tau correlation
    d) All of the above

    Answer: d) All of the above


    Correlation Numericals

    111. If the covariance between X and Y is 15 and the standard deviations of X and Y are 3 and 5, what is the Pearson correlation coefficient?
    a) 1.5
    b) 1.0
    c) 0.5
    d) 2.5

    Solution:

    r=Cov(X,Y)σXσY=153×5=1515=1.0r = \frac{\text{Cov}(X, Y)}{\sigma_X \cdot \sigma_Y} = \frac{15}{3 \times 5} = \frac{15}{15} = 1.0

    Answer: b) 1.0


    112. If two datasets have a correlation of -0.85, what does it indicate?
    a) Strong positive correlation
    b) Strong negative correlation
    c) No correlation
    d) Weak correlation

    Answer: b) Strong negative correlation


    ANOVA (Analysis of Variance) MCQs

    113. The purpose of ANOVA is to compare:
    a) Two means
    b) Three or more means
    c) Standard deviations
    d) Medians

    Answer: b) Three or more means


    114. Which of the following is an assumption of ANOVA?
    a) Normality of data
    b) Homogeneity of variance
    c) Independence of observations
    d) All of the above

    Answer: d) All of the above


    115. What does a low p-value (< 0.05) in an ANOVA test indicate?
    a) The groups have similar means
    b) At least one group mean is significantly different
    c) The test is invalid
    d) The data is not normal

    Answer: b) At least one group mean is significantly different


    116. Which statistic is used in ANOVA to determine significance?
    a) t-statistic
    b) F-statistic
    c) Chi-square
    d) z-score

    Answer: b) F-statistic


    ANOVA Numericals

    117. Given the following sample means:

    • Group A: 15

    • Group B: 20

    • Group C: 30

    The grand mean (overall mean) is:
    a) 20
    b) 22.5
    c) 25
    d) 30

    Solution:

    Grand Mean=15+20+303=653=21.67\text{Grand Mean} = \frac{15 + 20 + 30}{3} = \frac{65}{3} = 21.67

    Answer: b) 22.5


    118. If the between-group variance = 50 and within-group variance = 10, what is the F-ratio?
    a) 2.5
    b) 5.0
    c) 10.0
    d) 50.0

    Solution:

    F=Between-Group VarianceWithin-Group Variance=5010=5.0F = \frac{\text{Between-Group Variance}}{\text{Within-Group Variance}} = \frac{50}{10} = 5.0

    Answer: b) 5.0


    119. A one-way ANOVA is used when comparing:
    a) One group’s variance
    b) Two independent groups
    c) Three or more independent groups
    d) Paired samples

    Answer: c) Three or more independent groups


    120. In an ANOVA test, a large F-statistic means:
    a) The variances within groups are large
    b) The means of the groups are significantly different
    c) There is no significant difference
    d) The test is not valid

    Answer: b) The means of the groups are significantly different

    Next Topic

    ← prevnext →




    No comments:

    Post a Comment