# MSc Applied Statistics

STA 3100 Probability Theory

A Set functions and measures. Probability, product and Lebesgue  Stieltjes measures and space. Radon  Nikodym theorem. Random variables and measurable functions including moments and norms. Conditional distributions and moments including the law of iterated expectations. Inversion theorem. Modes of convergence: almost sure, in probability and in distribution.Convergence of sequences and sums of independent random variables. Kolmogorov inequality. Borel  Cantelli lemmas. Laws of large numbers. Central limit theorem.

STA 3101   Multivariate Analysis

Distribution of mean vector. Wishart distribution. Hottelingâ squared statistics, including power function and optimum properties. Inferences relating to mean vector and covariance matrix. Partial and multiple correlation coefficients and their distributions.Roys union  intersection principle and its role in multivariate analysis. Test of independence for sets of variates. Multivariate analysis of variance (MANOVA).Â  Discriminant and principle component analysis. Â  Canonical variables and correlations. Cluster and  factor analysis.

STA 3102  Parametric Regression Analysis

Classical least squares theory: methods of OLS (ordinary least squares), statistical properties of OLS estimators and hypothesis testing. GLS (generalized least squares): theory: method, heteroscedasticity, serial correlations, and method for panel data. Asymptotic least squares (when the regressors are stochastic): consistency and asymptotic normality of OLS estimates. Consistent estimation of covariance matrix. NLS (nonlinear least squares) theory: nonlinear specifications, the method, and asymptotic properties of NLS estimators. QML (quasi  maximum likelihood) theory: Kullback  Liebler information criterion, asymptotic properties of QML estimators, information matrix equality, and applications of QML estimation to models A  such as probit and logit, multinomial, and conditional logit.

STA 3103  Statistical Inference Theory

Estimation: maximum likelihood methods; Cramerâ Rao inequality and its generalization; Bhattacharya bounds; Rao Blackwell theorem; symmetric functions and UMVE (uniformly minimum variance estimators); linear, minimum risk, Bayeâs, minimax, and interval estimations; and invariant estimators. Hypothesis testing: randomized and non-randomized tests, unbiased tests and confidence intervals, consistency and efficiency of tests, similar region tests, completeness and similarity, likelihood ratio tests, Bartlettâs test for homogeneity of variances, and the principle of invariance.

STA 3104 Design and Analysis of Experiments

One and two way ANOVA (Analysis of Variance), with and without interaction. Kroneker product, fixed and random effects models, covariance matrix, Rao a Zyskind theorem, Herbach lemma, and Tukeyâ tests. Randomised blocks. Latin and Graeco  Latin square designs. Balanced incomplete blocks, Atiquillahâ theorem, factorial designs, confounding, and factorial replications. Random and mixed models, and split â plot designs. Nested and hierarchical designs. Use of statistical packages such as SAS, Minitab, and Genstat.

STA 3105 Design and Analysis of Sample Surveys

Sampling techniques: simple random, stratified random, cluster and multistage. Model based inference: properties of ratio, regression, Horvitz Thompson and combined ratio estimators. Variance estimation techniques: linearization, and  sampling techniques such as jack knife, BRR (balanced repeated replication), and bootstrap. Bias-robust methods: nonparametric regression for finite population total and estimator of its variance, and potential unexploited extensions such as neural network and spline regression, post-stratification, two phase sampling and repeated surveys. Non-response types: item, non-coverage and unit non-responses. Estimator of population mean: under fixed population model and its properties. Imputation: random, deductive, mean value, hot deck and NN (nearest neighbour). Small area estimation. Model assisted surveys.

STA 3106 Time Series Analysis

Stationary stochastic processes. Moments and autocorrelations. AR (autoregressive) and MA (moving average) models and their respective conditions of stationarity and invertibility. Mixed models and AR representation of MA and ARMA. Yule Walkerequations and partial autocorrelations of their examples. Integrated models such as ARIMA (autoregressive and integrated moving average) and SARIMA (sample autoregressive and integrated moving average). Model identification: ideas based on SACF (sample autocorrelation function) and PACF (partial autocorrelation function) including difficulties with real data. Estimation of parameters based on maximum likelihood, least squares and QML (quasi-maximum likelihood) methods. Forecasting by exponential smoothing and Box  Jenkins methods. Fourier representation of non-periodic and periodic sequences and functions. The spectral representation of autoregressive and stationary processes. The periodogram and its sampling properties.

STA 3107 Nonparametric Regression Analysis

Â General nonparametric regression models, in both fixed and stochastic design. Estimation of regression functions using common smoothing techniques: kernel, NN (nearest neighbour), spline and local polynomial. Properties of the estimators. Choice of smoothing parameters: measures of estimation equality, rates of convergence, and bandwidth selection by cross-validation. Asymptotic distribution of the estimates, boundary kernels and boot-strap method. Orthogonal series expansion and wavelets: Fourier and other orthogonal series, density and regression estimates, and windowed Fourier transform. Neural networks: from perceptron to non-linear neuron, neural network estimates and their properties, and network specification.

STA 3108 Epidemiological Research Methods

Measurements, study types, credibility of a study, sampling variability, and statistical versus clinical significance. Tables: 2-by-2, general contingency, one-way ANOVA (analysis of variance) by regression and its extension to multicategorical determinants. Mantel  Haenzel methods: confounding in 2-by-2 tables, combining odds ratios, relative risk, and multiple risk factors. Logistic regression: modelling confounding, stratified data and multiple risk factors. Case-by case data, Poisson regression, and modelling risks and outcome severity. Matching: matched pairs and logistic modelling, before-after studies, and pros and cons of matching. The problem of sample size determination.

STA 3109 Modelling Extremal Events

Â Maxima, block maxima, domain of attractions, GEV (generalized extreme value) and GP (generalized Pareto) distributions, quantiles and mean excess function. Estimation under GEV distribution and maximum domain of attraction. Fitting excesses over threshold.

Heavy tailed process: heavy tailed time series, estimation of autocovariance and power transfer functions, parameter estimation of the ARMA (autoregressive and mean value) process, extremal index, conditional quantiles, and value-at-risk.

STA 3110 Demography and Vital Statistics

Life tables: their construction and properties. Mukehamâ and Gompertz curves. Rates of mortality, stable and stationary populations, stochastic models, population growth, family size, and crude and specific fertility, gross and net reproductive, crude mortality, and standardized rates. Demographic transition, social and economic determinants of population, and population projection. Mathematical and component methods. Logistic curve fitting. Methods of simulation.

STA 3111 Stochastic Processes

Recurrent events, random walk and Markov chains. Simple time dependent processes such as Poisson, pure birth, pure death and both. Nonhomogeneous birth and death processes. Immigration, emigration, queueing, logistic and branching processes. Stationary Markov and diffusion processes. Renewal theory. Martingales. Queueing theory. Point processes.

STA 3112 Financial Time Series and Risk Management

Features of discrete financial time series such as stock prices and foreign exchange rates. AR (autoregressive), AR-ARCH (autoregressive-autoregressive conditional heteroscedastic), GARCH (generalized autoregressive-autoregressive conditional heteroscedastic), and other related processes. GARCH-in-Mean and GARCH-in-Variance. ARMA (autoregressive moving averages), ARMA-GARCH, VaR (Value-at-Risk),coherent risk measures, expected shortfall, and backtesting ideas.

STA 3113 Decision Theory

Statistical game and decision functions. Randomised, non-randomised, minimax, Bayes and non-randomised minimum decision rules. Admissable and minimax estimators and tests. Minimal complete class of tests. Principles of sufficiency and invariance. Huntstein theorem. Minimax invariant decision rules. Sequential probability ratio test.

STA 3114 Survival and Clinical Data Analysis

Survivor, hazard and cumulative hazard functions. Censoring: Kaplana  Meier survival curve and parametric models. Comparison of two groups including log-rank test. Inclusion of covariates. Proportional hazard model including application of model checking, computing risks and extensions. Clinical trials: organization and planning, protocol, patient selection, response justification and randomization methods. Uncontrolled and blind trials, placebos and ethical issues. Size of a clinical trial. Maintaining trial progress. Forms and data management: protocol deviations, methods of data analysis, binary responses, cross-over trials, survival data and prognostic factors.

STA 3115 Statistical Consulting (2 units)

Candidates are given the opportunity to serve as consultancy interns with close supervision of an academic staff member involved in consultancy activities. Instruction and experience in consultant interaction, communication skills, statistical practice, statistical computation and technical report writing will form the main focus. The student shall produce a typewritten report.

STA 3116 Non-Parametric Methods

Order statistics and non-parametric estimation. Distribution free tests for goodness of fit and independence. Tests for comparison of two populations. Fisher-Pitman randomisation theory, k-sample tests, method of paired comparisons, power and asymptotic relative efficiency. Robust procedures. Optimum non-parametric tests. Non-parametric methods in multivariate analysis.

STA 3117 Basics of Statistical Inference

Estimation: point estimates and confidence intervals for population parameters such as mean, variance and proportions. Hypothesis testing: inference about measures of association such as chi-square, contingency coefficient, gamma, phi, Cramerâ and uncertainty coefficient. Inference about means: one and two sample t-tests. ANOVA (Analysis of variance) models: one and two way ANOVA and MANOVA (multivariate analysis of variance). Measure of relationships such as bivariate, partial and distance correlation, and linear simple and multiple regression. Non-parametric tests: binomial, two-way contingency tables, Mann-Whitney, U, Kruskal-Wallis, median, Friedman and Cochran tests.

It will be assumed that the candidate is equipped with the relevant theory and is computer literate, since most of the computation will be done using a statistical software such as S-PLUS, SAS, SPSS or Genstat.

STA 3200 Statistics Project (2 units)

The project is taken in the second year of study. It provides the candidate with an opportunity to focus on an application of statistical methodology, to develop research skills and to consolidate knowledge accumulated from other units. The candidate shall undertake the research in a branch of statistics under appropriate supervision, in order to develop advanced research skills and techniques, and to present the findings in a documented scholarly form. The project should make an independentcontribution to learning, or offer a critical perspective on existing methodology.