



















































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
ISYE 6501 MIDTERM ACTUAL EXAM NEWEST VERSION - 2025/2026- WITH 100+ QUESTIONS AND VERIFIED ANSWERS (100% SUCCESS)
Typology: Exams
1 / 59
This page cannot be seen from the preview
Don't miss anything!
What is a support vector? A point that holds up a shape. Does ...[โ (a-1)+1/3(a+1)] move an SVM classifier up or down? Up How do you make errors more costly in a soft SVM classifier? include a multiplier for the point-error term. When we increase the sum of the square of the coefficients we... Decrease the distance between the lines In SVM soft classifier we tradeoff between maximizing ___ and minimizing ___ margin and errors
If lambda gets small what gets emphasized, large margin or minimizing training error?, Minimizing errors. If an SVM coefficient is very close to zero... that term is not very important to the classification. What is the difference between standardization and scaling? Scaling is bounded in range. Standardization is scaling to a normal distribution. Standardization is the (value - factor mean) / (factor standard deviation) What is the 2-norm? Euclidean distance What is the 1-norm? The rectilinear (Manhattan) distance
The whiskers on a box plot extend to... the 10th and 90th percentiles (or 5th and 95th) Why are hypothesis tests generally not sufficient for change detection? They are slow to detect changes. In CUSUM, T is _____ and C is _____., Threshold and a "bring down factor" In a CUSUM model, you adjust T and C to manage the tradeoff between..., early detection and false-alarms In exponential smoothing, if the data is less random, then you want to pick an alpha that is..., Close to 1. What is the initial condition for T in exponential smoothing with trending? T_i=
In cyclic exponential smoothing, L represents..., The length of the cycle or season In cyclic exponential smoothing, C_1 ... C_L = ___?,
What measure is used to determine the quality of a linear regression line to data? Square of the difference between the line and the data points. (Sum squared error.) What is the formula for point error in linear regression? y(i)-Yhat(i)=y(i)-(a(0) + a(i)*x(i) Taking partial derivates and setting them equal to zero and then solving that system equation helps us do what? Minimize the error and optimize the coefficients for linear regression. What does AIC stand for? Akaike Information Criterion What is "Likelihood"? A measure for the probability density for any parameter set.
What is Maximum likelihood? Parameters that give the highest probability. What is MLE? Maximum Likelihood Estimate... The set of parameters that minimizes the sum of square errors. What is the formula for AIC? AIC=2k - 2ln(L) where L* is the maximum likelihood value and K is the number of parameters estimated. What is the penality term in AIC and what does it do? 2K - It helps prevent overfitting. A models that is fit to random effects and not real ones is said to be? Overfit What does corrected AIC account for? The fact that we cannot have infinitely many data points.
In exponential how should you adjust for randomness? Make alpha close to 0. What does SVM stand for? Support Vector Machine Is written text structured or unstructured? Unstructured In exponential trending, what does Beta do? Adjusts for trending. In exponential smoothing, what is C_t? A multiplicative seasonality factor at time t. In exponential smoothing, what is L? The length of a cycle. In exponential smoothing, what does gamma do?
Adjusts how much cycles contribute to the model. In multiplicative seasonality, the first L values of C are set to what?
1 - norm Similar to rectilinear distance; measures the sum of the lengths of each dimension 2 - norm Similar to Euclidian distance; measures the straight-line length of a vector from the origin. Additive seasonality Seasonal effect that is added to a baseline value. Adjusted R-squared/Adjusted R Variant of R2 that encourages simpler models by penalizing the use oftoo many variables.
Attribute A characteristic or measurement - for example, a person's height or the color of a car. Aka "feature", "covariate" or "predictor" Autoregression Regression technique using past values of time series data as predictors of future values. Autoregressive integrated moving average (ARIMA) Time series model that uses differences between observations when data is nonstationary. Also called Box-Jenkins. Bayes' theorem/Bayes' rule Fundamental rule of conditional probability: ๐(๐ด|๐ต) = ๐(๐ต|๐ด)๐(๐ด) / ๐(๐ต). Bayesian Information criterion (BIC) Model selection technique that trades off model fit and model complexity. Generally penalizes complexity more than AIC. Lower is better.
Bayesian regression Regression model that incorporates estimates of how coefficients and error are distributed. BIC Bayesian information criterion Binary data Data that can take only two different values (true/false, 0/1, black/white, on/off, etc.). Binary variable Variable that can take just two values: 0 and 1. Box and whisker plot Graphical representation data showing the middle range of data (the "box"), reasonable ranges of variability ("whiskers"), and points (possible outliers) outside those ranges. Box-Cox transformation Transformation of a non-normally-distributed response to a normal distribution.
Classification tree Tree-based method for classification. After branching to split the data, each subset is analyzed with its own classification model. Classifier A boundary that separates the data into two or more categories. Also (more generally) an algorithm that performs classification. Cluster A group of points identified as near/similar to each other. Cluster center In some clustering algorithms (like ๐๐-means clustering), the central point (often the centroid) of a cluster of data points. Clustering Separation of data points into groups ("clusters") Collective outlier A set of data points that is (uncommonly) different from others - for example, a missing heartbeat in an electrocardiogram
Concave function A function where the values are always above [or equal to] the function's Line between end points. Concordance index Area under the ROC curve; an estimate of the classification model's accuracy. Also called AUC. Confusion matrix Visualization of classification model performance. Constant A number that remains the same. Constraint Part of an optimization model that describes a restriction on the solution (the values of the variables).
Cross-validation Validation technique where a model is tested on data different from what it was trained on. CUSUM Change detection method that compares observed distribution mean with a threshold level of change. Short for "cumulative sum". Data point Observation/record of (perhaps multiple) measurements for a single member of a population or data set. In the standard tabular format, a row of data. Decision Choice of action. Decision tree Tree-based method for decision-making. After branching to split the data, each subset is analyzed with its own decision model (or just has its own decision applied).
Descriptive analytics Loosely speaking, the use of analytics to explain or describe what has happened. Detrending Removal of trend, such as a change in the mean over time, from timeseries data. Differencing Using the difference of successive values in time series data, rather than the values themselves. Sometimes nonstationary data will have stationary differences. Dimension A feature of the data points (for example, height or credit score). Distance How far it is between two points -- but there are different ways to measure it (see Minkowski distance). Distribution-fitting Determining whether a set of data seems to follow a certain probability distribution, or determining which of several distributions the data is close to.