## Notes - AWS Comprehend can detect and redact PII - Synthetic Minority Oversampling Technique (SMOTE) is an oversampling approach where the minority class is oversampled by creating *synthetic data points* instead of *oversampling with replacement values*. Creates synthetic fraudulent cases - Part of SageMaker Data Wrangler - SageMaker *fast file* input mode - Has *file mode*, *pipe mode*, and *fast file mode* - *file mode* downloads file data to Docker container - *pipe mode* streams data to training algorithm (better performance) - *fast file mode* allows model to start training before dataset gets loaded - bias drift with `ModelBiasDrift` - data drift with `DefaultModel Monitor` - feature attribution drift with `ModelExplainabilityMonitor` - model quality drift with `ModelQualityMonitor` - Explainability analysis - [Difference in proportion of labels (DPL)](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-true-label-imbalance.html) - [Partial dependence plots (PDP)](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-processing-job-analysis-results.html#clarify-processing-job-analysis-results-pdp) - [Shapely values](http://docs.aws.amazon.com/sagemaker/latest/dg/clarify-shapley-values.html) - SageMaker endpoints - asynchronous endpoints - Most cost effective - Processes up to 60 minutes - real-time endpoint - Only processes up to 60 seconds - batch endpoint - Minimum request size of 100MB - serverless endpoint - Cannot configure VPC with this endpoint - 60 second time limit - [[AWS Glue]] - TensorBoard visualizes and analyzes intermediate tensors