## Notes
- [[#Domain 1 28% Data Preparation for Machine Learning]]
- 26% ML Model Development
- 22% Deployment and Orchestration of ML Workflows
- 24% ML Solution Monitoring, Maintenance, and Security
## Domain 1: 28% Data Preparation for Machine Learning
- Data formats and ingestion mechanisms
- validated formats
- not-validated formats
- Apache Parquet
- JSON
- CSV
- Apache ORC
- Apache Avro
- RecordIO
- Core AWS data services
- S3
- Amazon Elastic File System (EFS)
- Amazon FSx for NetApp ONTAP
- AWS streaming services
- Kinesis
- Flink
- Kafka
- AWS storage options and tradeoffs
- Extracting data from storage
- S3
- Elastic Block Store (EBS)
- EFS
- RDS
- DynamoDB
- S3 Transfer Acceleration
- EBS provisioned IOPS
- Choosing appropriate data formats based on access patterns
- Ingest data into SageMaker Data Wrangler and SageMaker Feature Store
- Merging data from multiple sources (AWS Glue, Spark)
## References
- [[AWS MOC]]