Presentation
Comprehensive Multi-Stage Evaluation of Language Models for Scientific Skill and Safety Red-Teaming
DescriptionThe evaluation of skill and safety in language models (LMs) and foundation models (FMs) within scientific domains is increasingly critical as these models become more integral to research and discovery. We propose a multi-stage evaluation framework that integrates automatic question-answer generation for skill assessment and automated red-teaming for safety evaluation, tailored specifically for scientific applications. The framework includes domain-specific benchmark creation and rigorous validation processes to ensure that the models are both knowledgeable and safe for deployment. Addressing these challenges is essential for developing reliable AI systems that can support scientific innovation responsibly.